Collating sequence

If you set the COLLATING_SEQUENCE server option to 'Y', you are telling the federated database that the data source collating sequence matches the collating sequence of the federated server. This setting allows the optimizer to consider pushing down order-dependent processing to a data source, which can improve performance.

If the data source collating sequence is not the same as the federated database collating sequence and you set the COLLATING_SEQUENCE server option to 'Y', you can receive incorrect results. For example, if your plan uses merge joins, the optimizer might push down ordering operations to the data sources. If the data source collating sequence is not the same, the join results might not have a correct result set. Set the COLLATING_SEQUENCE server option to 'N', if you are not sure that the collating sequence at the data source is identical to the federated database collating sequence.

Alternatively, you can configure a federated database to use the same collating sequence that a data source uses. You then set the COLLATING_SEQUENCE server option to 'Y'. This allows the optimizer to consider pushing down order-dependent operations on character columns.

To determine if a data source and the federated database have the same collating sequence, consider the following factors:

  • National language support

    The collating sequence is related to the language supported on a server. Compare the federated database NLS information for your operating system to the data source NLS information.

  • Language-aware collations

    Check whether the federated database or the data source uses language-aware collations. If they use different collations, then you should not set COLLATING_SEQENCE to Y.

  • Data source characteristics

    Some data sources are created using case-insensitive collating sequences, which can yield different results from the federated database in order-dependent operations.

  • Customization

    Some data sources provide multiple options for collating sequences or allow the collating sequence to be customized.

When a query fragment from a federated server requires sorting, the place where the sorting is processed depends on several factors. If the federated database's collating sequence is the same as the data source collating sequence, the sort can take place at the data source or at the federated server. The query optimizer can determine if a local sort or a remote sort is the most efficient way to complete the query.

Numeric comparisons, in general, can be performed at either location even if the collating sequence is different. You can get incorrect results, however, if the weighting of null characters is different between the federated database and the data source.

Likewise, for comparison statements, be careful if you are submitting statements to a case-insensitive data source. The weights assigned to the characters "I" and "i" in a case-insensitive data source are the same. For example, in a case-insensitive data source with an English code page, STEWART, SteWArT, and stewart would all be considered equal. The federated database, by default, is case-sensitive and would assign different weights to the characters.

If the collating sequences of the federated database and the data source differ, the federated server retrieves the data to the federated database, so that it can do the sorting locally. The reason is that users expect to see the query results ordered according to the collating sequence defined for the federated server; by ordering the data locally, the federated server ensures that this expectation is fulfilled.

If your query contains an equality predicate on the character column, it is possible to push down that portion of the query even if the collating sequences are different (set to 'N'). For example, the predicate C1 = 'A' would retrieve the same values regardless of collating sequence and therefore could be pushed down to a data source that has a different collating sequence than the federated server. However, such predicates cannot be pushed down when the collating sequence at the data source is case-insensitive (COLLATING_SEQUENCE='I'). When a data source is case-insensitive, the results from C1= 'A' and C1 = 'a' are the same, which is not consistent with a case-sensitive environment such as Db2®.

Administrators can create federated databases with a particular collating sequence that matches the data source collating sequence. This approach can speed performance if all data sources use the same collating sequence or if most or all column functions are directed to data sources that use the same collating sequence. For example, in Db2 for z/OS®, sorts defined by ORDER BY clauses are implemented by a collating sequence based on an EBCDIC code page. If you want to use the federated server to retrieve Db2 for z/OS data sorted in accordance with ORDER BY clauses, it is advisable to configure the federated database so that is uses a predefined collating sequence based on the EBCDIC code page.

If the collating sequences at the federated database and the data source differ, and you need to see the data ordered in the data source's sequence, you can submit your query in a pass-through session, or define the query in a data source view.