UNICODE BE and UNICODE LE Character Sets

From InterBase
Jump to: navigation, search

Go Up to Character Sets for DOS


InterBase now supports 16-bit UNICODE_BE and UNICODE_LE as server character sets. These character sets cannot be used as client character sets. If your client needs full UNICODE character support, please use UTF8 instead of UNICODE_LE and UNICODE_BE for the client character set (a.k.a LC_CSET). A client can use the UTF8 (or other native) client character set to connect with a UNICODE database.

A database schema is declared to use the new character set in the CREATE DATABASE statement, as follows:

CREATE DATABASE <filespec> <...> DEFAULT CHARACTER SET UNICODE;

Note that InterBase uses “big endian” ordering by default.

The attributes for the UNICODE_BE and UNICODE_LE character sets are shown in InterBase Character Sets.

Note: InterBase 2008 does not support UNICODE collations in this release. The default collation is binary sort order for UNICODE.

Support for the UTF-8 Character Set

The UTF-8 character set is an alternative coded representation form for all of the characters of the ISO/IEC 10646 standard. To use the UTF-8 character set, you would declare a database schema to use the character set, in the CREATE DATABASE SQL statement, as shown below:

CREATE DATABASE <filespec> <...> DEFAULT CHARACTER SET UTF8;

Additionally, you may use the alias UTF_8.

The attributes for the UTF-8 character set are shown in InterBase Character Sets.