Unicode Support (FireDAC)

From RAD Studio
Jump to: navigation, search

Go Up to Working with Connections (FireDAC)


This topic describes how FireDAC works with Unicode data and metadata. FireDAC fully supports Unicode data and metadata. However, an application may need to take additional steps to use Unicode properly.

General

Most enterprise-class database applications must be able to work with character data encoded provided as Unicode. FireDAC provides seamless support for:

  • different single byte client character sets and code pages, including standard ANSI.
  • multi-byte strings, such as Unicode, including UTF8, UTF16, and UCS2 encodings.

FireDAC Unicode handling depends on:

  • the version of Delphi being used.
  • the client character set - UTF8/UTF16 or ACP compatible.
  • the DBMS client / driver.

FireDAC performs transparent character set conversion between the Delphi application and the DBMS client character set, when it is required. Note that the complete Unicode support can be achieved only with the Unicode Delphi version usage - Delphi 2009 and later. The following table summarizes where and what encoding can be used, depending on a Delphi version:

Usage Unicode enabled Delphi 2009 and higher Non-Unicode enabled Delphi 2007 and less
SQL command text Unicode ANSI
SQL script text Unicode ANSI
Result set field values Unicode / ANSI ANSI / Unicode
Parameter values Unicode / ANSI ANSI / Unicode
Metadata values, including:
  • column names
  • table names
  • generator names
  • index names
  • and so on
Unicode ANSI
Metadata query values Unicode Unicode
Text data files Unicode / ANSI ANSI / Unicode
Trace output Unicode / ANSI ANSI / Unicode
Tools Unicode Unicode

Configuring Connection Definitions for Unicode

Note: It is recommended to set the DBMS client character set to Unicode for Delphi 2009 and later to avoid conversion loss.

In general, to configure connection definitions for Unicode, it is required to set the DBMS client character set to Unicode. How this is done depends on the version of Delphi and the DBMS driver that is used:

DBMS Parameter
Advantage Database Server Not supported.
IBM DB2 Server Automatically set to UTF16 in Delphi 2009 and later. Not supported in Delphi 2007 or less.
DataSnap server Automatically set to UTF16 in Delphi 2009 and later. Not supported in Delphi 2007 or less.
Firebird CharacterSet=utf8.
Informix Automatically set to UTF16 in Delphi 2009 and later.
InterBase CharacterSet=utf8 .
Microsoft SQL Server Automatically set to UTF16 in Delphi 2009 and later. Not supported in Delphi 2007 or less.
Microsoft Access Database Automatically set to UTF16 in Delphi 2009 and later. Not supported in Delphi 2007 or less.

Set StringFormat=Unicode to force all string columns to be of Unicode data types.

MySQL Server CharacterSet=utf8.
Oracle Server
  • CharacterSet=utf8.
  • or NLS_LANG=_.UTF8, if CharacterSet is not specified.
PostgreSQL CharacterSet=utf8.
SQLite database Automatically set to UTF16 in Delphi 2009 and later. Not supported in Delphi 2007 or less.

Set StringFormat=Unicode to force all string columns to be of Unicode data types.

Sybase SQL Anywhere Automatically set to UTF16 in Delphi 2009 and later. Not supported in Delphi 2007 or less.
Teradata Database CharacterSet=utf8 or CharacterSet=utf16.
dbExpress bridge driver Depends on the dbExpress driver.
ODBC bridge driver Automatically set to UTF16 in Delphi 2009 and later. Not supported in Delphi 2007 or less.

SQL Command Text

Starting with Delphi 2009, TFDQuery.SQL, TFDCommand.CommandText, IFDPhysCommand.CommandText, and other property values are Unicode encoded. The SQL command text preprocessor supports only UCS2 encoding, which means that surrogate pairs are not supported.

Pre-Delphi 2009 versions support only ANSI encoded command text.

Before sending a SQL command text to the DBMS, FireDAC converts it to:

  • Active Code Page, if the client character set is SBCS (non Unicode). If the client character set is not compatible with ACP, conversion loss is possible.
  • UTF8, if the client character set is UTF8.
  • UTF16, if the client character set is UTF16.

To specify Unicode encoded SQL command text, just do the following:

FDQuery1.SQL.Text := 'select Hello world ! where Lang = RU and Val = Привет мир !';

SQL Script Text

Starting with Delphi 2009, TFDScript.SQLScripts collections can contain Unicode encoded SQL script texts. Other than that, SQL script text processing is similar to SQL command text processing.

Pre-Delphi 2009 versions support only ANSI encoded script text. TFDScript can load Unicode encoded SQL script files, but they will be converted into ACP (ANSI) encoding before processing.

To control SQL script file and log file encoding, use the TFDScript.ScriptOptions.FileEncoding property:

FDScript1.ScriptOptions.FileEncoding := enUTF16;
FDScript1.SQLScriptFileName := 'u:\builddb.sql';
FDScript1.ExecuteAll;

Result Set Fields

DBMS Description
Advantage Database Server
  • NCHAR - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • NVARCHAR - ftWideString
  • NMEMO - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise
IBM DB2 Server
  • GRAPHIC - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • VARGRAPHIC - ftWideString
  • LONG VARGRAPHIC, DBCLOB - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise
DataSnap server
  • ftFixedWideChar on Delphi 2006 and higher and ftWideString
  • ftWideMemo on Delphi 2006 and later and ftFmtMemo
Firebird If CharacterSet is UTF8 or UNICODE_FSS, then:
  • CHAR - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • VARCHAR - ftWideString
  • BLOB SUB_TYPE TEXT - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise
Informix

If the connection definition parameter StringFormat=Unicode, then all string columns are Wide strings. If StringFormat=Ansi, then all string columns are ANSI strings. By default, StringFormat=Ansi.

InterBase If CharacterSet is UTF8, UNICODE_FSS, UNICODE_LE, or UNICODE_BE, then:
  • CHAR - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • VARCHAR - ftWideString
  • BLOB SUB_TYPE TEXT - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise
Microsoft SQL Server
  • NCHAR - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • NVARCHAR - ftWideString
  • NTEXT, NVARCHAR(MAX) - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise
Microsoft Access Database
  • CHARACTER - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • MEMO - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise

If the connection definition parameter StringFormat=Unicode, then all string columns are Wide strings. If StringFormat=Ansi, then all string columns are ANSI strings. By default, StringFormat=Choose.

MySQL Server If CharacterSet is UTF8, then:
  • CHAR - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • VARCHAR, TINYTEXT - ftWideString
  • MEDIUMTEXT, TEXT, LONGTEXT - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise
Oracle Server
  • NCHAR - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • NVARCHAR2 - ftWideString
  • NCLOB - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise

If CharacterSet is UTF8, then:

  • CHAR - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • VARCHAR2 - ftWideString
  • LONG, CLOB - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise
SQLite Database If StringFormat=Choose, then:
  • NCHAR, NATIONAL CHAR, NATIONAL CHARACTER - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • NVARCHAR, NVARCHAR2, NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING - ftWideString
  • TEXT, NTEXT, WTEXT, NCLOB, NMEMO, LONG NTEXT, LONG WTEXT, NATIONAL TEXT, LONGWVARCHAR - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise

If the connection definition parameter StringFormat=Unicode, then all string columns are Wide strings. If StringFormat=Ansi, then all string columns are ANSI strings. By default, StringFormat=Choose.

PostgreSQL If CharacterSet is UTF8, then:
  • CHAR - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • VARCHAR - ftWideString
  • TEXT - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise
Sybase SQL Anywhere
  • NCHAR - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • NVARCHAR - ftWideString
  • LONG NVARCHAR - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise
Teradata Database Not supported.
dbExpress bridge driver Depends on the dbExpress driver and on the DBMS. The general rule:

DBX1-3:

  • fldZSTRING, fldstUNICODE - ftWideString
  • fldZSTRING, fldstUNICODE, fldstFIXED - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • fldBLOB, fldstMEMO, fldstUNICODE - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise
  • fldBLOB, fldstFMTMEMO - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise

DBX4:

  • TDBXDataTypes.WideStringType - ftWideString
  • TDBXDataTypes.WideStringType, TDBXDataTypes.FixedSubType - ftFixedWideChar
  • TDBXDataTypes.BlobType, TDBXDataTypes.WideMemoSubType - ftWideMemo
ODBC bridge driver Depends on the ODBC driver and on the DBMS. The general rule:
  • SQL_WCHAR, SQL_GRAPHIC - ftFixedWideChar on Delphi 2006 and later, ftWideString otherwise
  • SQL_WVARCHAR, SQL_VARGRAPHIC, SQL_LONGVARGRAPHIC - ftWideString
  • SQL_WLONGVARCHAR, SQL_DBCLOB - ftWideMemo on Delphi 2006 and later, ftFmtMemo otherwise

To read / write Unicode string values programmatically use:

  • The TField.AsWideString or TField.Value properties.
  • TField.AsString on Delphi 2009 and later. On earlier Delphi versions, the reading/writing of the AsString property may lead to conversion loss, because AsString returns ANSI strings there.

To read/write the Unicode field on any Delphi version:

FDQuery1.Edit;
FDQuery1.Fields[0].AsWideString := FDQuery1.Fields[0].AsWideString + 'русский' + 'english';
FDQuery1.Post;

To read/write the Unicode memo field on Delphi 2009:

FDQuery1.Edit;
FDQuery1.FieldsByName('memo').Assign(Memo1.Lines);
FDQuery1.Post;

Parameter Values

The Unicode encoded parameter value is converted to a Unicode character set, which is supported by the DBMS, and sent to the DBMS. This does not depend on a client character set or on a Delphi version. Note that Firebird and InterBase cannot send Unicode character data to a server if the CharacterSet parameter is not UTF8.

To read/write a Unicode parameter value on any Delphi version, use the AsWideString property. The AsString behaviour depends on the platform:

  • on NextGen (iOS and Android) platforms, AsString is equivalent to AsWideString (Unicode).
  • on other platforms, AsString is equivalent to AsAnsiString (SBCS / ANSI strings).

For example:

FDQuery1.Params[0].AsWideString := 'русский' + 'english';
// The following code lines are equivalent to the previous one:
FDQuery1.Params[0].DataType := ftWideString;
FDQuery1.Params[0].Value := 'русский' + 'english';

Metadata Values

Starting with Delphi 2009, the Unicode encoded metadata text is supported for:

  • table names.
  • result set column names.
  • stored procedure names.
  • stored procedure parameter names.
  • SQL command parameter names.
  • SQL command macro names.
  • user name.
  • schema, catalog names.

Pre-Delphi 2009 versions support only ANSI encoded metadata text.

All columns in metadata result sets, returned by TFDMetaInfoCommand, TFDMetaInfoQuery, and IFDPhysMetaInfoCommand are of Unicode string data types.

Trace Output

To enable Unicode encoding for trace file output (MonitorBy=FlatFile), set the TADMoniFlatFileClientLink.FileEncoding property value to ecUTF8 or ecUTF16.

Tools

All FireDAC utilities are compiled with the Unicode Delphi version and are 100% Unicode enabled.