Wide Character Routines

From RAD Studio
Jump to: navigation, search

Go Up to Working with Strings


Wide strings contain two bytes per element and are used in a variety of situations. UnicodeString is the default string type in RAD Studio.

You might also choose to use wide strings because they simplify some of the string-handling issues in applications that have multiple target locales. Using a wide character encoding scheme has the advantage that you can make many of the usual assumptions about strings that do not work for systems that use multi-byte character sets (MBCS), which are single byte strings. For wide strings, there is a direct relationship between the number of bytes in the string and the number of elements in the string. In a MBCS string, you have to be concerned about cutting characters in half or mistaking the second part of a character for the start of a different character. There is a similar issue for a wide string. Although elements are all two bytes, characters not in the Basic Multilingual Plane (BMP) require two elements.

Two types represent wide strings: UnicodeString and WideString.

  • The WideString format is essentially the same as a Windows BSTR. WideString is appropriate for use in COM applications.
  • WideString is not reference counted, so UnicodeString is more flexible and efficient in other types of applications. In addition, more utility functions are available for UnicodeString than WideString, so UnicodeString is generally preferred.

This topic deals with WideString, not UnicodeString. VCL now uses the UnicodeString type; it no longer represents string values as single byte or MBCS strings.

The following functions convert between standard single-byte character strings (or MBCS strings) and Unicode strings:

In addition, the following functions translate between WideStrings and other representations:

The following routines work directly with WideStrings:

Finally, some routines include overloads for working with wide strings:

See Also