Wide-character And Multi-character Constants

From RAD Studio
Jump to: navigation, search

Go Up to Character Constants Overview Index


Wide-character Literals

Wide-character types can be used to represent a character that does not fit into the storage space allocated for a char type. A wide character is stored in a two-byte space.

A character constant preceded immediately by an L is a wide-character constant of the wchar_t data type (defined in stddef.h). For example:

wchar_t ch = L'A';

When wchar_t is used in a C program, it is a type defined in the stddef.h header file. In a C++ program, wchar_t is a keyword that can represent distinct codes for any element of the largest extended character set in any of the supported locales. In C++Builder, wchar_t is the same size, signedness, and alignment requirement as an unsigned short type.

A string preceded immediately by an L is a wide-character string. The memory allocation for a string is two bytes per character. For example:

wchar_t *str = L"ABCD";

Unicode Literals in C++11

The C++11 extension provides two new character types, char16_t and char32_t. They can represent Unicode characters. char16_t and char32_t are keywords in C++11. char16_t defines a 16-bit character type that can be used to represent UTF-16 encoded Unicode characters. char32_t defines a 32-bit character type that can be used to represent UTF-32 encoded characters.

You can use the following new formats of UTF-16 and UTF-32 encoded literals:

  • A character constant preceded immediately by u is the UTF-16 encoded Unicode character of the char16_t data type.
  • A character constant preceded immediately by U is the UTF-32 encoded character of the char32_t data type.
  • A string literal preceded immediately by u contains UTF-16 encoded Unicode characters of the char16_t data type.
  • A string literal preceded immediately by U contains UTF-32 encoded characters of the char32_t data type.

Multi-character Constants

The compiler also supports multi-character constants. Multi-character constants can consist of as many as four characters. For example, the constant '\006\007\008\009' is valid only in a C++Builder program. Multi-character constants are always 32-bit int values. The constants are not portable to other C++ compilers.

See Also