Wide-character And Multi-character Constants
Go Up to Character Constants Overview Index
Contents
Wide-character Literals
Wide-character types can be used to represent a character that does not fit into the storage space allocated for a char type. A wide character is stored in a two-byte space.
A character constant preceded immediately by an L is a wide-character constant of the wchar_t data type (defined in stddef.h). For example:
wchar_t ch = L'A';
When wchar_t is used in a C program, it is a type defined in the stddef.h header file. In a C++ program, wchar_t is a keyword that can represent distinct codes for any element of the largest extended character set in any of the supported locales. In C++Builder, wchar_t is the same size, signedness, and alignment requirement as an unsigned short type.
A string preceded immediately by an L is a wide-character string. The memory allocation for a string is two bytes per character. For example:
wchar_t *str = L"ABCD";
Unicode Literals in C++11
The C++11 extension provides two new character types, char16_t and char32_t. They can represent Unicode characters. char16_t and char32_t are keywords in C++11. char16_t defines a 16-bit character type that can be used to represent UTF-16 encoded Unicode characters. char32_t defines a 32-bit character type that can be used to represent UTF-32 encoded characters.
You can use the following new formats of UTF-16 and UTF-32 encoded literals:
- A character constant preceded immediately by u is the UTF-16 encoded Unicode character of the char16_t data type.
- A character constant preceded immediately by U is the UTF-32 encoded character of the char32_t data type.
- A string literal preceded immediately by u contains UTF-16 encoded Unicode characters of the char16_t data type.
- A string literal preceded immediately by U contains UTF-32 encoded characters of the char32_t data type.
Multi-character Constants
The compiler also supports multi-character constants. Multi-character constants can consist of as many as four characters. For example, the constant '\006\007\008\009'
is valid only in a C++Builder program. Multi-character constants are always 32-bit int values. The constants are not portable to other C++ compilers.