System.SysUtils.CharLength
Delphi
function CharLength(const S: UnicodeString; Index: Integer): Integer;
C++
extern DELPHI_PACKAGE int __fastcall CharLength(const System::UnicodeString S, int Index)/* overload */;
Properties
Type | Visibility | Source | Unit | Parent |
---|---|---|---|---|
function | public | System.SysUtils.pas System.SysUtils.hpp |
System.SysUtils | System.SysUtils |
Description
Returns the number of bytes used by character.
Call CharLength to determine the size in bytes of the character starting at Index
in S
. If the character does not start at Index
, this function returns the size of the remainder of the character, not the full character length.
Note:
Index
is an element index intoS
, not a byte or character index.
If S is an AnsiString and the system is not using a multibyte character system (MBCS), CharLength always returns 1.
The following example illustrates CharLength's operation.
type
SJISString = type AnsiString(932);
var
A: SJISString;
L: Integer;
begin
A := 'A' + 'B' + #$82#$A0 + // Japanese Hiragana 'A'
#$82#$A2 + // Japanese Hiragana 'I'
#$82#$A4 + // Japanese Hiragana 'U'
'C' + 'D';
L := CharLength(A, 1); // returns 1 ('A')
L := CharLength(A, 2); // returns 1 ('B')
L := CharLength(A, 3); // returns 2
L := CharLength(A, 4); // returns 1
end.
In this example, when the index is 1 or 2, it points to the beginning of a single-byte character, so the function returns 1. When the index is 3, it points to the beginning of a two-byte character, and the function returns 2. When the index is 4, it points to the second half of a two-byte character and returns 1. Note that for this example, the element size is 1. Some characters require two elements, and some only need one element.
CharLength can be used to locate the position of multibyte characters in a string.
This function also works for Unicode characters:
var
U: UnicodeString;
L: Integer;
begin
U := 'abc';
L := SysUtils.CharLength(U, 1); // returns 2
L := SysUtils.CharLength(U, 2); // returns 2
U := #$20BB7; // surrogate pair
L := SysUtils.CharLength(U, 1); // returns 4
L := SysUtils.CharLength(U, 2); // returns 2
end.
Note that the element size is 2 in this example, and the surrogate pair character consists of two elements.
When the string is abc
, each character is a single two-byte element, so the function returns 2. String literals are Unicode by default.
For the surrogate pair, when the index is 1, it points to the first element in the character, so the function returns 4. When the index is 2, it points to the second element in the character and returns 2.