System.SysUtils.CharLength

From RAD Studio API Documentation
Jump to: navigation, search

Delphi

function CharLength(const S: UnicodeString; Index: Integer): Integer;

C++

extern DELPHI_PACKAGE int __fastcall CharLength(const System::UnicodeString S, int Index)/* overload */;

Properties

Type Visibility Source Unit Parent
function public
System.SysUtils.pas
System.SysUtils.hpp
System.SysUtils System.SysUtils

Description

Returns the number of bytes used by character.

Call CharLength to determine the size in bytes of the character starting at Index in S. If the character does not start at Index, this function returns the size of the remainder of the character, not the full character length.

Note: Index is an element index into S, not a byte or character index.

If S is an AnsiString and the system is not using a multibyte character system (MBCS), CharLength always returns 1.

The following example illustrates CharLength's operation.

type
  SJISString = type AnsiString(932);

var
  A: SJISString;
  L: Integer;

begin
  A := 'A' + 'B' + #$82#$A0 + // Japanese Hiragana 'A'
    #$82#$A2 + // Japanese Hiragana 'I'
    #$82#$A4 + // Japanese Hiragana 'U'
    'C' + 'D';

  L := CharLength(A, 1); // returns 1 ('A')
  L := CharLength(A, 2); // returns 1 ('B')
  L := CharLength(A, 3); // returns 2
  L := CharLength(A, 4); // returns 1

end.

In this example, when the index is 1 or 2, it points to the beginning of a single-byte character, so the function returns 1. When the index is 3, it points to the beginning of a two-byte character, and the function returns 2. When the index is 4, it points to the second half of a two-byte character and returns 1. Note that for this example, the element size is 1. Some characters require two elements, and some only need one element.

CharLength can be used to locate the position of multibyte characters in a string.

This function also works for Unicode characters:

var
  U: UnicodeString;
  L: Integer;

begin
  U := 'abc';
  L := SysUtils.CharLength(U, 1); // returns 2
  L := SysUtils.CharLength(U, 2); // returns 2

  U := #$20BB7; // surrogate pair
  L := SysUtils.CharLength(U, 1); // returns 4
  L := SysUtils.CharLength(U, 2); // returns 2

end.

Note that the element size is 2 in this example, and the surrogate pair character consists of two elements.

When the string is abc, each character is a single two-byte element, so the function returns 2. String literals are Unicode by default.

For the surrogate pair, when the index is 1, it points to the first element in the character, so the function returns 4. When the index is 2, it points to the second element in the character and returns 2.

See Also