ANSI Implementation-specific Standards

From RAD Studio
Jump to: navigation, search


Go Up to C++ Language Guide Index


Certain aspects of the ANSI C standard are not defined exactly by ANSI. Instead, each implementer of a C compiler is free to define these aspects individually. This topic describes how Embarcadero has chosen to define these implementation-specific standards. The section numbers refer to the February 1990 C ANSI/ISO Standard. Remember that there are differences between C and C++; this topic addresses C only.

How to identify a diagnostic

When the compiler runs with the correct combination of options, any messages it issues beginning with the words Fatal, Error, or Warning are diagnostics in the sense that ANSI specifies. The options needed to ensure this interpretation are as follows:

Option Action

–A

Enable only ANSI keywords.

–C–

No nested comments allowed.

–i32

At most 32 significant characters in identifiers.

–p–

Use C calling conventions.

–w–

Turn off all warnings.

–wbei

Turn on warning about inappropriate initializers.

–wbig

Turn on warning about constants being too large.

–wcpt

Turn on warning about nonportable pointer comparisons.

–wdcl

Turn on warning about declarations without type or storage class.

–wdup

Turn on warning about duplicate nonidentical macro definitions.

–wext

Turn on warning about variables declared both as external and as static.

–wfdt

Turn on warning about function definitions using a typedef.

–wrpt

Turn on warning about nonportable pointer conversion.

–wstu

Turn on warning about undefined structures.

–wsus

Turn on warning about suspicious pointer conversion.

–wucp

Turn on warning about mixing pointers to signed and unsigned char.

–wvrt

Turn on warning about void functions returning a value.

Other options not specifically mentioned here can be set to whatever you want.


The semantics of the arguments to main

When the program is run on DOS, argv[0] points to the program name. The remaining argv strings point to each component of the DOS command-line arguments. Whitespace separating arguments is removed, and each sequence of contiguous non-whitespace characters is treated as a single argument. Quoted strings are handled correctly (that is, as one string containing spaces).

What constitutes an interactive device

An interactive device is any device that looks like the console.

The collation sequence of the execution character set

The collation sequence for the execution character set uses the value of the character in ASCII.

Members of the source and execution character sets

The source and execution character sets are the extended ASCII set supported by the IBM PC. Any character other than Ctrl+Z can appear in string literals, character constants, or comments.

Multibyte characters

Multibyte characters are supported in C++Builder.

The direction of printing

Printing is from left-to-right, the normal direction for the PC.

The number of bits in a character in the execution character set

There are 8 bits per character in the execution character set.

The number of significant initial characters in identifiers

The first 32 characters are significant, although you can use a command-line option (–i) to change that number. Both internal and external identifiers use the same number of significant characters. (The number of significant characters in C++ identifiers is unlimited.)

Whether case distinctions are significant in external identifiers

The compiler normally forces the linker to distinguish between uppercase and lowercase. You can use a command-line option (–lc–) to turn off case sensitivity.

The representations and sets of values of the various types of integers

Identifying diagnostics in C++
Type Minimum value Maximum value

signed char

–128

127

unsigned char

0

255

signed short

–32,768

32,767

unsigned short

0

65,535

signed int

–2,147,483,648

2,147,483,647

unsigned int

0

4,294,967,295

signed long

–2,147,483,648

2,147,483,647

unsigned long

0

4,294,967,295

All char types use one 8-bit byte for storage. All short types use 2 bytes. All int types use 4 bytes. All long types use 4 bytes. If alignment is requested (–a), all nonchar integer type objects will be aligned to even byte boundaries. If the requested alignment is –a4, the result is 4-byte alignment. Character types are never aligned.

The representations and sets of values of the various types of floating-point numbers

The IEEE floating-point formats as used by the Intel 8086 are used for all C++Builder floating-point types. The float type uses 32-bit IEEE real format. The double type uses 64-bit IEEE real format. The long double type uses 80-bit IEEE extended real format.

The mapping between source and execution character sets

Any characters in string literals or character constants remain unchanged in the executing program. The source and execution character sets are the same.

The value of an integer character constant that contains a character or escape sequence not represented in the basic execution character set or the extended character set for a wide character constant

Wide characters are supported.

The current locale used to convert multibyte characters into corresponding wide characters for a wide character constant

Wide character constants are recognized.

The value of an integer constant that contains more than one character, or a wide character constant that contains more than one multibyte character

Character constants can contain one or two characters. If two characters are included, the first character occupies the low-order byte of the constant, and the second character occupies the high-order byte.

The result of converting an integer to a shorter signed integer, or the result of converting an unsigned integer to a signed integer of equal length, if the value cannot be represented

These conversions are performed by simply truncating the high-order bits. Signed integers are stored as two’s complement values, so the resulting number is interpreted as such a value. If the high-order bit of the smaller integer is nonzero, the value is interpreted as a negative value; otherwise, it is positive.

The direction of truncation when an integral number is converted to a floating-point number that cannot exactly represent the original value

The integer value is rounded to the nearest representable value. Thus, for example, the long value (2^31 –1) is converted to the float value 2^31. Ties are broken according to the rules of IEEE standard arithmetic.

The direction of truncation or rounding when a floating-point number is converted to a narrower floating-point number

The value is rounded to the nearest representable value. Ties are broken according to the rules of IEEE standard arithmetic.

The results of bitwise operations on signed integers

The bitwise operators apply to signed integers as if they were their corresponding unsigned types. The sign bit is treated as a normal data bit. The result is then interpreted as a normal two’s complement signed integer.

What happens when a member of a union object is accessed using a member of a different type

The access is allowed and the different type member will access the bits stored there. You’ll need a detailed understanding of the bit encodings of floating-point values to understand how to access a floating-type member using a different member. If the member stored is shorter than the member used to access the value, the excess bits have the value they had before the short member was stored.

The type of integer required to hold the maximum size of an array

For a normal array, the type is unsigned int, and for huge arrays the type is signed long.

The result of casting a pointer to an integer or vice versa

When converting between integers and pointers of the same size, no bits are changed. When converting from a longer type to a shorter type, the high-order bits are truncated. When converting from a shorter integer type to a longer pointer type, the integer is first widened to an integer type the same size as the pointer type. Thus signed integers will sign-extend to fill the new bytes. Similarly, smaller pointer types being converted to larger integer types will first be widened to a pointer type as wide as the integer type.

The sign of the remainder on integer division

The sign of the remainder is negative when only one of the operands is negative. If neither or both operands are negative, the remainder is positive.

The type of integer required to hold the difference between two pointers to elements of the same array, ptrdiff_t

The type is signed int.

The result of a right shift of a negative signed integral type

A negative signed value is sign extended when right shifted.

The extent to which objects can actually be placed in registers by using the register storage-class specifier

Objects declared with any one, two, or four-byte integer or pointer types can be placed in registers. At least two and as many as seven registers are available. The number of registers actually used depends on what registers are needed for temporary values in the function.

Whether a plain int bit-field is treated as a signed int or as an unsigned int bit field

Plain int bit fields are treated as signed int bit fields.

The order of allocation of bit fields within an int

Bit fields are allocated from the low-order bit position to the high-order.

The padding and alignment of members of structures

By default, no padding is used in structures. If you use the word alignment option (–a), structures are padded to even size, and any members that do not have character or character array type are aligned to an even multiple offset.

Whether a bit-field can straddle a storage-unit boundary

Depending upon the overall alignment and the type alignment, a bit field is allowed to straddle a storage unit.

The integer type chosen to represent the values of an enumeration type

Store all enumerators as full ints. Store the enumerations in a long or unsigned long if the values don’t fit into an int. This is the default behavior as specified by –b compiler option. The –b- behavior specifies that enumerations should be stored in the smallest integer type that can represent the values. This includes all integral types, for example, signed char, unsigned char, signed short, unsigned short, signed int, unsigned int, signed long, and unsigned long.

For C++ compliance, –b- must be specified because it is not correct to store all enumerations as ints for C++.

What constitutes an access to an object that has volatile-qualified type

Any reference to a volatile object will access the object. Whether accessing adjacent memory locations will also access an object depends on how the memory is constructed in the hardware. For special device memory, such as video display memory, it depends on how the device is constructed. For normal PC memory, volatile objects are used only for memory that might be accessed by asynchronous interrupts, so accessing adjacent objects has no effect.

The maximum number of declarators that can modify an arithmetic, structure, or union type

There is no specific limit on the number of declarators. The number of declarators allowed is fairly large, but when nested deeply within a set of blocks in a function, the number of declarators will be reduced. The number allowed at file level is at least 50.

The maximum number of case values in a switch statement

There is no specific limit on the number of cases in a switch. As long as there is enough memory to hold the case information, the compiler will accept them.

Whether the value of a single-character character constant in a constant expression that controls conditional inclusion matches the value of the same character constant in the execution character set. Whether such a character constant can have a negative value

All character constants, even constants in conditional directives, use the same character set (execution). Single-character character constants will be negative if the character type is signed (default and –K not requested).

The method for locating includable source files

For include file names in angle brackets, if include directories are given in the command line, then the file is searched for in each of the include directories. Include directories are searched in this order:

  1. In directories specified on the command line.
  2. In directories specified in the configuration file of the command-line interface of the compiler. For example, in BCC32.CFG for BCC32.EXE.
  3. If no include directories are specified, search only the current directory.

The support for quoted names for includable source files

For quoted includable file names, the file is searched in the following order:

  1. In the same directory of the file that contains the #include statement.
  2. In the directories of files that include (#include) that file.
  3. The current directory.
  4. Along the path specified by the /I compiler option.
  5. Along paths specified by the INCLUDE environment variable.

The mapping of source file name character sequences

Backslashes in include file names are treated as distinct characters, not as escape characters. Case differences are ignored for letters.

The definitions for _ _DATE_ _ and _ _TIME_ _ when they are unavailable

The date and time are always available and will use the operating system date and time.

The decimal point character

The decimal point character is a period (.).

The type of the sizeof operator, size_t

The type size_t is unsigned.

The null pointer constant to which the macro NULL expands

NULL expands to an int zero or a long zero. Both are 32-bit signed numbers.

The diagnostic printed by and the termination behavior of the assert function

The diagnostic message printed is “Assertion failed: expression, file filename, line nn”, where expression is the asserted expression that failed, filename is the source file name, and nn is the line number where the assertion took place.

Abort is called immediately after the assertion message is displayed.

The implementation-defined aspects of character testing and case-mapping functions

None, other than what is mentioned in the section below.

The sets of characters tested for by the isalnum, isalpha, iscntrl, islower, isprint and isupper functions

First 128 ASCII characters for the default C locale. Otherwise, all 256 characters.

The values returned by the mathematics functions on domain errors

An IEEE NAN (not a number).

Whether the mathematics functions set the integer expression errno to the value of the macro ERANGE on underflow range errors

No, only for the other errors—domain, singularity, overflow, and total loss of precision.

Whether a domain error occurs or zero is returned when the fmod function has a second argument of zero

No; fmod(x,0) returns 0.

The set of signals for the signal function

SIGABRT, SIGFPE, SIGILL, SIGINT, SIGSEGV, and SIGTERM.

The semantics for each signal recognized by the signal function

See the description of signal.

The default handling and the handling at program startup for each signal recognized by the signal function

See the description of signal.

If the equivalent of signal(sig, SIG_DFL); is not executed prior to the call of a signal handler, the blocking of the signal that is performed

The equivalent of signal (sig, SIG_DFL) is always executed.

Whether the default handling is reset if the SIGILL signal is received by a handler specified to the signal function

No, it is not.

Whether the last line of a text stream requires a terminating newline character

No, none is required.

Whether space characters that are written out to a text stream immediately before a newline character appear when read in

Yes, they do.

The number of null characters that may be appended to data written to a binary stream

None.

Whether the file position indicator of an append mode stream is initially positioned at the beginning or end of the file

The file position indicator of an append-mode stream is initially placed at the beginning of the file. It is reset to the end of the file before each write.

Whether a write on a text stream causes the associated file to be truncated beyond that point

A write of 0 bytes might or might not truncate the file, depending on how the file is buffered. It is safest to classify a zero-length write as having indeterminate behavior.

The characteristics of file buffering

Files can be fully buffered, line buffered, or unbuffered. If a file is buffered, a default buffer of 512 bytes is created upon opening the file.

Whether a zero-length file actually exists

Yes, it does.

Whether the same file can be open multiple times

Yes, it can.

The effect of the remove function on an open file

No special checking for an already open file is performed; the responsibility is left up to the programmer.

The effect if a file with the new name exists prior to a call to rename

Rename returns a –1 and errno is set to EEXIST.

The output for %p conversion in fprintf

The output is eight hex digits (XXXXXXXX), zero padded, uppercase letters (the same as %08lX).

The input for %p conversion in fscanf

See the section above.

The interpretation of a –(hyphen) character that is neither the first nor the last character in the scanlist for a %[ conversion in fscanf

See the description of scanf.

The value the macro errno is set to by the fgetpos or ftell function on failure

EBADF Bad file number.

The messages generated by perror

Messages generated in Win32

Arg list too big

Attempted to remove current directory

Bad address

Bad file number

Block device required

Broken pipe

Cross-device link

Error 0

Exec format error

Executable file in use

File already exists

File too large

Illegal seek

Inappropriate I/O control operation

Input/output error

Interrupted function call

Invalid access code

Invalid argument

Invalid data

Invalid environment

Invalid format

Invalid function number

Invalid memory block address

Is a directory

Math argument

Memory arena trashed

Name too long

No child processes

No more files

No space left on device

No such device

No such device or address

No such file or directory

No such process

Not a directory

Not enough memory

Not same device

Operation not permitted

Path not found

Permission denied

Possible deadlock

Read-only file system

Resource busy

Resource temporarily unavailable

Result too large

Too many links

Too many open files

The behavior of calloc, malloc, or realloc if the size requested is zero

calloc and malloc will ignore the request and return 0. realloc will free the block.

The behavior of the abort function with regard to open and temporary files

The file buffers are not flushed and the files are not closed.

The status returned by exit if the value of the argument is other than zero, EXIT_SUCCESS, or EXIT_FAILURE

Nothing special. The status is returned exactly as it is passed. The status is a represented as a signed char.

The set of environment names and the method for altering the environment list used by getenv

The environment strings are those defined in the operating system with the SET command. putenv can be used to change the strings for the duration of the current program, but the SET command must be used to change an environment string permanently.

The contents and mode of execution of the string by the system function

The string is interpreted as an operating system command. COMSPEC or CMD.EXE is used, and the argument string is passed as a command to execute. Any operating system built-in command, as well as batch files and executable programs, can be executed.

The contents of the error message strings returned by strerror

See Table above.

The local time zone and Daylight Saving Time

Defined as local PC time and date.

The era for clock

Represented as clock ticks, with the origin being the beginning of the program execution.

The formats for date and time