ANSI Implementation-specific Standards
Go Up to C++ Language Guide Index
Certain aspects of the ANSI C standard are not defined exactly by ANSI. Instead, each implementer of a C compiler is free to define these aspects individually. This topic describes how Embarcadero has chosen to define these implementation-specific standards. The section numbers refer to the February 1990 C ANSI/ISO Standard. Remember that there are differences between C and C++; this topic addresses C only.
How to identify a diagnostic
When the compiler runs with the correct combination of options, any messages it issues beginning with the words Fatal, Error, or Warning are diagnostics in the sense that ANSI specifies. The options needed to ensure this interpretation are as follows:
Option | Action |
---|---|
–A |
Enable only ANSI keywords. |
–C– |
No nested comments allowed. |
–i32 |
At most 32 significant characters in identifiers. |
–p– |
Use C calling conventions. |
–w– |
Turn off all warnings. |
–wbei |
Turn on warning about inappropriate initializers. |
–wbig |
Turn on warning about constants being too large. |
–wcpt |
Turn on warning about nonportable pointer comparisons. |
–wdcl |
Turn on warning about declarations without type or storage class. |
–wdup |
Turn on warning about duplicate nonidentical macro definitions. |
–wext |
Turn on warning about variables declared both as external and as static. |
–wfdt |
Turn on warning about function definitions using a typedef. |
–wrpt |
Turn on warning about nonportable pointer conversion. |
–wstu |
Turn on warning about undefined structures. |
–wsus |
Turn on warning about suspicious pointer conversion. |
–wucp |
Turn on warning about mixing pointers to signed and unsigned char. |
–wvrt |
Turn on warning about void functions returning a value. |
Other options not specifically mentioned here can be set to whatever you want.
The semantics of the arguments to main
When the program is run on DOS, argv[0] points to the program name. The remaining argv strings point to each component of the DOS command-line arguments. Whitespace separating arguments is removed, and each sequence of contiguous non-whitespace characters is treated as a single argument. Quoted strings are handled correctly (that is, as one string containing spaces).
What constitutes an interactive device
An interactive device is any device that looks like the console.
The collation sequence of the execution character set
The collation sequence for the execution character set uses the value of the character in ASCII.
Members of the source and execution character sets
The source and execution character sets are the extended ASCII set supported by the IBM PC. Any character other than Ctrl+Z can appear in string literals, character constants, or comments.
Multibyte characters
Multibyte characters are supported in C++Builder.
The direction of printing
Printing is from left-to-right, the normal direction for the PC.
The number of bits in a character in the execution character set
There are 8 bits per character in the execution character set.
The number of significant initial characters in identifiers
The first 32 characters are significant, although you can use a command-line option (–i) to change that number. Both internal and external identifiers use the same number of significant characters. (The number of significant characters in C++ identifiers is unlimited.)
Whether case distinctions are significant in external identifiers
The compiler normally forces the linker to distinguish between uppercase and lowercase. You can use a command-line option (–lc–) to turn off case sensitivity.
The representations and sets of values of the various types of integers
Type | Minimum value | Maximum value |
---|---|---|
signed char |
–128 |
127 |
unsigned char |
0 |
255 |
signed short |
–32,768 |
32,767 |
unsigned short |
0 |
65,535 |
signed int |
–2,147,483,648 |
2,147,483,647 |
unsigned int |
0 |
4,294,967,295 |
signed long |
–2,147,483,648 |
2,147,483,647 |
unsigned long |
0 |
4,294,967,295 |
All char types use one 8-bit byte for storage. All short types use 2 bytes. All int types use 4 bytes. All long types use 4 bytes. If alignment is requested (–a), all nonchar integer type objects will be aligned to even byte boundaries. If the requested alignment is –a4, the result is 4-byte alignment. Character types are never aligned.
The representations and sets of values of the various types of floating-point numbers
The IEEE floating-point formats as used by the Intel 8086 are used for all C++Builder floating-point types. The float type uses 32-bit IEEE real format. The double type uses 64-bit IEEE real format. The long double type uses 80-bit IEEE extended real format.
The mapping between source and execution character sets
Any characters in string literals or character constants remain unchanged in the executing program. The source and execution character sets are the same.
The value of an integer character constant that contains a character or escape sequence not represented in the basic execution character set or the extended character set for a wide character constant
Wide characters are supported.
The current locale used to convert multibyte characters into corresponding wide characters for a wide character constant
Wide character constants are recognized.
The value of an integer constant that contains more than one character, or a wide character constant that contains more than one multibyte character
Character constants can contain one or two characters. If two characters are included, the first character occupies the low-order byte of the constant, and the second character occupies the high-order byte.
The result of converting an integer to a shorter signed integer, or the result of converting an unsigned integer to a signed integer of equal length, if the value cannot be represented
These conversions are performed by simply truncating the high-order bits. Signed integers are stored as two’s complement values, so the resulting number is interpreted as such a value. If the high-order bit of the smaller integer is nonzero, the value is interpreted as a negative value; otherwise, it is positive.
The direction of truncation when an integral number is converted to a floating-point number that cannot exactly represent the original value
The integer value is rounded to the nearest representable value. Thus, for example, the long value (2^31 –1) is converted to the float value 2^31. Ties are broken according to the rules of IEEE standard arithmetic.
The direction of truncation or rounding when a floating-point number is converted to a narrower floating-point number
The value is rounded to the nearest representable value. Ties are broken according to the rules of IEEE standard arithmetic.
The results of bitwise operations on signed integers
The bitwise operators apply to signed integers as if they were their corresponding unsigned types. The sign bit is treated as a normal data bit. The result is then interpreted as a normal two’s complement signed integer.
What happens when a member of a union object is accessed using a member of a different type
The access is allowed and the different type member will access the bits stored there. You’ll need a detailed understanding of the bit encodings of floating-point values to understand how to access a floating-type member using a different member. If the member stored is shorter than the member used to access the value, the excess bits have the value they had before the short member was stored.
The type of integer required to hold the maximum size of an array
For a normal array, the type is unsigned int, and for huge arrays the type is signed long.
The result of casting a pointer to an integer or vice versa
When converting between integers and pointers of the same size, no bits are changed. When converting from a longer type to a shorter type, the high-order bits are truncated. When converting from a shorter integer type to a longer pointer type, the integer is first widened to an integer type the same size as the pointer type. Thus signed integers will sign-extend to fill the new bytes. Similarly, smaller pointer types being converted to larger integer types will first be widened to a pointer type as wide as the integer type.
The sign of the remainder on integer division
The sign of the remainder is negative when only one of the operands is negative. If neither or both operands are negative, the remainder is positive.
The type of integer required to hold the difference between two pointers to elements of the same array, ptrdiff_t
The type is signed int.
The result of a right shift of a negative signed integral type
A negative signed value is sign extended when right shifted.
The extent to which objects can actually be placed in registers by using the register storage-class specifier
Objects declared with any one, two, or four-byte integer or pointer types can be placed in registers. At least two and as many as seven registers are available. The number of registers actually used depends on what registers are needed for temporary values in the function.
Whether a plain int bit-field is treated as a signed int or as an unsigned int bit field
Plain int bit fields are treated as signed int bit fields.
The order of allocation of bit fields within an int
Bit fields are allocated from the low-order bit position to the high-order.
The padding and alignment of members of structures
By default, no padding is used in structures. If you use the word alignment option (–a), structures are padded to even size, and any members that do not have character or character array type are aligned to an even multiple offset.
Whether a bit-field can straddle a storage-unit boundary
Depending upon the overall alignment and the type alignment, a bit field is allowed to straddle a storage unit.
The integer type chosen to represent the values of an enumeration type
Store all enumerators as full ints. Store the enumerations in a long or unsigned long if the values don’t fit into an int. This is the default behavior as specified by –b compiler option. The –b- behavior specifies that enumerations should be stored in the smallest integer type that can represent the values. This includes all integral types, for example, signed char, unsigned char, signed short, unsigned short, signed int, unsigned int, signed long, and unsigned long.
For C++ compliance, –b- must be specified because it is not correct to store all enumerations as ints for C++.
What constitutes an access to an object that has volatile-qualified type
Any reference to a volatile object will access the object. Whether accessing adjacent memory locations will also access an object depends on how the memory is constructed in the hardware. For special device memory, such as video display memory, it depends on how the device is constructed. For normal PC memory, volatile objects are used only for memory that might be accessed by asynchronous interrupts, so accessing adjacent objects has no effect.
The maximum number of declarators that can modify an arithmetic, structure, or union type
There is no specific limit on the number of declarators. The number of declarators allowed is fairly large, but when nested deeply within a set of blocks in a function, the number of declarators will be reduced. The number allowed at file level is at least 50.
The maximum number of case values in a switch statement
There is no specific limit on the number of cases in a switch. As long as there is enough memory to hold the case information, the compiler will accept them.
Whether the value of a single-character character constant in a constant expression that controls conditional inclusion matches the value of the same character constant in the execution character set. Whether such a character constant can have a negative value
All character constants, even constants in conditional directives, use the same character set (execution). Single-character character constants will be negative if the character type is signed (default and –K not requested).
The method for locating includable source files
For include file names in angle brackets, if include directories are given in the command line, then the file is searched for in each of the include directories. Include directories are searched in this order:
- In directories specified on the command line.
- In directories specified in the configuration file of the command-line interface of the compiler. For example, in
BCC32.CFG
for BCC32.EXE. - If no include directories are specified, search only the current directory.
The support for quoted names for includable source files
For quoted includable file names, the file is searched in the following order:
- In the same directory of the file that contains the #include statement.
- In the directories of files that include (#include) that file.
- The current directory.
- Along the path specified by the /I compiler option.
- Along paths specified by the INCLUDE environment variable.
The mapping of source file name character sequences
Backslashes in include file names are treated as distinct characters, not as escape characters. Case differences are ignored for letters.
The definitions for _ _DATE_ _ and _ _TIME_ _ when they are unavailable
The date and time are always available and will use the operating system date and time.
The decimal point character
The decimal point character is a period (.).
The type of the sizeof operator, size_t
The type size_t is unsigned.
The null pointer constant to which the macro NULL expands
NULL expands to an int zero or a long zero. Both are 32-bit signed numbers.
The diagnostic printed by and the termination behavior of the assert function
The diagnostic message printed is “Assertion failed: expression, file filename, line nn”, where expression is the asserted expression that failed, filename is the source file name, and nn is the line number where the assertion took place.
Abort is called immediately after the assertion message is displayed.
The implementation-defined aspects of character testing and case-mapping functions
None, other than what is mentioned in the section below.
The sets of characters tested for by the isalnum, isalpha, iscntrl, islower, isprint and isupper functions
First 128 ASCII characters for the default C locale. Otherwise, all 256 characters.
The values returned by the mathematics functions on domain errors
An IEEE NAN (not a number).
Whether the mathematics functions set the integer expression errno to the value of the macro ERANGE on underflow range errors
No, only for the other errors—domain, singularity, overflow, and total loss of precision.
Whether a domain error occurs or zero is returned when the fmod function has a second argument of zero
No; fmod(x,0) returns 0.
The set of signals for the signal function
SIGABRT, SIGFPE, SIGILL, SIGINT, SIGSEGV, and SIGTERM.
The semantics for each signal recognized by the signal function
See the description of signal.
The default handling and the handling at program startup for each signal recognized by the signal function
See the description of signal.
If the equivalent of signal(sig, SIG_DFL); is not executed prior to the call of a signal handler, the blocking of the signal that is performed
The equivalent of signal (sig, SIG_DFL) is always executed.
Whether the default handling is reset if the SIGILL signal is received by a handler specified to the signal function
No, it is not.
Whether the last line of a text stream requires a terminating newline character
No, none is required.
Whether space characters that are written out to a text stream immediately before a newline character appear when read in
Yes, they do.
The number of null characters that may be appended to data written to a binary stream
None.
Whether the file position indicator of an append mode stream is initially positioned at the beginning or end of the file
The file position indicator of an append-mode stream is initially placed at the beginning of the file. It is reset to the end of the file before each write.
Whether a write on a text stream causes the associated file to be truncated beyond that point
A write of 0 bytes might or might not truncate the file, depending on how the file is buffered. It is safest to classify a zero-length write as having indeterminate behavior.
The characteristics of file buffering
Files can be fully buffered, line buffered, or unbuffered. If a file is buffered, a default buffer of 512 bytes is created upon opening the file.
Whether a zero-length file actually exists
Yes, it does.
Whether the same file can be open multiple times
Yes, it can.
The effect of the remove function on an open file
No special checking for an already open file is performed; the responsibility is left up to the programmer.
The effect if a file with the new name exists prior to a call to rename
Rename returns a –1 and errno is set to EEXIST.
The output for %p conversion in fprintf
The output is eight hex digits (XXXXXXXX), zero padded, uppercase letters (the same as %08lX).
The input for %p conversion in fscanf
See the section above.
The interpretation of a –(hyphen) character that is neither the first nor the last character in the scanlist for a %[ conversion in fscanf
See the description of scanf.
The value the macro errno is set to by the fgetpos or ftell function on failure
EBADF Bad file number.
The messages generated by perror
Arg list too big |
Attempted to remove current directory |
Bad address | ||
Bad file number |
Block device required |
Broken pipe | ||
Cross-device link |
Error 0 |
Exec format error | ||
Executable file in use |
File already exists |
File too large | ||
Illegal seek |
Inappropriate I/O control operation |
Input/output error | ||
Interrupted function call |
Invalid access code |
Invalid argument | ||
Invalid data |
Invalid environment |
Invalid format | ||
Invalid function number |
Invalid memory block address |
Is a directory | ||
Math argument |
Memory arena trashed |
Name too long | ||
No child processes |
No more files |
No space left on device | ||
No such device |
No such device or address |
No such file or directory | ||
No such process |
Not a directory |
Not enough memory | ||
Not same device |
Operation not permitted |
Path not found | ||
Permission denied |
Possible deadlock |
Read-only file system | ||
Resource busy |
Resource temporarily unavailable |
Result too large | ||
Too many links |
Too many open files |
The behavior of calloc, malloc, or realloc if the size requested is zero
calloc and malloc will ignore the request and return 0. realloc will free the block.
The behavior of the abort function with regard to open and temporary files
The file buffers are not flushed and the files are not closed.
The status returned by exit if the value of the argument is other than zero, EXIT_SUCCESS, or EXIT_FAILURE
Nothing special. The status is returned exactly as it is passed. The status is a represented as a signed char.
The set of environment names and the method for altering the environment list used by getenv
The environment strings are those defined in the operating system with the SET command. putenv can be used to change the strings for the duration of the current program, but the SET command must be used to change an environment string permanently.
The contents and mode of execution of the string by the system function
The string is interpreted as an operating system command. COMSPEC or CMD.EXE is used, and the argument string is passed as a command to execute. Any operating system built-in command, as well as batch files and executable programs, can be executed.
The contents of the error message strings returned by strerror
See Table above.
The local time zone and Daylight Saving Time
Defined as local PC time and date.
The era for clock
Represented as clock ticks, with the origin being the beginning of the program execution.
The formats for date and time