|
|
HP C
|
Previous | Contents | Index |
Integer constants are used to represent whole numbers. An integer constant can be specified in decimal, octal, or hexadecimal radix, and can optionally include a prefix that specifies its radix and a suffix that specifies its type. An integer constant cannot include a period or an exponent part.
Follow these rules when specifying an integer constant:
Without explicit specification, the type of an integer constant defaults to the smallest possible type that can hold the constant's value, unless the value is suffixed with an L, l, LL ( (ALPHA, I64)), ll ( (ALPHA, I64)), U, or u.
The C99 standard introduced the type long long int (both signed and unsigned) as a standard integer type whose range of values requires at least 64 bits to represent. Although HP C on Alpha systems implemented the type long long as a language extension many releases ago, the compiler followed the C89 rules for determining the type of an integer constant. Those rules specified that an unsuffixed decimal integer with a value too large to be represented in a signed long would be given the type unsigned long if it would fit, and only be given a long long type if the value was too large for unsigned long. (Note: The long long data type is supported on Alpha and I64 systems only.)
In standardizing the long long type, the C99 standard regularized these rules and made them extensible to longer types. In particular, unsuffixed decimal integer constants are given the smallest signed integer type that will hold the value (the minimum type is still int). If the value is larger than the largest value of signed long long, it is given the next larger implementation-defined signed integer type (if there is one). Otherwise C99 states that the behavior is undefined. HP C, however, uses the type unsigned long long next. The only portable way to specify a decimal constant that will be given an unsigned type is to use a suffix containing u or U .
HP C continues to use the C89 rules in VAXC, COMMON, and strict ANSI89 modes (including MIA), but uses the new C99 rules in all other modes. Table 1-5 shows the rules for determining the type of an integer constant. The type of an integer constant is the first of the corresponding list in which its value can be represented.
For example, the constant 59 is assigned the int data type; the constant 59L is assigned the long data type; the constant 59UL is assigned the unsigned long int data type.
Integer constant values are always nonnegative; a preceding minus sign is interpreted as a unary operator, not as part of the constant. If the value exceeds the largest representable integer value (causing an overflow), the compiler issues a warning message and uses the greatest representable value for the integer type. Unsuffixed integer constants can have different types, because without explicit specification the constant is represented in the smallest possible integer type.
The new C99 rules for determining the type of an integer constant could lead to some constants in your program being interpreted as having a signed type when previous compiler versions gave them an unsigned type. This could affect your program's behavior in subtle ways. The new message intconstsigned can be enabled to report constants in your source code that are being treated differently under the C99 rules than they were in previous releases. This message is also part of the new message group NEWC99. If your program relied on unsigned treatment, the simple fix is to add the correct suffix including a "U" or "u" to force the constant to have the expected type. Such a change would be backward compatible and portable. |
A floating-point constant has a significand part that may be followed by an exponential part and an optional suffix that specifies its type (for example, 32.45E2).
The components of the significand part may include a digit sequence representing the whole number part, followed by a period (.), followed by a digit sequence representing the fractional part.
The components of the exponent part are an e, E, p, or P followed by an exponent consisting of an optionally signed digit sequence.
Either the whole-number part or the fraction part of the significand must be present. For decimal floating constants, either the period or the exponent part must be present.
The significand part of a floating-point constant is interpreted as a decimal or hexadecimal rational number; the digit sequence in the exponent is interpreted as a decimal integer. For decimal floating constants, the exponent indicates the power of 10 by which the significand part is to be scaled. For hexadecimal floating constants, the exponent indicates the power of 2 by which the significand part is to be scaled. For decimal floating constants, and for hexadecimal floating constants when FLT_RADIX is not a power of 2, the result is either the nearest representable value, or the larger or smaller representable value immediately adjacent to the nearest representable value, chosen in an platform-dependent manner. For hexadecimal floating constants when FLT_RADIX is a power of 2, the result is correctly rounded.
Floating-point constant values must be nonnegative; a preceding minus sign is interpreted as a unary operator, not as part of the constant.
Floating-point constants have the following type:
A word about hexadecimal floating-point constants... The C99 standard introduced a hexadecimal form of floating-point constants. This form of constant permits floating-point values to be specified reliably to the last bit of precision. It does not specify a bit pattern for the representation. Instead it is interpreted much like an ordinary decimal floating-point constant except that the significand is written in hexadecimal radix, and the exponent is expressed as a decimal integer indicating the power of two by which to multiply the significand. A "P" instead of an "E" separates the exponent from the significand. Thus, for example, 1/2 can be written as either 0x1P-1 or 0x.1P3.
The C99 standard also adds printf/scanf specifiers for this form of value, but that support will be in OpenVMS run-time libraries after OpenVMS Version 7.3.
Table 1-6 shows examples of valid notational options.
A character constant is any character from the source character set enclosed in apostrophes. Character constants are represented by objects of type int. For example:
char alpha = 'A'; |
Characters such as the new-line character, single quotation marks, double quotation marks, and backslash can be included in a character constant by using escape sequences as described in Section 1.9.3.3. All valid characters can also be included in a constant by using numeric escape sequences, as described in Section 1.9.3.4.
The value of a character constant containing a single character is the numeric value of the character in the current character set. Character constants containing multiple characters within the single quotation marks have a value determined by the compiler. The value of a character constant represented by an octal or hexadecimal escape sequence is the same as the octal or hexadecimal value of the escape sequence. The value of a wide character constant (discussed in Section 1.9.3.1) is determined by the mbtowc library function.
There is a limit of four characters for any one character constant. Enclosing more than four characters in single quotation marks (such as 'ABCDE'), generates an overflow warning.
Note that the byte ordering of character constants is platform specific.
C provides for an extended character set through the use of wide characters. Wide characters are characters too large to fit in the char type. The wchar_t type is typically used to represent a character constant in a character set requiring more than 256 possible characters, because 8 bits can represent only 256 different values.
A character constant in the extended character set is written using a preceding L, and is called a wide-character constant. Wide-character constants have an integer type, wchar_t, defined in the <stddef.h> header file. Wide-character constants can be represented with octal or hexadecimal character escape sequences, just like normal character escape sequences, but with the preceding L.
Strings composed of wide characters can also be formed. The compiler allocates storage as if the string were an array of type wchar_t, and appends a wide null character (\0) to the end of the string. The array is just long enough to hold the characters in the string and the wide null character, and is initialized with the specified characters.
The following examples show valid wide-character constants and string literals:
wchar_t wc = L'A'; wchar_t wmc = L'ABCD'; wchar_t *wstring = L"Hello!"; wchar_t *x = L"Wide"; wchar_t z[] = L"wide string"; |
HP C stores wchar_t objects as unsigned long objects (OPENVMS) or unsigned int objects (TRU64 UNIX) in 32 bits of storage. The null character at the end of a wide-character string is 32 bits long.
Some programmers requiring an extended character set have used shift-dependent encoding schemes to represent the non-ASCII characters in the normal char size of 8 bits. This encoding results in multibyte characters. ANSI C supports these encoding schemes, in addition to providing the wide-character type wchar_t.
In accordance with the ANSI standard, HP C recognizes multibyte characters in the following contexts:
For proper input and output of the multibyte character encodings, and to prevent conflicts with existing string processing routines, note the following rules governing the use of multibyte characters:
Transforming multibyte characters to wide-character constants and wide string literals eases the programmer's problems when dealing with shift-state encoding. There are several C library functions available for transforming multibyte characters to wide characters and back. See Chapter 9 for more information.
Characters that cannot be displayed on a standard terminal, or that have special meaning when used in character constants or string literals, can be entered as source characters by entering them as character escape sequences. A backslash (\) begins each character escape sequence. Each of the escape sequences is stored in a single char or wchar_t object. Table 1-7 lists the ANSI-defined escape sequences.
No other character escape sequences are valid. If another sequence is encountered in the source code, the compiler issues a warning and the backslash character is ignored.
An example of a character escape sequence use follows:
printf ("\t\aReady\?\n"); |
Upon execution, this results in an alert bell and the following prompt:
Ready? |
The compiler treats all characters as an integer representation, so it is possible to represent any character in the source code with its numeric equivalent. This is called a numeric escape sequence. The character is represented by typing a backslash (\), followed by the character's octal or hexadecimal integer equivalent from the current character set (see Appendix C for the ASCII equivalence tables). For example, using the ASCII character set, the character A can be represented as \101 (the octal equivalent) or \x41 (the hexadecimal equivalent). A preceding 0 in the octal example is not necessary because octal values are the default in numeric escape sequences. A lowercase x following the backslash indicates a hexadecimal representation. For example, \x5A is equivalent to the character Z.
An example of numeric escape sequences follows:
#define NUL '\0' /* Defines logical null character */ char x[] = {'\110','\145','\154','\154','\157','\41','\0'}; /* Initializes x with "Hello!" */ |
The escape sequence extends to three octal digits, or the first character that is not an octal digit, whichever is first. Therefore, the string "\089" is interpreted as four characters: \0, 8, 9, and \0.
With hexadecimal escape sequences, there is no limit to the number of characters in the escape sequence, but the result is not defined if the hexadecimal value exceeds the largest value representable by the unsigned char type for an normal character constant, or the largest value representable by the wchar_t type for a wide-character constant. For example, '\x777' is illegal.
In addition, hexadecimal escape sequences with more than three characters provoke a warning if the error-checking compiler option is used.
String concatenation can be used to specify a hexadecimal digit following a hexadecimal escape sequence. In the following example, a is initialized to the same value in both cases:
char a[] = "\xff" "f"; char a[] = {'\xff', 'f', '\0'}; |
Using numeric escape sequences can result in a nonportable program if the executing machine uses a different character set. Another threat to portability exists if arithmetic operations are performed on the integer character values, because multiple character constants (such as 'ABC' can be represented differently on different machines.
Previous | Next | Contents | Index |
|