Standard C
Formatting Monetary Values
P.J. Plauger
P.J. Plauger is senior editor of The C Users Journal. He is secretary of the ANSI C standards committee, X3J11, and convenor of the ISO C standards committee, WG14. His latest book is Standard C, which he co-authored with Jim Brodie. You can reach him at [email protected].
This is my fourth and last column in a series on the header <locale.h>. I have spent a lot of time on this topic because locales are new to many C programmers. You may have little use for them now, perhaps, but varying locales will become more and more important in the coming years. The pressures of competition in the international marketplace will see to that.
Supporting locales also requires a non-trivial amount of code. I spent one whole column describing the code that switches among locales already in memory. (See "Implementing Locales," Standard C, CUJ April 1991.) I devoted last month's column to describing one way to specify a new locale in a text file. (See "Build Your Own Locales," Standard C, CUJ May 1991.) I never got around to showing the code that reads and parses a locale file and I never will, in this column at least. It involves a lot of tedious detail.
Another bit of tedious detail becomes apparent when you actually try to use information from a locale. You call the function localeconv to get a pointer to a data object of type struct lconv. It tells you all sorts of interesting things about how you should format monetary values. (See "The Header <locale.h>," Standard C, CUJ March 1991.) It even tells you a thing or two about formatting non-monetary amounts. It does not, however, put all these details together for you in one convenient package.
I set about constructing an example of how to use the locale-specific information on formatting values. It turned out to be a non-trivial exercise. Hence, the result may prove of value to anyone who wants to format values by locale-specific rules. That's what this column is about.
Formatting Values
Two locale categories tell you how to format numbers to match local usage:
- Category LC_MONETARY suggests how to format monetary amounts, both by local custom and in accordance with international standards (ISO 4217).
- Category LC_NUMERIC dictates the decimal point character used by the Standard C library and suggests how to format non-monetary amounts.
Listing 1 shows the various ways you can format the monetary amount $-3.00 by local custom, depending upon the values stored in three members of struct lconv.
The example assumes that the member currency_symbol points at "$", mon_decimal_point points at ".", negative_sign points at "-", and frac_digits has the value 2. The example does not show the effect of the members mon_grouping and mon_thousands_sep, which describe how to group and separate digits to the left of the decimal point.
Three additional members describe how to format positive monetary amounts. These are p_sep_by_space, p_sign_posn, and p_cs_precedes. For international monetary amounts, the member int_curr_symbol determines the currency symbol (instead of currency_symbol) and int_frac_digits determines the number of decimal places to display (instead of frac_digits). And if you want to format non-monetary amounts, you care about the members decimal_point, grouping, and thousands_sep.
That's a lot of complexity to keep track of. Conceivably, you can use this information throughout an application, but probably not. The individual pieces are at a low level of detail. What you really want is some way to format numeric data that applies all of the relevant information in one place. Unfortunately, the C Standard does not define such a function.
Function _Fmtval
I decided to define the missing function. After several false starts, I ended up with the declaration:
char *_Fmtval (char *buf, double val, int frac_digs);You provide the character buffer buf to hold the formatted value. (The modern trend is to specify a maximum length for any such buffer. I found the function quite complicated enough without such checking, desirable as it may be.) As a convenience, the function returns the value of buf, which then holds the formatted value as a null-terminated string.
You also specify val, the value to be formatted, as a double. That provides for a fractional part and at least 16 decimal digits of precision. For a non-monetary value, frac_digits specifies the numer of fraction digits to include in the formatted value. The members of struct Iconv offer no guidance on this parameter.
Here's where the design gets clever. (I am willing to concede that it may be overly clever.) The locale information suggests four distinct formats for a value:
- an international monetary amount
- a local monetary amount
- a non-monetary amount with no decimal point or fraction
- a non-monetary amount with decimal point and fraction
Listing 1 shows the code for the function_Fmtval. It distinguishes the four formats by examining the value of frac_digits:
- A value of -2 (the macro FN_INT_CUR) tells the function to format an international monetary amount.
- A value of -1 (the macro FN_LCL_CUR) tells the function to format a local monetary amount.
- Any other value tells the function to format a non-monetary amount. The number of fraction digits, however determined, must be a non-negative value other than CHAR_MAX for the function to include a decimal point and fraction. So if you call _Fmtval with the value CHAR_MAX, or with any negative value other than -1 or -2, you tell it to format a non-monetary amount with no decimal point or fraction.
- By elimination, any non-negative value other than CHAR_MAX tells the function to format a non-monetary amount with a decimal point and fraction. The value specifies the number of fraction digits.
The remaining logic then determines how many separators to insert between characters to the left of the decimal point and proceeds to do so. It uses the function memmove, declared in <string.h> to move characters further along in the buffer. That guarantees a correct copy even if the source and destination areas overlap. Note that the function replaces the decimal point generated by sprintf (which itself can vary with locale) with a decimal point that depends on the format selected.
Using _Fmtval
To use _Fmtval, you must first declare it and define its associated macros in your program. You write something like:
#define FV_INTEGER -3 #define FV_INT_CUR -2 #define FV_LCL_CUR -1 char *_Fmtval(char *, double, int);Put these lines at the top of your program, or in a separate header file that you include in your program. Now you are in a position to call the function in various ways. For example, the code in Listing 2 might produce the output:
You ordered 1,340,000 sheets, each 1,204.787 square cm. Please remit USD 18,279 to New York office, (that's $18,278.85).Imagine trying to produce this result by inspecting the contents of struct lconv directly. Function _Fmtval obviously has its uses, at least for people who care about locales.
Everyday Uses
You don't have to sell software to Serbo-Croatians to care about readability. I don't know about you, but I have trouble reading messages like:
12 File(s) 2459648 bytes freeI tend to carve my MS-DOS disks into 20-25 Mb chunks. Then I tend to fill them up. When I see the above message, I have to think thrice about what it means. Is it 24 Mb, 2.4 Mb, or 240 Kb? Untrained civilians have enough sense to write values like this as 2,459,648. One of the most successful software companies in the world persists in leaving out the commas.
An obvious use for _Fmtval is to drop the commas in the right places in a value that you want to display. Compared to matching monetary conventions around the world, this operation is pretty lightweight. The job is just messy enough that you want to package it as a function.
Packaging it as a locale-dependent function has an added advantage. Not everyone in the world uses commas to separate digits to the left of the decimal point. Those who use commas as a decimal point occasionally use dots. Some folks use spaces. And despite the names thousands_sep and mon_thousands_sep, not everyone groups digits by threes.
There is one small problem with using _Fmtval more widely. In the "C" locale, it doesn't do anything. The value of thousands_sep in the "C" locale is CHAR_MAX. That is the code for an unspecified value. Ask _Fmtval to reformat a non-monetary value and it leaves it unchanged. Ask the function to reformat a monetary value, for that matter, and it doesn't do much that is useful. In the "C" locale, the currency symbols currency_symbol and int_curr_symbol both point at the empty string "".
I can mostly understand why. Remember that locales were added to Standard C as a way to appease the international community. These folks were unhappy that C contained so many cultural assumptions peculiar to the USA. They were prepared to alter the ISO C Standard in significant ways to accommodate the needs of other cultures. Those of us who wanted a common standard acceptable to all were not in a position to be presumptuous. It was hard to insist that the "C" locale contain any assumptions about a specific culture. We agreed on requiring the dot as a decimal point only because so much existing code depended on that assumption.
What I am hoping for, as a consequence, is a widespread use of the default locale. Perhaps one day it will be commonplace to put the statement set_locale(LC_ALL, ""); at the top of main. It should then also be commonplace that the default locale defines sensibly the various members of struct Iconv. In the English-speaking world, that generally means that the member thousands_sep will have the value 3.
Programs that start off on this foot are in an excellent position to use _Fmtval (or its equivalent) to advantage. I can even hope that programmers will take pity on those of us who like to see commas in large values. Such programmers will make a habit of formatting displayed values by locale-specific rules. Maybe then I can tell how much disk space I have left.
Listing 3