|
|
ANSI C provides two functions for locale-dependent string compares.
strcoll is analogous to strcmp
except that the two strings are compared according to
the LC_COLLATE category of the current locale.
(See the
strcoll(3C)
manual page
and
strcmp on the
string(3C)
manual page).
Conceptually, collation occurs in two passes
to obtain an appropriate ordering of accented characters,
two-character sequences that should be treated as one (the
Spanish character ch, for example), and single characters that should
be treated as two (the sharp s in German, for instance).
Since this comparison is not necessarily as inexpensive as strcmp,
the strxfrm function is provided to transform a string
into another. Therefore, any two such after-translation strings
can be passed to strcmp to get an ordering
identical to what strcoll would have returned if
passed the two pre-translation strings.
You are responsible for keeping track of the strings in their
translated and printable forms.
Generally, you should use strxfrm
when a string will be compared a number of times.
The following example uses qsort(3C) and strcoll(3C) to sort lines in a text file:
#include <stdio.h> #include <string.h> #include <locale.h>char table [ELEMENTS] [WIDTH];
main(argc, argv) int argc; char **argv; { FILE *fp; int nel, i;
setlocale(LC_ALL, "");
if ((fp = fopen(argv[1], "r")) == NULL) { fprintf(stderr, gettxt("progmsgs:2", "Can't open %s\n", argv[1]); exit(2); } for (nel = 0; nel < ELEMENTS && fgets(table[nel], WIDTH, fp); ++nel);
fclose(fp);
if (nel >= ELEMENTS) { fprintf(stderr, gettxt("progmsgs:3", "File too large\n"); exit(3); } qsort(table, nel, WIDTH, strcoll); for (i = 0; i < nel; ++i) fputs(table(i), stdout); return(0); }
The next example does the same thing with a function that uses strxfrm:
compare (s1, s2) char *s1, *s2; { char *tmp; int result; size_t n1 = strxfrm(NULL, s1, 0) + 1; size_t n2 = strxfrm(NULL, s2, 0) + 1;if ((tmp = malloc(n1 + n2)) == NULL) return strcmp(s1, s2);
(void)strxfrm(tmp, s1, n1); (void)strxfrm(tmp + n1 + 1, s2, n2);
result = strcmp(tmp, tmp + n1 + 1); free(tmp); return(result); }
Assuming malloc succeeds, the return value of compare (s1, s2) should correspond to the return value of strcoll(s1, s2). Although it is too complicated to show here, it would probably be better to hold onto the strings for subsequent comparisons rather than transforming them each time the function is called. See the strcoll(3C) and strxfrm(3C) manual pages.