Communication with arrays.
Strings are pieces of text which can be treated as values for variables. In C a string is represented as some characters enclosed by double quotes.
"This is a string"
A string may contain any character, including special control characters, such as \n, \r, \7 etc...
"Beep! \7 Newline \n..."
· Conventions and Declarations:
· Strings Arrays and Pointers:
· Arrays of Strings:
· Example 21:
· Strings from the user:
· Handling strings:
· Example 22:
· String Input/Output:
· Example 23:
· Questions 20:
Node:Conventions and Declarations, Next:Strings Arrays and Pointers, Previous:Strings, Up:Strings
Conventions and Declarations
There is an important distinction between a string and a single character in C. The convention is that single characters are enclosed by single quotes e.g. * and have the type char. Strings, on the hand, are enclosed by double quotes e.g. "string..." and have the type "pointer to char" (char *) or array of char. Here are some declarations for strings which are given without immediate explanations.
/**********************************************************/
/* */
/* String Declaration */
/* */
/**********************************************************/
#define SIZE 10
char *global_string1;
char global_string2[SIZE];
main ()
{ char *auto_string;
char arraystr[SIZE];
static char *stat_strng;
static char statarraystr[SIZE];
}
Node:Strings Arrays and Pointers, Next:Arrays of Strings, Previous:Conventions and Declarations, Up:Strings
Strings, Arrays and Pointers
A string is really an array of characters. It is stored at some place the memory and is given an end marker which standard library functions can recognize as being the end of the string. The end marker is called the zero (or NULL) byte because it is just a byte which contains the value zero: \0. Programs rarely gets to see this end marker as most functions which handle strings use it or add it automatically.
Strings can be declared in two main ways; one of these is as an array of characters, the other is as a pointer to some pre-assigned array. Perhaps the simplest way of seeing how C stores arrays is to give an extreme example which would probably never be used in practice. Think of how a string called string might be used to to store the message "Tedious!". The fact that a string is an array of characters might lead you to write something like:
#define LENGTH 9;
main ()
{ char string[LENGTH];
string[0] = 'T';
string[1] = 'e';
string[2] = 'd';
string[3] = 'i';
string[4] = 'o';
string[5] = 'u';
string[6] = 's';
string[7] = '!';
string[8] = '\0';
printf ("%s", string);
}
This method of handling strings is perfectly acceptable, if there is time to waste, but it is so laborious that C provides a special initialization service for strings, which bypasses the need to assign every single character with a new assignment!. There are six ways of assigning constant strings to arrays. (A constant string is one which is actually typed into the program, not one which in typed in by the user.) They are written into a short compilable program below. The explanation follows.
/**********************************************************/
/* */
/* String Initialization */
/* */
/**********************************************************/
char *global_string1 = "A string declared as a pointer";
char global_string2[] = "Declared as an array";
main ()
{ char *auto_string = "initializer...";
static char *stat_strng = "initializer...";
static char statarraystr[] = "initializer....";
/* char arraystr[] = "initializer...."; IS ILLEGAL! */
/* This is because the array is an "auto" type */
/* which cannot be preinitialized, but... */
char arraystr[20];
printf ("%s %s", global_string1, global_string2);
printf ("%s %s %s", auto_string, stat_strng, statarraystr);
}
/* end */
The details of what goes on with strings can be difficult to get to grips with. It is a good idea to get revise pointers and arrays before reading the explanations below. Notice the diagrams too: they are probably more helpful than words.
The first of these assignments is a global, static variable. More correctly, it is a pointer to a global, static array. Static variables are assigned storage space in the body of a program when the compiler creates the executable code. This means that they are saved on disk along with the program code, so they can be initialized at compile time. That is the reason for the rule which says that only static arrays can be initialized with a constant expression in a declaration. The first statement allocates space for a pointer to an array. Notice that, because the string which is to be assigned to it, is typed into the program, the compiler can also allocate space for that in the executable file too. In fact the compiler stores the string, adds a zero byte to the end of it and assigns a pointer to its first character to the variable called global_string1.
The second statement works almost identically, with the exception that, this time the compiler sees the declaration of a static array, which is to be initialized. Notice that there is no size declaration in the square brackets. This is quite legal in fact: the compiler counts the number of characters in the initialization string and allocates just the right amount of space, filling the string into that space, along with its end marker as it goes. Remember also that the name of the array is a pointer to the first character, so, in fact, the two methods are identical.
The third expression is the same kind of thing, only this time, the declaration is inside the function main() so the type is not static but auto. The difference between this and the other two declarations is that this pointer variable is created every time the function main() is called. It is new each time and the same thing holds for any other function which it might have been defined in: when the function is called, the pointer is created and when it ends, it is destroyed. The string which initializes it is stored in the executable file of the program (because it is typed into the text). The compiler returns a value which is a pointer to the string's first character and uses that as a value to initialize the pointer with. This is a slightly round about way of defining the string constant. The normal thing to do would be to declare the string pointer as being static, but this is just a matter of style. In fact this is what is done in the fourth example.
The fifth example is again identical, in practice to other static types, but is written as an `open' array with an unspecified size.
The sixth example is forbidden! The reason for this might seem rather trivial, but it is made in the interests of efficiency. The array declared is of type auto: this means that the whole array is created when the function is called and destroyed afterwards. auto-arrays cannot be initialized with a string because they would have to be re-initialized every time the array were created: that is, each time the function were called. The final example could be used to overcome this, if the programmer were inclined to do so. Here an auto array of characters is declared (with a size this time, because there is nothing for the compiler to count the size of). There is no single assignment which will fill this array with a string though: the programmer would have to do it character by character so that the inefficiency is made as plain as possible!
Node:Arrays of Strings, Next:Example 21, Previous:Strings Arrays and Pointers, Up:Strings
Arrays of Strings
In the previous chapter we progressed from one dimensional arrays to two dimensional arrays, or arrays of arrays! The same thing works well for strings which are declared static. Programs can take advantage of C's easy assignment facilities to let the compiler count the size of the string arrays and define arrays of messages. For example here is a program which prints out a menu for an application program:
/*********************************************************/
/* */
/* MENU : program which prints out a menu */
/* */
/*********************************************************/
main ()
{ int str_number;
for (str_number = 0; str_number < 13; str_number++)
{
printf ("%s",menutext(str_number));
}
}
/*********************************************************/
char *menutext(n) /* return n-th string ptr */
int n;
{
static char *t[] =
{
" -------------------------------------- \n",
" | ++ MENU ++ |\n",
" | ~~~~~~~~~~~~ |\n",
" | (1) Edit Defaults |\n",
" | (2) Print Charge Sheet |\n",
" | (3) Print Log Sheet |\n",
" | (4) Bill Calculator |\n",
" | (q) Quit |\n",
" | |\n",
" | |\n",
" | Please Enter Choice |\n",
" | |\n",
" -------------------------------------- \n"
};
return (t[n]);
}
Notice the way in which the static declaration works. It is initialized once at compile time, so there is effectively only one statement in this function and that is the return statement. This function retains the pointer information from call to call. The Morse coder program could be rewritten more economically using static strings, See Example 15.
Node:Example 21, Next:Strings from the user, Previous:Arrays of Strings, Up:Strings
Example Listing
/************************************************/
/* */
/* static string array */
/* */
/************************************************/
/* Morse code program. Enter a number and */
/* find out what it is in Morse code */
#include
#define CODE 0
/*************************************************/
main ()
{ short digit;
printf ("Enter any digit in the range 0..9");
scanf ("%h",&digit);
if ((digit < 0) || (digit > 9))
{
printf ("Number was not in range 0..9");
return (CODE);
}
printf ("The Morse code of that digit is ");
Morse (digit);
}
/************************************************/
Morse (digit) /* print out Morse code */
short digit;
{
static char *code[] =
{
"dummy", /* index starts at 0 */
"-----",
".----",
"..---",
"...--",
"....-",
".....",
"-....",
"--...",
"---..",
"----.",
};
printf ("%s\n",code[digit]);
}
Node:Strings from the user, Next:Handling strings, Previous:Example 21, Up:Strings
Strings from the user
All the strings mentioned so far have been typed into a program by the programmer and stored in a program file, so it has not been necessary to worry about where they were stored. Often though we would like to fetch a string from the user and store it somewhere in the memory for later use. It might even be necessary to get a whole bunch of strings and store them all. But how will the program know in advance how much array space to allocate to these strings? The answer is that it won't, but that it doesn't matter at all!
One way of getting a simple, single string from the user is to define an array and to read the characters one by one. An example of this was the Game of Life program the the previous chapter:
· Define the array to be a certain size
· Check that the user does not type in too many characters.
· Use the string in that array.
Another way is to define a static string with an initializer as in the following example. The function filename() asks the user to type in a filename, for loading or saving by and return it to a calling function.
char *filename()
{ static char *filenm = "........................";
do
{
printf ("Enter filename :");
scanf ("%24s",filenm);
skipgarb();
}
while (strlen(filenm) == 0);
return (filenm);
}
The string is made static and given an initializing expression and this forces the compiler to make some space for the string. It makes exactly 24 characters plus a zero byte in the program file, which can be used by an application. Notice that the conversion string in scanf prevents the characters from spilling over the bounds of the string. The function strlen() is a standard library function which is described below; it returns the length of a string. skipgarb() is the function which was introduced in chapter 15.
Neither of the methods above is any good if a program is going to be fetching a lot of strings from a user. It just isn't practical to define lots of static strings and expect the user to type into the right size boxes! The next step in string handling is therefore to allocate memory for strings personally: in other words to be able to say how much storage is needed for a string while a program is running. C has special memory allocation functions which can do this, not only for strings but for any kind of object. Suppose then that a program is going to get ten strings from the user. Here is one way in which it could be done:
1. Define one large, static string (or array) for getting one string at a time. Call this a string buffer, or waiting place.
2. Define an array of ten pointers to characters, so that the strings can be recalled easily.
3. Find out how long the string in the string buffer is.
4. Allocate memory for the string.
5. Copy the string from the buffer to the new storage and place a pointer to it in the array of pointers for reference.
6. Release the memory when it is finished with.
The function which allocates memory in C is called malloc() and it works like this:
· malloc() should be declared as returning the type pointer to character, with the statement:
· char *malloc();
·
· malloc() takes one argument which should be an unsigned integer value telling the function how many bytes of storage to allocate. It returns a pointer to the first memory location in that storage:
· char *ptr;
· unsigned int size;
·
· ptr = malloc(size);
·
· The pointer returned has the value NULL if there was no memory left to allocate. This should always be checked.
The fact that malloc() always returns a pointer to a character does not stop it from being used for other types of data too. The cast operator can force malloc() to give a pointer to any data type. This method is used for building data structures in C with "struct" types.
malloc() has a complementary function which does precisely the opposite: de-allocates memory. This function is called free(). free() returns an integer code, so it does not have to be declared as being any special type.
· free() takes one argument: a pointer to a block of memory which has previously been allocated by malloc().
· int returncode;
·
· returncode = free (ptr);
·
· The pointer should be declared:
· char *ptr;
·
· The return code is zero if the release was successful.
An example of how strings can be created using malloc() and free() is given below. First of all, some explanation of Standard Library Functions is useful to simplify the program.
Node:Handling strings, Next:Example 22, Previous:Strings from the user, Up:Strings
Handling strings
The C Standard Library commonly provides a number of very useful functions which handle strings. Here is a short list of some common ones which are immediately relevant (more are listed in the following chapter). Chances are, a good compiler will support a lot more than those listed below, but, again, it really depends upon the compiler.
strlen()
This function returns a type int value, which gives the length or number of characters in a string, not including the NULL byte end marker. An example is:
int len;
char *string;
len = strlen (string);
strcpy()
This function copies a string from one place to another. Use this function in preference to custom routines: it is set up to handle any peculiarities in the way data are stored. An example is
char *to,*from;
to = strcpy (to,from);
Where to is a pointer to the place to which the string is to be copied and from is the place where the string is to be copied from.
strcmp()
This function compares two strings and returns a value which indicates how they compared. An example:
int value;
char *s1,*s2;
value = strcmp(s1,s2);
The value returned is 0 if the two strings were identical. If the strings were not the same, this function indicates the (ASCII) alphabetical order of the two. s1 > s2, alphabetically, then the value is > 0. If s1 < s2 then the value is < 0. Note that numbers come before letters in the ASCII code sequence and also that upper case comes before lower case.
strstr()
Tests whether a substring is present in a larger string
int n;
char *s1,*s2;
if (n = strstr(s1,s2))
{
printf("s2 is a substring of s1, starting at %d",n);
}
strncpy()
This function is like strcpy, but limits the copy to no more than n characters.
strncmp()
This function is like strcmp, but limits the comparison to no more than n characters.
More string functions are described in the next section along with a host of Standard Library Functions.
Node:Example 22, Next:String Input/Output, Previous:Handling strings, Up:Strings
Example Listing
This program aims to get ten strings from the user. The strings may not contain any spaces or white space characters. It works as follows:
The user is prompted for a string which he/she types into a buffer. The length of the string is tested with strlen() and a block of memory is allocated for it using malloc(). (Notice that this block of memory is one byte longer than the value returned by strlen(), because strlen() does not count the end of string marker \0.) malloc() returns a pointer to the space allocated, which is then stored in the array called array. Finally the strings is copied from the buffer to the new storage with the library function strcpy(). This process is repeated for each of the 10 strings. Notice that the program exits through a low level function called QuitSafely(). The reason for doing this is to exit from the program neatly, while at the same time remembering to perform all a programmer's duties, such as de-allocating the memory which is no longer needed. QuitSafely() uses the function exit() which should be provided as a standard library function. exit() allows a program to end at any point.
/******************************************************/
/* */
/* String storage allocation */
/* */
/******************************************************/
#include
/* #include another file for malloc() and */
/* strlen() ???. Check the compiler manual! */
#define NOOFSTR 10
#define BUFSIZE 255
#define CODE 0
/******************************************************/
/* Level 0 */
/******************************************************/
main ()
{ char *array[NOOFSTR], *malloc();
char buffer[BUFSIZE];
int i;
for (i = 0; i < NOOFSTR; i++)
{
printf ("Enter string %d :",i);
scanf ("%255s",buffer);
array[i] = malloc(strlen(buffer)+1);
if (array[i] == NULL)
{
printf ("Can't allocate memory\n");
QuitSafely (array);
}
strcpy (array[i],buffer);
}
for (i = 0; i < NOOFSTR; i++)
{
printf ("%s\n",array[i]);
}
QuitSafely(array);
}
/******************************************************/
/* Snakes & Ladders! */
/******************************************************/
QuitSafely (array) /* Quit & de-alloc memory */
char *array[NOOFSTR];
{ int i, len;
for (i = 0; i < NOOFSTR; i++)
{
len = strlen(array[i]) + 1;
if (free (array[i]) != 0)
{
printf ("Debug: free failed\n");
}
}
exit (CODE);
}
/* end */
Node:String Input/Output, Next:Example 23, Previous:Example 22, Up:Strings
String Input/Output
Because strings are recognized to be special objects in C, some special library functions for reading and writing are provided for them. These make it easier to deal with strings, without the need for special user-routines. There are four of these functions:
gets()
puts()
sprintf()
sscanf()
· gets():
· puts():
· sprintf():
· sscanf():
Node:gets(), Next:puts(), Previous:String Input/Output, Up:String Input/Output
gets()
This function fetches a string from the standard input file stdin and places it into some buffer which the programmer must provide.
#define SIZE 255
char *sptr, buffer[SIZE];
strptr = gets(buffer);
If the routine is successful in getting a string, it returns the value buffer to the string pointer strptr. Otherwise it returns NULL (==0). The advantage of gets() over scanf("%s"..) is that it will read spaces in strings, whereas scanf() usually will not. gets() quits reading when it finds a newline character: that is, when the user presses RETURN.
NOTE: there are valid concerns about using this function. Often it is implemented as a macro with poor bounds checking and can be exploited to produce memory corruption by system attackers. In order to write more secure code, use fgets() instead.
Node:puts(), Next:sprintf(), Previous:gets(), Up:String Input/Output
puts()
puts() sends a string to the output file stdout, until it finds a NULL end of string marker. The NULL byte is not written to stdout, instead a newline character is written.
char *string;
int returncode;
returncode = puts(string);
puts() returns an integer value, whose value is only guaranteed if there is an error. returncode == EOF if an end of file was encountered or there was an error.
Node:sprintf(), Next:sscanf(), Previous:puts(), Up:String Input/Output
sprintf()
This is an interesting function which works in almost the same way as printf(), the exception being that it prints to a string! In other words it treats a string as though it were an output file. This is useful for creating formatted strings in the memory. On most systems it works in the following way:
int n;
char *sp;
n = sprintf (sp, "control string", parameters, values);
n is an integer which is the number of characters printed. sp is a pointer to the destination string or the string which is to be written to. Note carefully that this function does not perform any check on the output string to make sure that it is long enough to contain the formatted output. If the string is not large enough, then a crash could be in store! This can also be considered a potential security problem, since buffer overflows can be used to capture control of important programs. Note that on system V Unix systems the sprintf functionr returns a pointer to the start of the printed string, breaking the pattern of the other printf functions. To make such an implementation compatible with the usual form you would have to write:
n = strlen(sprintf(parameters......));
Node:sscanf(), Previous:sprintf(), Up:String Input/Output
sscanf()
This function is the complement of sprintf(). It reads its input from a string, as though it were an input file.
int n;
char *sp;
n = sscanf (sp,"control string", pointers...);
sp is a pointer to the string which is to be read from. The string must be NULL terminated (it must have a zero-byte end marker '\0'). sscanf() returns an integer value which holds the number of items successfully matched or EOF if an end of file marker was read or an error occurred. The conversion specifiers are identical to those for scanf().
Node:Example 23, Next:Questions 20, Previous:String Input/Output, Up:Strings
Example Listing
/************************************************/
/* */
/* Formatted strings */
/* */
/************************************************/
/* program rewrites s1 in reverse into s2 */
#include
#define SIZE 20
#define CODE 0
/************************************************/
main ()
{ static char *s1 = "string 2.3 55x";
static char *s2 = "....................";
char ch, *string[SIZE];
int i,n;
float x;
sscanf (s1,"%s %f %d %c", string, &x, &i, &ch);
n = sprintf (s2,"%c %d %f %s", ch, i, x, string);
if (n > SIZE)
{
printf ("Error: string overflowed!\n");
exit (CODE);
}
puts (s2);
}
Node:Questions 20, Previous:Example 23, Up:Strings
Questions
1. What are the two main ways of declaring strings in a program?
2. How would you declare a static array of strings?
3. Write a program which gets a number between 0 and 9 and prints out a different message for each number. Use a pre-initialized array to store the strings.
0 comments: on " "
Post a Comment