.TH SPLIT 3 "8 Dec 1990" .BY "Zoology" .SH NAME split \- split string into fields .SH SYNOPSIS .nf .B "int split(string, fields, nfields, separators)" .B "char *string;" .B "char *fields[];" .B "int nfields;" .B "char *separators;" .fi .SH DESCRIPTION .I Split splits the .I string into .I fields according to the .IR separators , much in the manner of .IR awk (1)'s field facilities. .I String has .SM NUL s written into it to do the splitting, and pointers to the resulting strings are placed into the .I fields array, up to the number specified by .IR nfields . If the number of fields found exceeds .IR nfields , the writing of .SM NUL s stops, so that the last entry in .I fields points to the remainder of the string. .PP The returned value is the number of fields present in .IR string , which may exceed .IR nfields . If the returned value is less than .IR nfields , the contents of the remaining entries in .I fields (up to the limit set by .IR nfields ) are undefined. (Note, in particular, that .I split does not terminate the list of fields with a .SM NULL pointer.) .PP If .I separators is a string of length 1, that character is the separator: each occurrence of it separates the current field from a new field, which may be of length 0 if another separator follows immediately. .PP If .I separators is a string of length >1, all the characters in it are separators: any sequence of characters composed entirely of separators separates the current field from a new field, which is of length 0 only if it is at the beginning or end of the .IR string . Characters may be duplicated in .IR separators ; in particular, a .I separators string consisting of the same character twice gives fields separated by sequences of that character. .PP As a special case, if .I separators is a string of length 0, sequences of white-space characters (here defined as blanks and tabs) are the separators, and sequences of separators at the beginning or end of the .I string are ignored, so there can be no fields of length 0. .PP In all cases, an empty .I string (or, in the case of a .I separators of length 0, a .I string consisting only of white space) has 0 fields. .SH SEE ALSO awk(1), regexp(3), strtok(3) .SH HISTORY Written at University of Toronto by Henry Spencer, with contributions from Geoff Collyer and Mark Moraes. .SH BUGS Too many different decisions are bundled into the single .I separators argument. Arguably it should be a regular expression instead. .PP Specifying several characters as .I separators is rather inefficient; if speed is a priority in such a case, it is better to custom-build code that knows about the particular separators desired.