The sort(1) utility sorts text files by lines.
Comparisons are based on one or more sort keys extracted from each
line of input, and are performed lexicographically. By default, if
keys are not given, sort(1) regards each input line as a
single field.
The following options are available:
-c
Check that the single input file is sorted. If the file is not
sorted, sort(1) produces the appropriate error messages and
exits with code 1; otherwise, sort(1) returns 0;
sort(1) -c produces no output.
-m
Merge only; the input files are assumed to be pre-sorted.
-ooutput
The argument given is the name of an output file to be
used instead of the standard output. This file can be the same as
one of the input files.
-u
Unique: suppress all but one in each set of lines having equal
keys. If used with the -c option, check that there are no
lines with duplicate keys.
The following options override the default ordering rules. When
ordering options appear independent of key field specifications,
the requested field ordering rules are applied globally to all sort
keys. When attached to a specific key (see -k), the ordering
options override all global ordering options for that key.
-d
Only blank-space and alphanumeric characters are used in making
comparisons.
-f
Considers all lowercase characters that have uppercase
equivalents to be the same for purposes of comparison.
-i
Ignore all non-printable characters.
-n
An initial numeric string, consisting of optional blank space,
optional minus sign, and zero or more digits (including decimal
point) is sorted by arithmetic value.(The -n option no
longer implies the -b option.)
-r
Reverse the sense of comparisons.
The treatment of field separators can be altered using the
following options:
-b
Ignores leading blank space when determining the start and end
of a restricted sort key. A -b option specified before the
first -k option applies globally to all -k options.
Otherwise, the -b option can be attached independently to
each field argument of the -k option (discussed later
in this topic). The -b option has no effect unless key
fields are specified.
-tchar
In this case, char is used as the field separator
character. The initial char is not considered to be part of
a field when determining key offsets. Each occurrence of
char is significant (for example, charchar delimits
an empty field). If -t is not specified, blank-space
characters are used as default field separators, and any number of
continuous blanks count as a single field separator.
-Tchar
In this case, char is used as the record-separator
character. The default record separator is newline (one line is one
record). This should be used with discretion;
-T<alphanumeric> usually produces undesirable
results.
-kfield1[,field2]
Designates the starting position, field1, and optional
ending position, field2, of a key field. Counting starts at
1. The -k option replaces the obsolescent options
+pos1 and -pos2.
You can specify more than one sorting field by specifying
multiple -k options. The second field specified is only used
to sort records where the first fields specified are identical, and
so on.
The following operands are available:
File
The path name of a file to be sorted, merged, or checked. If no
file operands are specified, or if a file operand is - the
standard input is used.
Field
A field is defined as a minimal sequence of characters followed
by a field separator or a newline character. By default, the first
blank space of a sequence of blank spaces acts as the field
separator. All blank spaces in a sequence of blank spaces are
considered as part of the next field; for example, all blank spaces
at the beginning of a line are considered to be part of the first
field.
Fields are specified by the -kfield1[,field2] argument. A
missing field2 argument defaults to the end of a line.
The arguments field1 and field2 have the form
m.n followed by one or more of the options
b,d,f,i, n,r. A
field1 position specified by m.n
(m,n>0) is interpreted as the nth character
in the mth field. A missing .n in field1 means
.1, indicating the first character of the mth field; If the
-b option is in effect, n is counted from the first
non-blank character in the mth field; m.1b refers to
the first non-blank character in the mth field.
A field2 position specified by m.n is interpreted
as the nth character (including separators) of the
mth field. A missing .n indicates the last character
of the mth field; m = 0 designates the end of a line.
Thus the option -kv.x,w.y is synonymous with the obsolescent
option +v-1.x-1 w-1.y ; when y is omitted, -kv.x,w is
synonymous with +v-1.x-1w+1.0.
The obsolescent +pos1-pos2 option is
still supported, except for w.0b which has no -k
equivalent.
The current sort(1) command uses lexicographic radix
sorting, which requires that sort keys be kept in memory (as
opposed to previous versions which used quick and merge sorts and
did not). Performance therefore depends highly on the efficient
choice of sort keys, and the -b option and the field2
argument of the -k option should be used whenever possible.
Similarly, sort(1) -k1f is equivalent to
sort(1) -f and might take twice as long.