The sort command does just that, it sorts input. Input can be a list of files, standard in, or files with standard in. The first example presents this simple file, shopping.txt, containing a list of items:
chicken
fish
sour cream
bread crumbs
milk
eggs
bread
sinkers
fishing hooks
Issuing the sort command on this file:
sort shopping.txt
Would present the following output:
bread
bread crumbs
chicken
eggs
fish
fishing hooks
milk
sinkers
sour cream
Sort presents the items in alphanumeric order and by case. Note that symbols have the highest hierarchy. So if passed this list to sort:
flounder
2lb sinker
5 bobbers
Strike Caster Reel
swivels
Three minnows
#zee banjo minnow
The output would be:
#zee bangjo minnow
2lb sinker
5 bobbers
Strike Caster Reel
Three minnows
flounder
swivels
Notice the output starts with symbols, then numbers, and finally moves to the alphabet ranking upper case letters first.
There are a number of options to control how sort behaves. The -d or --dictionary-order option sorts the output considering only blank spaces and alphanumeric characters. It ignores symbols.
1000
#bannana
#apple
zinger
02
20
A regular sort on this list produces the following output (note there is a space before the “z” in “zinger”):
zinger
#apple
#bannana
02
1000
20
But executed with the -d option produces this output:
zinger
02
1000
20
#apple
#bannana
Sort is not ranking the symbols first. The -b or ignore --ignore-leading-blanks produces a sort ignoring leading blanks ordering the list as with ” zinger” at the bottom:
#apple
#bannana
02
1000
20
zinger
The -f or --ignore-case sorts a list by alphanumeric sort but as it states ignores the case. A regular sort on the following list:
bannana
Apple
Carrot
orange
Grape
Produces the following sort:
Apple
Carrot
Grape
bannana
orange
But with the -f option the list is sorted in this manner:
Apple
bannana
Carrot
Grape
orange
The entry “bannana” occurs after “Apple” as the case of the items is ignored.
The -r, or --reverse, option reverses the sort order. So this list:
Apple
Carrot
Grape
bannana
orange
With -r becomes:
orange
banan
Grape
Carrot
Apple
Sort has a month sorting option: -M or --month-sort that will sort a list of months in their proper order:
April
Jun
May
January
Dec
february
Issuing sort -M produces the following output:
January
february
April
May
Jun
Dec
There are a few other options to sort that determine the output:
- -h or --human-numeric-sort
- -g or --general-numeric-sort
- -n or --numeric-sort
- -i or --ignore-nonprinting
- -V or --version-sort
Human numeric sort first determines whether there is a number sign - postivie, zero, or negative and then looks whether there is a suffix. Suffixes can be one of:
- K or k
- M
- G
- T
- P
- E
- Z
- Y
Note that case sensitivity is important and the suffix is sorted before the numeric value:
1G
1042M
15
-32P
The output of sort -h on this list would be:
-32P
15
1042M
1G
Even though the value of 1042M would be greater than 1G.
General number sort, -g or --general-number-sort, follows a different rule set from standard numeric sort. It converts each line to a long double-precision floating point number and treats lines that do not start with numbers as equal..
15
+12
zeta
-32
5.8880
0
alpha
A regular numeric sort, sort -n, produces this list:
-32
+12
0
alpha
zeta
5.8880
15
While a general numeric sort, sort -g, produces this list:
alpha
zeta
-32
0
5.8880
+12
15
There is a random option to sort using the -R, or --random-sort:
sort -R some_file
This does exactly what you think, randomizes the output.
Version sorting acts a bit differently than the previously mentioned sorts. Version sorts match on indices and version numbers and not just on by examining the first character. For instance, in a directory listing of these files:
myapp-012.tar.gz
myapp-012b.tar.gz
myapp-013.tar.gz
myapp-0013b.tar.gz
A normal sorting would product the following list:
myapp-0013b.tar.gz
myapp-012.tar.gz
myapp-012b.tar.gz
myapp-013.tar.gz
Where as sort -V would produce:
myapp-012.tar.gz
myapp-012b.tar.gz
myapp-013.tar.gz
myapp-0013b.tar.gz
There is one more basic option to sort and that is to do a reverse sort with the -r, or --reverse. This option can be combined with any of the other options listed above to augment the sort to be reversed.
These are the basic options to sort. The last note about sort is that the sort type can be specified using the --sort=WORD switch where the value of word would be one of the following:
- general-numeric
- human-numeric
- month
- numeric
- random
- version
Sort is a handy utility for managing lists. Combined with other commands like uniq, cut, and grep one can produce an output of pertinent data in format that can be utilized to process data quickly.
Bibliography
- man sort
- info sort
If the video is not clear enough view it off the YouTube website and select size 2 or full screen. Or download the video in Ogg Theora format:
- Episode 022 – sort Ogg Theora Video – Archive.org
Thank you very much!
- Ogg file hosted by Hacker Public Radio
- Speex file hosted by Hacker Public Radio
- Mp3 file hosted by Hacker Public Radio