Frequency counts in a Shell

For command-line junkies, a useful idiom to learn is

  list-of-something | sort | uniq -c | sort -nr
              

which will give a frequency count of list-of-something sorted in descending order.

e.g. to show the frequency count of the 4-starting-letters of words

  cat /usr/share/dict/words |\
     cut -c 1-4 |\
     sort | uniq -c | sort -nr |\
     head
              

which will show something like

  2043 over
  1334 unde
  1323 inte
  1078 anti
  1000 supe
   951 semi
   731 unco
   700 poly
   648 para
   618 peri
              

The classic story about this relates to Doug McIlroy and Donald Knuth More Shell, Less Egg