4. Pipes

Earlier, we mentioned the possibility of redirecting output to a file. Another very powerful redirection method are pipes, used like this:

$ cmd1 | cmd2 | cmd3 | cmd4

Instead of redirecting the output of cmd1 to a file, it will be used directly as input to cmd2 and so on.

sort, grep, sed, awk, uniq, head, tail

In this section, you’ll see some commands that are useful in the middle of pipes. Remember to use the man pages to find options!

  1. Compare:

    $ echo -e "d\nb\nc\na"
    $ echo -e "d\nb\nc\na" | sort
    
  2. Sort these lines by numerical value:

    $ echo -e "3\n1\n2\n12"
    
  3. Copy the file /usr/share/dict/words to your working directory, then use grep to find out which words contain the string physics:

    $ grep physics words
    
  4. Alternatively, grep can also be used as part of a pipe:

    $ cat words | grep physics
    
  5. How many entries in words do not contain physics? Let grep do the count for you.

  6. Create a file called physicswords that contains the physics words twice. Run sort on it. Try inserting uniq into the pipe before or after the sort.

  7. Try:

    $ sort physicswords | uniq | head -3
    

    Do the same with tail -3.

  8. sed is a very powerful command, and worth spending more time on. It is most often used for text replacement using the s/// construct:

    $ cat physicswords | sed s/physics/foo/
    
  9. A similarly comprehensive command is awk. Again, we can only show some common uses. Compare:

    $ ls -l ~
    $ ls -l ~ | awk '{print $1 " abc " $3 " def " $5}'
    

    What is happening? Also try:

    $ ls -l ~ | awk '{ sum+=$5 ; print sum } END { print "===\ntotal: "sum }'
    

    If you have any PDF files in ~, also try matching specific lines:

    $ ls -l ~ | awk '/pdf/ { sum+=$5 ; print sum } END { print "===\ntotal: "sum }'
    
  10. List the three oldest files in your home directory.

  11. Use ls, grep, sort, head and awk to find the five largest files you have. Check all your directories, not just ~, do not include directory sizes! The output should list only the file names, one per line, largest first. If you feel up to a challenge, try find instead of ls and grep.

date, cal

Two little useful tools to close this block (they fit equally badly anywhere else):

$ date
$ cal