DEV Community

yonatan
yonatan

Posted on

'Everything Is a File in Unix' : A Tour of Linux I/O Redirection

I've been working through The Linux Command Line and wanted to write up what I learned about I/O redirection, partly to cement it for myself and partly because I think the "why" behind it is more interesting than most explanations make it sound. In Unix, everything is a file. Your screen is a file. Your keyboard is a file. Even the connection between two running programs in a pipe is, underneath, just file descriptors pointing at things. Once you see redirection as just "moving where a file descriptor points" instead of a pile of separate symbols to memorize, the rest of this falls into place.
First post here, kinda nervous.
Constructive criticism are very welcome.


  • Any program that outputs a result to the screen or terminal is actually putting its result into a special file called stdout(in accordance with the concept of Unix Everything is a file ).
  • This file is by default tied to the screen.
  • For example the command ls ~ will list every file/dir in the home directory. What's happening is ls is pouring its output to stdout, and since stdout is linked to the screen, we see the output of the program.
  • There are other such kind of files, namely stdin(by default tied to the keyboard), stderr(where programs pour their status and errors) and stdout.
  • We can change where input and output go to by redirecting them to our desired location.

Redirecting Standard Output

  • As we said before, by default the stdout is tied to the screen.
  • But we can change its destination using the > operator.
  • For example, ls ~ > dir_list.txt will put the result of the command into the text file instead of displaying it on the screen.
  • If we want to truncate a file we can use a trick like > file.
  • Mind you if we use > we are overwriting the file. If we want to append to a file instead of overwriting it, we should use >>.
  • If our program outputs an error nothing will be written into the destination file, because stderr and stdout are separate streams, so > only captures one of them.

Redirecting Standard Error

  • This is a great time to introduce file descriptors. We referred to the file streams as stdin, stdout and stderr.
  • But the shell refers to them as 0, 1, and 2 respectively.
  • Since stderr doesn't have a dedicated operator, we can use the file descriptor 2.
  • For example ls /bin/usr/ 2> err.txt, this will put the error or the status of the program into the err.txt.

Redirecting both Standard Output and Standard Error

  • There are two ways to do so:
  • 1) The traditional way: First redirecting the stdout to our desired file, then we redirect the stderr to stdout. In practice it looks like this: ls /directory > file.txt 2>&1.

It's worth noting why ls /directory 2>&1 > file.txt doesn't work as expected. What's happening is this, the shell handles redirections from left to right. It copies where file descriptor 1 is pointing at that moment into file descriptor 2, but at these time stdout was pointing to the screen so the stderr will be pointing to the screen too, then we redirect stdout into the file.txt, but stderr is still pointing to the screen so we will see the error in screen.

  • 2) The more streamlined version: This just involves putting & before >. In practice it looks like this: ls /directory &> file.txt. We can use a double > to append instead of overwriting the file. &> still overwrites files, so to append into files we just need to add an extra > so it becomes &>>.

We can use redirecting to get rid of unwanted output or errors . We can do this by redirecting both stdout and stderr into /dev/null. In practice : ls ~ &> /dev/null. We often use this when we don't want to see the status or errors of a program or command.


Redirecting Standard Input

  • cat file.txt and cat < file.txt give the same output, which makes it tempting to think the filename argument and the < operator are just two spellings of the same thing.
  • They're not. Take wc -l file.txt, this counts the lines and prints something like 2 file.txt.
  • Now try wc -l < file.txt. Same count, but no filename in the output. That's because wc never received a filename at all.
  • When you use <, the shell doesn't hand the command a file, it hands it a stream of bytes with no name attached. With cat file.txt, cat gets the string "file.txt" and opens it itself, using its own argument parsing.

Piping

  • So far with redirecting, we redirected the output of a command into a file, but there might be times where we need to redirect it to another command.
  • So basically the output of one command is the input of another.
  • For that we can use the | operator, which reads the output of one command into the input of another one.
  • For example ls ~ | less.
  • We can connect many commands like this using the pipe operator, which is known as Filtering.

Remember the redirecting operators >, < work from commands to files, and the piping operator work from commands to commands. Mixing up those two might be dangerous in some situations, as > overwrites files by default.
For example if we happen to be in /usr/bin and we did something like ls > less, instead of pouring the output of ls into the input of less and seeing the result on the screen, we would be overwriting the contents of the file less in that directory, since less is just an executable program located at /usr/bin. We can check this with type less which will give an output like `less is /usr/bin/less'.


Stepping back, we can see that every operator in this post,>, <, 2>&1, |, is doing the same basic thing. It's pointing a file descriptor at something else. Standard output, standard error, standard input, even another program's input, none of them are special-cased by the kernel. They're all just file descriptors, and a file descriptor doesn't care whether it's pointing at your screen, a text file, or the input side of another process. That's the whole trick. cat file.txt > out.txt and ls | less look like different operations, but they're both just rewiring where a stream of bytes flows.
This is also why something like wc -l < file.txt behaves differently from wc -l file.txt even though they look almost identical, one hands wc a filename to open itself, the other hands it a raw stream with no name attached at all.

For reference I used the The linux commandline book by William shotts

Top comments (0)