UNIX tee in real life

tee is a pretty sweet little unix utility that allows you to copy standard output and make a copy of it to one or more files. For example, let’s say that you wanted to run a command, see the output it generated, as well as save it to a file. tee does just that:

1234	shell> echo run code run \| tee /tmp/output.txtrun code runshell> cat /tmp/output.txtrun code run

If you run the above snippet you’ll see “run code run” in the terminal output as well as in the file we told tee to copy standard output to.

tee becomes ultra-useful when you’re doing something that you want to see, but also want to capture for later reference. This may happen when you’re logged into a remote box and you are running a command which produces a lot of output (perhaps compiling libxml2) or if you’re like me and have a nasty habit of clearing terminal output.

Recently, we found tee was the tool for the job to enhance our continuous integration server Integrity. Integrity had a few lines of code which actually ran a build:

def run

  cmd = "(cd #{repo.directory} && #{@build.project.command} 2>&1)"

  IO.popen(cmd, "r") { |io| @output = io.read }

  @status = $?.success?

end

The first line of the run method creates a command that executes a build (and it redirects STDERR to STDOUT), and then the second line opens that command as a subprocess, and then captures the output of the subprocess in the instance variable @output. The last line determines success by checking the exit status of the most recently executed process (similar to bash’s use of $?).

The problem we ran into is that the command must be fully executed before we can read from the subprocess. For a continuous integration server this can be problematic if the build takes a while. We had some issues with Integrity’s threaded ruby builder (it kept hanging), and it was really hard to track down, since our build took about 15 minutes. We never got any feedback from Integrity. We’d have to log into the CI server and manually inspect processes and log files to determine what was going on.

We ended up updating the run method to copy the output to a file which we could have Integrity integrate into one of its views that were accessible from the web. The first shot at this looked like:

def run

  cmd = "(cd #{repo.directory} && #{@build.project.command} 2>&1 | tee build.txt)"

  IO.popen(cmd, "r") { |io| @output = io.read }

  @status = $?.success?

end

The addition of the tee command would copy output of the build to the file build.txt. This worked really well and helped us have a tighter feedback loop with the CI server, as we could simply refresh the page to see if the builds were progressing. This introduced a new problem though, every build was deemed successful! By introducing the tee command as the last command to run, the exit status of the subprocess is going to reflect tee’s exit status, which is 0—which indicates success. We still need to capture the exit status of the first command and ensure that the subprocess uses that as its exit status.

Fortunately, this can be done with bash’s variable PIPESTATUS. PIPESTATUS is an array which contains a list of exit status values from processes in the most recently executed pipeline. Here’s an example:

12	shell> true \| false \| true \| false ; echo ${PIPESTATUS[0]}0

In the above snippet we execute four commands in a single pipeline. We then echo out the exit status of the very first command using PIPESTATUS[0]. It will print 0 because the first command we ran was “true” which always returns 0. If you change the array-index to 1 or 3 it will print 1 because “false” always returns 1. Likewise, if you index it with 2 you will get 0 again. PIPESTATUS can help us solve our issue with Integrity.

Our first attempt was simple:

def run

  cmd = "(cd #{repo.directory} && #{@build.project.command} 2>&1 | tee build.txt ; echo ${PIPESTATUS[0]})"

  IO.popen(cmd, "r") { |io| @output = io.read }

  @status = $?.success?

end

If you’re thinking this won’t work, you’re right, it won’t. Echoing the exit status adds it to the output, but it doesn’t affect the exit status which determines if “success?” is true or false. We didn’t really want to have to grep the output and look for success or failure flags. We’d rather rely on the exit status of the build process since that is what it’s already there for. So what to do?

Well, fortunately, we can utilize a bash function to return the exit status we’re looking for. We ended up replacing the first line the run method with the following:

cmd = <<-CODE.gsub(/^\s+/, '')
  (
  cd #{repo.directory} && #{@build.project.command} 2>&1 | tee build.txt ;
  function return_exit_status { return ${PIPESTATUS[0]}; };
  return_exit_status
  )
 CODE

We create a function called “return_exit_status” which returns the exit status from the most recently execute command pipeline. And lastly, all we have to do is call it and it will use the function’s return code as the exit status for the subprocess. This let’s us take advantage of tee as well as ensure our build is properly treated as successful or not.

Good old UNIX tools to the rescue.

UNIX tee in real life

Related Posts

About Mutually Human

UNIX tee in real life

Related Topics

Related Posts

About Mutually Human