Potentially malicious filenames

How can I rename all the files of a certain pattern to a different pattern in Bash?

This question comes up all the time. We’ll walk you through the common answers, how to make them bulletproof as possible, and how they fail. Then we’ll do it right, in Perl.

TL;DR – you cannot safely handle malicious filenames in bash scripts, you need a real language – like Perl – for that.

Simple, and wrong: for loops in Bash

Let’s say you have a simple set of files:

me@banshee:~/demo$ ls
test1.txt test2.txt test3.txt

Maybe you want to rename all the .txt files to .bin files. You could do this with a bash for loop:

for file in `ls test*.txt` ; do newfile=`echo $file | sed 's/\.txt$/.bin/'`; mv $file $newfile; done

For this simple set of files with no nasty pitfalls in the names, this works fine:

me@banshee:~/demo$ ls
test1.bin test2.bin test3.bin

But what if we want to make it ugly? Let’s add a couple of files with some really malicious names to our original list:

me@banshee:~/demo$ ls -l
total 2
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test1.txt
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test2.txt
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test3.txt
-rw-rw-r-- 1 me me 0 Jun 20 17:24 test 4 $?!*--""'';:()[]{}.txt
-rw-rw-r-- 1 me me 0 Jun 20 17:26 test 5 ; rm * ; .txt

Ah, our old friend Little Bobby Tables! What happens if we run our naive little for loop on that set of files?

me@banshee:~/demo$ for file in `ls test*.txt`; do newfile=`echo $file | sed 's/\.txt$/.bin/'`; mv $file $newfile; done
mv: cannot stat ‘test’: No such file or directory
mv: cannot stat ‘4’: No such file or directory
mv: cannot stat ‘lolwtf’: No such file or directory
mv: cannot stat ‘$?!*--;:(){}[].txt’: No such file or directory
mv: cannot stat ‘test’: No such file or directory
mv: cannot stat ‘5’: No such file or directory
mv: cannot stat ‘;’: No such file or directory
mv: cannot stat ‘rm’: No such file or directory
mv: cannot stat ‘test1.txt’: No such file or directory
mv: cannot stat ‘test2.txt’: No such file or directory
mv: cannot stat ‘test3.txt’: No such file or directory
mv: target ‘$?!*--;:(){}[].bin’ is not a directory
mv: target ‘.bin’ is not a directory
mv: cannot stat ‘;’: No such file or directory
mv: cannot stat ‘.txt’: No such file or directory
me@banshee:~/demo$ ls -l
total 2
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test1.bin
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test2.bin
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test3.bin
-rw-rw-r-- 1 me me 0 Jun 20 17:24 test 4 lolwtf $?!*--""'';:()[]{}.txt
-rw-rw-r-- 1 me me 0 Jun 20 17:26 test 5 ; rm * ; .txt

Well, it could have been worse. At least we didn’t end up actually executing that ; rm * ; we snuck in there. Still, the operation clearly didn’t complete successfully – and if it isn’t making you nervous, it damn well should be. So now what?

Complex, and wrong: while read loops in Bash

The for loop broke on spaces, meaning any filename with a space in it would get handled as separate pieces. Upgrading from a for loop to a while read loop will mitigate that. Adding some strategic encapsulation with “double quotes” will mitigate a few more expansion issues beyond that.

me@banshee:~/demo$ ls *.txt | while read file; do newfile=`echo "$file" | sed 's/\.txt$/.bin/'`; mv "$file" "$newfile"; done

Let’s look at each piece of this puzzle in order:

  • ls *.txt
    I think you can figure this one out.

  • while read file; do
    for each complete line of the input, put it in a variable named $file, then execute this loop.

  • newfile=`echo “$file” | sed ‘s/\.txt$/.bin/’`
    Actually we need to break this down further:

    • newfile=“ – execute the contents of the `backticks` as a command, and put the output in the variable $newfile.
    • echo “$file” | – dump the contents of the variable $echo into the input of the command after the pipe. Note how we encapsulated “$file” with “double quotes” here – this prevents bash from expanding the contents of $file, as well as expanding $file itself. This is worth a demonstration of its own:
      me@banshee:/tmp/tmp$ touch 1 2 3
      me@banshee:/tmp/tmp$ ls 
      1  2  3
      me@banshee:/tmp/tmp$ echo 1 * 3
      1 1 2 3 3
      me@banshee:/tmp/tmp$ echo "1 * 3"
      1 * 3
      

      See?

    • sed ‘s/\.txt$/.bin/’ – we’re asking sed to replace any instance of .txt at the end of the line only with .bin. The $ in \.txt$ tells us only to match if .txt is at the end of the line. The \ in \.txt$ tells sed to “escape” the period character, which otherwise would be a wildcard that matches ANY character in the input. We don’t need to escape the period character in .bin, since that’s in the replacement section, where wildcard characters would never be interpreted anyway.
    • mv “$file” “$newfile” – fairly self-explanatory, but note again the “double” “quotes” encapsulation of “$file” and “$newfile” – it’s necessary; this tells bash that the variables $file and $newfile are one single argument apiece, no matter how many spaces, wildcard characters, or anything else might be present once $file and $newfile are expanded to their literal text. You might think that the double quotes in that nasty test 4 lolwtf \$?!*–“””;:()[]{}.txt filename would still break this – but, thankfully, you’d be wrong.
    • done – this justcompletes the loop.

    But did it work?

    me@banshee:~/demo$ ls -l
    total 3
    -rw-rw-r-- 1 me me 0 Jun 20 17:19 test1.bin
    -rw-rw-r-- 1 me me 0 Jun 20 17:19 test2.bin
    -rw-rw-r-- 1 me me 0 Jun 20 17:19 test3.bin
    -rw-rw-r-- 1 me me 0 Jun 20 17:24 test 4 lolwtf $?!*--""'';:()[]{}.bin
    -rw-rw-r-- 1 me me 0 Jun 20 17:27 test 5 ; rm * ; .bin
    

    Woohoo!

    There’s still one trick we haven’t thrown at our little bash loop, though, and that’s escape characters in the filenames. Behold, the mighty backslash:

    me@banshee:~/demo$ ls -l | grep 6
    -rw-rw-r-- 1 me me 0 Jun 20 18:17 test 6 \$?!*.txt
    

    Will it blend?

    me@banshee:~/demo$ ls *.txt | while read file; do newfile="`echo "$file" | sed 's/\.txt$/.bin/'`"; mv "$file" "$newfile"; done
    mv: cannot stat ‘test 6 $?!*.txt’: No such file or directory

    ‘Fraid not – we successfully handled wildcards, quotes, and shell specials, but the lowly backslash brought us down with ease. If there’s a safe and sane way to handle that in Bash, I don’t know it. Also, by now my brain hurts with all the double quoting and the tricky loop selection and the AAAAIIIIEEEE make it stop already!

    So how DO you REALLY safely handle insanely dirty filenames? With Perl. Duh.

    Simple, and correct: while loops in Perl

    #!/usr/bin/perl
    
    # demonstration - rename all files in the current directory ending in .txt
    #                 with the same filename, but ending in .bin
    
    # open the current directory, or die with a useful error message if we can't
    opendir (DIR, '.') or die $!;
    
    # loop through the current directory, file by file
    while (my $file = readdir(DIR)) {
    	# don't do anything unless this is a .txt file
    	if ($file =~ /\.txt$/) {
    		# first set $newfile to $file
    		my $newfile = $file;
    		# now replace .txt at the end of $newfile with .bin
    		$newfile =~ s/\.txt$/.bin/;
    		# now rename the actual file from .txt to .bin
    		rename $file, $newfile;
    	}
    }
    # close the current directory
    closedir (DIR);
    

    This is a lot easier to read and understand. There’s no crazy double and triple encapsulation, we can properly indent things, we can see what the hell we’re doing. Yes.

    But does it work?

    me@banshee:~/demo$ perl /tmp/test.pl
    me@banshee:~/demo$ ls -l
    total 3
    -rw-rw-r-- 1 me me   0 Jun 20 17:19 test1.bin
    -rw-rw-r-- 1 me me   0 Jun 20 17:19 test2.bin
    -rw-rw-r-- 1 me me   0 Jun 20 17:19 test3.bin
    -rw-rw-r-- 1 me me   0 Jun 20 17:24 test 4 lolwtf $?!*--;:(){}[].bin
    -rw-rw-r-- 1 me me   0 Jun 20 17:27 test 5 ; rm * ; .bin
    -rw-rw-r-- 1 me me   0 Jun 20 18:17 test 6 \$?!*.bin
    

    Damn right it does – they don’t call Perl the Swiss Army Chainsaw for nothin’.

    endfile TL;DR: don’t try to handle potentially-malicious filenames with Bash in the first place – handle them with Perl.

Published by

Jim Salter

Mercenary sysadmin, open source advocate, and frotzer of the jim-jam.

Leave a Reply

Your email address will not be published. Required fields are marked *