June 2016 – JRS Systems: the blog

Verifying copies

Most of the time, we explicitly trust that our tools are doing what we ask them to do. For example, if we rsync -a /source /target, we trust that the contents of /target will exactly match the contents of /source. That’s the whole point, right? And rsync is a tool with a long and impeccable lineage. So we trust it. And, really, we should.

But once in a while, we might get paranoid. And when we do, and we decide to empirically and thoroughly check our copied data… things might get weird. Today, we’re going to explore the weirdness, and look at how to interpret it.

So you want to make a copy of /usr/bin…

Well, let’s be honest, you probably don’t. But it’s a convenient choice, because it has plenty of files in it (but not so many that copies will take forever), it has folders beneath it, and it has a couple of other things in there that can throw you off balance.

On the machine I’m sitting in front of now, /usr/bin is on an ext4 filesystem – the root filesystem, in fact. The machine also has a couple of ZFS filesystems. Let’s start out by using rsync to copy it off to a ZFS dataset.

We’ll use the -a argument to rsync, which stands for archive, and is a shorthand for -rlptgoD. This means recursive, copy symlinks as symlinks, maintain permissions, maintain timestamps, maintain group ownership, maintain ownership, and preserve devices and other special files as devices and special files where possible. Seems fairly straightforward, right?

root@banshee:/# zfs create banshee/tmp
root@banshee:/# rsync -a /usr/bin /banshee/tmp/
root@banshee:/# du -hs /usr/bin ; du -hs /banshee/tmp/bin
353M	/usr/bin
215M	/banshee/tmp/bin

Hey! We’re missing data! What the hell? No, we’re not actually missing data – we just didn’t think about the fact that the du command reports space used on disk, which is related but not at all the same thing as amount of data stored. In this case, obvious culprit is obvious:

root@banshee:/# zfs get compression banshee/tmp
NAME         PROPERTY     VALUE     SOURCE
banshee/tmp  compression  lz4       inherited from banshee
root@banshee:/# zfs get compressratio banshee/tmp
NAME         PROPERTY       VALUE  SOURCE
banshee/tmp  compressratio  1.55x  -

So, compression tricked us.

I had LZ4 compression enabled on my ZFS dataset, which means that the contents of /usr/bin – much of which are highly compressible – do not take up anywhere near as much space on disk when stored there as they do on the original ext4 filesystem. For the purposes of our exercise, let’s just wipe it out, create a new dataset that doesn’t use compression, and try again.

root@banshee:/# zfs destroy -r banshee/tmp
root@banshee:/# zfs create -o compression=off banshee/tmp
root@banshee:/# rsync -ha /usr/bin /banshee/tmp/
root@banshee:/# du -hs /usr/bin ; du -hs /banshee/tmp/bin
353M	/usr/bin
365M	/tmp/bin

Wait, what the hell?! Compression’s off now, so why am I still seeing discrepancies – and in this case, more space used on the ZFS target than the ext4 source? Well, again – du measures on-disk space, and these are different filesystems. The blocksizes may or may not be different, and in this case, ZFS stores redundant copies of the metadata associated with each file where ext4 does not, so this adds up to measurable discrepancies in the output of du.

Semi-hidden, arcane du arguments

You won’t find any mention at all of it in the man page, but info du mentions an argument called –apparent-size that sounds like the answer to our prayers.

`--apparent-size'
     Print apparent sizes, rather than disk usage.  The apparent size
     of a file is the number of bytes reported by `wc -c' on regular
     files, or more generally, `ls -l --block-size=1' or `stat
     --format=%s'.  For example, a file containing the word `zoo' with
     no newline would, of course, have an apparent size of 3.  Such a
     small file may require anywhere from 0 to 16 KiB or more of disk
     space, depending on the type and configuration of the file system
     on which the file resides.

Perfect! So all we have to do is use that, right?

root@banshee:/# du -hs --apparent-size /usr/bin ; du -hs --apparent-size /banshee/tmp/bin
349M	/usr/bin
353M	/banshee/tmp/bin

We’re still consuming 4MB more space on the target than the source, dammit. WHY?

Counting files and comparing notes

So we’ve tried everything we could think of, including some voodoo arguments to du that probably cost the lives of many Bothans, and we still have a discrepancy in the amount of data present in source and target. Next step, let’s count the files and folders present on both sides, by using the find command to enumerate them all, and piping its output to wc -l, which counts them once enumerated:

root@banshee:/# find /usr/bin | wc -l ; find /banshee/tmp/bin | wc -l
2107
2107

Huh. No discrepancies there. OK, fine – we know about the diff command, which compares the contents of files and tells us if there are discrepancies – and handily, diff -r is recursive! Problem solved!

root@banshee:/# diff -r /usr/bin /banshee/tmp/bin 2>&1 | head -n 5
diff: /banshee/tmp/bin/arista-gtk: No such file or directory
diff: /banshee/tmp/bin/arista-transcode: No such file or directory
diff: /banshee/tmp/bin/dh_pypy: No such file or directory
diff: /banshee/tmp/bin/dh_python3: No such file or directory
diff: /banshee/tmp/bin/firefox: No such file or directory

What’s wrong with diff -r?

Actually, problem emphatically not solved. I cheated a little, there – I knew perfectly well we were going to get a ton of errors thrown, so I piped descriptor 2 (standard error) to descriptor 1 (standard output) and then to head -n 5, which would just give me the first five lines of errors – otherwise we’d be staring at screens full of crap. Why are we getting “no such file or directory” on all these? Let’s look at one:

root@banshee:/# ls -l /banshee/tmp/bin/arista-gtk
lrwxrwxrwx 1 root root 26 Mar 20  2014 /banshee/tmp/bin/arista-gtk -> ../share/arista/arista-gtk

Oh. Well, crap – /usr/bin is chock full of relative symlinks, which rsync -a copied exactly as they are. So now our target, /banshee/tmp/bin, is full of relative symlinks that don’t go anywhere, and diff -r is trying to follow them to compare contents. /usr/share/arista/arista-gtk is a thing, but /banshee/share/arista/arista-gtk isn’t. So now what?

find, and xargs, and md5sum, oh my!

Luckily, we have a tool called md5sum that will generate a pretty strong checksum of any arbitrary content we give it. It’ll still try to follow symlinks if we let it, though. We could just ignore the symlinks, since they don’t have any data in them directly anyway, and just sorta assume that they’re targeted to the same relative target that they were on the source… yeah, sure, let’s try that.

What we’ll do is use the find command, with the -type f argument, to only find actual files in our source and our target, thereby skipping folders and symlinks entirely. We’ll then pipe that to xargs, which turns the list of output from find -type f into a list of arguments for md5sum, which we can then redirect to text files, and then we can diff the text files. Problem solved! We’ll find that extra 4MB of data in no time!

root@banshee:/# find /usr/bin -type f | xargs md5sum > /tmp/sourcesums.txt
root@banshee:/# find /banshee/tmp/bin -type f | xargs md5sum > /tmp/targetsums.txt

… okay, actually I’m going to cheat here, because you guessed it: we failed again. So we’re going to pipe the output through grep sudo to let us just look at the output of a couple of lines of detected differences.

root@banshee:/# diff /tmp/sourcesums.txt /tmp/targetsums.txt | grep sudo
< 866713ce6e9700c3a567788334be7e25  /usr/bin/sudo
< e43c40cfbec06f191191120f4332f60f  /usr/bin/sudoreplay
> e43c40cfbec06f191191120f4332f60f  /banshee/tmp/bin/sudoreplay
> 866713ce6e9700c3a567788334be7e25  /banshee/tmp/bin/sudo

Ah, crap – absolute paths. Every single file failed this check, since we’re seeing it referenced by absolute path, and /usr/bin/sudo doesn’t match /banshee/tmp/bin/sudo, even though the checksums match! OK, we’ll fix that by using relative paths when we generate our sums… and I’ll save you a little more time; that might fail too, because there’s not actually any guarantee that find will return the files in the same sort order both times. So we’ll also pipe find‘s output through sort to fix that problem, too.

root@banshee:/banshee/tmp# cd /usr ; find ./bin -type f | sort | xargs md5sum > /tmp/sourcesums.txt
root@banshee:/usr# cd /banshee/tmp ; find ./bin -type f | sort | xargs md5sum > /tmp/targetsums.txt
root@banshee:/banshee/tmp# diff /tmp/sourcesums.txt /tmp/targetsums.txt
root@banshee:/banshee/tmp#

Yay, we finally got verification – all of our files exist on both sides, are named the same on both sides, are in the same folders on both sides, and checksum out the same on both sides! So our data’s good! Hey, wait a minute… if we have the same number of files, in the same places, with the same contents, how come we’ve still got different totals?!

Macabre, unspeakable use of ls

We didn’t really get what we needed out of find. We do at least know that we have the right total number of files and folders, and that things are in the right places, and that we got the same checksums out of the files that we could check. So where the hell are those missing 4MB of data coming from?! We’re using our arcane invocation of du -s –apparent-size to make sure we’re looking at bytes of data and not bytes of allocated disk space, so why are we still seeing differences?

This time, let’s try using ls. It’s rarely used, but ls does have a recursion argument, -R. We’ll use that along with -l for long form and -a to pick up dotfiles, and we’ll pipe it though grep -v ‘\.\.’ to get rid of the entry for the parent directory (which obviously won’t be the same for /usr/bin and for /banshee/tmp/bin, no matter how otherwise perfect matches they are), and then, once more, we’ll have something we can diff.

root@banshee:/banshee/tmp# ls -laR /usr/bin > /tmp/sourcels.txt
root@banshee:/banshee/tmp# ls -laR /banshee/tmp/bin > /tmp/targetls.txt
root@banshee:/banshee/tmp# diff /tmp/sourcels.txt /tmp/targetls.txt | grep which
< lrwxrwxrwx  1 root   root         10 Jan  6 14:16 which -> /bin/which
> lrwxrwxrwx 1 root   root         10 Jan  6 14:16 which -> /bin/which

Once again, I cheated, because there was still going to be a problem here. ls -l isn’t necessarily going to use the same number of formatting spaces when listing the source or the target – so our diff, again, frustratingly fails on every single line. I cheated here by only looking at the output for the which file so we wouldn’t get overwhelmed. The fix? We’ll pipe through awk, forcing the spacing to be constant. Grumble grumble grumble…

root@banshee:/banshee/tmp# ls -laR /usr/bin | awk '{print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11}' > /tmp/sourcels.txt
root@banshee:/banshee/tmp# ls -laR /banshee/tmp/bin | awk '{print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11}' > /tmp/targetls.txt
root@banshee:/banshee/tmp# diff /tmp/sourcels.txt /tmp/targetls.txt | head -n 25
1,4c1,4
< /usr/bin:          
< total 364620         
< drwxr-xr-x 2 root root 69632 Jun 21 07:39 .  
< drwxr-xr-x 11 root root 4096 Jan 6 15:59 ..  
---
> /banshee/tmp/bin:          
> total 373242         
> drwxr-xr-x 2 root root 2108 Jun 21 07:39 .  
> drwxr-xr-x 3 root root 3 Jun 29 18:00 ..  
166c166
< -rwxr-xr-x 2 root root 36607 Mar 1 12:35 c2ph  
---
> -rwxr-xr-x 1 root root 36607 Mar 1 12:35 c2ph  
828,831c828,831
< -rwxr-xr-x 4 root root 14592 Dec 3 2012 lockfile-check  
< -rwxr-xr-x 4 root root 14592 Dec 3 2012 lockfile-create  
< -rwxr-xr-x 4 root root 14592 Dec 3 2012 lockfile-remove  
< -rwxr-xr-x 4 root root 14592 Dec 3 2012 lockfile-touch  
---
> -rwxr-xr-x 1 root root 14592 Dec 3 2012 lockfile-check  
> -rwxr-xr-x 1 root root 14592 Dec 3 2012 lockfile-create  
> -rwxr-xr-x 1 root root 14592 Dec 3 2012 lockfile-remove  
> -rwxr-xr-x 1 root root 14592 Dec 3 2012 lockfile-touch  
885,887c885,887

FINALLY! Useful output! It makes perfect sense that the special files . and .., referring to the current and parent directories, would be different. But we’re still getting more hits, like for those four lockfile- commands. We already know they’re identical from source to target, both because we can see the filesizes are the same and because we’ve already md5sum‘ed them and they matched. So what’s up here?

And you thought you’d never need to use the ‘info’ command in anger

We’re going to have to use info instead of man again – this time, to learn about the ls command’s lesser-known evils.

Let’s take a quick look at those mismatched lines for the lockfile- commands again:

< -rwxr-xr-x 4 root root 14592 Dec 3 2012 lockfile-check  
< -rwxr-xr-x 4 root root 14592 Dec 3 2012 lockfile-create  
< -rwxr-xr-x 4 root root 14592 Dec 3 2012 lockfile-remove  
< -rwxr-xr-x 4 root root 14592 Dec 3 2012 lockfile-touch  
---
> -rwxr-xr-x 1 root root 14592 Dec 3 2012 lockfile-check  
> -rwxr-xr-x 1 root root 14592 Dec 3 2012 lockfile-create  
> -rwxr-xr-x 1 root root 14592 Dec 3 2012 lockfile-remove  
> -rwxr-xr-x 1 root root 14592 Dec 3 2012 lockfile-touch

OK, ho hum, we’ve seen this a million times before. File type, permission modes, owner, group, size, modification date, filename, and in the case of the one symlink shown, the symlink target. Hey, wait a minute… how come nobody ever thinks about the second column? And, yep, that second column is the only difference in these lines – it’s 4 in the source, and 1 in the target.

But what does that mean? You won’t learn the first thing about it from man ls, which doesn’t tell you any more than that -l gives you a “long listing format”. And there’s no option to print column headers, which would certainly be helpful… info ls won’t give you any secret magic arguments to print column headers either. But however grudgingly, it will at least name them:

`-l'
`--format=long'
`--format=verbose'
     In addition to the name of each file, print the file type, file
     mode bits, number of hard links, owner name, group name, size, and
     timestamp

So it looks like that mysterious second column refers to the number of hardlinks to the file being listed, normally one (the file itself) – but on our source, there are four hardlinks to each of our lockfile- files.

Checking out hardlinks

We can explore this a little further, using find‘s –samefile test:

root@banshee:/banshee/tmp# find /usr/bin -samefile /usr/bin/lockfile-check
/usr/bin/lockfile-create
/usr/bin/lockfile-check
/usr/bin/lockfile-remove
/usr/bin/lockfile-touch
root@banshee:/banshee/tmp# find /banshee/tmp/bin -samefile /usr/bin/lockfile-check
root@banshee:/banshee/tmp#

Well, that explains it – on our source, lockfile-create, lockfile-check, lockfile-remove, and lockfile-touch are all actually just hardlinks to the same inode – meaning that while they look like four different files, they’re actually just four separate pointers to the same data on disk. Those and several other sets of hardlinks add up to the reason why our target, /banshee/tmp/bin, is larger than our source, /usr/bin, even though they appear to contain the same data exactly.

Could we have used rsync to duplicate /usr/bin exactly, including the hardlink structures? Absolutely! That’s what the -H argument is there for. Why isn’t that rolled into the -a shortcut argument, you might ask? Well, probably because hardlinks aren’t supported on all filesystems – in particular, they won’t work on FAT32 thumbdrives, or on NFS links set up by sufficiently paranoid sysadmins – so rather than have rsync jobs fail, it’s left up to you to specify if you want to recreate them.

Simpler verification with tar

This was cumbersome as hell – could we have done it more simply? Well, of course! Actually… maybe.

We could have used the venerable tar command to generate a single md5sum for the entire collection of data in one swell foop. We do have to be careful, though – tar will encode the entire path you feed it into the file structure, so you can’t compare tars created directly from /usr/bin and /banshee/tmp/bin – instead, you need to cd into the parent directories, create tars of ./bin in each case, then compare those.

Let’s start out by comparing our current copies, which differ in how hardlinks are handled:

root@banshee:/tmp# cd /usr ; tar -c ./bin | md5sum ; cd /banshee/tmp ; tar -c ./bin | md5sum
4bb9ffbb20c11b83ca616af716e2d792  -
2cf60905cb7870513164e90ac82ffc3d  -

Yep, they differ – which we should be expecting ; after all, the source has hardlinks and the target does not. How about if we fix up the target so that it also uses hardlinks?

root@banshee:/tmp# rm -r /banshee/tmp/bin ; rsync -haH /usr/bin /banshee/tmp/
root@banshee:/tmp# cd /usr ; tar -c ./bin | md5sum ; cd /banshee/tmp ; tar -c ./bin | md5sum
4bb9ffbb20c11b83ca616af716e2d792  -
f654cf7a1c4df8482d61b59993bb92f7  -

Well, crap. That should’ve resulted in identical md5sums, shouldn’t it? It would have, if we’d gone from one ext4 filesystem to another:

root@banshee:/banshee/tmp# rsync -haH /usr/bin /tmp/
root@banshee:/banshee/tmp# cd /usr ; tar -c ./bin | md5sum ; cd /tmp ; tar -c ./bin | md5sum
4bb9ffbb20c11b83ca616af716e2d792  -
4bb9ffbb20c11b83ca616af716e2d792  -

So why, exactly, didn’t we get the same md5sum values for a tar on ZFS, when we did on ext4? The metadata gets included in those tarballs, and it’s just plain different between different filesystems. Different metadata means different contents of the tar means different checksums. Ugly but true. Interestingly, though, if we dump /banshee/tmp/bin back through tar and out to another location, the checksums match again:

root@banshee:/tmp/fromtar# tar -c /banshee/tmp/bin | tar -x
tar: Removing leading `/' from member names
tar: Removing leading `/' from hard link targets
root@banshee:/tmp/fromtar# cd /tmp/fromtar/banshee/tmp ; tar -c ./bin | md5sum ; cd /usr ; tar -c ./bin | md5sum
4bb9ffbb20c11b83ca616af716e2d792  -
4bb9ffbb20c11b83ca616af716e2d792  -

What this tells us is that ZFS is storing additional metadata that ext4 can’t, and that, in turn, is what was making our tarballs checksum differently… but when you move that data back onto a nice dumb ext4 filesystem again, the extra metadata is stripped again, and things match again.

Picking and choosing what matters

Properly verifying data and metadata on a block-for-block basis is hard, and you need to really, carefully think about what, exactly, you do – and don’t! – want to verify. If you want to verify absolutely everything, including folder metadata, tar might be your best bet – but it probably won’t match up across filesystems. If you just want to verify the contents of the files, some combination of find, xargs, md5sum, and diff will do the trick. If you’re concerned about things like hardlink structure, you have to check for those independently. If you aren’t just handwaving things away, this stuff is hard. Extra hard if you’re changing filesystems from source to target.

Shortcut for ZFS users: replication!

For those of you using ZFS, and using strictly ZFS, though, there is a shortcut: replication. If you use syncoid to orchestrate ZFS replication at the dataset level, you can be sure that absolutely everything, yes everything, is preserved:

root@banshee:/usr# syncoid banshee/tmp banshee/tmp2
INFO: Sending oldest full snapshot banshee/tmp@autosnap_2016-06-29_19:33:01_hourly (~ 370.8 MB) to new target filesystem:
 380MB 0:00:01 [ 229MB/s] [===============================================================================================================================] 102%            
INFO: Updating new target filesystem with incremental banshee/tmp@autosnap_2016-06-29_19:33:01_hourly ... syncoid_banshee_2016-06-29:19:39:01 (~ 4 KB):
2.74kB 0:00:00 [26.4kB/s] [======================================================================================>                                         ] 68%           
root@banshee:/banshee/tmp2# cd /banshee/tmp ; tar -c . | md5sum ; cd /banshee/tmp2 ; tar -c . | md5sum
e2b2ca03ffa7ba6a254cefc6c1cffb4f  -
e2b2ca03ffa7ba6a254cefc6c1cffb4f  -

Easy, peasy, chicken greasy: replication guarantees absolutely everything matches.

TL;DR

There isn’t one, really! Verification of data and metadata, especially across filesystems, is one of the fundamental challenges of information systems that’s generally abstracted away from you almost entirely, and it makes for one hell of a deep rabbit hole to fall into. Hopefully, if you’ve slogged through all this with me so far, you have a better idea of what tools are at your fingertips to test it when you need to.

Potentially malicious filenames

How can I rename all the files of a certain pattern to a different pattern in Bash?

This question comes up all the time. We’ll walk you through the common answers, how to make them bulletproof as possible, and how they fail. Then we’ll do it right, in Perl.

TL;DR – you cannot safely handle malicious filenames in bash scripts, you need a real language – like Perl – for that.

Simple, and wrong: for loops in Bash

Let’s say you have a simple set of files:

me@banshee:~/demo$ ls
test1.txt test2.txt test3.txt

Maybe you want to rename all the .txt files to .bin files. You could do this with a bash for loop:

for file in `ls test*.txt` ; do newfile=`echo $file | sed 's/\.txt$/.bin/'`; mv $file $newfile; done

For this simple set of files with no nasty pitfalls in the names, this works fine:

me@banshee:~/demo$ ls
test1.bin test2.bin test3.bin

But what if we want to make it ugly? Let’s add a couple of files with some really malicious names to our original list:

me@banshee:~/demo$ ls -l
total 2
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test1.txt
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test2.txt
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test3.txt
-rw-rw-r-- 1 me me 0 Jun 20 17:24 test 4 $?!*--""'';:()[]{}.txt
-rw-rw-r-- 1 me me 0 Jun 20 17:26 test 5 ; rm * ; .txt

Ah, our old friend Little Bobby Tables! What happens if we run our naive little for loop on that set of files?

me@banshee:~/demo$ for file in `ls test*.txt`; do newfile=`echo $file | sed 's/\.txt$/.bin/'`; mv $file $newfile; done
mv: cannot stat ‘test’: No such file or directory
mv: cannot stat ‘4’: No such file or directory
mv: cannot stat ‘lolwtf’: No such file or directory
mv: cannot stat ‘$?!*--;:(){}[].txt’: No such file or directory
mv: cannot stat ‘test’: No such file or directory
mv: cannot stat ‘5’: No such file or directory
mv: cannot stat ‘;’: No such file or directory
mv: cannot stat ‘rm’: No such file or directory
mv: cannot stat ‘test1.txt’: No such file or directory
mv: cannot stat ‘test2.txt’: No such file or directory
mv: cannot stat ‘test3.txt’: No such file or directory
mv: target ‘$?!*--;:(){}[].bin’ is not a directory
mv: target ‘.bin’ is not a directory
mv: cannot stat ‘;’: No such file or directory
mv: cannot stat ‘.txt’: No such file or directory
me@banshee:~/demo$ ls -l
total 2
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test1.bin
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test2.bin
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test3.bin
-rw-rw-r-- 1 me me 0 Jun 20 17:24 test 4 lolwtf $?!*--""'';:()[]{}.txt
-rw-rw-r-- 1 me me 0 Jun 20 17:26 test 5 ; rm * ; .txt

Well, it could have been worse. At least we didn’t end up actually executing that ; rm * ; we snuck in there. Still, the operation clearly didn’t complete successfully – and if it isn’t making you nervous, it damn well should be. So now what?

Complex, and wrong: while read loops in Bash

The for loop broke on spaces, meaning any filename with a space in it would get handled as separate pieces. Upgrading from a for loop to a while read loop will mitigate that. Adding some strategic encapsulation with “double quotes” will mitigate a few more expansion issues beyond that.

me@banshee:~/demo$ ls *.txt | while read file; do newfile=`echo "$file" | sed 's/\.txt$/.bin/'`; mv "$file" "$newfile"; done

Let’s look at each piece of this puzzle in order:

ls *.txt
I think you can figure this one out.
while read file; do
for each complete line of the input, put it in a variable named $file, then execute this loop.
newfile=`echo “$file” | sed ‘s/\.txt$/.bin/’`
Actually we need to break this down further:
- newfile=“ – execute the contents of the `backticks` as a command, and put the output in the variable $newfile.
- echo “$file” | – dump the contents of the variable $echo into the input of the command after the pipe. Note how we encapsulated “$file” with “double quotes” here – this prevents bash from expanding the contents of $file, as well as expanding $file itself. This is worth a demonstration of its own:
```
me@banshee:/tmp/tmp$ touch 1 2 3
me@banshee:/tmp/tmp$ ls 
1  2  3
me@banshee:/tmp/tmp$ echo 1 * 3
1 1 2 3 3
me@banshee:/tmp/tmp$ echo "1 * 3"
1 * 3
```
  See?
- sed ‘s/\.txt$/.bin/’ – we’re asking sed to replace any instance of .txt at the end of the line only with .bin. The $ in \.txt$ tells us only to match if .txt is at the end of the line. The \ in \.txt$ tells sed to “escape” the period character, which otherwise would be a wildcard that matches ANY character in the input. We don’t need to escape the period character in .bin, since that’s in the replacement section, where wildcard characters would never be interpreted anyway.
- mv “$file” “$newfile” – fairly self-explanatory, but note again the “double” “quotes” encapsulation of “$file” and “$newfile” – it’s necessary; this tells bash that the variables $file and $newfile are one single argument apiece, no matter how many spaces, wildcard characters, or anything else might be present once $file and $newfile are expanded to their literal text. You might think that the double quotes in that nasty test 4 lolwtf \$?!*–“””;:()[]{}.txt filename would still break this – but, thankfully, you’d be wrong.
- done – this justcompletes the loop.
But did it work?
```
me@banshee:~/demo$ ls -l
total 3
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test1.bin
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test2.bin
-rw-rw-r-- 1 me me 0 Jun 20 17:19 test3.bin
-rw-rw-r-- 1 me me 0 Jun 20 17:24 test 4 lolwtf $?!*--""'';:()[]{}.bin
-rw-rw-r-- 1 me me 0 Jun 20 17:27 test 5 ; rm * ; .bin
```
Woohoo!

There’s still one trick we haven’t thrown at our little bash loop, though, and that’s escape characters in the filenames. Behold, the mighty backslash:
```
me@banshee:~/demo$ ls -l | grep 6
-rw-rw-r-- 1 me me 0 Jun 20 18:17 test 6 \$?!*.txt
```
Will it blend?
```
me@banshee:~/demo$ ls *.txt | while read file; do newfile="`echo "$file" | sed 's/\.txt$/.bin/'`"; mv "$file" "$newfile"; done
mv: cannot stat ‘test 6 $?!*.txt’: No such file or directory
```
‘Fraid not – we successfully handled wildcards, quotes, and shell specials, but the lowly backslash brought us down with ease. If there’s a safe and sane way to handle that in Bash, I don’t know it. Also, by now my brain hurts with all the double quoting and the tricky loop selection and the AAAAIIIIEEEE make it stop already!

So how DO you REALLY safely handle insanely dirty filenames? With Perl. Duh.

Simple, and correct: while loops in Perl
```
#!/usr/bin/perl

# demonstration - rename all files in the current directory ending in .txt
#                 with the same filename, but ending in .bin

# open the current directory, or die with a useful error message if we can't
opendir (DIR, '.') or die $!;

# loop through the current directory, file by file
while (my $file = readdir(DIR)) {
	# don't do anything unless this is a .txt file
	if ($file =~ /\.txt$/) {
		# first set $newfile to $file
		my $newfile = $file;
		# now replace .txt at the end of $newfile with .bin
		$newfile =~ s/\.txt$/.bin/;
		# now rename the actual file from .txt to .bin
		rename $file, $newfile;
	}
}
# close the current directory
closedir (DIR);
```
This is a lot easier to read and understand. There’s no crazy double and triple encapsulation, we can properly indent things, we can see what the hell we’re doing. Yes.

But does it work?
```
me@banshee:~/demo$ perl /tmp/test.pl
me@banshee:~/demo$ ls -l
total 3
-rw-rw-r-- 1 me me   0 Jun 20 17:19 test1.bin
-rw-rw-r-- 1 me me   0 Jun 20 17:19 test2.bin
-rw-rw-r-- 1 me me   0 Jun 20 17:19 test3.bin
-rw-rw-r-- 1 me me   0 Jun 20 17:24 test 4 lolwtf $?!*--;:(){}[].bin
-rw-rw-r-- 1 me me   0 Jun 20 17:27 test 5 ; rm * ; .bin
-rw-rw-r-- 1 me me   0 Jun 20 18:17 test 6 \$?!*.bin
```
Damn right it does – they don’t call Perl the Swiss Army Chainsaw for nothin’.

endfile TL;DR: don’t try to handle potentially-malicious filenames with Bash in the first place – handle them with Perl.

PSA: Snapshots are better than ZVOLs

A lot of people new to ZFS, and even a lot of people not-so-new to ZFS, like to wax ecstatic about ZVOLs. But they never seem to mention the very real pitfalls ZVOLs present.

What’s a ZVOL?

Well, if you know what LVM is, a ZVOL is like an LV, but for ZFS. If you don’t know what LVM is, you can think of a ZVOL as, basically, a dynamically allocated “raw partition” inside ZFS. Unlike a normal dataset, a ZVOL doesn’t have a filesystem of its own. And you can access it by a raw devicename, like /dev/zvol/poolname/zvolname. This looks ideal for those use-cases where you want to nest a legacy filesystem underneath ZFS – for example, virtual machine images. Once you have the ZVOL, you have a raw block storage device to interact with – think mkfs.ext4 /dev/zvol/poolname/zvolname, for example – but you still get all the normal ZFS features along with it, like data integrity, compression, snapshots, and so forth. Plus you don’t have to mess with a loopback device, so that should be higher performance, right? What’s not to love?

ZVOLs perform better, though, right?

AFAICT, the increased performance is pretty much a lie. I’ve benchmarked ZVOLs pretty extensively against raw disk partitions, raw LVs, raw files, and even .qcow2 files and there really isn’t much of a performance difference to be seen. A partially-allocated ZVOL isn’t going to perform any better than a partially-allocated .qcow2 file, and a fully-allocated ZVOL isn’t going to perform any better than a fully-allocated .qcow2 file. (Raw disk partitions or LVs don’t really get any significant boost, either.)

Let’s talk about snapshots.

If snapshots aren’t one of the biggest reasons you’re using ZFS, they should be, and ZVOLs and snapshots are really, really tricky and weird. If you have a dataset that’s occupying 85% of your pool, you can snapshot that dataset any time you like. If you have a ZVOL that’s occupying 85% of your pool, you cannot snapshot it, period. This is one of those things that both tyros and vets tend to immediately balk at – I must be misunderstanding something, right? Surely it doesn’t work that way? Afraid it does.

Ooh, is it demo-in-a-VM-time again?! =)

root@xenial:~# zfs create target/dataset -o compress=off -o quota=15G
root@xenial:~# pv < /dev/zero > /target/dataset/0.bin
  15GiB 0:01:13 [10.3MiB/s] [            <=>                            ]
pv: write failed: Disk quota exceeded
root@xenial:~# zfs list
NAME             USED  AVAIL  REFER  MOUNTPOINT
target          15.3G  3.93G    19K  /target
target/dataset  15.0G      0  15.0G  /target/dataset
root@xenial:~# zfs snapshot target/dataset@1
root@xenial:~#

Above, we created a dataset on a 20G pool, we dumped 15G of data into it, and we snapshotted the dataset. No surprises here, this is exactly what we expect.

But what happens when we try the same thing with a ZVOL?

root@xenial:~# zfs create target/zvol -V 15G -o compress=off
root@xenial:~# pv < /dev/zero > /dev/zvol/target/zvol
  15GiB 0:03:22 [57.3MiB/s] [========================================>  ] 99% ETA 0:00:00
pv: write failed: No space left on device
NAME          USED  AVAIL  REFER  MOUNTPOINT
target       15.8G  3.46G    19K  /target
target/zvol  15.5G  3.90G  15.0G  -
root@xenial:~# zfs snapshot target/zvol@1
cannot create snapshot 'target/zvol@1': out of space

Despite having 3.9G free on our pool, we can’t snapshot the zvol. If you don’t have at least as much free space in a pool as the REFER of a ZVOL on that pool, you can’t snapshot the ZVOL, period. This means for our little baby demonstration here we’d need 15G free to snapshot our 15G ZVOL. In a real-world situation with VM images, this could easily be a case where you can’t snapshot your 15TB VM image without 15 terabytes of free space available – where if you’d stuck with standard datasets, you’d be able to snapshot that same 15TB VM even with just a few hundred megabytes of AVAIL at your disposal.

TL;DR:

Think long and hard before you implement ZVOLs. Then, you know… don’t.

Could not complete SSL handshake, Socket timeout after 10 seconds with Nagios and NSClient++ 0.4x

Adding a couple of new Windows hosts to my monitoring network this morning, my NRPE plugin checks against them were failing.

me@nagios:~$ /usr/lib/nagios/plugins/check_nrpe -H monitoredwindowsserver.vpn
CHECK_NRPE: Error - could not complete SSL handshake.

The usual culprits – password set but not being used, or hosts_allowed not including the host doing the checking – were already set correctly.

Turns out the NSClient++ folks changed up the default configs. In order to get it working again with a relatively vanilla Nagios server on the other end, I needed to set two new directives under [/settings/NRPE/server]:

verify mode = none
insecure = true

With those directives set and a restart to the nsclient service on the Windows end, manual tests to NRPE worked properly:

me@nagios:~$ /usr/lib/nagios/plugins/check_nrpe -H monitoredwindowsserver.vpn
I (0.4.4.19 2015-12-08) seem to be doing fine...

One of these days I should figure out how to get the default modes, which use peer-to-peer certificate checks, working… but for the moment, I’m only allowing traffic over a VPN tunnel anyway, so it was more important to get it working in its existing (secured by VPN) configuration than to blow a few hours untangling the new defaults.

Wasn’t out of the woods yet, though – that got NRPE working, but not NSClientServer, which is what I’m actually using to monitor these Windows hosts for the most part. So I was still seeing “CRITICAL – Socket timeout after 10 seconds” on a lot of tests against the new hosts in Nagios. Doing a netstat -an on the Windows hosts themselves showed that they were listening on 5666 – the NRPE port – but nothing was listening on 12489, the NSClientServer port. This required another fix in the nsclient.ini file. Just underneath [/modules], you’ll need to add (not just uncomment!) this line:

NSClientServer = enabled

And restart the NSClient++ service again, either from Services applet or with net stop nscp ; net start nscp. Now we test the check_nt plugin against the host…

me@nagios:~$ /usr/lib/nagios/plugins/check_nt -H monitoredwindowsserver.vpn -v UPTIME -p 12489
System Uptime - 0 day(s) 1 hour(s) 34 minute(s)

Now, finally, all of my tests are working. Hope this saves somebody else from having the kind of morning I had!

Month: June 2016