Will ZFS and non-ECC RAM kill your data?

This comes up far too often, so rather than continuing to explain it over and over again, I’m going to try to do a really good job of it once and link to it here.

What’s ECC RAM? Is it a good idea?

The ECC stands for Error Correcting Checksum. In a nutshell, ECC RAM is a special kind of server-grade memory that can detect and repair some of the most common kinds of in-memory corruption. For more detail on how ECC RAM does this, and which types of errors it can and cannot correct, the rabbit hole’s over here.

Now that we know what ECC RAM is, is it a good idea? Absolutely. In-memory errors, whether due to faults in the hardware or to the impact of cosmic radiation (yes, really) are a thing. They do happen. And if it happens in a particularly strategic place, you will lose data to it. Period. There’s no arguing this.

What’s ZFS? Is it a good idea?

ZFS is, among other things, a checksumming filesystem. This means that for every block committed to storage, a strong hash (somewhat misleadingly AKA checksum) for the contents of that block is also written. (The validation hash is written in the pointer to the block itself, which is also checksummed in the pointer leading to itself, and so on and so forth. It’s turtles all the way down. Rabbit hole begins over here for this one.)

Is this a good idea? Absolutely. Combine ZFS checksumming with redundancy or parity, and now you have a self-healing array. If a block is corrupt on disk, the next time it’s read, ZFS will see that it doesn’t match its checksum and will load a redundant copy (in the case of mirror vdevs or multiple copy storage) or rebuild a parity copy (in the case of RAIDZ vdevs), and assuming that copy of the block matches its checksum, will silently feed you the correct copy instead, and log a checksum error against the first block that didn’t pass.

ZFS also supports scrubs, which will become important in the next section. When you tell ZFS to scrub storage, it reads every block that it knows about – including redundant copies – and checks them versus their checksums. Any failing blocks are automatically overwritten with good blocks, assuming that a good (passing) copy exists, either redundant or as reconstructed from parity. Regular scrubs are a significant part of maintaining a ZFS storage pool against long term corruption.

Is ZFS and non-ECC worse than not-ZFS and non-ECC? What about the Scrub of Death?

OK, it’s pretty easy to demonstrate that a flipped bit in RAM means data corruption: if you write that flipped bit back out to disk, congrats, you just wrote bad data. There’s no arguing that. The real issue here isn’t whether ECC is good to have, it’s whether non-ECC is particularly problematic with ZFS. The scenario usually thrown out is the the much-dreaded Scrub Of Death.

TL;DR version of the scenario: ZFS is on a system with non-ECC RAM that has a stuck bit, its user initiates a scrub, and as a result of in-memory corruption good blocks fail checksum tests and are overwritten with corrupt data, thus instantly murdering an entire pool. As far as I can tell, this idea originates with a very prolific user on the FreeNAS forums named Cyberjock, and he lays it out in this thread here. It’s a scary idea – what if the very thing that’s supposed to keep your system safe kills it? A scrub gone mad! Nooooooo!

The problem is, the scenario as written doesn’t actually make sense. For one thing, even if you have a particular address in RAM with a stuck bit, you aren’t going to have your entire filesystem run through that address. That’s not how memory management works, and if it were how memory management works, you wouldn’t even have managed to boot the system: it would have crashed and burned horribly when it failed to load the operating system in the first place. So no, you might corrupt a block here and there, but you’re not going to wring the entire filesystem through a shredder block by precious block.

But we’re being cheap here. Say you only corrupt one block in 5,000 this way. That would still be hellacious. So let’s examine the more reasonable idea of corrupting some data due to bad RAM during a scrub. And let’s assume that we have RAM that not only isn’t working 100% properly, but is actively goddamn evil and trying its naive but enthusiastic best to specifically kill your data during a scrub:

First, you read a block. This block is good. It is perfectly good data written to a perfectly good disk with a perfectly matching checksum. But that block is read into evil RAM, and the evil RAM flips some bits. Perhaps those bits are in the data itself, or perhaps those bits are in the checksum. Either way, your perfectly good block now does not appear to match its checksum, and since we’re scrubbing, ZFS will attempt to actually repair the “bad” block on disk. Uh-oh! What now?

Next, you read a copy of the same block – this copy might be a redundant copy, or it might be reconstructed from parity, depending on your topology. The redundant copy is easy to visualize – you literally stored another copy of the block on another disk. Now, if your evil RAM leaves this block alone, ZFS will see that the second copy matches its checksum, and so it will overwrite the first block with the same data it had originally – no data was lost here, just a few wasted disk cycles. OK. But what if your evil RAM flips a bit in the second copy? Since it doesn’t match the checksum either, ZFS doesn’t overwrite anything. It logs an unrecoverable data error for that block, and leaves both copies untouched on disk. No data has been corrupted. A later scrub will attempt to read all copies of that block and validate them just as though the error had never happened, and if this time either copy passes, the error will be cleared and the block will be marked valid again (with any copies that don’t pass validation being overwritten from the one that did).

Well, huh. That doesn’t sound so bad. So what does your evil RAM need to do in order to actually overwrite your good data with corrupt data during a scrub? Well, first it needs to flip some bits during the initial read of every block that it wants to corrupt. Then, on the second read of a copy of the block from parity or redundancy, it needs to not only flip bits, it needs to flip them in such a way that you get a hash collision. In other words, random bit-flipping won’t do – you need some bit flipping in the data (with or without some more bit-flipping in the checksum) that adds up to the corrupt data correctly hashing to the value in the checksum. By default, ZFS uses 256-bit SHA validation hashes, which means that a single bit-flip has a 1 in 2^256 chance of giving you a corrupt block which now matches its checksum. To be fair, we’re using evil RAM here, so it’s probably going to do lots of experimenting, and it will try flipping bits in both the data and the checksum itself, and it will do so multiple times for any single block. However, that’s multiple 1 in 2^256 (aka roughly 1 in 10^77) chances, which still makes it vanishingly unlikely to actually happen… and if your RAM is that damn evil, it’s going to kill your data whether you’re using ZFS or not.

But what if I’m not scrubbing?

Well, if you aren’t scrubbing, then your evil RAM will have to wait for you to actually write to the blocks in question before it can corrupt them. Fortunately for it, though, you write to storage pretty much all day long… including to the metadata that organizes the whole kit and kaboodle. First time you update the directory that your files are contained in, BAM! It’s gotcha! If you stop and think about it, in this evil RAM scenario ZFS is incredibly helpful, because your RAM now needs to not only be evil but be bright enough to consistently pull off collision attacks. So if you’re running non-ECC RAM that turns out to be appallingly, Lovecraftianishly evil, ZFS will mitigate the damage, not amplify it.

If you are using ZFS and you aren’t scrubbing, by the way, you’re setting yourself up for long term failure. If you have on-disk corruption, a scrub can fix it only as long as you really do have a redundant or parity copy of the corrupted block which is good. Once you corrupt all copies of a given block, it’s too late to repair it – it’s gone. Don’t be afraid of scrubbing. (Well, maybe be a little wary of the performance impact of scrubbing during high demand times. But don’t be worried about scrubbing killing your data.)

I’ve constructed a doomsday scenario featuring RAM evil enough to kill my data after all! Mwahahaha!

OK. But would using any other filesystem that isn’t ZFS have protected that data? ‘Cause remember, nobody’s arguing that you can lose data to evil RAM – the argument is about whether evil RAM is more dangerous with ZFS than it would be without it.

I really, really want to use the Scrub Of Death in a movie or TV show. How can I make it happen?

What you need here isn’t evil RAM, but an evil disk controller. Have it flip one bit per block read or written from disk B, but leave the data from disk A alone. Now scrub – every block on disk B will be overwritten with a copy from disk A, but the evil controller will flip bits on write, so now, all of disk B is written with garbage blocks. Now start flipping bits on write to disk A, and it will be an unrecoverable wreck pretty quickly, since there’s no parity or redundancy left for any block. Your choice here is whether to ignore the metadata for as long as possible, giving you the chance to overwrite as many actual data blocks as you can before the jig is up as they are written to by the system, or whether to pounce straight on the metadata and render the entire vdev unusable in seconds – but leave the actual data blocks intact for possible forensic recovery.

Alternately you could just skip straight to step B and start flipping bits as data is written on any or all individual devices, and you’ll produce real data loss quickly enough. But you specifically wanted a scrub of death, not just bad hardware, right?

I don’t care about your logic! I wish to appeal to authority!

OK. “Authority” in this case doesn’t get much better than Matthew Ahrens, one of the cofounders of ZFS at Sun Microsystems and current ZFS developer at Delphix. In the comments to one of my filesystem articles on Ars Technica, Matthew said “There’s nothing special about ZFS that requires/encourages the use of ECC RAM more so than any other filesystem.”

Hope that helps. =)

Three Step Guide to X11 Forwarding

Got a graphical application you want to run on a Linux box, but display on a Windows box? It’s stupidly easy. I can’t believe how long it took me to learn how to do this, even though I knew it was possible to. Hopefully, this will save some other sysadmin from not having this trick in the toolbox. (It’s particularly useful for running virt-manager when you don’t have a Linux machine to sit in front of.)

Install Xming
Install Xming
Step 1: download and install Xming (probably from Softpedia, since Sourceforge is full of malware and BS misleading downloads now)

Enable X11 Forwarding
Enable X11 Forwarding
Step 2: in PuTTY’s configs on your Windows box, Connection –> SSH –> X11 –> check the “Enable X11 Forwarding” box.

Run from SSH
Run from SSH
Step 3: SSH into a Linux box, and run a GUI application from the command line. Poof, the app shows up on your Windows desktop!

Avahi killed my server :'(

Avahi is the equivalent to Apple’s “Bonjour” zeroconf network service. It installs by default with the ubuntu-desktop meta-package, which I generally use to get, you guessed it, a full desktop on virtualization host servers. This never caused me any issues until today.

Today, though – on a server with dual network interfaces, both used as bridge ports on its br0 adapter – Avahi apparently decided “screw the configuration you specified in /etc/network/interfaces, I’m going to give your production virt host bridge an autoconf address. Because I want to be helpful.”

When it did so, the host dropped off the network, I got alarms on my monitoring service, and I couldn’t so much as arp the host, much less log into it. So I drove down to the affected office and did an ifconfig br0, which showed me the following damning bit of evidence:

me@box:~$ ifconfig br0
br0       Link encap:Ethernet  HWaddr 00:0a:e4:ae:7e:4c
         inet6 addr: fe80::20a:e4ff:feae:7e4c/64 Scope:Link
         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
         RX packets:11 errors:0 dropped:0 overruns:0 frame:0
         TX packets:96 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:0
         RX bytes:3927 (3.8 KB)  TX bytes:6970 (6.8 KB)

br0:avahi Link encap:Ethernet  HWaddr 00:0a:e4:ae:7e:4c
         inet addr:169.254.6.229  Bcast:169.254.255.255  Mask:255.255.0.0
         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

Oh, Avahi, you son-of-a-bitch. Was there anything wrong with the actual NIC? Certainly didn’t look like it – had link lights on the NIC and on the switch, and sure enough, ifdown br0 ; ifup br0 brought it right back online again.

Can we confirm that avahi really was the culprit?

/var/log/syslog:Jan  9 09:10:58 virt0 avahi-daemon[1357]: Withdrawing address record for [redacted IP] on br0.
/var/log/syslog:Jan  9 09:10:58 virt0 avahi-daemon[1357]: Leaving mDNS multicast group on interface br0.IPv4 with address [redacted IP].
/var/log/syslog:Jan  9 09:10:58 virt0 avahi-daemon[1357]: Interface br0.IPv4 no longer relevant for mDNS.
/var/log/syslog:Jan  9 09:10:59 virt0 avahi-autoipd(br0)[12460]: Found user 'avahi-autoipd' (UID 111) and group 'avahi-autoipd' (GID 121).
/var/log/syslog:Jan  9 09:10:59 virt0 avahi-autoipd(br0)[12460]: Successfully called chroot().
/var/log/syslog:Jan  9 09:10:59 virt0 avahi-autoipd(br0)[12460]: Successfully dropped root privileges.
/var/log/syslog:Jan  9 09:10:59 virt0 avahi-autoipd(br0)[12460]: Starting with address 169.254.6.229
/var/log/syslog:Jan  9 09:11:03 virt0 avahi-autoipd(br0)[12460]: Callout BIND, address 169.254.6.229 on interface br0
/var/log/syslog:Jan  9 09:11:03 virt0 avahi-daemon[1357]: Joining mDNS multicast group on interface br0.IPv4 with address 169.254.6.229.
/var/log/syslog:Jan  9 09:11:03 virt0 avahi-daemon[1357]: New relevant interface br0.IPv4 for mDNS.
/var/log/syslog:Jan  9 09:11:03 virt0 avahi-daemon[1357]: Registering new address record for 169.254.6.229 on br0.IPv4.
/var/log/syslog:Jan  9 09:11:07 virt0 avahi-autoipd(br0)[12460]: Successfully claimed IP address 169.254.6.229

I know I said this already, but – oh, avahi, you worthless son of a bitch!

Next step was to kill it and disable it.

me@box:~$ sudo stop avahi-daemon
me@box:~$ echo manual | sudo tee /etc/init/avahi-daemon.override

Grumble grumble grumble. Now I’m just wondering why I’ve never had this problem before… I suspect it’s something to do with having dual NICs on the bridge, and one of them not being plugged in (I only added them both so it wouldn’t matter which one actually got plugged in if the box ever got moved somewhere).

MSE Install fails with 0x8004FF91

Well, this was an annoying one, and it’s hard to find the one thread that actually addresses it amongst the ones conflating it with an off-by-one error code (subtract one from 0x8004FF91. not going to actually say it here to avoid poisoning Google).

TL;DR if you can’t install Microsoft Security Essentials – even on a brand new install of Win7 64 bit – it’s probably due to Windows Update KB3004394. Uninstall that update, and MSE will install just fine.

UPDATE: KB3004394 has been acknowledged as bad by MS. And the problems are actually a lot more far-reaching than just MSE installs; the KB botched an implementation of root certificate checking that causes all code signing checks to fail. Affected systems (Win7 SP1 and Win2008 R2 SP1 at least) will not be able to install signed device drivers, will not be able to install MSE, will get unexpected UAC prompts in weird places (due to signed code suddenly appearing unsigned and therefore untrusted)… oh, yeah, and Windows Update will fail, meaning that they’ll have to be manually fixed by either uninstalling the bad KB (at which point Windows Update will work again) or by manually downloading and installing KB3024777. Repeat manually – you can’t get it from Windows Update until Windows Update actually works, so…

Get it all in one sock, Microsoft.

The SSLv3 “POODLE” attack in a (large) nutshell

A summary of the POODLE sslv3 vulnerability and attack:

A vulnerability has been discovered in a decrepit-but-still-widely-supported version of SSL, SSLv3, which allows an attacker a good chance at determining the true value of a single byte of encrypted traffic. This is of limited use in most applications, but in HTTPS (eg your web browser, many mobile applications, etc) an attacker in an MITM (Man-In-The-Middle) position, such as someone operating a wireless router you connect to, can capture and resend the traffic repeatedly until they manage to get a valuable chunk of it assembled in the clear. (This is done by manipulating cleartext traffic, to the same or any other site, injecting some Javascript in that traffic to get your browser to run it. The rogue JS function is what reloads the secure site, offscreen where you can’t see it happening, until the attacker gets what s/he needs out of it.)

That “valuable chunk” is the cookie that validates your user login on whatever secure website you happen to be browsing – your bank, webmail, ebay or amazon account, etc. By replaying that cookie, the attacker can now hijack your logged in session directly on his/her own device, and from there can do anything that you would be able to do – make purchases, transfer funds, change the password, change the associated email account, et cetera.

It reportedly takes 60 seconds or less for an attacker in a MITM position (again, typically someone in control of a router your traffic is being directed through, which is most often going to be a wireless router – maybe even one you don’t realize you’ve connected to) to replay traffic enough to capture the cookie using this attack.

Worth noting: SSLv3 is hopelessly obsolete, but it’s still widely supported in part because IE6/Windows XP need it, and so many large enterprises STILL are using IE6. Many sites and servers have proactively disabled SSLv3 for quite some time already, and for those, you’re fine. However, many large sites still have not – a particularly egregious example being Citibank, to whom you can still connect with SSLv3 today. As long as both your client application (web browser) and the remote site (web server) both support SSLv3, a MITM can force a downgrade dance, telling each side that the OTHER side only supports SSLv3, forcing that protocol even though it’s strongly deprecated.

I’m an end user – what do I do?

Disable SSLv3 in your browser. If you use IE, there’s a checkbox in Internet Options you can uncheck to remove SSLv3 support. If you use Firefox, there’s a plugin for that. If you use Chrome, you can start Chrome with a command-line option that disables SSLv3 for now, but that’s kind of a crappy “fix”, since you’d have to make sure to start Chrome either from the command line or from a particular shortcut every time (and, for example, clicking a link in an email that started up a new Chrome instance would fail to do so).

Instructions, with screenshots, are available at https://zmap.io/sslv3/ and I won’t try to recreate them here; they did a great job.

I will note specifically here that there’s a fix for Chrome users on Ubuntu that does fairly trivially mitigate even use-cases like clicking a link in an email with the browser not already open:


* Open /usr/share/applications/google-chrome.desktop in a text editor
* For any line that begins with "Exec", add the argument --ssl-version-min=tls1
* For instance the line "Exec=/usr/bin/google-chrome-stable %U" should become "Exec=/usr/bin/google-chrome-stable --ssl-version-min=tls1 %U

You can test to see if your fix for a given browser worked by visiting https://zmap.io/sslv3/ again afterwards – there’s a banner at the top of the page which will warn you if you’re vulnerable. WARNING, caching is enabled on that page, meaning you will have to force-refresh to make certain that you aren’t seeing the old cached version with the banner intact – on most systems, pressing ctrl-F5 in your browser while on the page will do the trick.

I’m a sysadmin – what do I do?

Disable SSLv3 support in any SSL-enabled service you run – Apache, nginx, postfix, dovecot, etc. Worth noting – there is currently no known way to usefully exploit the POODLE vulnerability with IMAPS or SMTPS or any other arbitrary SSL-wrapped protocol; currently HTTPS is the only known protocol that allows you to manipulate traffic in a useful enough way. I would not advise banking on that, though. Disable this puppy wherever possible.

The simplest way to test if a service is vulnerable (at least, from a real computer – Windows-only admins will need to do some more digging):

openssl s_client -connect mail.jrs-s.net:443 -ssl3

The above snippet would check my mailserver. The correct (sslv3 not available) response begins with a couple of error lines:

CONNECTED(00000003)
140301802776224:error:14094410:SSL routines:SSL3_READ_BYTES:sslv3 alert handshake failure:s3_pkt.c:1260:SSL alert number 40
140301802776224:error:1409E0E5:SSL routines:SSL3_WRITE_BYTES:ssl handshake failure:s3_pkt.c:596:

What you DON’T want to see is a return with a certificate chain in it:

CONNECTED(00000003)
depth=1 C = GB, ST = Greater Manchester, L = Salford, O = COMODO CA Limited, CN = PositiveSSL CA 2
verify error:num=20:unable to get local issuer certificate
verify return:0
---
Certificate chain
0 s:/OU=Domain Control Validated/OU=PositiveSSL/CN=mail.jrs-s.net
i:/C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=PositiveSSL CA 2
1 s:/C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=PositiveSSL CA 2
i:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root

On Apache on Ubuntu, you can edit /etc/apache2/mods-available/ssl.conf and find the SSLProtocol line and change it to the following:

SSLProtocol all -SSLv2 -SSLv3

Then restart Apache with /etc/init.d/apache2 restart, and you’re golden.

I haven’t had time to research Postfix or Dovecot yet, which are my other two big concerns (even though they theoretically shouldn’t be vulnerable since there’s no way for the attacker to manipulate SMTPS or IMAPS clients into replaying traffic repeatedly).

Possibly also worth noting – I can’t think of any way for an attacker to exploit POODLE without access to web traffic running both in the clear and in a Javascript-enabled browser, so if you wanted to disable Javascript completely (which is pretty useless since it would break the vast majority of the web) or if you’re using a command-line tool like wget for something, it should be safe.

Allowing traceroutes to succeed with iptables

There is a LOT of bogus half-correct information about traceroutes and iptables floating around out there.  It took me a bit of sifting through it all to figure out the real deal and the best way to allow traceroutes without negatively impacting security this morning, so here’s some documentation in case I forget before the next time.

Traceroute from Windows machines typically uses ICMP Type 8 packets.  Traceroute from Unixlike machines typically uses UDP packets with sequentially increasing destination ports, from 33434 to 33534.  So your server (the traceroute destination) must not drop incoming ICMP Type 8 or UDP 33434:33534.

Here’s where it gets tricky: it really doesn’t need to accept those packets either, which the vast majority of sites addressing this issue recommends.  It just needs to be able to reject them, which won’t happen if they’re being dropped.  If you implement the typical advice – accepting those packets – traceroute basically ends up sort of working by accident: those ports shouldn’t be in use by any running applications, and since nothing is monitoring them, the server will issue an ICMP Type 3 response (destination unreachable).  However, if you’re accepting packets to these ports, then a rogue application listening on those ports also becomes reachable – which is the sort of thing your firewall should be preventing in the first place.

The good news is, DROP and ACCEPT aren’t your only options – you can REJECT these packets instead, which will do exactly what we want here: allow traceroutes to work properly without also potentially enabling some rogue application to listen on those UDP ports.

So all you really need on your server to allow incoming traceroutes to work properly is:

# allow ICMP Type 8 (ping, ICMP traceroute)
-A INPUT -p icmp --icmp-type 8 -j ACCEPT
# enable UDP traceroute rejections to get sent out
-A INPUT -p udp --dport 33434:33523 -j REJECT

Note: you may very well need and/or want more ICMP functionality than this in general – but this is all you need for incoming traceroutes to complete properly.

Selectively disabling Windows UAC for individual applications

Today a client emailed me to report that since installing Quickbooks “Enterprise” (note the scare quotes there. they are used with malice), her users (who are, sensibly, not Administrators) were faced with a User Account Control prompt (“Do you want to allow the following program to make changes to your computer?”) every time they opened the new version of Quickbooks.  A little further investigation showed that “DBManagerExe.exe” was the actual file throwing the UAC dialog.  Absolutely no information from Intuit is available whatsoever about how or why this program wants Administrator privileges, ways to nerf it, etc – apparently this “Enterprise” product is just supposed to be run in “Enterprises” by users who are allowed full Administrator privileges.  Because, you know, that’s what “Enterprises” do.  Delightful.

I chased the issue around and around trying to figure out what DBManagerExe.exe actually wanted access to, so I could just grant that to the users… but eventually I was forced to give up and just disable UAC selectively for that one program.  Luckily, while the process is rather arcane, it’s not actually HARD.  So let’s document it here.

1. Download the Microsoft Application Compatibility Toolkit.  I won’t link it here, to avoid creating stale links – just Google it, it should come right up.  Pick the latest version available (currently, 5.6).  Run the installer.

2. start –> all programs –> Microsoft Application Compatibility Toolkit –> Compatibility Administrator (32-bit) or Compatibility Administrator (64-bit), as appropriate. Note: just because your system is 64-bit does not necessarily mean that’s the Compatibility Administrator you want here – this needs to match the application you want to selectively allow UAC-less admin privileges for, not the system as a whole!  For DBManagerEXE.exe, I needed to select 32-bit.  Further note: if you are not logged in as the actual Administrator account, you should right-click and “Run As Administrator” to open the Compatibility Manager.  Otherwise, your “fix” won’t fix anything.

3. Click the “Fix” icon on the top toolbar.  Click “Browse” to find the executable you want to enable – for me, it was C:\Program Files (x86)\Intuit\QuickBooks Enterprise Solutions 14.0\DBManagerExe.exe.  Now, enter the name of the program and vendor in the two text boxes above the location in the dialog – this will make it easier to manage later, if you ever need to figure out what you’ve done and to whom.  Click Next.

4. Under Compatibility Modes, click none.  You don’t want this.  (Unless you do, of course, but Compatibility Modes aren’t needed for nerfing UAC dialogs, they’re for something COMPLETELY different and certainly aren’t applicable to running Quickbooks Enterprise 2014, in this case.)  Click Next.

5. Find RunAs Invoker on the list of Compatibility Fixes.  Check it.  Don’t mess with anything else.  Click Next, then click Finish.

6. Save your database (from the button on the toolbar).  Give it a name that makes sense, and save it in C:\Windows\System32. 8. File –> Install from the top menu.  You’ll get a dialog box confirming that you’ve installed your fix.  You should be done now.

Log in as an unprivileged user and test – in my case, for enabling non-Administrators to open Quickbooks “Enterprise” 2014, it worked flawlessly – no more UAC prompt, now the user went straight to the new setup wizard as they should.

Disable_UAC_selectively

Note: for this particular diabolically badly written application, just disabling UAC probably won’t be enough: QuickBooks also tends to fail miserably at starting its database manager service due to not placing its service user group into the local Administrators group. Each year of QB will create its own service user, in the form QBDataServiceUser24 or similar. If you’re here specifically for Quickbooks and you still get a nasty, this time NON Windows “you need to be administrator” prompt when you launch QB, you’ll need to find your local service user for the year of Quickbooks in question and add it to the local Administrators group on your machine. Yay, Intuit.

Fascinating insight into IT and non-profits

If you’ve ever wondered what the typical non-profit looks like in terms of IT budget – everything from salaries, to head count, to non-salary budget per staff, to non-salary budget per category (project management, outsourced services, hardware, software, more…) looks like, NTEN has got you covered.

http://www.nten.org/research/download_it_staffing_2012

This is a free download, though they do ask for your name and email address before you can click through to the PDF.  Fascinating, very deep dive, and totally worth it whether you’re a decision maker in a non profit yourself, a service provider with non-profit customers, or even just somebody curious about how organizations function.

OpenVPN on BeagleBone Black

beaglebone_black
This is my new Beaglebone Black. Enormous, isn’t it?

I needed an inexpensive embedded device for OpenVPN use, and my first thought (actually, my tech David’s first thought) was the obvious in this day and age: “Raspberry Pi.”

Unfortunately, the Pi didn’t really fit the bill.  Aside from the unfortunate fact that my particular Pi arrived with a broken ethernet port, doing some quick network-less testing of OpenSSL gave me very disappointing 5mbps-ish numbers – 5mbps or so, running flat out, encryption alone, let alone any actual routing.  This bore up with some reviews I found online, so I had to give up on the Pi as an embedded solution for OpenVPN use.

Luckily, that didn’t mean I was sunk yet – enter the Beaglebone Black.  Beaglebone doesn’t get as much press as the Pi does, but it’s an interesting device with an interesting history – it’s been around longer than the Pi (more than ten years!), it’s fully open source where the Pi is not (hardware plans are published online, and other vendors are not only allowed but encouraged to build bit-for-bit identical devices!), and although it doesn’t have the video chops of the Pi (no 1080p resolution supported), it has a much better CPU – a 1GHZ Cortex A8, vs the Pi’s 700MHz A7.  If all that isn’t enough, the Beaglebone also has built-in 2GB eMMC flash with a preloaded installation of Angstrom Linux, and – again unlike the Pi – directly supports being powered from plain old USB connected to a computer.  Pretty nifty.

The only real hitch I had with my Beaglebone was not realizing that if I had an SD card in, it would attempt to boot from the SD card, not from the onboard eMMC.  Once I disconnected my brand new Samsung MicroSD card and power cycled the Beaglebone, though, I was off to the races.  It boots into Angstrom pretty quickly, and thanks to the inclusion of the Avahi daemon in the default installation, you can discover the device (from linux at least – haven’t tested Windows) by just pinging beaglebone.local.  Once that resolves, ssh root@beaglebone.local with a default password, and you’re embedded-Linux-ing!

Angstrom doesn’t have any prebuilt packages for OpenVPN, so I downloaded the source from openvpn.net and did the usual ./configure ; make ; make install fandango.  I did have one minor hitch – the system clock wasn’t set, so ./configure bombed out complaining about files in the future.  Easily fixed – ntpdate us.pool.ntp.org updated my clock, and this time the package built without incident, needing somewhere south of 5 minutes to finish.  After that, it was time to test OpenVPN’s throughput – which, spoiler alert, was a total win!

root@beaglebone:~# openvpn --genkey --secret beagle.key ; scp beagle.key me@locutus:/tmp/
root@beaglebone:~# openvpn --secret beagle.key --port 666 --ifconfig 10.98.0.1 10.98.0.2 --dev tun
me@locutus:/tmp$ sudo openvpn --secret beagle.key --remote beaglebone.local --port 666 --ifconfig 10.98.0.2 10.98.0.1 --dev tun

Now I have a working tunnel between locutus and my beaglebone.  Opening a new terminal on each, I ran iperf to test throughput.  To run iperf (which was already available on Angstrom), you just run iperf -s on the server machine, and run iperf -c [ip address] on the client machine to connect to the server.  I tested connectivity both ways across my OpenVPN tunnel:

me@locutus:~$ iperf -c 10.98.0.1
------------------------------------------------------------
Client connecting to 10.98.0.1, TCP port 5001
TCP window size: 21.9 KByte (default)
------------------------------------------------------------
[ 3] local 10.98.0.2 port 55873 connected with 10.98.0.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.1 sec 46.2 MBytes 38.5 Mbits/sec
me@locutus:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 10.98.0.2 port 5001 connected with 10.98.0.1 port 32902
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 47.0 MBytes 39.2 Mbits/sec

38+ mbps from an inexpensive embedded device?  I’ll take it!

Apache 2.4 / Ubuntu Trusty problems

Found out the hard way today that there’ve been SIGNIFICANT changes in configuration syntax and requirements since Apache 2.2, when I tried to set up a VERY simple couple of vhosts on Apache 2.4.7 on a brand new Ubuntu Trusty Tahr install.

First – the a2ensite/a2dissite scripts refuse to work unless your vhost config files end in .conf. BE WARNED. Example:

you@trusty:~$ ls /etc/apache2/sites-available
000-default.conf
default-ssl.conf
testsite.tld
you@trusty:~$ sudo a2ensite testsite.tld
ERROR: Site testsite.tld does not exist!

The solution is a little annoying; you MUST end the filename of your vhost configs in .conf – after that, a2ensite and a2dissite work as you’d expect.

you@trusty:~$ sudo mv /etc/apache2/sites-available/testsite.tld /etc/apache2/sites-available/testsite.tld.conf
you@trusty:~$ sudo a2ensite testsite.tld
Enabling site testsite.tld
To activate the new configuration, you need to run:
  service apache2 reload

After that, I had a more serious problem. The “site” I was trying to enable was nothing other than a simple exposure of a directory (a local ubuntu mirror I had set up) – no php, no cgi, nothing fancy at all. Here was my vhost config file:

<VirtualHost *:80>
        ServerName us.archive.ubuntu.com
        ServerAlias us.archive.ubuntu.local 
        Options Includes FollowSymLinks MultiViews Indexes
        DocumentRoot /data/apt-mirror/mirror/us.archive.ubuntu.com
	*lt;Directory /data/apt-mirror/mirror/us.archive.ubuntu.com/>
	        Options Indexes FollowSymLinks
	        AllowOverride None
	</Directory>
</VirtualHost>

Can’t get much simpler, right? This would have worked fine in any previous version of Apache, but not in Apache 2.4.7, the version supplied with Trusty Tahr 14.04 LTS.

Every attempt to browse the directory gave me a 403 Forbidden error, which confused me to no end, since the directories were chmod 755 and chgrp www-data. Checking Apache’s error log gave me pages on pages of lines like this:

[Mon Jun 02 10:45:19.948537 2014] [authz_core:error] [pid 27287:tid 140152894646016] [client 127.0.0.1:40921] AH01630: client denied by server configuration: /data/apt-mirror/mirror/us.archive.ubuntu.com/ubuntu/

What I eventually discovered was that since 2.4, Apache not only requires explicit authentication setup and permission for every directory to be browsed, the syntax has changed as well. The old “Order Deny, Allow” and “Allow from all” won’t cut it – you now need “Require all granted”. Here is my final working vhost .conf file:

<VirtualHost *:80>
        ServerName us.archive.ubuntu.com
        ServerAlias us.archive.ubuntu.local 
        Options Includes FollowSymLinks MultiViews Indexes
        DocumentRoot /data/apt-mirror/mirror/us.archive.ubuntu.com
	<Directory /data/apt-mirror/mirror/us.archive.ubuntu.com/>
	        Options Indexes FollowSymLinks
	        AllowOverride None
                Require all granted
	</Directory>
</VirtualHost>

Hope this helps someone else – this was a frustrating start to the morning for me.