520 byte sectors and Ubuntu

I recently bought a server which came with Samsung PM1643 SSDs. Trying to install Ubuntu on them didn’t work at first try, because the drives had 520 byte sectors instead of 512 byte.

Luckily, there’s a fix–get the drive(s) to a WORKING Ubuntu system, plug them in, and use the sg_format utility to convert the sector size!

root@ubuntu:~# sg_format -v --format --size=512 /dev/sdwhatever

Yep, it’s really that easy. Be warned, this is a destructive, touch-every-sector operation–so it will take a while, and your drives might get a bit warm. The 3.84TB drives I needed to convert took around 10 minutes apiece.

On the plus side, this also fixes any drive slowdowns due to a lack of TRIM, since it’s a destructive sector-level format.

I’ve heard stories of drives that refused to sg_format initially; if you encounter stubborn drives, you might be able to unlock them by dding a gigabyte or so to them–or you might need to first sg_format them with --size=520, then immediately and I mean immediately again with --size=512.

WSL2, keychain, /etc/hosts and you

There unfortunately are still a few stumbling blocks toward getting a properly, fully-working virt-manager setup running under WSL2 on Windows 11.

apt install virt-manager just works, of course–but getting WSL2 to properly handle hostnames and SSH key passphrases takes a bit of tweaking.

First up, install a couple of additional packages:

apt install keychain ssh-askpass

The keychain package allows WSL2 to cache the passphrases for your SSH keys, and ssh-askpass allows virt-manager to bump requests up to you when necessary.

If you haven’t already done so, first generate yourself an SSH key and give it a passphrase:

me@my-win11:~# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (~/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in ~/.ssh/id_rsa
Your public key has been saved in ~/.ssh/id_rsa.pub

You will also need to configure keychain itself, by adding the following to the end of your .bashrc:

# For Loading the SSH key
/usr/bin/keychain -q --nogui $HOME/.ssh/id_rsa
source $HOME/.keychain/$HOSTNAME-sh

Now, you’ll enter in your SSH key passphrase each time you open a WSL2 terminal, and it will remember it for SSH sessions opened via that terminal (or via apps opened from that terminal, eg if you type in virt-manager).

If you like to set hostnames in /etc/hosts to make your virt-manager connections look more reasonable, there’s one more step necessary. By default, for some reason WSL2 clobbers /etc/hosts each time it’s started.

You can defang this by creating /etc/wsl.conf and inserting this stanza:

[network]
generateHosts = false

Presto, you can now have a nice, secure, and well-working virt-manager under your Windows 11 WSL2 instance!

screenshot of virt-manager under WSLg
I also edited this screenshot with Ubuntu GiMP installed under WSL2 with apt install gimp. Because of course I did.

One final caveat: I do not recommend trying to create a shortcut in Windows to open virt-manager directly.

You can do that… but if you do, you’re liable to break things badly enough to require a Windows reboot. Windows 11 really doesn’t like launching WSL2 apps directly from a batch file, rather than from within a fully-launched WSL2 terminal!

When NOTHING else will remove a half-installed .deb package

WARNING: ALL WARRANTIES NULL AND VOID

With that important disclaimer out of the way… when you’re stuck in the world’s worst apt -f install loop and can’t figure out any other way to get the damn thing unwedged when there’s a half-installed package (eg if you’ve removed an /etc directory for a package you installed before, and this breaks an installer script—or the installer script “knows you already have it” and refuses to replace a removed config directory), this is the nuclear option:

sudo nano /var/lib/dpkg/status

Remove the offending package (and any packages that depend on it) entirely from this file, then apt install the offending package again.

If you’re still broken after that… did I mention all warranties null and void? This is an extremely nuclear option, and I really wouldn’t recommend it outside a throwaway test environment; you’re probably better off just nuking the whole server and reinstalling.

With that said, the next thing I had to do to clean out the remnants of the mysql/mariadb coinstall debacle that inspired this post was:

find / -iname “*mysql*” | grep -v php | grep -v snap | xargs rm -r

This got me out of a half-broken state on a machine that somebody had installed both mariadb-server and mysql-server on, leaving neither working properly.

Rebalancing data on ZFS mirrors

One of the questions that comes up time and time again about ZFS is “how can I migrate my data to a pool on a few of my disks, then add the rest of the disks afterward?”

If you just want to get the data moved and don’t care about balance, you can just copy the data over, then add the new disks and be done with it. But, it won’t be distributed evenly over the vdevs in your pool.

Don’t fret, though, it’s actually pretty easy to rebalance mirrors. In the following example, we’ll assume you’ve got four disks in a RAID array on an old machine, and two disks available to copy the data to in the short term.

Step one: create the new pool, copy data to it

First up, we create a simple temporary zpool with the two available disks.

zpool create temp -oashift=12 mirror /dev/disk/by-id/wwn-disk0 /dev/disk/by-id/disk1

Simple. Now you’ve got a ZFS mirror named temp, and you can start copying your data to it.

Step two: scrub the pool

Do not skip this step!

zpool scrub temp

Once this is done, do a zpool status temp to make sure you don’t have any errors. Assuming you don’t, you’re ready to proceed.

Step three: break the mirror, create a new pool

zpool detach temp /dev/disk/by-id/disk1

Now, your temp pool is down to one single disk vdev, and you’ve freed up one of its original disks. You’ve also got a known good copy of all your data on disk0, and you’ve verified it’s all good by using a zpool scrub command in step two. So, destroy the old machine’s storage, freeing up its four disks for use. 

zpool create tank /dev/disk/by-id/disk1 mirror /dev/disk/by-id/disk2 /dev/disk/by-id/disk3 mirror /dev/disk/by-id/disk4 /dev/disk/by-id/disk5

Now you’ve got your original temporary pool named temp, and a new permanent pool named tank. Pool “temp” is down to one single-disk vdev, and pool “tank” has one single-disk vdev, and two mirror vdevs.

Step four: copy your data from temp to tank

Copy all your data one more time, from the single-disk pool “temp” to the new pool “tank.” You can use zfs replication for this, or just plain old cp or rsync. Your choice.

Step five: scrub tank, destroy temp

Do not skip this step.

zpool scrub tank

Once this is done, do a zpool status tank to make sure you don’t have any errors. Assuming you don’t, now it’s time to destroy your temporary pool to free up its disk.

zpool destroy temp

Almost done!

Step six: attach the final disk from temp to the single-disk vdev in tank

zpool attach tank /dev/disk/by-id/disk0 /dev/disk/by-id/disk1

That’s it—you now have all of your data imported to a six-disk pool of mirrors, and all of the data is evenly distributed (according to disk size, at least) across all vdevs, not all clumped up on the first one to be added.

You can obviously adjust this formula for larger (or smaller!) pools, and it doesn’t necessarily require importing from an older machine—you can use this basic technique to redistribute data across an existing pool of mirrors, too, if you add a new mirror vdev. 

The important concept here is the idea of breaking mirror vdevs using zpool detach, and creating mirror vdevs from single-disk vdevs using zpool attach.

 

Heads up—Let’s Encrypt and Dovecot

Let’s Encrypt certificates work just dandy not only for HTTPS, but also for SSL/TLS on IMAP and SMTP services in mailservers. I deployed Let’s Encrypt to replace manually-purchased-and-deployed certificates on a client server in 2019, and today, users started reporting they were getting certificate expiration errors in mail clients.

When I checked the server using TLS checking tools, they reported that the certificate was fine; both the tools and a manual check of the datestamp on the actual .pem file showed that it had been updating just fine, with the most recent update happening in January and extending the certificate validation until April. WTF?

As it turns out, the problem is that Dovecot—which handles IMAP duties on the server—doesn’t notice when the certificate has been updated on disk; it will cheerfully keep using an in-memory cached copy of whatever certificate was present when the service started until time immemorial.

The way to detect this was to use openssl on the command line to connect directly to the IMAPS port:

you@anybox:~$ openssl s_client -showcerts -connect mail.example.com:993 -servername example.com

Scrolling through the connect data produced this gem:

---
Server certificate
subject=CN = mail.example.com

issuer=C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3

---
No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: RSA
Server Temp Key: ECDH, P-384, 384 bits
---
SSL handshake has read 3270 bytes and written 478 bytes
Verification error: certificate has expired

So obviously, the Dovecot service hadn’t reloaded the certificate after Certbot-auto renewed it. One /etc/init.d/dovecot restart later, running the same command instead produced (among all the other verbiage):

---
Server certificate
subject=CN = mail.example.com

issuer=C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3

---
No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: RSA
Server Temp Key: ECDH, P-384, 384 bits
---
SSL handshake has read 3269 bytes and written 478 bytes
Verification: OK
---

With the immediate problem resolved, the next step was to make sure Dovecot gets automatically restarted frequently enough to pick new certs up before they expire. You could get fancy and modify certbot’s cron job to include a Dovecot restart; you can find certbot’s cron job with grep -ir certbot /etc/crontab and add a –deploy-hook argument to restart after new certificates are obtained (and only after new certificates are obtained).

But I don’t really recommend doing it that way; the cron job might get automatically updated with an upgraded version of certbot at some point in the future. Instead, I created a new root cron job to restart Dovecot once every Sunday at midnight:

# m h dom mon dow   command
0 0 * * Sun /etc/init.d/dovecot restart

Since Certbot renews any certificate with 30 days or less until expiration, and the Sunday restart will pick up new certificates within 7 days of their deployment, we should be fine with this simple brute-force approach rather than a more efficient—but also more fragile—approach tying the update directly to restarting Dovecot using the –deploy-hook argument.

 

Importing WireGuard configs on mobile

I learned something new today—you can use an app called qrencode to create plain-ASCII QR codes on Ubuntu. This comes in super handy if you need to set up WireGuard tunnels on an Android phone or tablet, which otherwise tends to be a giant pain in the ass.

If you haven’t already, you’ll need to install qrencode itself; on Ubuntu that’s simply apt install qrencode and you’re ready. After that, just feed a tunnel config into the app, and it’ll display the QR code in the terminal. Your WireGuard mobile app has “from QR code” as an option in the tunnel import section; pick that, allow it to use the camera, and you’re off to the races!

Just like that, your WireGuard tunnel is ready to import into your phone or tablet.

 

 

zfs set sync=disabled

While benchmarking the Ars Technica Hot Rod server build tonight, I decided to empirically demonstrate the effects of zfs set sync=disabled on a dataset.

In technical terms, sync=disabled tells ZFS “when an application requests that you sync() before returning, lie to it.” If you don’t have applications explicitly calling sync(), this doesn’t result in any difference at all. If you do, it tremendously increases write performance… but, remember, it does so by lying to applications that specifically request that a set of data be safely committed to disk before they do anything else. TL;DR: don’t do this unless you’re absolutely sure you don’t give a crap about your applications’ data consistency safeguards!

In the below screenshot, we see ATTO Disk Benchmark run across a gigabit LAN to a Samba share on a RAIDz2 pool of eight Seagate Ironwolf 12TB disks. On the left: write cache is enabled (meaning, no sync() calls). In the center: write cache is disabled (meaning, a sync() call after each block written). On the right: write cache is disabled, but zfs set sync=disabled has been set on the underlying dataset.

L-R: no sync(), sync(), lying in response to sync().

The effect is clear and obvious: zfs set sync=disabled lies to applications that request sync() calls, resulting in the exact same performance as if they’d never called sync() at all.

Continuously updated iostat

Finally, after I don’t know HOW many years, I figured out how to get continuously updated stats from iostat that don’t just scroll up the screen and piss you off.

For those of you who aren’t familiar, iostat gives you some really awesome per-disk reports that you can use to look for problems. Eg, on a system I’m moving a bunch of data around on at the moment:

root@dr0:~# iostat --human -xs
Linux 4.15.0-45-generic (dr0) 06/04/2019 x86_64 (16 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.1% 0.0% 2.6% 5.8% 0.0% 91.5%
Device tps kB/s rqm/s await aqu-sz areq-sz %util
loop0 0.00 0.0k 0.00 0.00 0.00 1.6k 0.0%
sda 214.70 17.6M 0.14 2.25 0.48 83.9k 29.8%
sdb 364.41 38.1M 0.63 9.61 3.50 107.0k 76.5%
sdc 236.49 22.2M 4.13 2.13 0.50 96.3k 20.4%
sdd 237.14 22.2M 4.09 2.14 0.51 95.9k 20.4%
md0 12.09 221.5k 0.00 0.00 0.00 18.3k 0.0%

In particular, note that %util column. That lets me see that /dev/sdb is the bottleneck on my current copy operation. (I expect this, since it’s a single disk reading small blocks and writing large blocks to a two-vdev pool, but if this were one big pool, it would be an indication of problems with sdb.)

But what if I want to see a continuously updated feed? Well, I can do iostat –human -xs 1 and get a new listing every second… but it just scrolls up the screen, too fast to read. Yuck.

OK, how about using the watch command instead? Well, normally, when you call iostat, the first output is a reading that averages the stats for all devices since the first boot. This one won’t change visibly very often unless the system was JUST booted, and almost certainly isn’t what you want. It also frustrates the heck out of any attempt to simply use watch.

The key here is the -y argument, which skips that first report which always gives you the summary of history since last boot, and gets straight to the continuous interval reports – and knowing that you need to specify an interval, and a count for iostat output. If you get all that right, you can finally use watch -n 1 to get a running output of iostat that doesn’t scroll up off the screen and drive you insane trying to follow it:

root@dr0:~# watch -n 1 iostat -xy --human 1 1

Have fun!

Ubuntu 18.04 hung at update-grub 66%

I’ve encountered this two or three times now, and it’s always a slog figuring out how to fix it. When doing a fresh install of Ubuntu 18.04 to a new system, it hangs forever (never times out, no matter how long you wait) at 66% running update-grub.

The problem is a bug in os-prober. The fix is to ctrl-alt-F2 into a new BusyBox shell, ps and grep for the offending process, and kill it:

BusyBox v1.27.2 (Ubuntu 1:1.27.2-2ubuntu3.1) built-in shell (ash)
Enter 'help' for a list of built-in commands.

# ps wwaux | grep dmsetup | grep -v grep
6114   root   29466 S    dmsetup create -r osprober-linux-sdc9

# kill 6114

Now ctrl-alt-F1 back into your installer session. After a moment, it’ll kick back into high gear and finish your Ubuntu 18.04 installation… but you’re unfortunately not done yet; killing os-prober got the install to complete, but it didn’t get GRUB to actually install onto your disks.

You can get a shell and chroot into your new install environment right now, but if you’re not intimately familiar with that process, it may be easier to just reboot using the same Ubuntu install media, but this time select “Rescue broken system”. Once you’ve made your way through selecting your keyboard layout and given your system a bogus name (it only persists for this rescue environment; it doesn’t change on-disk configuration) you’ll be asked to pick an environment to boot into, with a list of disks and partitions.

If you installed root to a simple partition, pick that partition. If, like me, you installed to an mdraid array, you should see that array listed as “md127”, which is Ubuntu’s default name for an array it knows is there but otherwise doesn’t know much about. Choose that, and you’ll get a shell with everything already conveniently mounted and chrooted for you.

(If you didn’t have the option to get into the environment the simple way, you can still do it from a standard installer environment: find your root partition or array, mount it to /mnt like mount /dev/md127 /mnt ; then chroot into it like chroot /mnt and you’ll be caught up and ready to proceed.)

The last part is easy. First, we need to get the buggy os-prober module out of the execution path.

root@ubuntu:~# cd /etc/grub.d
root@ubuntu:~/etc/grub.d# mkdir nerfed
root@ubuntu:~/etc/grub.d# mv 30_os-prober/nerfed

OK, that got rid of our problem module that locked up on us during the install. Now we’re ready to run update-grub and grub-install. I’m assuming here that you have two disks which should be bootable, /dev/sda and /dev/sdb; if that doesn’t match your situation, adjust accordingly. (If you’re using an mdraid array, mdadm –detail /dev/md127 to tell you for sure which disks to make bootable.)

root@ubuntu:~# update-grub
root@ubuntu:~# grub-install /dev/sda
root@ubuntu:~# grub install /dev/sdb

That’s it; now you can shutdown the system, pull the USB installer, and boot from the actual disks!

I’m stuck at update-grub, but it times out and errors!

If your update-grub process hangs for quite a while (couple full minutes?) at 50% but then falls to an angry error screen with a red background, you’ve got a different problem. If you’re trying to install with an mdraid root directory on a disk 4TiB or larger, you need to do a UEFI-style install – which requires EFI boot partitions available on each of your bootable disks.

You’re going to need to start the install process over again; this time when you partition your disks, make sure to create a small partition of type “EFI System Partition”. This is not the same partition you’ll use for your actual root; it’s also not the same thing as /boot – it’s a special snowflake all to itself, and it’s mandatory for systems booting from a drive or drives 4 TiB or larger. (You can still boot in BIOS mode, with no boot partition, from 2 TiB or smaller drives. Not sure about 3 TiB drives; I’ve never owned one IIRC.)

Installing WordPress on Apache the modern way

It’s been bugging me for a while that there are no correct guides to be found about using modern Apache 2.4 or above with the Event or Worker MPMs. We’re going to go ahead and correct that lapse today, by walking through a brand-new WordPress install on a new Ubuntu 18.04 VM (grab one for $5/mo at Linode, Digital Ocean, or your favorite host).

Installing system packages

Once you’ve set up the VM itself, you’ll first need to update the package list:

root@VM:~# apt update

Once it’s updated, you’ll need to install Apache itself, along with PHP and the various extras needed for a WordPress installation.

root@VM:~# apt install apache2 mysql-server php-fpm php-common php-mbstring php-xmlrpc php-soap php-gd php-xml php-intl php-mysql php-cli php-ldap php-zip php-curl

The key bits here are Apache2, your HTTP server; MySQL, your database server; and php-fpm, which is a pool of PHP worker processes your HTTP server can connect to in order to build WordPress dynamic content as necessary.

What you absolutely, positively do not want to do here is install mod_php. If you do that, your nice modern Apache2 with its nice modern Event process model gets immediately switched back to your granddaddy’s late-90s-style prefork, loading PHP processors into every single child process, and preventing your site from scaling if you get any significant traffic!

Enable the proxy_fcgi module

Instead – and this is the bit none of the guides I’ve found mention – you just need to enable one module in Apache itself, and enable the already-installed PHP configuration module. (You will need to figure out which version of php-fpm is installed: dpkg –get-selections | grep fpm can help here if you aren’t sure.)

root@VM:~# a2enmod proxy_fcgi
root@VM:~# a2enconf php7.4-fpm.conf
root@VM:~# systemctl restart apache2

Your Apache2 server is now ready to serve PHP applications, like WordPress. (Note for more advanced admins: if you’re tuning for larger scale, don’t forget that it’s not only about the web server connections anymore; you also want to keep an eye on how many PHP worker processes you have in your pool. You’ll do that in /etc/php/[version]/fpm/pool.d/www.conf.)

Download and extract WordPress

We’re going to keep things super simple in this guide, and just serve WordPress from the existing default vhost in its standard location, at /var/www/html.

root@VM:~# cd /var/www
root@VM:/var/www# wget https://wordpress.org/latest.tar.gz
root@VM:/var/www# tar zxvf latest.tar.gz
root@VM:/var/www# chown -R www-data.www-data wordpress
root@VM:/var/www# mv html html.dist
root@VM:/var/www# mv wordpress html

Create a database for WordPress

The last step before you can browse to your new WordPress installation is creating the database itself.

root@VM:/var/www# mysql -u root

mysql> create database wordpress;
Query OK, 1 row affected (0.01 sec)

mysql> grant all on wordpress.* to 'wordpress'@'localhost' identified by 'superduperpassword';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> exit;

This created a database named wordpress, with a user named wordpress, and a password superduperpassword. That’s a bad password. Don’t actually use that password. (Also, if mysql -u root wanted a password, and you don’t have it – cat /etc/mysql/debian.cnf, look for the debian-sys-maint password, and connect to mysql using mysql -u debian-sys-maint instead. Everything else will work fine.)

note for ubuntu 20.04 / mysql 8.0 users:

MySQL changed things a bit with 8.0. grant all on db.* to ‘user’@’localhost’ identified by ‘password’; no longer works all in one step. Instead, you’ll need first to create user ‘user’@’localhost’ identified by ‘password’; then you can grant all on db.* to ‘user’@’localhost’; —you no longer need to (or can) specify password on the actual grant line itself.

All done – browser time!

Now that you’ve set up Apache, dropped the WordPress installer in its default directory, and created a mysql database – you’re ready to run through the WordPress setup itself, by browsing directly to http://your.servers.ip.address/. Have fun!