520 byte sectors and Ubuntu

I recently bought a server which came with Samsung PM1643 SSDs. Trying to install Ubuntu on them didn’t work at first try, because the drives had 520 byte sectors instead of 512 byte.

Luckily, there’s a fix–get the drive(s) to a WORKING Ubuntu system, plug them in, and use the sg_format utility to convert the sector size!

root@ubuntu:~# sg_format -v --format --size=512 /dev/sdwhatever

Yep, it’s really that easy. Be warned, this is a destructive, touch-every-sector operation–so it will take a while, and your drives might get a bit warm. The 3.84TB drives I needed to convert took around 10 minutes apiece.

On the plus side, this also fixes any drive slowdowns due to a lack of TRIM, since it’s a destructive sector-level format.

I’ve heard stories of drives that refused to sg_format initially; if you encounter stubborn drives, you might be able to unlock them by dding a gigabyte or so to them–or you might need to first sg_format them with --size=520, then immediately and I mean immediately again with --size=512.

ZFS write allocation in 0.7.x

In an earlier post, I demonstrated why you shouldn’t mix rust and SSDs – reads on your pool bind at the speed of the slowest vdev; effectively making SSDs in a pool containing rust little more than extremely small, expensive rust disks themselves.  That post was a follow-up to an even earlier post demonstrating that – as of 0.6.x – ZFS did not allocate writes to the lowest latency vdev.

An update to the Storage Pool Allocator (SPA) has changed the original write behavior; as of 0.7.0 (and Ubuntu Bionic includes 0.7.5) writes really are allocated to the lowest-latency vdev in the pool. To test this, I created a throwaway pool on a system with both rust and SSD devices on board. This isn’t the cleanest test possible – the vdevs are actually sparse files created on, respectively, an SSD mdraid1 and another pool consisting of on rust mirror vdev. It’s good enough for government work, though, so let’s see how small-block random write operations are allocated when you’ve got one rust vdev and one SSD vdev:

root@demo0:/tmp# zpool create -oashift=12 test /tmp/rust.bin /tmp/ssd.bin
root@demo0:/tmp# zfs set compression=off test

root@demo0:/tmp# fio --name=write --ioengine=sync --rw=randwrite \
--bs=16K --size=1G --numjobs=1 --end_fsync=1

[...]

Run status group 0 (all jobs):
WRITE: bw=204MiB/s (214MB/s), 204MiB/s-204MiB/s (214MB/s-214MB/s),
io=1024MiB (1074MB), run=5012-5012msec

root@demo0:/tmp# du -h /tmp/ssd.bin ; du -h /tmp/rust.bin
1.8M /tmp/ssd.bin
237K /tmp/rust.bin

Couldn’t be much clearer – 204 MB/sec is higher throughput than a single rust mirror can manage for 16K random writes, and almost 90% of the write operations were committed to the SSD side. So the SPA updates in 0.7.0 work as intended – when pushed to the limit, ZFS will now allocate far more of its writes to the fastest vdevs available in the pool.

I italicized that for a reason, of course. When you don’t push ZFS hard with synchronous, small-block writes like we did with fio above, it still allocates according to free space available. To demonstrate this, we’ll destroy and recreate our hybrid test pool – and this time, we’ll write a GB or so of random data sequentially and asynchronously, using openssl to rapidly generate pseudo-random data, which we’ll pipe through pv into a file on our pool.

root@demo0:/tmp# zpool create -oashift=12 test /tmp/rust.bin /tmp/ssd.bin root@demo0:/tmp# zfs set compression=off test 

root@demo0:~# openssl enc -aes-256-ctr -pass \
              pass:"$(dd if=/dev/urandom bs=128 \
                    count=1 2>/dev/null | base64)" \
              -nosalt < /dev/zero | pv > /test/randomfile.bin

 1032MiB 0:00:04 [ 370MiB/s] [    <=>                 ] ^C

root@demo0:~# du -h /tmp/*bin
root@demo0:~# du -h /tmp/*bin
571M /tmp/rust.bin
627M /tmp/ssd.bin

Although we wrote our pseudorandom data very rapidly to the pool, in this case we did so sequentially and asynchronously, rather than in small random access blocks and synchronously. And in this case, our writes were committed near-equally to each vdev, despite one being immensely faster than the other.

Please note that this describes the SPA’s behavior when allocating writes at the pool level – it has nothing at all to do with the behavior of individual vdevs which have both rust and SSD member devices. My recent test of half-rust/half-SSD mirror vdevs was also run on Bionic with ZFS 0.7.5, and demonstrated conclusively that even read behavior inside a vdev doesn’t favor lower-latency devices, let alone write behavior.

The new SPA code is great, and it absolutely does improve write performance on IOPS-saturated pools. However, it is not intended to enable the undying dream of mixing rust and SSD storage willy-nilly, and if you try to do so anyway, you’re gonna have a bad time.

I still do not recommend mixing SSDs and rust in the same pool, or in the same vdev.

ZFS does NOT favor lower latency devices. Don’t mix rust disks and SSDs!

In an earlier post, I addressed the never-ending urban legend that ZFS writes data to the lowest-latency vdev. Now the urban legend that never dies has reared its head again; this time with someone claiming that ZFS will issue read operations to the lowest-latency disk in a given mirror vdev.

TL;DR – this, too, is a myth. If you need or want an empirical demonstration, read on.

I’ve got an Ubuntu Bionic machine handy with both rust and SSD available; /tmp is an ext4 filesystem on an mdraid1 SSD mirror and /rust is an ext4 filesystem on a single WD 4TB black disk. Let’s play.

root@box:~# truncate -s 4G /tmp/ssd.bin
root@box:~# truncate -s 4G /rust/rust.bin
root@box:~# mkdir /tmp/disks
root@box:~# ln -s /tmp/ssd.bin /tmp/disks/ssd.bin ; ln -s /rust/rust.bin /tmp/disks/rust.bin
root@box:~# zpool create -oashift=12 test /tmp/disks/rust.bin
root@box:~# zfs set compression=off test

Now we’ve got a pool that is rust only… but we’ve got an ssd vdev off to the side, ready to attach. Let’s run an fio test on our rust-only pool first. Note: since this is read testing, we’re going to throw away our first result set; they’ll largely be served from ARC and that’s not what we’re trying to do here.

root@box:~# cd /test
root@box:/test# fio --name=read --ioengine=sync  --rw=randread --bs=16K --size=1G --numjobs=1 --end_fsync=1

OK, cool. Now that fio has generated its dataset, we’ll clear all caches by exporting the pool, then clearing the kernel page cache, then importing the pool again.

root@box:/test# cd ~
root@box:~# zpool export test
root@box:~# echo 3 > /proc/sys/vm/drop_caches
root@box:~# zpool import -d /tmp/disks test
root@box:~# cd /test

Now we can get our first real, uncached read from our rust-only pool. It’s not terribly pretty; this is going to take 5 minutes or so.

root@box:/test# fio --name=read --ioengine=sync  --rw=randread --bs=16K --size=1G --numjobs=1 --end_fsync=1
[ ... ]
Run status group 0 (all jobs):
  READ: bw=17.6MiB/s (18.5MB/s), 17.6MiB/s-17.6MiB/s (18.5MB/s-18.5MB/s), io=1024MiB (1074MB), run=58029-58029msec

Alright. Now let’s attach our ssd and make this a mirror vdev, with one rust and one SSD disk.

root@box:/test# zpool attach test /tmp/disks/rust.bin /tmp/disks/ssd.bin
root@box:/test# zpool status test
  pool: test
 state: ONLINE
  scan: resilvered 1.00G in 0h0m with 0 errors on Sat Jul 14 14:34:07 2018
config:

    NAME                     STATE     READ WRITE CKSUM
    test                     ONLINE       0     0     0
      mirror-0               ONLINE       0     0     0
        /tmp/disks/rust.bin  ONLINE       0     0     0
        /tmp/disks/ssd.bin   ONLINE       0     0     0

errors: No known data errors

Cool. Now that we have one rust and one SSD device in a mirror vdev, let’s export the pool, drop all the kernel page cache, and reimport the pool again.

root@box:/test# cd ~
root@box:~# zpool export test
root@box:~# echo 3 > /proc/sys/vm/drop_caches
root@box:~# zpool import -d /tmp/disks test
root@box:~# cd /test

Gravy. Now, do we see massively improved throughput when we run the same fio test? If ZFS favors the SSD, we should see enormously improved results. If ZFS does not favor the SSD, we’ll not-quite-doubled results.

root@box:/test# fio --name=read --ioengine=sync  --rw=randread --bs=16K --size=1G --numjobs=1 --end_fsync=1
[...]
Run status group 0 (all jobs):
   READ: bw=31.1MiB/s (32.6MB/s), 31.1MiB/s-31.1MiB/s (32.6MB/s-32.6MB/s), io=1024MiB (1074MB), run=32977-32977msec

Welp. There you have it. Not-quite-doubled throughput, matching half – but only half – of the read ops coming from the SSD. To confirm, we’ll do this one more time; but this time we’ll detach the rust disk and run fio with nothing in the pool but the SSD.

root@box:/test# cd ~
root@box:~# zpool detach test /tmp/disks/rust.bin
root@box:~# zpool export test
root@box:~# zpool import -d /tmp/disks test
root@box:~# cd /test

Moment of truth… this time, fio runs on pure solid state:

root@box:/test# fio --name=read --ioengine=sync  --rw=randread --bs=16K --size=1G --numjobs=1 --end_fsync=1
[...]
Run status group 0 (all jobs):
  READ: bw=153MiB/s (160MB/s), 153MiB/s-153MiB/s (160MB/s-160MB/s), io=1024MiB (1074MB), run=6710-6710msec

Welp, there you have it.

Rust only: reads 18.5 MB/sec
SSD only: reads 160 MB/sec
Rust + SSD: reads 32.6 MB/sec

No, ZFS does not read from the lowest-latency disk in a mirror vdev.

Please don’t perpetuate the myth that ZFS favors lower latency devices.

Review: ASUS “Transformer” Tablet

ASUS Transformer (tablet only)
Tablet only - 1.6 lbs

ASUS has entered the Android tablet market with a compelling new contender – the Eee TF-101 “Transformer.”  Featuring an Nvidia Tegra dual-core CPU at 1.0GHz, the device feels “snappier” than most relatively high-end desktop PCs – nothing lags; when you open an app, it pops onto the screen smartly.  If you’re accustomed to browsing on smartphones, you’ll feel the performance difference immediately – even notoriously heavy pages like CNN or ESPN render as quickly as they would if you were using a high-end desktop computer.

I first got my hands on a TF101 that one of my clients had purchased, sans docking station.  After I’d played with it for a few minutes, I knew I wanted one, but that left the $150 question – what will it be like when it’s docked?  The answer is “there’s a lot of potential here” – but there are problems to be worked through before you can give it an unqualified “hey, awesome!

CNET complains that the TF101 feels cheap, with poorly-rounded corners and a flimsy backplate.  After a week or so of ownership and something like 20 hours of active use, I do not agree on either issue.  I find the tablet nicely balanced, easy to grip, and solid feeling.  It weighs in at 1.6 pounds (tablet only) and 2.9 lbs (tablet and docking station), which puts it pretty much dead center in standard weight for both tablets and netbooks.  However, while the weight of the docked TF101 is an ounce heavier than the weight of my Dell Mini 10v, the TF101 feels much less cumbersome – probably because even though it’s slightly heavier, it’s much, much slimmer.

Docking the tablet feels easy and intuitive; line up the edges of the tablet with the edges of the docking station, and you’re in the right position for the sockets to mate. Pressing down first gently, then firmly produces good tactile feedback for whether it’s lined up properly, and whether it’s “clicked” all the way in. The hinge itself is very solid and doesn’t feel “loose” or sloppy at all – in fact, it’s stiff enough that most people would have trouble moving it at all without the tablet already inserted. Undocking the tablet is easy; there’s a release toggle that slides to the left (marked with an arrow POINTING to the left, which is a nice touch); the release toggle also has a solid, not-too-sloppy but not-too-stiff action.

The docking station offers more than just the keyboard.  There are also two USB ports (a convenience which is missing on the tablet itself), a full-size SD card slot, and an internal battery pack, roughly comparable to the battery in the tablet itself.  The extra battery life is a great feature; the tablet itself gets 9 hours or so of fully active use, and the docking station roughly doubles that.  In practical terms, most people will be able to go away for a long weekend with a fully-charged TF101 sans charger, use it for 4 hours a day without ever bothering to turn it off, and come home with a significant fraction of the battery left – especially if they’ve taken the time to set the “disconnect from wireless when screen is off” option in the Power settings. I used the docked tablet 2 to 4 hours a day for a full seven-day week, playing games, emailing, and browsing; at the end of the week I was at 15% charge remaining.

ASUS Transformer - with docking station
With docking station - 2.9 lbs

Polaris Office, the office suite shipped with the TF101, was a pleasant surprise – a client asked if I could display PowerPoint presentations on the tablet, and the answer turned out to be “yes, I certainly can.” I’ve only tried a few of them, none of which had any particularly fancy animations; but 40MB slideshows load and display just fine. Paired with an HDMI projector, the TF101 should make a pretty solid little presentation device, particularly since it feels just as “fast” running slideshows as it does browsing and playing games from the Market.

Moving on to the docking station itself, I quite liked the way the touchpad is integrated into the Android OS – instead of an arrow cursor, you get a translucent “bubble” roughly the size of a fingertip press, which felt much more intuitive to me. With the arrow, I tend to try to be just as precise as I would with a mouse – which can be frustrating. The “fingertip bubble” made it easier for me to relax and just “get what you want inside the circle” without trying to be overly finicky. Sensitivity for both tracking and tapping was also very good; the touchpad feels slick and responsive to use.

The keyboard, unfortunately, is a mixed bag – it’s better-suited to large hands than many netbook keyboards, but you won’t ever mistake it for the full-size keyboard on your desk.  The dimensions are almost exactly the same as the keyboard on my Dell Mini 10v; but I find that it feels significantly more cramped and awkward – probably because ASUS elected to go the trendy new route of “raised keys with space between them”, where the Mini’s keys are literally edge-to-edge with one another.  This should make the TF101 less likely to collect crumbs, skin flakes, and other kinds of “yuck” than the Mini 10v, but I personally would rather deal with more cleaning than less roominess.

Several applications don’t really play well with the keyboard – ConnectBot, which I use as an SSH client to operate remote servers, becomes completely unusable due to handling the shift key wrong – you can’t type anything from ! through + without resorting to re-enabling the onscreen keyboard.  In the Android Browser, typing URLs in the address bar works fine, but if you do any significant amount of typing in a form – for example, writing this post in the TinyMCE control WordPress uses – the up and down arrow keys frequently map to the wrong thing.  Sometimes up/down arrow would scroll through the text I was typing, sometimes they would tab me to different controls on the page, and some OTHER times they would simply scroll the entire page up and down.

In Polaris Office (the office suite shipped with the TF101), the keyboard itself worked perfectly – but the touchpad was too sensitive and placed too closely. It was difficult to type more than one sentence at a time without the heel of my hand brushing the touchpad and registering as a “tap”, causing the last half of a sentence to appear in the midst of the sentence before it.

Can these problems be mitigated?  Probably.  The ConnectBot issue was solved pretty simply by Googling “connectbot transformer”, which immediately leads you to a Transformer-specific fork of ConnectBot – after uninstalling the original ConnectBot, temporarily enabling off-Market app installation, and downloading and installing the fork directly from GitHub, my shift-key problems there were solved.  Presumably either Google or ASUS will eventually deal with the arrow key behavior in the Android Browser.  I tried using the Dolphin HD Browser in the meantime, but had no better luck with it – it is at least consistent in how it handles arrow key usage, but unfortunately it’s consistently wrong – it always scrolls the entire page up and down when you press up or down arrow keys, no matter where the focus on the page is.   Finally, you can toggle the touchpad completely on or off by using a function button at the top of the keyboard – but it would be nice to simply change the sensitivity instead, or automatically disable it for half a second or so after keypresses, the way you can on a traditional (non-Android) netbook.

In the end, though, you can’t really fix all the problems by yourself with “tweaking”; some of the frustrations with the poor integration of the physical keyboard into the Android environment are going to keep ambushing you until Google itself addresses them. ASUS and individual app developers can and likely will continue working to mitigate these issues, but it will be a never-ending game of whack-a-mole until Android itself takes adapting to the “netbook” environment more seriously.

Final verdict: The tablet looks, feels, and performs incredibly well; in most cases it “feels faster” than even high-end desktop computers. Even though my Atom-powered Dell Mini 10v has a Crucial C300 SSD (Solid State Drive), the TF101 spanks it thoroughly in pretty much every performance category possible and sends it home crying. Battery life is also phenomenal, at 9-ish active hours undocked or 18-ish hours docked. It looks and feels, on first blush, like it would make a truly incredible netbook when docked – but Android 3.2 and its apps clearly haven’t come to the party well-prepared for a physical keyboard – and it shows, which knocks the initial blush well off the device as a netbook competitor. If you really need physical keyboard and conventional data entry, this is probably not going to be the device for you – at least, not until the rest of the OS and its apps evolve to support it better.

If you want a tablet, I can recommend the TF-101 without reservation.  If you want a netbook, though, you should probably give the TF-101 a pass unless and until Google starts taking the idea of “Android Netbooks” seriously.

More virtualization: multiple Win7 guests on a single Debian host

WordPress › Error

There has been a critical error on this website.

Learn more about troubleshooting WordPress.