▲Troubleshooting ZFS – Common Issues and How to Fix Themklarasystems.com

68 points by zdw 4 days ago | 15 comments

aborsy 20 hours ago [-]

I think ZFS send -w should be default. You enter a flag if you want to send plaintext.

Also something like sanoid needs to be built in to ZFS. Find out all newer snapshots at source by zfs list and send the first and last.

Nice website btw. Does anyone know what tools are used to build this website?

totetsu 21 hours ago [-]

What about .. you mounted your system disk and an external drive on another system and now you put it back in your laptop it won’t boot because some pool thing got changed.

anotherhue 21 hours ago [-]

You might have changed the hostid, and it refuses to mount it as a safety precaution. IIRC it's a quick fix.

https://wiki.lustre.org/Protecting_File_System_Volumes_from_...

E39M5S62 16 hours ago [-]

Use ZFSBootMenu, it automatically corrects hostid issues each boot by passing spl.spl_hostid to the kernel, matching the value set on the pool itself.

webdevver 21 hours ago [-]

damn i thought it was klarna, the fabled burrito borrowing service. "wow they're self-hosting zfs? fascinating..."

3np 21 hours ago [-]

Remote send/recv can be frustrating due to the need for root.

It can be helpful to remember that while less efficient, for scenarios where ssh as root is a no-go you can ship snapshot syncs (including incremental ones) as files:

    zfs send [...] tank > tank.zfssnap
    cat tank.zfssnap | zfs recv [...] bank

xoa 20 hours ago [-]

>It can be helpful to remember that while less efficient, for scenarios where ssh as root is a no-go you can ship snapshot syncs (including incremental ones) as files:

This capability can also be extremely helpful for bootstrapping a big pool sync from a site with mediocre WAN (very common in many places, particularly when it comes to upload vs download). Plenty of individuals or orgs may be characterized by having quite a sizable amount of data accumulated by this point, but they're not generating new data at a prodigious clip. So if you can get the initial replication done, ongoing syncing from there can be possible over a fairly narrow pipe.

Latency might not be quite the best, but sometimes the greatest bandwidth to be had is a big fat drive or set of them in the trunk of a car :)

blahlabs 19 hours ago [-]

"Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway." - Andrew S. Tanenbaum

Computer Networks, 3rd ed., p. 83. (paraphrasing Dr. Warren Jackson, Director, University of Toronto Computing Services (UTCS) circa 1985)

toast0 16 hours ago [-]

> Remote send/recv can be frustrating due to the need for root.

You can delegate with zfs allow. Relevant permissions are send, receive, maybe create?

tinco 19 hours ago [-]

What if you need to delete 200TB of data, and you forgot to put your metadata on an SSD?

GauntletWizard 19 hours ago [-]

You can add a metadata device to the pool at any time - it's probably faster to add that pool and wait for it to populate than to delete all that data with metadata on spinning rust.

tinco 18 hours ago [-]

It was a while ago. I think I needed to move the data one directory at a time to get it to move the metadata to the SSD 's.

Since that still would take months I think I simply waited for the next upgrade and moved all the data to a new pool. And deleted the old one.

curt15 22 hours ago [-]

Does ZFS suffer the equivalent of the dreaded btrfs ENOSPC?

nelox 21 hours ago [-]

ZFS can hit “No space left on device” (ENOSPC) errors if the pool fills up. But unlike btrfs’s infamous ENOSPC woes, ZFS was designed to handle these situations much more gracefully. ZFS actually keeps a bit of “slop space” reserved. So, as you approach full, it stops writes early and gives you a chance to clean things up, instead of running into unpredictable issues or impossible snapshot removals like btrfs sometimes does. You can even tweak how much safety space ZFS reserves, though most users don’t need to touch it.

[https://news.ycombinator.com/item?id=11696657]

When you run out of space in ZFS, you get a clear error for write attempts, but the system doesn’t end up fragmented beyond repair or force you into tricky multi-step recovery processes. Freeing up space (by deleting files or snapshots, or expanding the pool) typically makes things happy again.

[https://bobcares.com/blog/zfs-no-space-left-on-device-how-to...]

namibj 21 hours ago [-]

I thought Btrfs includes a fix by reserving about half a gig up front for that purpose/prevention?

Loading comments...