This is great news, and we're already using ZFS in production on Ubuntu in a few areas at Netflix (not widespread yet).
Ubuntu 16.04 also comes with enhanced BPF, the new Linux tracing & programming framework that is builtin to the kernel, and is a huge leap forward for Linux tracing. Eg, we can start using tools like these: https://github.com/iovisor/bcc#tracing
It's really two questions: Why choose Ubuntu for the cloud, and, why choose FreeBSD for the CDN. We believe that's the best choice for both environments. I was trying to type in an explanation here, but that's really something that will take a lot to explain (maybe a Netflix tech blog post).
If you do write that blog post, it would be cool if you not only covered the FreeBSD vs. Ubuntu aspects of the choice, but also the Ubuntu vs. other Linux aspects (particularly Debian).
If you browse some of the _example.txt files in https://github.com/iovisor/bcc/tree/master/tools , you'll see it's solving the same problems we used to solve with DTrace, plus a few extra. Here's a couple of the ZFS examples (since we're talking ZFS):
The current BPF interface we're using (bcc) is Python for the frontend, and C for the backend. It's currently much more verbose than DTrace, and involves writing 10x the lines of code. For some immediate use cases at Netflix, that's not a big problem, as staff will be using BPF via a GUI (Vector), not writing these tools directly.
There's also high level features it's still missing (like tracepoints and sampling), so what will be in Ubuntu 14.04 won't do everything, but it will do a fair amount: most of those _example.txt's. Some use a newer BPF interface (linux 4.5), and we've been putting the legacy versions in an /old directory specifically for Ubuntu 16.04 users.
Does the current publicly released version of Vector support BPF? Or is there perhaps a PMDA that allows BPF support?
I'm following along with all of this pretty excitedly, and crossing my fingers for a Linux tracing book with BPF, ftrace, perf, etc. to read through and keep on my shelf next to your performance and dtrace books ;)
ZFS is nice but as far as I understand the Linux version does not yet have support for copy-on-write clones using e.g. "cp --reflink=always", which to me was reason enough to choose BTRFS instead. Apart from this the two systems seem to be quite comparable (from my limited user perspective), with BTRFS having quite good Linux support as well. Maybe someone more experienced with the COW functionality could comment on that as it would be very interesting to hear how other people deal with this.
I've used both ZFS on both BSD and OpenIndiana, and I've used btrfs on linux. as recently as 4.0.5
As of 4.0.5 btrfs was IMO completely unusable as a daily file system. Some examples of issues I ran into:
1) System became unbootable with the version of btrfs I had installed and I had to use either an older or newer kernel to recover
2) I have a periodic backup of my mailbox that runs, and when it runs my system becomes completely unusable until it completes. The same script running on zfs on bsd and with ext4 or reiser3 on linux would show I/O slowdowns, but I could still use my machine.
3) In general I would run into other minor issues and the consensus in #btrfs was that since my kernel was more than 3 months old, it was probably fixed in the latest version, and why would somebody using an experimental filesystem not be tracking mainline more closely?
[edit]
To be fair, here's some issues with ZFS:
1) Do not ever accidentally enable dedupe on commodity hardware; it will slowly consume all your RAM if you aren't on a sun server (where 64GB of RAM is a resource constrained environment), and there aren't effective ways to undo dedupe, other than copying all the data onto a different pool.
2) You can't shrink a pool. Hugely annoying, apparently non-trivial to solve.
3) Do not allow a pool to exceed 90% capacity ... and probably don't let it exceed 85%.
ZFS does not have a defrag utility and it badly needs one. You can permanently wreck zpool performance by running it up past 90% capacity - even if you later reduce capacity back down to 75-80%. You can sort of fix it if you add additional top level vdevs to the zpool, thus farming out some IO to the new set of disks, but it's still going to be performance constrained forever. The only solution is to create a new pool and export the data to it.
This is unacceptable, by the way.
It is not at all reasonable to require a filesystem to stay below 80% capacity (our target "full" number at rsync.net) nor is it acceptable that hitting 90% is a (performance) death sentence.
When you consider that you might have already sacrificed 25% of your physical disk just for the raidz2/raidz3 overhead, being constrained to 80% means you're only using 60% or so of your physical disks that you bought.
If gang blocks are generated, you get more I/Os than necessary and an extra level of indirection, but the ZFS code base tries very hard to avoid that situation by switching to the allocator behavior at the metaslab level to best fit, such that most data written to the pool would not have gang blocks used at all. This is understandably terrible for IOPS, but it is likely the source of the performance degradation that you saw. You can probe for the zio_gang_* functions during a scrub to see if any gang blocks exist. On my pool that has exceeded 90% on multiple ocasioks, there are zero gang blocks and consequently, no permanent degradation from them. The only other problem that you might have (which tends to be caused by the order of writes rather than the fullness of the pool) is lower sequential performance from nonlinear block placement (one ZoL user measured this as cutting sequential reads in half on a pool filled with files made by bit torrent), but that is a much less severe problem, especially on solid state drives. If you want to fix placement, you can do a file copy or send/recv. The new locations should have blocks picked in sequence whenever possible.
For ZFS to get defrag, someone with enough ability and knowledge would need to bite the bullet and go through the fairly massive and difficult task of adding in block pointer rewrite, which doesn't seem to be something anyone has been willing to do, and I've seen a lot of concerns about the actual feasibility of it from some very smart people that are knowledgeable with the codebase.
I wonder how much money you'd need to pay a very talented engineer to do the work. This is yet another thing I didn't read about before building a home NAS on ZFS.
I had this happen to me... I accidentally allowed a pool to fill up to capacity and couldn't do anything with it because deleting files wouldn't free space due to snapshots and the commands to delete snapshots wouldn't work.
Then added a disk to it to try to recover. That worked, but only after adding a disk did I realize that I couldn't shrink the pool down again. I ended up moving the whole thing off to a new disk cluster and back again. Really painful.
The main factor seems to be the best-fit allocator, which tends to go into action sooner from metaslabs crossing the 94% threshold earlier than the pool itself and still be selected due to LBA weighting, which is a trick to increase throughput on rotational media. This ought to help prevent best-fit allocation from occurring earlier than necessary on SSDs:
That said, Delphix made changes to consider on fragmentation rather than space usage when the histogram is in use, so the performance before best-fit behavior goes into effect is better than outright selection of metaslabs by free space.
I've been using btrfs since 2 years (Fedora) and the only problem I've ever run into was "no space left on device", solvable with a rebalance. btrfs also survived many hard poweroff.
I'll pitch in with a neutral position. I've been using btrfs for four years, and in that time I've had unrecoverable fs corruption probably three times. This is on Arch, on bleeding edge kernels, where new releases are prone to regressions that break the filesystem.
But there was a tangible progression from instability towards increasing stability. I haven't had one lick of issue with btrfs in about a dozen kernel releases. I'm close to saying I'd trust in a production environment, since I use it everywhere else as a daily driver, I would just use an LTS release version just to be safe.
It is not all sunshine and roses, though. While Facebook employs several major btrfs developers, a lot of features that have been talked about for years still have not seen the light of day or any development whatsoever. lz4 compression, better checksum algorithms, per-subvolume encryption, online filesystem checking, and the Raid 5/6 support is still kind of garbage a year later. I worry that btrfs is suffering from a lack of interest in actually making the last legitimate pushes it needs, and code audits, and integration testing, to make it truly trustworthy.
But at the end of the day checksum integrity and COW are basically a game changer for me in terms of data integrity.
I tried to use the mirror functionality when it was new. I tested booting with one disk missing. Errors all the way. I went into IRC and chatted with btrfs folk about the "bug". Their response?
"Booting without all members of your mirror is unsupported."
If you use ZFS you only have snapshots for this. cp --reflink also has some gotchas on btrfs - if you do a balance it's not preserved and there are some odd problems with snaphots (I have no link at hand, take a look at the mailing list)
ZFS is not comparable to btrfs at the moment. Everything device related is missing on btrfs. No detection of missing or broken devices in btrfs at the moment, no hotspare functionality, btrfs RAID1 uses the pid to decide with disk to read from. RAID5/6 is still experimental and there are some odd behaviours.
Using btrfs for production is a risky bet and may very well bite you. The tooling is terrible at the moment (IMHO) and benchmarks favour ZFS most of the time.
No ZFS version has support for that. It has been discussed and it might be implemented at some point.
That said, I would like to point out that ZFS' dataset level operations are more powerful than reflinks. ZFS' dataset level operations give separate independent snapshot and clone capabilities. They also provide the ability to rollback without killing things on top (which is useful in some cases). You cannot do that with reflinks. I suppose the immutable bit could be used to fix a reflink so that it retains the state at creation, but that is racy. In the case of virtual machines which seems to be a major application of reflinks, zvols are lower overhead and support incremental send/recv.
One benefit of reflinks would be that regular users can use it, but regular users should be able to snapshot, clone and rollback when delegation support is implemented.
Running hourly, daily, weekly and monthly snapshots is reason enough to choose ZFS on Linux for me. And ironically, it runs better on Linux than it did for me on Solaris - I used to get occasional pauses every few minutes when streaming media. Memory utilization for cache purposes isn't fantastic since it doesn't interact well with the rest of the kernel's logic, but everything else is pretty good.
I'v had the misfortune of using btrfs in production with a few hundred machines on Ubuntu 14.04. It's one of the most finicky filesystems I've ever used. It's probably better in newer kernels, but if you have a lot of churn it requires constant care and feeding and tends to cause kernel softlockups fairly commonly.
At CoreOS and we tried really hard to make btrfs happen, but it really came down to how different it operated than other file systems. It was mainly a UX issue and thus fell into my lap.
The major issue is that regular debugging tools that folks have been using forever like `df -h` aren't just non-functional, they actively misrepresent the state of the file system. The most common example is indicating that you have plenty of free space when in fact you're out. We had to write a lot of documentation to teach people how btrfs works and how to debug it: https://coreos.com/os/docs/latest/btrfs-troubleshooting.html
The second major issue is that rebalancing requires free space, which is the problem that most folks are trying to fix with a rebalance operation. Catch-22 in the worst way. Containers vary in size and can restart frequently, churning through the btrfs chunks without filling them up, leaving around a lot of empty space that needs to be rebalanced.
I hit that rebalancing needs space (and therefore can ENOSPC) issue at work when trying to compile ZFS on CoreOS on Digital Ocean before CoreOS switched from btrfs to ext4 and overlayfs. Getting ENOSPC on btrfs rebalance when you are seeing regular writes return ENOSPC is a really annoying problem.
> I'v had the misfortune of using btrfs in production with a few hundred machines on Ubuntu 14.04
You are not alone. btrfs seems to be kind of stable - as in does not corrupt itself anymore - with 4.2 but it's been a nasty ride.
It's an experimental filesystem that is neither complete nor stable yet. I wish this would be better communicated.
It's needless frustrating: If you search for btrfs you come across a few slide decks that tell you: It's fine you can use it... after the first strange problems you'll subscribe the mailing list and every other day there is some post that shines some light into strange behavior and stuff that is not implemented.
If you want checksumming on your single hdd backup disk btrfs is fine. For everything else you are up to some surprises... basically everything volume management and RAID is pretty much experimental and has strange behavior.
Performance is not even a topic. I remember the ML discussion on this OLTP blogpost and the majority of responses was: Don't run databases on btrfs you stupid! I'd rather would read a technical discussion about the problems but from reading the ML it seems like it's too complex and few understand the complexity.
@bcantrill called it a shit-show in some podcast and while it maybe technically not true it sure does looks like that.
If you want peace of mind use mdraid+ext4 (or xfs) - ZFS on Linux has a lot of problems for heavy usage but the community is IMHO more invested in making it a good Linux citizen.
On the other hand: This stuff is complicated and everyone expects miracles. I'm just looking at it from sysadmin perspective and on Linux both suck at the moment. But ZFS won't eat your data and has far better tooling.
If you need something that works for high load on Linux I'd use neither.
Nothing. It's the same as nvidia's non-gpl kernel modules. The simple fact is they will never be accepted upstream, but that matters little to distributors of ubuntu's scale.
In the case of Nvidia's modules, Nvidia's proprietary licensing disallows distribution of a prebuilt nvidia.ko (as that implies distributing a modified version). Coincidentally, their license terms for the OpenSolaris driver have no such restriction and the OpenSolaris descendants distribute the prebuilt module without potentially violating Nvidia's license terms.
Amazingly, their Linux licensing used to be worse. They used to claim you were only permitted to install the driver on one computer within an organization
Someone hired better lawyers. All these ridiculous EULA and click through licences and idiotic mandatory registration systems we see, I can't help think many companies would benefit from hiring better lawyers. Get rid of the timid who default to 'no' in order to protect their own arse, hire people who help you get where you want to go.
In this case, I can't even see any real liability issues - even if Canonical did get taken to court there are no damages since the software is free of charge.
People decided to listen to lawyers who read the licenses instead of listening to statements by people claiming to know how things work without actually reading either license or asking a lawyer about it. That is quite literally the only change.
No, I've seen talks by the engineers behind Solaris (I don't recall who at the moment) that strongly indicated the Sun lawyers didn't go out of their way to be incompatible with the GPL -- they just wanted a license that allowed them to split proprietary and open code (as they didn't have the right to open up all of Solaris due to licensing agreements with third parties etc) -- and still being able to distribute both open and traditional closed Solaris. This led to the "per file" license nature of the CDDL -- and unfortunately to the "additional limitation"-bit that makes it incompatible with the GPL.
If it was done again today, they might have gone for the Apache license as I recall -- and avoided some of the unfortunate issues.
Pretty sure Canonical saw an opportunity with container management, plus interest in ZFS, decided to get over the unfortunate licensing issue and support the module.
I check in on btrfs every 6-12 months or so. To date it has always seemed too unreliable compared to ZFS. Lack of decent RAID 5/6 support is another major difference.
I have been using btrfs for a couple of years. The nicest thing I can say about it is that it is both conceptually elegant and has made my backup practises bulletproof.
Granted, it's been a couple of years, but I tried BTRFS when openSUSE made it their default and I had filesystem issues that I've never had on anything else. I'm sure it's progressed a lot since then, but I expect it will be behind ZFS for a long time in terms of stability.
SUSE seems to recommend XFS for production data and btrfs for the rootfs that is a read mostly workload that is unlikely to trigger ENOSPC. It is not what I would consider a great endorsement of btrfs.
ZFS on Linux (i.e. kernel module, not with fuse) is very stable in my experience. I have not heard the same about btrfs. I tried ZFS on fuse once, but the performance was abysmal.
I heard from a former Sun executive that Apple wanted indemnification following netapp's lawsuit. They had spent a long time negotiating over that before they had something mutually acceptable and it was supposed to be signed the day Oracle's acquisition of Sun finished. That left it up to Larry Elison, who refused to sign it and Apple decided to try its luck improving HFS+.
BTRFS is still not mature, and there's a license incompatibility as well. And it's controlled by Oracle all the same. If Apple replaces the filesystem they'll likely roll their own.
btrfs is not controlled by Oracle for one (its principal developers are employed by Facebook, but its still regular Linux GPLv2 code) but I did check and the APL is incompatible with the GPL.
And they obviously aren't going to make a new filesystem. That doesn't get them sales like higher resolution screens or changing the color theme... again.
And they obviously aren't going to make a new filesystem. That doesn't get them sales like higher resolution screens or changing the color theme... again.
Introducing a new filesystem would be a big decision for Apple. There would doubtless be all sorts of migration and compatibility issue, even aside from the work it would take. Especially given where we are in the maturity of desktop clients, it makes a lot more sense to incrementally improve the current filesystem. I'm not sure how snarky you intended to be, but no there aren't many sales in a complex undertaking that is far more likely to cause data corruption and migration issues than concrete benefits for 99.9% of users.
I'm not sure it's accurate to say it was "controlled" by Oracle but a lot of--though certainly not all--active development came out of Oracle at one point, notably by Chris Mason (who is now with Facebook).
Dragonfly BSD's HAMMER2, when it is even half done (that is, stable for one node), is probably a much better (technical) choice than BTRFS or improving HFS+, and probably a much better legal choice than ZFS.
Sorry in advance if this is a stupid question: my main Linux system is a laptop with a small SSD drive. I would like to organize my entire digital life on a 2 TB external USB drive, and be able to maintain a clone of everything on at least one other 2 TB USB drive.
I read some time ago that ZFS is definitely NOT the right tool for laptop/external storage, unless you actually have a zpool with mirroring/raidz (which means you have to always keep the devices connected).
The reason is that when ZFS detects corruption, it'll lock down the whole fs... and prevent reading/recovery data from it, as recovering data from raidz is the expected solution in that case.
I tried to google again for the description of this issue, but I couldn't find it... I found this otoh:
> Even without redundancy and "zfs set copies", ZFS stores two copies of all metadata far apart on the disk, and three copies of really important zpool wide metadata.
Whichn means that this might not actually be a problem after all
> The reason is that when ZFS detects corruption, it'll lock down the whole fs... and prevent reading/recovery data from it, as recovering data from raidz is the expected solution in that case.
ZFS has duplicate metadata by default, so it can recover from corrupted metadata blocks unless too much is gone. If the data blocks are corrupted and there is no resundancy, you should get EIO. There is no code to "lock down the FS", although if you have severe damage (like zeroing all copies of important metadata or losing all members of a mirror), it will die and you will see faulted status. That is a situation no storage stack can survive and is why people should have backups.
> The reason is that when ZFS detects corruption, it'll lock down the whole fs... and prevent reading/recovery data from it
Depends on what exactly is corrupt, but for file corruption it's generally just a case of warnings in logs/zpool status (which will suggest restoring the file from backup), and IO errors trying to access that specific file. The pool itself should remain intact and online.
It's less clear cut if it's important metadata that's damaged, but as you mention, ZFS is quite aggressive about maintaining multiple copies even on standalone devices.
I have backups stored on a double mirror of USB drives. The USB interface is fragile, but it does work. I cannot say that I recommend USB drives though, but if you are using USB, ZFS is not at any disadvantage versus other filesystems.
If you're talking about a "static" setup where you attach both at the same time or not at all, yes. ZFS export before unplugging, ZFS import when plugged in, I can see it working very nicely.
If you're talking about using one of them most of the time and syncing occasionally then any filesystem will do, you'll want a user-level tool for doing the sync (probably - sibling did mention zfs send which I don't have any experience with).
Let's see how this works out. It's probably better and more stable than btrfs but this is not complicated...
ZFS on Linux had issues with ARC (especially fast reclaim) and some deadlocks and AFAIK cgroups are not really supported - e.g. blkio throttling does not work.
Would be great is they got this ironed out but I would be wary. Still great news!
Additional problem is that in-kernel latencies of both btrfs and ZFS are on the high end. Essentially a show stopper for professional audio work and maybe some kinds of video streaming.
Trying to completely escape disk IO in those uses is very limiting.
A comparable solution using LVM and/or mdraid with ext4 on top has much better latency behavior.
Sorry for no benches for you, but feel free to run a quick check using latencytop and ftrace. Phoronix has some performance comparisons if you want them.
> Essentially a show stopper for professional audio work (...). Trying to completely escape disk IO in those uses is very limiting.
Could you expand on that?
I mean, an hour of mono uncompressed 192 kHz/24bit audio is almost exactly 2 GB. Compared to professional audio equipment, 128 GB of RAM isn't very expensive ( < $2000), and that would let you keep 64 one-hour maximum-def tracks in memory. Why do you need to read from the disk with any frequency?
This is great news. Among other incentives, ZFS has some truly excellent features for improving reliability. ZFS's built-in checksums, for example, can result in much happier stories during the onset of disk failures: where a RAID array can quietly return incorrect sector contents without noticing, and be unable to correctly differentiate between the correct and not-so-correct sectors in the event of disk loss followed by disagreements discovered during rebuilds, ZFS simply does the right thing by making checks during normal operations, and uses the same checks to confidently do the right thing during recovery. And snapshotting. Oh, snapshotting.
On the other hand, I've always wished we could get a modern re-take on ZFS. As anyone who's tried it will tell you: dedup in ZFS essentially doesn't work. ZFS, internally, is not built on content-addressable storage (or, it is, but since splitting of large files into blocks doesn't take any special actions to make similar blocks align perfectly, it doesn't have anywhere near the punch that it should). As a result, dedup operations that should be constant-time and zero memory overhead... aren't. Amazing though ZFS is, we've learned a lot about designing distributed and CAS storage since that groundwork was laid in ZFS. A new system that gets this right at heart would be monumental.
Transporting snapshots (e.g. to other systems for backups... or to "resume" them (think pairing with CRIU containers)) could similarly be so much more powerful if only ZFS (or subsequent systems) can get content-addressable right on the same level that e.g. git does. `zfs send` can transport snapshots across the network to other storage pools -- amazing, right? It even has an incremental mode -- magic! In theory, this should be just like `git push` and `git fetch`: I should even be able to have, say n=3 machines, and have them all push snapshots of their filesystems to each other, and it should all dedup, right? And yet... as far as I can tell [1], the entire system is a footgun. Many operations break the ability receive incremental updates; if you thought you could make things topology agnostic... Mmm, may the force be with you.
[1] https://gist.github.com/heavenlyhash/109b0b18df65579b498b -- These were my research notes on what kind of snapshot operations work, how they transport, etc. If you try to build anything using zfs send/recv, you may find these useful... and if anyone can find a shuffle of these commands with better outcomes I'd love to hear about it of course :)
The deduplication code works, but each deduplication operation requires 3 serial IOS to lookup the information needed to check if deduplication is possible and if they are in aches, that becomes painful fast on storage with low IOPS. On my workstation where I have enough memory that the results of all of the lookups naturally fit and high IOPS storage, the deduplication code runs well. You would have a similar problem designing a system that perfectly deduplicates data at the record level if you tried.
I was thinking about this. To reduce both the huge ram usage and serial IOs you could use something similar to a Bloom filter to quickly test whether you should attempt to dedup a new block. If the filter says it's not a duplicate, then completely skip the standard (slow) dedup path.
Bloom filters specifically have issues: they don't permit removing entries for one, and they're not really that efficient. But there's a paper about Cuckoo Filters which seems to solve both of these problems. For example:
The "semi-sort" variant of the cuckoo filter benchmarked in the paper has a size of 192 MB and holds 128M items.
So for 8kb blocks, it can dedup 1TB of blocks. More if you increase the block size or the size of the table.
It has a 0.09% false-positive rate (!). I.e. unique blocks would use the slow path to test for duplication in vain only once in 1111 writes.
The algorithm can perform 6 million lookups per second on the benchmark hardware. (2x Xeons at 2.27GHz, 12MB L3, 32 GB DRAM)
This is assuming that the majority of writes are actually unique, and dedup is more of a "it would be nice" thing than essential. But for that case something like this would be a lot easier to implement and use far fewer resources. Just stick it in front of the existing dedup lookup and early-exit if the filter says it's not duplicate.
Note that ZFS isn't magic. Even with ZFS's checksums on read, you should still be doing regular scrubs, just like you should be with LVM or btrfs. And once you have regular scrubs, checksums on read don't really add much.
Agreed, turning on more safety features will always make you... safer :)
But it's worth noting that I've debugged corruptions in prod systems where:
- corrupted data was read from disk -- a bit flip, with no error code at the time -- by an application
- the application operated on it
- and the application then wrote the result -- still carrying the bit flip -- back to a new file on disk.
Ouch. The bitflip is now baked in and even looks like a legit block as far as the disk is concerned. The disk failed not long after, of course -- SMART status caught up, etc. But that was days later.
Checksums on read address this. I never want to run a system without them again.
I don't understand. If the bit was flipped before or during read the scrub would catch it. If it was flipped by the application then no file system can help. How does read checksums help you?
There is still the chance that data get corrupted between the time the scrub is performed and the time you read the data, so I don't consider scrubs sufficient.
In any case you are right, they should be performed even with ZFS, expecially to test data that is rarely or never read back.
Sure, there can be one error between scrub and read. But assuming RAID, you need errors on two disks. That can happen in a week or however often you scrub, but that's going to be pretty low probability.
You assume that your RAID implementation is going to actually read all of its parity bits from each of the disks, and check them for agreement, before returning a value to you.
And what about machines without ECC RAM? I thought this is the idea for using ZFS in the first place.
Or is the ECC "requirement" only important for raidz?
The whole ECC is required by ZFS is a bit misunderstanding.
ZFS puts guarantees that your data will be safe, but if has no power to help you if your data gets corrupted in memory.
The ECC is the last piece needed to guarantee data safety.
So if you don't have ECC your data is still safer with ZFS than traditional file-systems, ECC just increases the safety further.
How does corrupted memory affect ZFS's performance? Much of the replication state is stored in memory; is it possible you could lose data from a single bit being flipped?
Could you be more specific? You seem to just be linking to random posts about ECC vs. non-ECC. I don't see anything specifically there about the root of the file system.
(I'll happily grant that this scenario is so unlikely as to be impossible for all practical purposes, but having skimmed the stuff you linked to I don't see why it couldn't happen theoretically.)
Without ECC RAM, you're far more likely to get uncorrectable / unnoticeable corruption.
This is not unique to ZFS, and it doesn't make ZFS worse than other filesystems. But since the reason you'd use ZFS is often to avoid any corruption, it's tradition to advise the use of ECC.
I learned this the hard way on an old server that did not have ECC.
I had a file server happily ticking away using ext4.
Converted it to ZFS - and a week later got file system corruption reported. Ran a very extensive memory test - and sure enough I had bad RAM (but it took 2-3 days for the errors to show up).
In the wild there has to be a ton of corruption that just never gets discovered without end to end checking.
If you have a large jpgs or MKVs - a flipped bit here or there is not going to be apparent.
Why would ZFS be worse than ext4 or anything in this way if you don't have ECC?
Genuine question, I don't understand this claim. As far as I can see, ZFS provides protection against some types of failures on disk, which ext4 doesn't. ECC has no impact on that, it protects another dimension.
One reason is that you generally perform scrubbing, thus you are potentially rewriting data which would otherwise be at rest. If your memory is bad, this could replace good data with bad. FS that doesn't scrub doesn't have this issue.
No it is worse with ZFS because it doesn't have a fsck tool. If you have a bit flip in the ZFS metadata you have to export and re-import your whole pool to get it to a writable state again.
Meanwhile fsck on a traditional filesystem will gleefully mangle data that's actually fine in face of transient corruptions.
I've had this happen more than once, both with bad RAM and bad IO controllers - previously fine static data suddenly being detached from the filesystem and appearing in little bits in lost+found, because bit flips effectively causes it to hallucinate problems to "fix".
Resilvering (which is basically a global data verify, similar to fsck) will fix bits flipped the wrong way via error correction, assuming you've set ZFS up that way. Are you saying this doesn't apply to the metadata?
A simple scrub will repair blocks that have checksum failures, but there's is no guarantee that the checksum was calculated before the bit flipped if it occurred in a buffer being written.
Scrubbing corrects on disk bit flips. An in-memory bit flip (which is more rare than on disk bit flips even with non-ECC memory) can corrupt a in-memory data structure which is later written to disk to all replicas, i.e. scrub will not detect it. If this corrupted data structure is later loaded and used this may cause all kinds of problems and there is no tooling to correct it.
No, no one is saying that ZFS is more prone to defects without ECC. Lack of ECC increases the risk of corruption for any filesystem. The reason you hear more about ECC in the context of ZFS is that data integrity is a key feature for many who choose to use ZFS.
I don't understand why this is so often misunderstood.
The parent poster already stated the opposite.
ECC and ZFS are orthogonal. ECC ensures that data in your RAM is not corrupted (or rather detects corruption) it helps whether you use ZFS, EXT4, NTFS etc.
ZFS increases your data safety whether you use ECC or not, but if you have to have maximum assurance that data is fine you should use ZFS and ECC.
This is correct. He considers ECC so important that he is willing to spread FUD about non-ECC behavior to try to scare people into using ECC. I think the truth is scary enough to convince people, but he does not agree.
No, all are equally prone to errors if a bit in RAM is unexpectedly flipped. My understanding is that ZFS requires more RAM and possibly more CPU than other filesystems and those costs aren't worth it if you're going to use RAM that can't detect errors anyway.
They do. There are now Xeon E3 notebooks that have ECC. It's something new they added with Skylake for that reason. They're designed as mobile workstations, so you're gonna be looking at at least $1700 for a laptop, but good quality's always expensive.
They might not own the openzfs patches but they definitely own all of the zfs code. They had all contributors sign CLAs to reassign copyrights. That's how they were able to end opensolaris.
From my understanding, Sun back in the day, and Oracle now, cannot release everything under the GPL due to contractual obligations. There's a reason they wrote the CDDL in the first place.
I can't find the talks now, but I believe Cantrill and others have spoken about this previously.
My memory is somewhat fuzzy, so I might be wrong on this.
That was Solaris and that was regarding putting all of it under an OSS license, not necessarily the GPL. There were a few tiny bits that they just could not open source.
Oracle could release their fork of ZFS under any license they wish.
I'm a conservative user, so I don't change my filesystem until my preferred distribution (Ubuntu) supports installing an FS to the root with a provided, supported kernel module. This is a huge deal for me; I will probably install a new FS on my main file server and move from ext4 to zfs.
Anyone know if this applies for Lubuntu as well? I use Lubuntu on my rencently bought new desktop for the default of LXDE. I intend to upgrade my desktop on which I put Lubuntu 15.10 (I bought a computer without OS so as not to pay the Windows tax) to Lubuntu 16.04 because I understand it'll be based on LXQt, the successor to LXDE (and it's not a matter of newer is better -- I am a fan of Qt) and also because I think Lubuntu 16.04 will be an LTS release and I've been very happy with the stability of Lubuntu 14.04 LTS which my previous main computer was and is running.
P.S. not the "solution", but may help in case you fill the disk by accident, and need to make some room before a remount cycle.
P.S.2. this could help also when your disk is already 100% full, without enough space for deleting files -not enough space for new inodes- (I tested that case on ZFS NAS with no left space at all, and worked).
I'm currently setting up a couple servers using LXC with btrfs.
I ending up choosing LXC (as opposed to LXD, docker, rkt, etc.) because I wanted something relatively straight-forward. I just wanted some containers I could create, log in to and configure.
If this was a bigger deployment, I'd take the time to use docker or something else. But for now, just being able to get going quickly is nice. For backup / failover, I can btrfs send / receive the containers to another host and start them there.
Yeah, thats all fine and good. Nothing to announce with lxc/lxd + btrfs because it already works fine :) I do like the wizard to easily setup the zfs backend, rather than you needing to manually replace /var/lib/lxc with a btrfs partition or however you are doing it.
I've been using lxc + btrfs daily for quite a while, setting up and tearing down hundreds of containers on a busy day. I stopped using lxc snapshots after I had a btfs subvolume that would crash the system when I mounted it. After that, no problems.
I stopped using lxc snapshots after I had a btfs subvolume that would crash the system when I mounted it. After that, no problems.
That's unfortunate. What operating system version were you using at the time?
I've actually switched to using Ubuntu 15.10 for the container hosts so that I can get a more recent version of the btrfs tools. The intention is to upgrade to 16.04 as soon as is reasonable, and leave them there for a long time.
How did you determine that? The two should use about the same amount of memory. The only diffence is that ZFS uses ARC and btrfs uses the page cache. ARC is not reported the same way as the page cache though, which might give the appearance of requiring more.
Docker switched to Alpine Linux by default, so there's that hurdle... Not to mention the giant legal question mark of loading CDDL code into a GPLv2 kernel.
One of the ideas of containers / virtualization is that the host operating system (Ubuntu, in this case) has as little to do as possible with the VM / container (Alpine, in this case).
Running an Alpine container using Docker on an Ubuntu host will work just fine.
For those who missed it, Debian Project Leader Neil McGovern gives details [1] about how licensing issues were resolved so that ZFS can be in Debian now. It is distributed as source-only dkms module.
This appears to be inaccurate. They appear to be distributing it as a binary module, outside of the kernel tree, just like many other binary-only kernel modules.
The binary kernel image is a separate issue from the kernel packages (deb); they could include multiple files in a kernel package (deb) that are licensed under different licenses.
Regardless of how the binary is shipped (which could be legal), their aggregation of the source is likely not, since it almost certainly is a derivative work at that point. The fact they have a public git repo where the two codebases touch is probably enough to bait a lawsuit from Oracle, in that they're distributing CDDL code in a way that is against its license.
There's a reason LLNL developed their branch out-of-tree - it's just not worth the legal headaches to aggregate the source like Canonical just did.
The CDDL does not prevent ZFS from being used in Linux. It's the GPL that prevents using CDDL code in the kernel. Oracle doesn't have any grounds to sue based on their IP. They would only be able to sue on behalf of Linux, and violation of the GPL. Although I wouldn't rule that out, it's somewhat less likely to happen.
On the other hand, I suspect RMS isn't too happy with this turn of events. The sfconservancy may be the more likely party to bring a lawsuit. I'm curious to see either of them comment on the situation.
The GPL doesn't even really prevent it, the only clause is a "derived work" which is quite a stretch that a court would find a module to be such a thing.
How would you explain to a non-technical person that a kernel module is not a derived work?
Can you remove the Linux kernel and still have a complete and working program? What happen if one removed all function calls to the Linux kernel or use of internal kernel variables? As a module, does it work with any other kernels like windows or apple, and what was the programmers intention when writing it?
There is some arguments in favor of fair use in regard to compatibility, where derived works are infringements but still deemed legal. The courts has historically been rather split on this subject when it comes to software, in particular with several cases voting in favor of unlicensed modules to consoles. It would be quite a big bet either way you vote.
If I take a chapter of a textbook, modify it to be a standalone volume in a collection of books and start distributing it, I am distributing a derived work of the original book, not a derived work of the collection of books. The latter constitutes an aggregation and unless there is some license (superseding doctrine of first sale in the case of books) that prevents it from being redistributed with such things, it is perfectly okay to do that.
Similarly, the original code was taken from OpenSolaris and was adapted for Linux. No matter how we change it, it is a derived work of Solaris. Furthermore, it is distributed as part of a mere aggregation, which is okay with OSS under the OSD and also okay with the GPL under the GPL FAQ. The only time you can claim a combined work is formed is when the module is loaded into a running kernel, but the GPL does not restrict non-distribution and the kernel with the module loaded into it is not being distributed.
As for removing it from the Linux kernel, given that it is an entire storage stack between the block layer and VFS, you would need to replace everything there (including the disk format), but yes, you would have a working system.
As for all calls to Linux kernel symbols, those are provided to LKM so that they can function and they cannot function without it. There are symbols not provided at all, symbols provided only to GPL software and symbols provided to everyone. ZFS only uses the last group, which is intended for use by non-GPL software.
You can design software to load a LKM from an arbitrary. FreeBSD had done that with Windows kernel modules for wireless drivers at one point. Wine does that for certain Windows drivers that do copy protection. There is nothing stopping you from creating a kernel under a difference license that loads modules in the LKM format of a given Linux kernel. Although the usual case is to port the code to another kernel's own LKM implementation. Attorneys with whom I (and apparently Canonical too) have spoken think this is okay.
> Can you remove the Linux kernel and still have a complete and working program? What happen if one removed all function calls to the Linux kernel or use of internal kernel variables? As a module, does it work with any other kernels like windows or apple, and what was the programmers intention when writing it?
ZFS was developed on another operating system, Solaris back in the early 2000's and continues to be actively developed on illumos, FreeBSD, OSX and Linux today. However the bulk of new code seems to come from the illumos and FreeBSD communities. ZFS also runs in userspace to allow for easier testing and development. So if you remove Linux you still have a working program, ie it's a working kernel module for illumos, FreeBSD and Mac OSX, as well as a userspace program.
As for the intentions of Jeff Bonwick and Matt Ahrens, it was to make administration of file systems much easier. The video posted below is about the history of ZFS and is presented by one of the creators. The first person talking is the other founder of ZFS.
> How would you explain to a non-technical person that a kernel module is not a derived work?
GPLv2 does not use the term "derived work" anywhere. It uses "work [...] derived from the Program", and does not define this term [1].
I'd start out by explaining that before we even get to the question of whether or not the module is a "work [...] derived from the Program", we have to ask the question of whether or not the license even applies. GPLv2 only applies if the module does something that requires permission under copyright law. The copyright law question that needs to be asked is whether or not the module is a "derivative work" of the kernel.
> Can you remove the Linux kernel and still have a complete and working program? What happen if one removed all function calls to the Linux kernel or use of internal kernel variables? As a module, does it work with any other kernels like windows or apple, and what was the programmers intention when writing it?
None of these questions are actually relevant to the copyright law question of whether or not it is a derivative work. They are relevant to the question of whether or not it is useful when not used in conjunction with a Linux kernel but that's not a copyright law question.
To answer the copyright law question of whether or not some program P [2] is a derivative work of some other program Q, you only need to look at the source code to P and Q. If P and Q interact with each other (unilaterally or bilaterally, directly or indirectly) some people get hung up on the mechanism of that interaction, but that's not relevant to the question of whether or not P is a derivative work.
Whether or not a program P that uses function names, function argument ordering, and data structures of program Q, but does not copy algorithmic code from Q, is a derivative work of Q is going to essentially come down to whether or not the interface (I'm including data structures as part of the interface) of Q is copyrightable.
If program interfaces are copyrightable, then programs that interact with other programs will be derivative works of those programs, regardless of whether they interface by static linking, dynamic linking, system calls from a user process P to kernel code Q, IPC from process P to process Q, RPC from process P across a network to process Q on another machine and so on.
If program interfaces are not copyrightable, then as long as all P incorporates from Q are interfaces P won't be a derivative work.
Generally, courts have held that program interfaces are not copyrightable (with the notable exception of the Court of Appeals for the Federal Circuit in the Oracle vs. Google case, which does not set copyright precedent).
Thus we arrive at the major question for kernel modules: what copyrightable kernel elements do they incorporate?
If they just incorporate non-copyrightable interfaces then a kernel module would not be a derivative work of the kernel.
That's not the end of the inquiry though. It would be if some third party were making and distributing the module. E.g., if I were to write a kernel module that does not incorporate any copyrightable kernel elements and distribute it stand alone, for others to download if they want and use it with their kernels, we'd be done.
In the case of a distribution vendor distributing a kernel module along with a kernel, then even though the module itself might not be a derivative work their distribution as a whole is. Questions might arise as to just what constitutes a "work". If they statically link the module to the kernel, the resulting binary is clearly a work, and it is a derivative work of both the kernel and the module, and so the module would have to be GPL. It is important to note in this case that this is because the combined work is a derivative work of the kernel...the module itself is still not a derivative work of the kernel.
How about if the module is dynamically linked, but the configuration they ship automatically loads it at boot time? Might one argue that the kernel, init scripts, and dynamic modules together are all one work that the vendor is distributing?
[1] For completeness, GPLv3 does not use "derive" or "derived" or any similar terms it all. It uses the term "covered work", which is defined as the original program or a "work based on the Program", and it defines that as basically a work that requires copyright permission.
[2] I'm going to use the term "program" expansively to include modules, applications, plug-ins, and so on.
The CDDL was crafted with GPL existing and was according to the person responsible for creating it, deliberately made incompatible with GPLv2. Of course not hard to understand given that Sun had no reason whatsoever to hand over their prized technology (ZFS, DTrace) to the competitor which was killing them in the market.
>On the other hand, I suspect RMS isn't too happy with this turn of events.
Why not ?
>The sfconservancy may be the more likely party to bring a lawsuit.
That would require that a Linux copyright holder would want to sue, why would they ? OpenZFS is open source, and previous suits has been about source code compliance.
It is probably more accurate to claim the GPL was designed to be incompatible with an entire class of licenses that includes the CDDL, and the MPL on which it was based and any future licenses similar to or based on licenses in that class (of which the CDDL was given that it was made after the GPL).
There is no clause in the CDDL that places restrictions on other files in a combined work, but there is one the GPL. There are people out there who dislike the GPL for that, there are some people who explicitly go out of their way to avoid GPL compatibility because of that and I am sure that some of those people existed at Sun, but I really doubt that the design of a license by a huge organization with many people giving input can be simplified to one guy thinking GPL incompatibility is a good feature.
I also think this happened years ago and there really is no point to living in the past. People cannot distribute a vmlinux file with ZFS linked into it (i.e. not a kernel module, but part of the binary itself) because of that, but that does not stop people from distributing it as a kernel module and that is how filesystem code is loaded these days, so it is a non-issue.
>It is probably more accurate to claim the GPL was designed to be incompatible with an entire class of licenses that includes the CDDL,
It was designed to give and preserve rights for end users, it's not really a big mystery, and the actual rights which are given and preserved perfectly mirror that.
I don't see anything that would substantiate your claim of them being 'deliberately' incompatible with any other licenses (anything you can point to ?), in fact they've fixed incompability problems in GPLv3 with other licenses.
And of course both MPL and CDDL came along much later than GPLv2, with which they were incompatible (MPL 2.0 in turn rectified this).
>can be simplified to one guy thinking GPL incompatibility is a good feature.
No, I don't think for a second that it was 'one guy', again Sun management had absolutely zero reason to allow Linux to incorporate ZFS and DTrace and every business reason not to, in fact from a business standpoint it would have been crazy to hand over ZFS and DTrace to their main competitor.
>but that does not stop people from distributing it as a kernel module and that is how filesystem code is loaded these days, so it is a non-issue.
I'm not at all sure it's a non-issue, this is a Linux kernel module running in Linux kernel space, I'm pretty sure there is a strong case for this being considered a derivative, that said I hope it won't be an issue since having ZFS in a native capacity with minimal effort is a boon for Linux.
> It is probably more accurate to claim the GPL was designed to be incompatible with an entire class of licenses that includes the CDDL, and the MPL on which it was based and any future licenses similar to or based on licenses in that class (of which the CDDL was given that it was made after the GPL).
Given that work was done to make GPLv3 more compatible with other open source licenses and that GPLv2 predates both of the licenses you mention by quite a bit I'm inclined to think that's nonsense.
If being compatible with anything were the goal, the FSF would have opted for the CC0 license. Since the GPL is not compatible with things on that level, it is designed to be incompatible with certain things. Some subset of possible open source licenses definitely were excluded as part of that.
No, Ubuntu may have put the ZFS source in the kernel tree, but they still ship it to endusers as a separate kernel module and separate Ubuntu package (edit: the separate package is "zfsutils-linux" for the userspace code)
AFAIK to violate the GPL they would have to ship ZFS compiled code in the kernel image, but this is not what they are doing.
If I take a chapter of a textbook, modify it to be a standalone volume in a collection of books and start distributing it, I am distributing a derived work of the original book, not a derived work of the collection of books. The latter constitutes an aggregation and unless there is some license (superseding doctrine of first sale in the case of books) that prevents it from being redistributed with such things, it is perfectly okay to do that.
Similarly, the original code was taken from OpenSolaris and was adapted for Linux. No matter how we change it, it is a derived work of Solaris. Furthermore, it is distributed as part of a mere aggregation, which is okay with OSS under the OSD and also okay with the GPL under the GPL FAQ. The only time you can claim a combined work is formed is when the module is loaded into a running kernel, but the GPL does not restrict non-distribution and the kernel with the module loaded into it is not being distributed.
You can argue that GPL advocates did not intend to support a license that allows any of this. However, I expect that you would have trouble finding an attorney that will interpret what the copyright holder thought the terms said to supersede the legal meaning of the terms unless explicitly stated.
If you make a license for the kernel that does not allow derived works of other platforms' software to be distributed as ports, you would violate #9 of the OSD and could not call it an open source license:
If you take the plot from an episode of Star Trek and modify it such that it fits into the Dr Who storyline, you've created a work that's derivative of both Star Trek and Dr Who. Similarly, if you take code from Solaris and modify it such that it tightly integrates with Linux, you've created a work that's derivative of both Solaris and Linux. Since ZFS can only be distributed under the CDDL and since GPLv2 requires all derived works to be distributed under the GPL, you can't satisfy the license.
> If you take the plot from an episode of Star Trek and modify it such that it fits into the Dr Who storyline, you've created a work that's derivative of both Star Trek and Dr Who. Similarly, if you take code from Solaris and modify it such that it tightly integrates with Linux, you've created a work that's derivative of both Solaris and Linux. Since ZFS can only be distributed under the CDDL and since GPLv2 requires all derived works to be distributed under the GPL, you can't satisfy the license.
That is analogous to writing a new piece of software intended to be similar to an existing piece of software rather than a port of software under license. Examples of the former include the Linux kernel (meant to be similar to UNIX SVR4) and the wine project (meant to be similar to Windows). If that argument is valid:
1. Oracle is in an excellent position to sue every Linux user not using Oracle Linux, because they own rights to UNIX SVR4, which they inherited from Sun.
2. Microsoft is in an excellent position to sue wine users.
3. James Cameron and 20th century Fox would also be in trouble with Disney for Avatar's similarities to Pocahontas.
4. Probably plenty of other bad things.
However, this argument does not apply to ZoL because the code originated in OpenSolaris and is under license and exists as a discrete module, rather than a whole program.
So far, the only thing that you have concretely stated is that you met some attorneys who were unwilling to make a decision on legality. You are not an attorney (unless you have obtained a bar number since I last asked) and I have yet to hear that anyone with a bar number that agrees with you.
If you want to prohibit people from using software you write with things that you consider to be derivatives when the law does not recognize them as such, you need a license that makes that explicit. Such a license could not be called an open source license under clause #9 of the open source definition:
> That is analogous to writing a new piece of software intended to be similar to an existing piece of software rather than a port of software under license.
I take ZFS from Solaris. I rewrite it to work with Linux. In which sense is this not equivalent to my analogy? The examples you're giving are not equivalent, because in each case the work was written without deriving from the other copyrighted work.
> However, this argument does not apply to ZoL because the code originated in OpenSolaris and is under license and exists as a discrete module, rather than a whole program.
That's an entirely arbitrary distinction.
> So far, the only thing that you have concretely stated is that you met some attorneys who were unwilling to make a decision on legality.
> If you want to prohibit people from using software you write with things that you consider to be derivatives when the law does not recognize them as such
> I take ZFS from Solaris. I rewrite it to work with Linux. In which sense is this not equivalent to my analogy? The examples you're giving are not equivalent, because in each case the work was written without deriving from the other copyrighted work.
I take it that you never actually read the ZFSOnLinux source code.
It is not really rewritten. There is a compatibility layer in place to prevent the need to rewrite much of the code and a very small percentage of the original kernel code actually changed to support Linux, but what did change was meant to use for interfaces that are provided by the kernel to allow proprietary modules to be ported, which suggests any license is fine.
However, the claim that writing a brand new TV show script inspired by another forms a derivative work is to claim that writing things from scratch forms a derivative work.
Do you have bar numbers of these lawyers? Is there any reason to think that they were thinking that zfs.ko somehow used GPL exported symbols or some other thing that is not actually true that does not involve taking your word for it? I did have one person going to law school tell me that it was a derivative work because of that. He did not think he could claim otherwise after an explanation that the code does not do that.
Given that your legal views are so incredibly divorced from those of actual lawyers with whom I have talked, I am not inclined to believe you when you say that they had no misunderstanding, especially when it seems that you have never actually read the code to be able to be sure of that.
It has several direct calls into Linux functionality that don't go via SPL, but it's also unclear that simply adding an abstraction layer is a meaningful mechanism to avoid derivation.
> what did change was meant to use for interfaces that are provided by the kernel to allow proprietary modules to be ported
There are no such interfaces in Linux.
> the claim that writing a brand new TV show script inspired by another forms a derivative work is to claim that writing things from scratch forms a derivative work.
I didn't make that claim. The analogy in question involves taking an existing work and modifying it such that it includes components of another work.
> It is the distinction lawyers are making.
It's the distinction a lawyer that you've spoken to is making.
> Do you have bar numbers of these lawyers?
Yes.
> Is there any reason to think that they were thinking that zfs.ko somehow used GPL exported symbols or some other thing that is not actually true that does not involve taking your word for it?
No.
> Your claims are inconsistent with that.
My claim is that I have reason to believe that, under copyright law, ZoL is a derivative work of Linux and as such is subject to the terms of the GPL. If the final legal determination is that it's not a derivative work then the GPL is irrelevant.
> If I take a chapter of a textbook, modify it to be a standalone volume in a collection of books and start distributing it, I am distributing a derived work of the original book, not a derived work of the collection of books. The latter constitutes an aggregation and unless there is some license (superseding doctrine of first sale in the case of books) that prevents it from being redistributed with such things, it is perfectly okay to do that.
I should elaborate that you need the original to be under license. Otherwise, you have a problem.
> Well, Ubuntu may have put the ZFS source in the kernel tree, but they still ship it to endusers as a separate kernel module and separate Ubuntu package.
No, they aren't using a separate Ubuntu package, it's gone straight into the main kernel repo.
> AFAIK to violate the GPL they would have to ship ZFS compiled code in the kernel image, but this is not what they are doing.
You can violate the GPL inside a kernel module that you distribute.
> No, they aren't using a separate Ubuntu package, it's gone straight into the main kernel repo.
How Ubuntu packages is irrelevant. What matters under the GPL is how the module is linked into the kernel.
> You can violate the GPL inside a kernel module that you distribute.
Of course, but they're not doing that. For example, you could violate the GPL by including GPL'ed code in a kernel module under a more restrictive license.
What matters under the Copyright law (and thus the GPL) is whether the module is a derivative work of the Linux kernel or not.
ZFS was originally created for Solaris, and works on multiple operating systems. So ZFS itself is obviously not a Linux derivative. If the original ZFS could be directly linked with the Linux kernel without modifications, it still wouldn't be a Linux derivative.
But ZFS had to be modified to work with Linux. It can be argued that those modifications are Linux derivatives. We haven't had a definitive ruling on this yet.
ZFS from Solaris / BSD --> not a Linux derivative, even if it was directly linked into Linux.
ZFS with trivial modifications to work with Linux --> not a Linux derivative
ZFS with extensive modifications to work with Linux --> judge's ruling required
The only reason that linking matters is because Linus's statement that binary modules are OK would have some weight with the judge. However, Linus is not the only copyright holder of the Linux kernel, and other copyright holders have disagreed with Linus on this statement.
It's a Linux kernel module running in Linux kernel address space, I'd say there is reason to assume it can be considered a derivative work, and thus a license incompability.
Do you think that there is reason to assume that every program that ran on MS-DOS on an 8086 was a derivative work of MS-DOS? The programs and MS-DOS all ran in the same address space on the 8086.
The GPLv2 does not restrict placing things under GPLv2 incompatible in the same tree. It only restricts distribution of binaries that are derivative works under copyright law.
OpenZFS ! I can't see why this has blown up as a violation when people didn't actually read the announcement.
EDIT :
ZFS is licensed under the Common Development and Distribution License (CDDL), and the Linux kernel is licensed under the GNU General Public License Version 2 (GPLv2). While both are free open source licenses they are restrictive licenses. The combination of them causes problems because it prevents using pieces of code exclusively available under one license with pieces of code exclusively available under the other in the same binary. In the case of the kernel, this prevents us from distributing ZFS as part of the kernel binary. However, there is nothing in either license that prevents distributing it in the form of a binary module or in the form of source code. http://open-zfs.org/wiki/Main_Page
"We at Canonical have conducted a legal review, including discussion with the industry's leading software freedom legal counsel, of the licenses that apply to the Linux kernel and to ZFS.
And in doing so, we have concluded that we are acting within the rights granted and in compliance with their terms of both of those licenses."
"And zfs.ko, as a self-contained file system module, is clearly not a derivative work of the Linux kernel but rather quite obviously a derivative work of OpenZFS and OpenSolaris. Equivalent exceptions have existed for many years, for various other stand alone, self-contained, non-GPL and even proprietary (hi, nvidia.ko) kernel modules."
This would be true if the resulting work were not a derivative work of the GPLed kernel. There's plenty of solid legal opinion that it is, and if you accept that then the GPL absolutely prevents distributing it in the form of a binary module.
Shame how most of the conversation devolved into licensing rubbish. Almost none of us are qualified to speak on that, leave it to the lawyers - which I assure you Canonical did too.
With that out of the way, ZFS is by far and away the best filesystem for container workloads. Hopefully we will get deeper quota and I/O throttling support soon.
I have been using ZoL in production for many years now thanks mostly to the work of Brian Behelendorf and Richard Yao. So if you find yourselves here thanks for all the work you have put into making ZoL awesome.
> Shame how most of the conversation devolved into licensing rubbish. Almost none of us are qualified to speak on that, leave it to the lawyers - which I assure you Canonical did too.
This a million times, It will be nice to have the illumos community, the FreeBSD community and now the Linux community contributing to one piece of core software. It's especially amazing considering most open source operating system projects don't share major kernel subsystems.
This can be game-changing for the NAS/SAN industry.
I'm surprised their lawyers gave an OK, where FSF, SFLC and friends have given a thumbs down. If their interpretation is good, suddenly the large AIX/Solaris dominated storage boxes open up to a LOT of ubuntu-based/ubuntu-derived competition.
> I'm surprised their lawyers gave an OK, where FSF, SFLC and friends have given a thumbs down.
I'm not. FSF and SFLC have institutional incentives to support the maximum remotely defensible interpretation of the scope of copyright holders rights, since they are ideological organizations who rely on the maximum amount of code possible being subject to the restrictions of the GPL.
They are among the least likely organizations on Earth to publicly present a balanced view of the scope of copyright law particularly as it addresses coverage of derivative works.
They certainly have reasons to be biased but saying they are among the least likely is unnecessary hyperbole. I'd say they're just as likely, at most, as the lawyer of any copyright holder is when discussing whether something is a derived work of their property.
The other party here has their own interests and biases here as well, of course. Let's not forget how many companies in the mobile and embedded space have repeatedly chosen to violate the GPL even when their noncompliance has been obvious.
I'm curious exactly what the quality of Canonical's legal advice is and how much those lawyers understand open source licensing and IP law in general. It took "two years of negotiations" for them to state that, for packages under the GPL, their GPL-incompatible license on Ubuntu as a whole did not apply.
(It's still the case that non-GPL binary packages in Ubuntu, that is, stuff under MIT, BSD, etc. licenses, may not be redistributed. This is legal for the same reason that using that code in proprietary software is legal.)
This is my #1 question. Can we use it for the root FS? If so, that's amazing, as there are already btrfs-based tools for snapshotting every time you run apt, etc.
I expect those issues to be resolved before 16.04 is released. Even with those fixes, the interactive installer doesn't support ZFS yet so you will still need to drop to a shell to actually setup your zpool and your partitions.
How is this possible, legally? Based on my basic understanding of the ZFS license, it's not possible to legally distribute ZFS and GPL code (linux kernel) together.
That's what they say - others claim that even a dynamically loaded module is producing a derivative and thus you're not allowed to distribute a non-GPL-ed binary module.
Matthew Garret (a kernel developer and thus a shared copyright holder of the kernel) is of the opinion that linking a binary ZFS module is not legal:
I know zealots are necessary to keep a balance. I've typically appreciated the utility of people like RMS to the free software movement.
That said, Matt Garret's Captain Ahab-like zeal of keeping one of the most useful pieces of open-source code away from Linux, while taking potshots at Ubuntu, is really off-putting. I guess I'm not so pure.
Which is why I run my file server with BSD.
I'm really excited to see ZFS functional in 16.04, and in fact, that got me to install the pre-beta just to mess with it.
I can understand how you disagree with Mathew (and I also would prefer for ZFS to be universally available under Linux), but that's not the point here.
The GPL states in clear terms what's allowed and what isn't.
It doesn't matter whether you believe a specific use case should make it ok to violate the license or not.
It's like laws. Whether you personally believe they are just or not is not a reason why they should or should not apply to you.
In my heart, I know. As a user, it's just frustrating to see so much awesome technology artificially limited by silly licenses on both sides of this debate.
>That said, Matt Garret's Captain Ahab-like zeal of keeping one of the most useful pieces of open-source code away from Linux, while taking potshots at Ubuntu, is really off-putting.
Your response to a large company violating the license that Linux is distributed under is to blame Matthew Garret for pointing it out?
That's a little presumptuous. Violating? On one hand we have some opinions from people, some lawyers, some not, saying they think this could be a violation.
On the other, there's equal opinions that this (or the way the did this) is -not- a violation.
So I don't think it's particularly fair to reach for your pitchfork, either.
Besides, his response had almost nothing to do with Ubuntu, it was "Garret's zeal in trying to keep ZFS off Linux" regardless of distro (which is true). "While taking potshots at Ubuntu" (which is also true, and on issues far wider than "including ZFS".
Can you provide bar numbers of lawyers who would make that claim?
So far, Matthew Garrett has yet to claim that any attorney said this is a problem. The only claim he has made (after I got him to clarify what was said) is that he met some attorneys who said that they were not absolutely sure that there is no problem. There are likely attorneys out there that make similar claims about the GPL software in general, so I really am not that concerned that he found a few attorneys that said that they were not sure.
Indeed. ZFS, supported by Canonical. Its Canonical's considered legal opinion that there isn't a problem with ZFS, and if you disagree you can sue them. They put their balls on the table. Dare you to try cutting them off. You need a real lawyer to try that, not an armchair lawyer.
that zfs was "merged into the kernel tree," but so far as I know the GPL doesn't dictate that things can't be stored in the same location together. There are no official GPL certified directory structures, etc.
Those comments are very different from Matthew's previous comments regarding Oracle's dtrace LKM for Linux, where the only definitive remark he had was that bypassing the GPL symbol export like they did was not okay:
His argument about CDDL Linux kernel modules using non-GPL exported symbols being a problem is clearly FUD. Specifically, Fear of a violation; Uncertainty of a violation; and Doubt that there is no violation.
Does Garrett hold the copyright on anything that the ZFS module could be considered a derivative work of? I mean presumably the notion that the ZFS module is a derivative work is based on the ZFS module containing code that was written to work with particular parts of the Linux kernel - but it's not going to touch a lot of the kernel API. So who owns the copyrights on the parts that it does touch.
On twitter[0], he is suggesting that since the binary module is a derivative work (of the linux kernel due to linking to it), according to the GPL, the source code to ZFS must be licensed under a GPL compatible license.
However, since Canonical cannot relicense the ZFS code to a GPL-compatible license (since they are not the copyright holder), if they distribute the ZFS module, they would be in violation of the GPL (and thus lose their rights under the GPL to the kernel code).
Whether that's actually true appears to be up for a debate depending upon whether or not a binary module is distribution or not, which is why he's suggesting he'll talk to the FSF about possible recourse.
Given his track record trying to get Oracle to stop using a GPL exported symbol in their CDDL DTrace module, it is unlikely he is going to do anything here. Orqcle has committed an actual potential violation as far as lawyers with whom I have talked are concerned. Canonical has not.
ZFS has been available in Linux for a long time, the licence restrictions hold it back from being included in the kernel, but it's available as a FUSE filesystem.
EDIT: Looking through the comments in the article, it's being suggested this isn't using the FUSE implementation of ZFS, and that somehow it's part of the kernel. Not sure how they've legally managed to do that!
EDIT 2: Looks like it's a kernel module, as other comments here suggest.
It looks like they got around it by distributing OpenZFS as a kernel module. If that's permitted for closed source kernel blobs, it's probably fine for this code too.
> It is not permitted for closed source kernel blobs; those are violations too.
They are not permitted by the license; for them to be violations, they would have to be derivative works that require a license.
If you have a reference to copyright case law in any jurisdiction holding that to be the case, that would be interesting.
I think its clear that certain parties (including the FSF) would like this to be perceived to be a violation; it also seems fairly clear that this type of act has been a fairly established practice around Linux, but that those with that view have not take action to vindicate it in court, perhaps because while they'd like it to be perceived to be the case, they have little confidence that courts would agree with them, and the one thing they'd like less than active disagreement with some people engaging in the practice that they don't like is a black-and-white ruling vindicating the ongoing practice and rejecting the FSF view on the legal requirements.
Does a single patch to any part of the Linux kernel make you eligible for suing over these or do you have to have copyright over a part that is clearly relevant?
So using a kernel module for the closed source NVIDIA driver for Linux is a GPL violation?
As far as I understood, the GPL only applies for code that is directly linked to it. A call to an external library (e.g. a kernel module) wouldn't necessarily apply. The code is being delivered as separate binaries.
> So using a kernel module for the closed source NVIDIA driver for Linux is a GPL violation?
Yes. Your understanding isn't correct.
With NVIDIA, there's this complicated dance where you download source code from nVidia for the kernel module, then compile it on your own machine and use it -- you aren't violating copyright because you don't distribute the kernel module that you compiled for yourself on your own machine.
But that kernel module does indeed violate GPLv2, and you can't distribute it legally, and neither could Canonical or nVidia (which is why they do the dance above instead).
> "But that kernel module does indeed violate GPLv2, and you can't distribute it legally, and neither could Canonical or nVidia (which is why they do the dance above instead)."
If that's the case, then why doesn't the FSF sue Linux kernel developers over licence violations? There are clearly pre-compiled binary blobs distributed along with the mainline kernel (otherwise there would be no need for the Linux-libre project to exist: https://en.wikipedia.org/wiki/Linux-libre ). There's little point having a licence if there's no consequences for breaking it.
I suspect they don't because it's not a simple case, and that such a measure would be somewhat counterproductive for their cause.
> If that's the case, then why doesn't the FSF sue Linux kernel developers over licence violations?
I don't understand. The Linux kernel developers hold the copyright on the kernel. If someone sues someone else, it's them --- the kernel developers -- who have standing to be doing the suing. They didn't give their copyright to the FSF merely as a result of choosing to use the FSF's license.
So if none of the Linux kernel developers sues Canonical for using a OpenZFS kernel module, Canonical can carry on with it and nothing of value was lost.
This is true, but most companies would want better assurance of their work's legality than "well, as long as none of the tens of thousands of people we just gave grounds to sue us actually do so, we'll be fine".
My understanding is that it isn't under the theory of what counts as a derivative work requiring a license (and, thus, what is subject to GPLv2 in the first place) espoused by the FSF, however, I believe that view has been hotly disputed as to its accuracy under US copyright law (at least) for about as long as the GPL has existed and never been tested in court.
The FSF, in general -- as is unsurprising for entity that relies on maximally leveraging copyright protections to achieve its ends -- holds to a fairly maximalist view of the legal rights of copyright owners.
The majority of the code in the ZFS on Linux port comes from
OpenSolaris which has been released under the terms of the CDDL
open source license. This includes the core ZFS code, libavl,
libnvpair, libefi, libunicode, and libutil.
Comment by Richard Yao, Gentoo dev and ZFS On Linux contributor [1]:
... there is no legal issue preventing the sources
from being combined because neither the CDDL nor
the GPL place restrictions on aggregations of
source code, which is what putting ZFS into the
same tree as Linux would be. Binary modules built
from such a tree could be distributed with the
kernel's GPL modules under what the GPL considers
to be an aggregate. These concepts have passed
legal review by many parties.
This was settled with the Andrew filesystem, which is a matter that I do not believe Linus considers up for debate. ZFS being a port from another system just like the Andrew filesystem is naturally in the same boat.
ZFS on Linux doesn't work unless it's linked against interfaces provided by Linux. This is a derivative work, and thus the combined work must be distributed under GPLv2.
The kernel copyright file expressly states that using the standard system call interface does not create a derivative work. There is no such statement about in-kernel symbols.
That's a good assurance that would probably estop a Linux copyright holder from later claiming that a binary using only the system call interface is an infringing derivative work.
It doesn't have the power to define a binary module using an internal interface as a derivative work; that can only be done by a court interpreting copyright law in a particular jurisdiction. In the United States, different Federal Circuit courts have different views of what constitutes a derivative work in software.
> The kernel copyright file expressly states that using the standard system call interface does not create a derivative work. There is no such statement about in-kernel symbols.
There is no legal precedent suggesting that was actually necessary either. Or are you aware of a court case
Linking does not matter for the GPL. What matters it he legal concept of a derivative work. It just so happens that linking to a dynamic library bears a strong resemblance to a derivative work. LKMs and plugins in general are an entirely different matter. GPL software that supports plugins implicitly allows proprietary software to be loaded into it. That is why the FSF had been opposed to allowing plugins in GCC for years until competition from Clang required that they become more tolerant or face irrelevance.
That being said, would anyone who believes this "linking to the kernel" argument please explain what linking actually means and how it is related to the GPL when the term "link" is not even present in the GPL?
Kernel modules are not derivative works of the kernel under copyright law (which was originally designed around literary works). Some might argue that kernel modules that were developed on Linux are, but ZFS itself is ported from another platform, so that is a moot point.
The only case where you cannot distribute ZFS is if you link it into the vmlinux binary that your bootloader loads. In that case, it is no longer a LKM and you can claim the binary is a derived work of Linux. That is what I believe shmeri meant.
Building kernel module on the fly when deploying ZFS is an acceptable workaround. I think that's exactly how it was planned to be used in Debian. I really have no clue what Canonical are planning though.
In case of Linux specifically, Linus voted on the practice to provide so-called GPL entrypoints and symbols. (MODULE_SYMBOL vs MODULE_SYMBOL_GPL)
This is explicitly allowed in GPL as a clause about relaxing provisions. (exception clauses such as one in Glibc or GCC)
If ZFS kernel module uses only those permitted symbols, it's likely fine from legal standpoint. If not, there is a problem.
The FSF referring to the creation of executable files, shared libraries and executable files dynamically linked to those libraries. They do not mean LKMs, which is what a ZFS kernel module is.
Ubuntu is always slightly behind and slightly ahead of the curve. Why are they digging their heels in with lxd when docker and rkt make so much more sense?
Ubuntu 16.04 also comes with enhanced BPF, the new Linux tracing & programming framework that is builtin to the kernel, and is a huge leap forward for Linux tracing. Eg, we can start using tools like these: https://github.com/iovisor/bcc#tracing