Port ZFS to Haiku
Ankur Sethi
Short description: Port ZFS -- the combined copy-on-write filesystem and logical volume manager available on OpenSolaris, FreeBSD and Linux (through FUSE) -- to Haiku.
Basic Information
- Full name: Ankur Sethi
- Timezone: GMT+5:30 (Asia/Kolkata)
- Primary email address: gm@GnrlMxms.in
- Secondary email address: getmeankur@gmail.com
- Trac username: GeneralMaximus
- IRC username: GeneralMaximus
- Will you treat Google Summer of Code as full time employment? Yes.
- How many hours per day will you work? At least six hours on weekdays.
- List all obligations (and their dates) that may take time away from GSoC: My end-term exams end on June 8. New semester begins from August 1.
- Are you using Google Summer of Code to fulfill a university requirement? Yes.
Personal Information
I'm a third-year Information Technology student at GGS Indraprastha University, New Delhi. I'm primarily a Python and C/Objective-C (Cocoa) programmer. I worked on the index_server as part of the Haiku Code Drive 2009. In the two years since, I've mostly worked on Mac OS X and iOS applications. I've also done some web work using Python. On weekends I like to tinker with Common Lisp.
I have spent the 3 weeks leading up to the GSoC application period studying the intricacies of writing a file system module for Haiku. In addition, I've done a fair bit of reading on filesystems in general. To get familiar with the file system API, I am hacking on Michael Pfeiffer's zipfs (using the BFS and PackageFS sources as references when I don't understand something).
I see this GSoC as a chance to hack on the innards of the Haiku kernel -- or at least the file system interface.
Project Proposal
Title: Port ZFS to Haiku.
Motivation: ZFS, allegedly the "last word in file systems", gives us all the features we could possibly want from a file system, and then some. Some interesting ZFS features are support for large files, a near-paranoid focus on data integrity, integrated volume management, encryption and easy snapshots and rollbacks. Both Linux and FreeBSD now support ZFS, and OS X might follow. The interoperability advantages are huge.
Timeline
Note: it would be a good idea to skim this article and this FAQ for a high-level overview of the ZFS codebase. This source code tour may also be useful. What follows is a rough roadmap based on the experiences of people who have ported ZFS to Linux[1] and FreeBSD[2].
Community bonding period
- Get a clear picture of how the ZFS source code is organized and how the userland tools talk to the kernel.
- Read the ZFS administrator's guide[3] and the guide to ZFS on-disk structures[4].
- Set up an OpenSolaris/FreeBSD VM and play with the ZFS admin tools.
- Talk to the mentor(s) and draw up a detailed roadmap of the porting effort.
June 8 - June 13 (Quarter-term)
- Finish porting dependencies (libavl[5], libnvpair[6], libuutil[7], libumem[8]).
- Begin writing an OpenSolaris compatibility layer[9]. This involves studying how threads, mutexes, condition variables etc. work on Solaris and then writing wrappers that behave in a similar manner around the Haiku API.
June 14 - July 11 (Mid-term)
- Finish work on the compatibility layer. At this point, libzpool should build correctly on Haiku. libzpool is a userland implementation of the SPA and DMU designed to allow testing with zdb and ztest, the ZFS testing tools.
- Port zdb and ztest and start fixing regressions. We'll know libzpool is working if we can run ztest for a day without any problems.
- Make the work done so far available as an easy to install package so people can run ztest on different kinds of machines.
July 12 - August 1 (Three-quarter-term)
- Create a /dev/zfs and add the ZFS userland <-> kernel ioctl() interface to Haiku. This allows userland applications to talk to the ZFS kernel module through ioctl() calls on /dev/zfs.
- Port libzfs. libzfs is the primary interface for management apps to talk to the ZFS kernel module (through the aforementioned ioctl() interface). zfs and zpool, the ZFS administration utilities, are front-ends to libzfs.
- Port zfs and zpool. Creating zpools, checking their status etc. should now be possible.
- Start porting the ZFS POSIX Layer (ZPL) to Haiku. The ZPL is the layer that will act as a bridge between the Haiku VFS and the ZFS DMU. It will translate common filesystem calls to a format expected by the ZFS DMU. This is probably the hardest part of the entire porting effort.
Pencils down date (August 22)
- At this point, the ZPL should be working correctly. Mounting a ZFS filesystem should now be possible.
After Google Summer of Code (long-term goals)
- Test, test, test. And squash bugs.
- Benchmarks. Everybody loves pretty graphs!
- Get Haiku to boot off a ZFS volume.
- GUI front-end to libzfs. This should replace zfs and zpool as the standard utility to manage ZFS pools and filesystems.
- Get index_server to index attributes stored on ZFS volumes.
Possible Issues
This section lists possible issues that might arise during the porting process. I will update this section as I discover new information.
- The main area where incompatibilities with Haiku will arise is the ZPL. The ZPL is where all the FS-related system calls (open, creat etc.) are mapped to ZFS calls. I'll take a look at the FreeBSD port and zfs-fuse to see how they handle the issue.
Expectations from Mentors
I've spent some time with the Haiku filesystem API. I'm also currently working on making Michael Pfeiffer's zipfs usable on Haiku. Even so, I lack kernel-level experience, so I would almost certainly need some guidance there.
Footnotes
[1] http://zfs-fuse.net
[2] http://wiki.freebsd.org/ZFS
[3] http://www.opensolaris.org/os/community/zfs/docs/zfsadmin.pdf
[4] http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf
[5] libavl implements binary search trees. See: http://adtinfo.org/
[6] libnvpair implements name-value pairs.
[7] libuutil provides a doubly linked list and an AVL tree implementation.
[8] libumem is a userland slab allocator used to discover memory management bugs. See: https://secure.wikimedia.org/wikipedia/en/wiki/Libumem
[9] The OS-specific routines are abstracted away in zfs_context.h. Writing a compatibility layer would involve implementing this header.
External Links
- The ZFS Porting Guide
- The ZFS On FUSE Blog (old, but still useful)
- OpenSolaris Source Code
- Porting ZFS to FreeBSD (paper)
