gjournal: FreeBSD GEOM Journaling Layer
Name: Ivan Voras
Contact: [email protected]
School: University of Zagreb
Major: Electrical Engineering and Computing
Project: gjournal
Project Page: http://wikitest.freebsd.org/moin.cgi/gjournal/
Mentors: Pawel Jakub Dawidek and Poul-Henning Kamp
Mentoring Organization: The FreeBSD Project (http://www.freebsd.org/)
The aim of the gjournal project is to create a data journaling layer for FreeBSD's GEOM storage device layer. The idea of gjournal was born from the observation that FreeBSD doesn't currently have a journaling filesystem, but in an early phase the specification was extended to include copy-on-write (COW) functionality.
The GEOM subsystem is a modern kernel-based framework that manages pretty much all aspects of usage and control of storage devices. It's based on the concept of classes. A GEOM class can be a source of data or it can implement data transformations in a completely transparent way. All classes can be arbitrary combined in a hierarchy in the form of a directed acyclic graph. Examples of existing GEOM classes are gmirror, which consumes two or more underlying class instances (called "geoms") and provides one that duplicates and distributes I/O requests to them (a RAID 1 layer); and geom_dev, which consumes all disk device geoms and creates entries in the /dev filesystem hierarchy for them.
The gjournal is implemented as a GEOM class that consumes two geoms and produces one. The first of the two consumed geoms is designated as a "data device" and the second as a "journal device." The basic idea is to transform write requests to the produced geom into sequential writes to the journal device. The class implements two kernel threads: A main worker thread to which I/O requests are delegated, and a helper thread used to asynchronously commit data from the journal to the data device.
In regular mode, the journal device is divided into two areas, one of which is used to record data until it's filledat which point, it's scheduled for asynchronous commit. A timed callout is scheduled that periodically triggers the swap/commit process. Two journal formats are implementedone optimized for speed that emphasizes sequentiality of writes to the journal device, and another that conserves space by keeping metadata for the journal in one place.
Unfortunately, the most used FreeBSD filesystemthe UFScannot be used with gjournal because this layer doesn't distinguish metadata (for example, information about deleted but still referenced files) and requires a fsck run to correct references. The COW facility is functional and can be used for experimentation with filesystems.
DDJ