Sunday, December 16, 2007

My ideal filesystem

I have a file server. It has multiple disks of various sizes. Some are old and likely to fail soon, others are brand new and will hopefully fail less soon. The file server contains nearly half a terabyte of data. Some data are big, others are small. Some are important, others I could do without.

The problem: it takes a lot of manual labour to manage all this. I need to decide which data goes where, keep an eye on the free space of each drive, make sure backups are made regularly, shuffle around data when I add a new disk, etcetera. Highly inconvenient.

The solution: My Ideal Filesystem, MIFS for short. Unlike other filesystems, MIFS is not stored on a single disk (or partition, if you like): it is spread out over multiple partitions. Unlike filesystems on a RAID or LVM array, MIFS actually has knowledge of the underlying structure of its disks (or partitions) and uses this knowledge to its advantage.

MIFS presents itself to the operating system simply as one filesystem. You can therefore mount it at a single mount point. There is only one small extension to the interface that normal filesystems expose to the OS: you can tag a file with a number that indicates the ‘importance’ of the file. This number indicates how bad it is if the file gets lost. So I can tag, for example, a thesis that I'm working on as very important, whereas a television series that I downloaded can easily be downloaded again and is therefore less important. There is also a number which specifies a ‘minimum redundancy’ for the file. If no number is specified, it is inherited from the parent directory.

Additionally, the disks comprising the filesystem each have a tag with their relative reliability, so you can indicate which disks are likely to fail soon. This number might be extracted from the SMART data that the disk itself presents, combined with a database of reliabilities of different disk models, if it is possible to build a database like that.

Now when I write a file to this filesystem, MIFS will decide what to do with it, depending on its importance. When the array is mostly empty, MIFS can afford to write files to each and every of the disks, achieving maximum redundancy and complete recovery even if all disks but one fail. When the array fills up, the files that are less important will be erased from some of the disks to make room for more important files. The ‘minimum redundancy’ tag ensures that my important thesis will always be on at least three of the disks. The filesystem is only full when all files are at their minimum redundancy level.

One could even go a step further, and put some of the disks in a machine across a network or even the internet. That would essentially give you automatic, real-time backups in case one of the machines gets fried along with all of its disks.

MIFS has only one huge drawback: it does not exist. Of course there are many technical difficulties to be overcome when implementing MIFS; I am not blind to that. But I think it should be possbile. Anyone who writes this filesystem will earn my eternal gratitude.

2 comments:

Isaac said...

I dreamed of this file system too.

And surprisingly it shares many features with MIFS!

I think it's time for implementing MIFS. Maybe in next version of Windows ;)

Unknown said...

ZFS or BTRFS?