$3FileSystems It's Crap !

➜$3FileSystems It's Crap !

#1 Sunday 6th April 2008 19:35:24 Week 14

BMX
sm

$3FileSystems It's Crap !

Why we care about file systems

Computer platform advocacy can bubble up in the strangest places. In a recent interview at a conference in Australia, Linux creator Linus Torvalds got the Macintosh community in an uproar when he described Mac OS X's file system as "complete and utter crap, which is scary."

What did he mean? What is a "file system" anyway, and why would we care why one is better than another? At first glance, it might seem that file systems are boring technical widgetry that would never impact our lives directly, but in fact, the humble file system has a huge influence on how we use and interact with computers.

This article will start off by defining what a file system is and what it does. Then we'll take a look back at the history of how various file systems evolved and why new ones were introduced. Finally we'll take a brief glance into our temporal vortex and see how file systems might change in the future. We'll start by looking at the file systems of the past, then we'll look at file systems used by individual operating systems before looking at what the future may hold.
What is a file system?

Briefly put, a file system is a clearly-defined method that the computer's operating system uses to store, catalog, and retrieve files. Files are central to everything we use a computer for: all applications, images, movies, and documents are files, and they all need to be stored somewhere. For most computers, this place is the hard disk drive, but files can exist on all sorts of media: flash drives, CD and DVD discs, or even tape backup systems.

File systems need to keep track of not only the bits that make up the file itself and where they are logically placed on the hard drive, but also store information about the file. The most important thing it has to store is the file's name. Without the name it will be nearly impossible for the humans to find the file again. Also, the file system has to know how to organize files in a hierarchy, again for the benefit of those pesky humans. This hierarchy is usually called a directory. The last thing the file system has to worry about is metadata.
Metadata

Metadata literally means "data about data" and that's exactly what it is. While metadata may sound relatively recent and modern, all file systems right from the very beginning had to store at least some metadata along with the file and file name. One important bit of metadata is the file's modification date—not always necessary for the computer, but again important for those humans to know so that they can be sure they are working on the latest version of a file. A bit of metadata that is unimportant to people—but crucial to the computer—is the exact physical location (or locations) of the file on the storage device.

Other examples of metadata include attributes, such as hidden or read-only, that the operating system uses to decide how to display the file and who gets to modify it. Multiuser operating systems store file permissions as metadata. Modern file systems go absolutely nuts with metadata, adding all sorts of crazy attributes that can be tailored for individual types of files: artist and album names for music files, or tags for photos that make them easier to sort later.
Advanced file system features

As operating systems have matured, more and more features have been added to their file systems. More metadata options are one such improvement, but there have been others, such as the ability to index files for faster searches, new storage designs that reduce file fragmentation, and more robust error-correction abilities. One of the biggest advances in file systems has been the addition of journaling, which keeps a log of changes that the computer is about to make to each file. This means that if the computer crashes or the power goes out halfway through the file operation, it will be able to check the log and either finish or abandon the operation quickly without corrupting the file. This makes restarting the computer much faster, as the operating system doesn't have to scan the entire file system to find out if anything is out of sync.

Most people remember the 1984 Macintosh with a kind of romantic haze, forgetting the huge limitations of the original unit. The Mac came with a single floppy drive back when PC users were starting to get used to hard disks. The original file system was called the Macintosh File System, or MFS, and had a limit of 20MB and 4,096 files. It both had directories and didn't have them. The user could create graphical "folders" and drag files into them, but they would not show up in the Open/Save dialog boxes of applications. Instead, all the file and directory information was stored in a single "Empty Folder" that would disappear if modified in any way, only to be replaced by a new "Empty Folder." This worked well for floppy disks, but really slowed down performance with hard drives. File names could be 63 characters long.
The 1984 Macintosh
The original 128K Macintosh, shown with optional second floppy drive

MFS was replaced in 1985 by a system with proper hierarchical directories. Because of this, it was called the Hierarchal File System, or HFS. For some reason, file names were now limited to 31 characters, which was just short enough to be annoying. The file, directory, and free space information was stored in a B-Tree, a type of binary storage structure that allows for fast sorting and retrieval of information. HFS used 512KB clusters with a 16-bit pointer, so the maximum size of a drive was 32GB. Later versions upped the pointer to 32 bits and could thus access 2TB at once.

MFS and HFS introduced an innovative way of handling files, called "forks." Instead of storing metadata in a separate place (such as the place directories are stored), HFS made each file into two files: the file itself (the "data fork") and an invisible "resource fork" that contained structured data, including information about the file, such as its icon. Resource forks were used for far more than metadata, though—for example, they held details of an application's interface and executable code in pre-PowerPC macs. Like prongs on a fork, the data and resource traveled around together all the time, until the file was sent to another type of computer that didn't know about forks. Fortunately, back then computers were very snobby and never talked to each other, so this was rarely a problem.

Instead of using a puny three-letter file extension to determine the file type, HFS used a massively huge four-letter "type code" and another creator code, which were stored in the file system's metadata, treated as a peer to information such as the file's creation date.

HFS didn't mess around with slashes or backslashes to separate directory names. Instead, it used a colon (:) and then made sure that the humans would never get to see this letter anywhere in the system, until they tried to include one in a file name.

All kidding aside, HFS was the first instance in history where a file system was designed specifically to adapt to the needs of the then-new graphical user interface. The whole philosophy of the Macintosh's GUI design was to hide unimportant details from the user. This "human-centric" design was intended to help people focus more on their work than the technical details of the file system.

Of course, nothing is perfect, and all systems that try to abstract away the nasty technical bits occasionally run afoul of what Joel Spolsky calls the Law of Leaky Abstractions. When something broke, such as the loss of the resource fork when a file was sent to another computer and back to the Macintosh, it was not always clear what to do to fix the problem.

HFS had some other technical limitations that could occasionally leak out. All the records for files and directories were stored in a single location called the Catalog File, and only one program could access this file at once. There were some benefits of this approach, like very fast file searches. The first Macintoshes did not have multitasking, so this was not a problem, but when multitasking was added later, it caused problems with certain programs "hogging the system." The file could also become corrupt, which could potentially render the entire file system unusable. Other file systems stored file and directory information in separate places, so even if one directory became corrupt the rest of the system was still accessible.

As was the case for other file systems covered so far, the number of clusters (known as "blocks") was fixed to a 16-bit number, so there could only be 65,535 blocks on a single partition, no matter what size it was. Because HFS (like most file systems) could only store files in individual blocks, the larger block size meant a lot of wasted space: even a tiny 1KB file would take up a full 16K on a 1GB drive. For an 8GB drive the problem got eight times worse, and so on.

This last problem was fixed with HFS+ by Apple in 1998, which came bundled with the release of Mac OS 8.1. HFS+ used a 32-bit number for numbering blocks, and also allowed 255-character file names, although versions of "classic" Mac OS (9.2.2 and earlier) only supported 32-character file names. Oddly, Microsoft Office for the Mac would not support higher than 32-character names until 2004.

source : arste…future file systems ars 2

Offline

 

Board footer

Blog, Sitemap, Projets, Videos XXX Allopass, Rachat de Credit
$3FileSystems It's Crap ! © BMX - - 04:01:43@140 -