|Introducing YAFFS, the first NAND-specific flash file system
by Charles Manning (Sept. 20, 2002)
YAFFS stands for 'yet another flash file system' (see footnote 1). As far as I am aware, YAFFS is the only file system, under any operating system, that has been designed specifically for use with NAND flash. YAFFS is thus designed to work within the constraints of, and exploit the features of, NAND flash to maximize performance. YAFFS uses journaling, error correction, and verification techniques tuned to the way NAND typically fails to enhance robustness. The result is a file system that exploits low-cost NAND chips and is both fast and robust. YAFFS is highly portable and runs under Linux, uClinux, and Windows CE. YAFFS is an open source project.
Why NAND flash?
Embedded and mobile systems are increasingly using NAND flash for storage because it has various advantages over other storage technologies. As always though, life is a compromise and those advantages come with some limitations that need to be addressed to provide a robust flash file system.
Hard disks are not a viable storage option for many embedded and handheld systems because they are too big, too fragile and use too much power. For some years now, people have been using common-old NOR flash for file system storage. JFFS and JFFS2 do an excellent job of this for Linux. For storage applications NOR flash is not that great because it is not very dense (i.e. not much storage per chip), is costly and is slow to write. NAND flash, on the other hand, is low cost, dense, and writes fast; but it has other limitations. The most common consumer usage of NAND flash is in the form of SmartMedia cards, which are simply NAND chips bonded to a carrier card. Consumer use of SmartMedia is driving down NAND prices, while driving up densities.
The standard SmartMedia file system is FAT-based and is thus susceptable to corruption due to power failures, crashes, and other acts of demonic forces. This raises robustness concerns, particularly in embedded systems where a corrupted file system can kill the device (or do even worse collateral damage). What's really needed is a journaling file system that is able to work around the limitations of NAND and exploit it for best performance.
The lead-up to YAFFS started with an investigation into modifying the JFFS2 flash file system to work with NAND flash (see footnote 2) for some Aleph One customers. At first, it seemed reasonable that the best way to get a file system for NAND flash would be just 'tweaking' an existing flash file system. On deeper investigation, though, it became apparent that designing a new file system specifically for NAND might have some benefits.
Comparison of NOR and NAND Flash technologies
Designing an effective flash file system is a difficult task, and you need to jump through many hoops of fire to get a useful system. Although both are called 'Flash', NOR and NAND flash have very different properties. Thus, a flash file system that works with NOR incorporates various mechanisms that are not required for NAND, and NAND needs extra mechanisms not required for NOR.
For example, garbage collection performance is largely determined by the erasure time. NOR erases very slowly and thus an effective NOR garbage collection strategy is relatively complex and limits the design options. In comparison, NAND erases very quickly thus these limitations don't apply.
Another major difference is that NAND is shipped with marked bad blocks on the device, while NOR chips are shipped defect free. Thus, one expects to encounter some failures in NAND and should design accordingly.
On top of this, we also realized that typical NAND flash arrays would be far larger than typical NOR flash arrays (see footnote 3). This raised concerns as to how well JFFS2 would scale up, particularly with regard to RAM usage, mounting/scanning time, and garbage collection time.
This lead me to believe that it would be worth designing a file system specifically for NAND flash, exploiting the features of NAND to simplify the design, and catering to NAND specific limitations directly -- rather than trying to 'bend' NAND to work with an existing file system architecture, or vice versa.
The first conceptual design for YAFFS was drawn up in late December 2001. Once the customers bought in to the idea, things progressed rapidly. The first line of code was cut in January 2002. By May 2002 we had the first publicly available CVS code. We now have at least six different companies using or experimenting with YAFFS in embedded or handheld systems. YAFFS is already very stable and is rapidly becoming a very useful file system. We expect to see YAFFS-based products shipped before the end of the year.
Rather than develop and debug code inside the Linux kernel, I took the approach of developing the file system 'guts' algorithms -- the most complex part -- in a user space program. To achieve this, the code is split into four sections:
This modular design has provided a lot of flexibility for testing and development. A user space program and DDD is a far more rapid development environment that continually hanging the Linux kernel. Also, a RAM-based NAND emulation layer provides an excellent way to debug the file system in-kernel without special hardware.
- yaffs_guts.c: The file system algorithms. This code is fully portable C.
- yaffs_fs.c: Interface layer to the Linux VFS. Can be replaced with a test harness for user space developing of yaffs_guts.c
- NAND interface: wrapper layer between yaffs_guts and the NAND memory access functions. eg. calls Linux mtd layer or RAM emulation layer.
- Portability functions: wrapper functions for services such as memory allocation etc.
A further spin-off has been that YAFFS is portable to other operating systems. For example, a port of YAFFS for Windows CE was readily achieved by writing a new wrapper layer to hook up to the WinCE file system manager and new NAND access layer. yaffs_guts.c remains 100% portable (see footnote 4). YAFFS is thus likely to be easily portable to other OSes too.
Removing surplus constraints has lead to a simpler design that has improved performance and robustness as well, and also has reduced development time.
YAFFS reads and writes pretty fast. There is no significant hit due to garbage collection -- i.e., in most situations garbage collection overheads are not likely to impact usage.
It is difficult to give meaningful benchmarks given non-standard hardware. However here are some results achieved on a StrongARM-based board with 128MB of NAND flash: write speed is approximately 800KB/sec, and read speed is approximately 1.5MB/sec; in a weekend stress test I wrote, triple-verified, then deleted 25GB of file data without a single bit of file data being corrupted. Note that 25GB exceeds the design lifetime of most embedded or handheld devices.
Although YAFFS already performs well, there is still a lot of room for improvement. Development is continuing to enhance performance and robustness as well as add support software such as utilities, bootloader support, and analysis tools.
I expect to see ports to other OSes too. Increasing the user base of YAFFS is an important way to get more eyes over the code and test it under a wider set of scenarios.
The YAFFS development community is growing well, with good involvement from people all over the world. YAFFS is realizing all the benefits open source projects hope for and will, we expect, continue to grow.
Want more info about YFFS?
The YAFFS project home page provides further investigation reports, YAFFS documentation, and instructions for access CVS and the YAFFS mail list.
- The YAFFS name came from the first concept document proposing yet another flash file system. Some how that name just stuck.
- NAND support has since been added to JFFS2 by others. The YAFFS, JFFS2 and mtd communities cooperate to provide a rich flexible set of solutions to draw from. As is the nature of open source projects, JFFS2 code was used as a reference for some parts of the YAFFS development, particularly for figuring out the VFS interface. Google for further info on JFFS2 and mtd.
- A 32MB NOR array would be considered large. Single NAND chips of 256MB are already available. YAFFS is undergoing testing on NAND arrays of up to 512MB or larger.
- Some features were added to the guts to enhance WinCE usage. For example, the string comparison function was made configurable to be either strcmp or scricmp because Windows file names are case insensitive.
About the author: Charles Manning (aka The Embedded Janitor) lives near Christchurch, New Zealand, and has been developing and mopping up embedded systems for twenty years. His hobbies include fly tying and spinning wool.
(Click here for further information)
FUEL Database on MontaVista Linux
Whether building a mobile handset, a car navigation system, a package tracking device, or a home entertainment console, developers need capable software systems, including an operating system, development tools, and supporting libraries, to gain maximum benefit from their hardware platform and to meet aggressive time-to-market goals.
Breaking New Ground: The Evolution of Linux Clustering
With a platform comprising a complete Linux distribution, enhanced for clustering, and tailored for HPC, Penguin Computing�s Scyld Software provides the building blocks for organizations from enterprises to workgroups to deploy, manage, and maintain Linux clusters, regardless of their size.
Data Monitoring with NightStar LX
Unlike ordinary debuggers, NightStar LX doesn�t leave you stranded in the dark. It�s more than just a debugger, it�s a whole suite of integrated diagnostic tools designed for time-critical Linux applications to reduce test time, increase productivity and lower costs. You can debug, monitor, analyze and tune with minimal intrusion, so you see real execution behavior. And that�s positively illuminating.
Virtualizing Service Provider Networks with Vyatta
This paper highlights Vyatta's unique ability to virtualize networking functions using Vyatta's secure routing software in service provider environments.
High Availability Messaging Solution Using AXIGEN, Heartbeat and DRBD
This white paper discusses a high-availability messaging solution relying on the AXIGEN Mail Server, Heartbeat and DRBD. Solution architecture and implementation, as well as benefits of using AXIGEN for this setup are all presented in detail.
Understanding the Financial Benefits of Open Source
Will open source pay off? Open source is becoming standard within enterprises, often because of cost savings. Find out how much of a financial impact it can have on your organization. Get this methodology and calculator now, compliments of JBoss.
Embedded Hardware and OS Technology Empower PC-Based Platforms
The modern embedded computer is the jack of all trades appearing in many forms.
Data Management for Real-Time Distributed Systems
This paper provides an overview of the network-centric computing model, data distribution services, and distributed data management. It then describes how the SkyBoard integration and synchronization service, coupled with an implementation of the OMG�s Data Distribution Service (DDS) standard, can be used to create an efficient data distribution, storage, and retrieval system.
7 Advantages of D2D Backup
For decades, tape has been the backup medium of choice. But, now, disk-to-disk (D2D) backup is gaining in favor. Learn why you should make the move in this whitepaper.