202 lines
8.2 KiB
Text
202 lines
8.2 KiB
Text
|
Welcome to YAFFS, the first file system developed specifically for NAND flash.
|
||
|
|
||
|
It is now YAFFS2 - original YAFFS (AYFFS1) only supports 512-byte page
|
||
|
NAND and is now deprectated. YAFFS2 supports 512b page in 'YAFFS1
|
||
|
compatibility' mode (CONFIG_YAFFS_YAFFS1) and 2K or larger page NAND
|
||
|
in YAFFS2 mode (CONFIG_YAFFS_YAFFS2).
|
||
|
|
||
|
|
||
|
A note on licencing
|
||
|
-------------------
|
||
|
YAFFS is available under the GPL and via alternative licensing
|
||
|
arrangements with Aleph One. If you're using YAFFS as a Linux kernel
|
||
|
file system then it will be under the GPL. For use in other situations
|
||
|
you should discuss licensing issues with Aleph One.
|
||
|
|
||
|
|
||
|
Terminology
|
||
|
-----------
|
||
|
Page - NAND addressable unit (normally 512b or 2Kbyte size) - can
|
||
|
be read, written, marked bad. Has associated OOB.
|
||
|
Block - Eraseable unit. 64 Pages. (128K on 2K NAND, 32K on 512b NAND)
|
||
|
OOB - 'spare area' of each page for ECC, bad block marked and YAFFS
|
||
|
tags. 16 bytes per 512b - 64 bytes for 2K page size.
|
||
|
Chunk - Basic YAFFS addressable unit. Same size as Page.
|
||
|
Object - YAFFS Object: File, Directory, Link, Device etc.
|
||
|
|
||
|
YAFFS design
|
||
|
------------
|
||
|
|
||
|
YAFFS is a log-structured filesystem. It is designed particularly for
|
||
|
NAND (as opposed to NOR) flash, to be flash-friendly, robust due to
|
||
|
journalling, and to have low RAM and boot time overheads. File data is
|
||
|
stored in 'chunks'. Chunks are the same size as NAND pages. Each page
|
||
|
is marked with file id and chunk number. These marking 'tags' are
|
||
|
stored in the OOB (or 'spare') region of the flash. The chunk number
|
||
|
is determined by dividing the file position by the chunk size. Each
|
||
|
chunk has a number of valid bytes, which equals the page size for all
|
||
|
except the last chunk in a file.
|
||
|
|
||
|
File 'headers' are stored as the first page in a file, marked as a
|
||
|
different type to data pages. The same mechanism is used to store
|
||
|
directories, device files, links etc. The first page describes which
|
||
|
type of object it is.
|
||
|
|
||
|
YAFFS2 never re-writes a page, because the spec of NAND chips does not
|
||
|
allow it. (YAFFS1 used to mark a block 'deleted' in the OOB). Deletion
|
||
|
is managed by moving deleted objects to the special, hidden 'unlinked'
|
||
|
directory. These records are preserved until all the pages containing
|
||
|
the object have been erased (We know when this happen by keeping a
|
||
|
count of chunks remaining on the system for each object - when it
|
||
|
reaches zero the object really is gone).
|
||
|
|
||
|
When data in a file is overwritten, the relevant chunks are replaced
|
||
|
by writing new pages to flash containing the new data but the same
|
||
|
tags.
|
||
|
|
||
|
Pages are also marked with a short (2 bit) serial number that
|
||
|
increments each time the page at this position is incremented. The
|
||
|
reason for this is that if power loss/crash/other act of demonic
|
||
|
forces happens before the replaced page is marked as discarded, it is
|
||
|
possible to have two pages with the same tags. The serial number is
|
||
|
used to arbitrate.
|
||
|
|
||
|
A block containing only discarded pages (termed a dirty block) is an
|
||
|
obvious candidate for garbage collection. Otherwise valid pages can be
|
||
|
copied off a block thus rendering the whole block discarded and ready
|
||
|
for garbage collection.
|
||
|
|
||
|
In theory you don't need to hold the file structure in RAM... you
|
||
|
could just scan the whole flash looking for pages when you need them.
|
||
|
In practice though you'd want better file access times than that! The
|
||
|
mechanism proposed here is to have a list of __u16 page addresses
|
||
|
associated with each file. Since there are 2^18 pages in a 128MB NAND,
|
||
|
a __u16 is insufficient to uniquely identify a page but is does
|
||
|
identify a group of 4 pages - a small enough region to search
|
||
|
exhaustively. This mechanism is clearly expandable to larger NAND
|
||
|
devices - within reason. The RAM overhead with this approach is approx
|
||
|
2 bytes per page - 512kB of RAM for a whole 128MB NAND.
|
||
|
|
||
|
Boot-time scanning to build the file structure lists only requires
|
||
|
one pass reading NAND. If proper shutdowns happen the current RAM
|
||
|
summary of the filesystem status is saved to flash, called
|
||
|
'checkpointing'. This saves re-scanning the flash on startup, and gives
|
||
|
huge boot/mount time savings.
|
||
|
|
||
|
YAFFS regenerates its state by 'replaying the tape' - i.e. by
|
||
|
scanning the chunks in their allocation order (i.e. block sequence ID
|
||
|
order), which is usually different form the media block order. Each
|
||
|
block is still only read once - starting from the end of the media and
|
||
|
working back.
|
||
|
|
||
|
YAFFS tags in YAFFS1 mode:
|
||
|
|
||
|
18-bit Object ID (2^18 files, i.e. > 260,000 files). File id 0- is not
|
||
|
valid and indicates a deleted page. File od 0x3ffff is also not valid.
|
||
|
Synonymous with inode.
|
||
|
2-bit serial number
|
||
|
20-bit Chunk ID within file. Limit of 2^20 chunks/pages per file (i.e.
|
||
|
> 500MB max file size). Chunk ID 0 is the file header for the file.
|
||
|
10-bit counter of the number of bytes used in the page.
|
||
|
12 bit ECC on tags
|
||
|
|
||
|
YAFFS tags in YAFFS2 mode:
|
||
|
4 bytes 32-bit chunk ID
|
||
|
4 bytes 32-bit object ID
|
||
|
2 bytes Number of data bytes in this chunk
|
||
|
4 bytes Sequence number for this block
|
||
|
3 bytes ECC on tags
|
||
|
12 bytes ECC on data (3 bytes per 256 bytes of data)
|
||
|
|
||
|
|
||
|
Page allocation and garbage collection
|
||
|
|
||
|
Pages are allocated sequentially from the currently selected block.
|
||
|
When all the pages in the block are filled, another clean block is
|
||
|
selected for allocation. At least two or three clean blocks are
|
||
|
reserved for garbage collection purposes. If there are insufficient
|
||
|
clean blocks available, then a dirty block ( ie one containing only
|
||
|
discarded pages) is erased to free it up as a clean block. If no dirty
|
||
|
blocks are available, then the dirtiest block is selected for garbage
|
||
|
collection.
|
||
|
|
||
|
Garbage collection is performed by copying the valid data pages into
|
||
|
new data pages thus rendering all the pages in this block dirty and
|
||
|
freeing it up for erasure. I also like the idea of selecting a block
|
||
|
at random some small percentage of the time - thus reducing the chance
|
||
|
of wear differences.
|
||
|
|
||
|
YAFFS is single-threaded. Garbage-collection is done as a parasitic
|
||
|
task of writing data. So each time some data is written, a bit of
|
||
|
pending garbage collection is done. More pages are garbage-collected
|
||
|
when free space is tight.
|
||
|
|
||
|
|
||
|
Flash writing
|
||
|
|
||
|
YAFFS only ever writes each page once, complying with the requirements
|
||
|
of the most restricitve NAND devices.
|
||
|
|
||
|
Wear levelling
|
||
|
|
||
|
This comes as a side-effect of the block-allocation strategy. Data is
|
||
|
always written on the next free block, so they are all used equally.
|
||
|
Blocks containing data that is written but never erased will not get
|
||
|
back into the free list, so wear is levelled over only blocks which
|
||
|
are free or become free, not blocks which never change.
|
||
|
|
||
|
|
||
|
|
||
|
Some helpful info
|
||
|
-----------------
|
||
|
|
||
|
Formatting a YAFFS device is simply done by erasing it.
|
||
|
|
||
|
Making an initial filesystem can be tricky because YAFFS uses the OOB
|
||
|
and thus the bytes that get written depend on the YAFFS data (tags),
|
||
|
and the ECC bytes and bad block markers which are dictated by the
|
||
|
hardware and/or the MTD subsystem. The data layout also depends on the
|
||
|
device page size (512b or 2K). Because YAFFS is only responsible for
|
||
|
some of the OOB data, generating a filesystem offline requires
|
||
|
detailed knowledge of what the other parts (MTD and NAND
|
||
|
driver/hardware) are going to do.
|
||
|
|
||
|
To make a YAFFS filesystem you have 3 options:
|
||
|
|
||
|
1) Boot the system with an empty NAND device mounted as YAFFS and copy
|
||
|
stuff on.
|
||
|
|
||
|
2) Make a filesystem image offline, then boot the system and use
|
||
|
MTDutils to write an image to flash.
|
||
|
|
||
|
3) Make a filesystem image offline and use some tool like a bootloader to
|
||
|
write it to flash.
|
||
|
|
||
|
Option 1 avoids a lot of issues because all the parts
|
||
|
(YAFFS/MTD/hardware) all take care of their own bits and (if you have
|
||
|
put things together properly) it will 'just work'. YAFFS just needs to
|
||
|
know how many bytes of the OOB it can use. However sometimes it is not
|
||
|
practical.
|
||
|
|
||
|
Option 2 lets MTD/hardware take care of the ECC so the filesystem
|
||
|
image just had to know which bytes to use for YAFFS Tags.
|
||
|
|
||
|
Option 3 is hardest as the image creator needs to know exactly what
|
||
|
ECC bytes, endianness and algorithm to use as well as which bytes are
|
||
|
available to YAFFS.
|
||
|
|
||
|
mkyaffs2image creates an image suitable for option 3 for the
|
||
|
particular case of yaffs2 on 2K page NAND with default MTD layout.
|
||
|
|
||
|
mkyaffsimage creates an equivalent image for 512b page NAND (i.e.
|
||
|
yaffs1 format).
|
||
|
|
||
|
Bootloaders
|
||
|
-----------
|
||
|
|
||
|
A bootloader using YAFFS needs to know how MTD is laying out the OOB
|
||
|
so that it can skip bad blocks.
|
||
|
|
||
|
YAFFS Tracing
|
||
|
-------------
|