201 lines
8.2 KiB
Text
201 lines
8.2 KiB
Text
Welcome to YAFFS, the first file system developed specifically for NAND flash.
|
|
|
|
It is now YAFFS2 - original YAFFS (AYFFS1) only supports 512-byte page
|
|
NAND and is now deprectated. YAFFS2 supports 512b page in 'YAFFS1
|
|
compatibility' mode (CONFIG_YAFFS_YAFFS1) and 2K or larger page NAND
|
|
in YAFFS2 mode (CONFIG_YAFFS_YAFFS2).
|
|
|
|
|
|
A note on licencing
|
|
-------------------
|
|
YAFFS is available under the GPL and via alternative licensing
|
|
arrangements with Aleph One. If you're using YAFFS as a Linux kernel
|
|
file system then it will be under the GPL. For use in other situations
|
|
you should discuss licensing issues with Aleph One.
|
|
|
|
|
|
Terminology
|
|
-----------
|
|
Page - NAND addressable unit (normally 512b or 2Kbyte size) - can
|
|
be read, written, marked bad. Has associated OOB.
|
|
Block - Eraseable unit. 64 Pages. (128K on 2K NAND, 32K on 512b NAND)
|
|
OOB - 'spare area' of each page for ECC, bad block marked and YAFFS
|
|
tags. 16 bytes per 512b - 64 bytes for 2K page size.
|
|
Chunk - Basic YAFFS addressable unit. Same size as Page.
|
|
Object - YAFFS Object: File, Directory, Link, Device etc.
|
|
|
|
YAFFS design
|
|
------------
|
|
|
|
YAFFS is a log-structured filesystem. It is designed particularly for
|
|
NAND (as opposed to NOR) flash, to be flash-friendly, robust due to
|
|
journalling, and to have low RAM and boot time overheads. File data is
|
|
stored in 'chunks'. Chunks are the same size as NAND pages. Each page
|
|
is marked with file id and chunk number. These marking 'tags' are
|
|
stored in the OOB (or 'spare') region of the flash. The chunk number
|
|
is determined by dividing the file position by the chunk size. Each
|
|
chunk has a number of valid bytes, which equals the page size for all
|
|
except the last chunk in a file.
|
|
|
|
File 'headers' are stored as the first page in a file, marked as a
|
|
different type to data pages. The same mechanism is used to store
|
|
directories, device files, links etc. The first page describes which
|
|
type of object it is.
|
|
|
|
YAFFS2 never re-writes a page, because the spec of NAND chips does not
|
|
allow it. (YAFFS1 used to mark a block 'deleted' in the OOB). Deletion
|
|
is managed by moving deleted objects to the special, hidden 'unlinked'
|
|
directory. These records are preserved until all the pages containing
|
|
the object have been erased (We know when this happen by keeping a
|
|
count of chunks remaining on the system for each object - when it
|
|
reaches zero the object really is gone).
|
|
|
|
When data in a file is overwritten, the relevant chunks are replaced
|
|
by writing new pages to flash containing the new data but the same
|
|
tags.
|
|
|
|
Pages are also marked with a short (2 bit) serial number that
|
|
increments each time the page at this position is incremented. The
|
|
reason for this is that if power loss/crash/other act of demonic
|
|
forces happens before the replaced page is marked as discarded, it is
|
|
possible to have two pages with the same tags. The serial number is
|
|
used to arbitrate.
|
|
|
|
A block containing only discarded pages (termed a dirty block) is an
|
|
obvious candidate for garbage collection. Otherwise valid pages can be
|
|
copied off a block thus rendering the whole block discarded and ready
|
|
for garbage collection.
|
|
|
|
In theory you don't need to hold the file structure in RAM... you
|
|
could just scan the whole flash looking for pages when you need them.
|
|
In practice though you'd want better file access times than that! The
|
|
mechanism proposed here is to have a list of __u16 page addresses
|
|
associated with each file. Since there are 2^18 pages in a 128MB NAND,
|
|
a __u16 is insufficient to uniquely identify a page but is does
|
|
identify a group of 4 pages - a small enough region to search
|
|
exhaustively. This mechanism is clearly expandable to larger NAND
|
|
devices - within reason. The RAM overhead with this approach is approx
|
|
2 bytes per page - 512kB of RAM for a whole 128MB NAND.
|
|
|
|
Boot-time scanning to build the file structure lists only requires
|
|
one pass reading NAND. If proper shutdowns happen the current RAM
|
|
summary of the filesystem status is saved to flash, called
|
|
'checkpointing'. This saves re-scanning the flash on startup, and gives
|
|
huge boot/mount time savings.
|
|
|
|
YAFFS regenerates its state by 'replaying the tape' - i.e. by
|
|
scanning the chunks in their allocation order (i.e. block sequence ID
|
|
order), which is usually different form the media block order. Each
|
|
block is still only read once - starting from the end of the media and
|
|
working back.
|
|
|
|
YAFFS tags in YAFFS1 mode:
|
|
|
|
18-bit Object ID (2^18 files, i.e. > 260,000 files). File id 0- is not
|
|
valid and indicates a deleted page. File od 0x3ffff is also not valid.
|
|
Synonymous with inode.
|
|
2-bit serial number
|
|
20-bit Chunk ID within file. Limit of 2^20 chunks/pages per file (i.e.
|
|
> 500MB max file size). Chunk ID 0 is the file header for the file.
|
|
10-bit counter of the number of bytes used in the page.
|
|
12 bit ECC on tags
|
|
|
|
YAFFS tags in YAFFS2 mode:
|
|
4 bytes 32-bit chunk ID
|
|
4 bytes 32-bit object ID
|
|
2 bytes Number of data bytes in this chunk
|
|
4 bytes Sequence number for this block
|
|
3 bytes ECC on tags
|
|
12 bytes ECC on data (3 bytes per 256 bytes of data)
|
|
|
|
|
|
Page allocation and garbage collection
|
|
|
|
Pages are allocated sequentially from the currently selected block.
|
|
When all the pages in the block are filled, another clean block is
|
|
selected for allocation. At least two or three clean blocks are
|
|
reserved for garbage collection purposes. If there are insufficient
|
|
clean blocks available, then a dirty block ( ie one containing only
|
|
discarded pages) is erased to free it up as a clean block. If no dirty
|
|
blocks are available, then the dirtiest block is selected for garbage
|
|
collection.
|
|
|
|
Garbage collection is performed by copying the valid data pages into
|
|
new data pages thus rendering all the pages in this block dirty and
|
|
freeing it up for erasure. I also like the idea of selecting a block
|
|
at random some small percentage of the time - thus reducing the chance
|
|
of wear differences.
|
|
|
|
YAFFS is single-threaded. Garbage-collection is done as a parasitic
|
|
task of writing data. So each time some data is written, a bit of
|
|
pending garbage collection is done. More pages are garbage-collected
|
|
when free space is tight.
|
|
|
|
|
|
Flash writing
|
|
|
|
YAFFS only ever writes each page once, complying with the requirements
|
|
of the most restricitve NAND devices.
|
|
|
|
Wear levelling
|
|
|
|
This comes as a side-effect of the block-allocation strategy. Data is
|
|
always written on the next free block, so they are all used equally.
|
|
Blocks containing data that is written but never erased will not get
|
|
back into the free list, so wear is levelled over only blocks which
|
|
are free or become free, not blocks which never change.
|
|
|
|
|
|
|
|
Some helpful info
|
|
-----------------
|
|
|
|
Formatting a YAFFS device is simply done by erasing it.
|
|
|
|
Making an initial filesystem can be tricky because YAFFS uses the OOB
|
|
and thus the bytes that get written depend on the YAFFS data (tags),
|
|
and the ECC bytes and bad block markers which are dictated by the
|
|
hardware and/or the MTD subsystem. The data layout also depends on the
|
|
device page size (512b or 2K). Because YAFFS is only responsible for
|
|
some of the OOB data, generating a filesystem offline requires
|
|
detailed knowledge of what the other parts (MTD and NAND
|
|
driver/hardware) are going to do.
|
|
|
|
To make a YAFFS filesystem you have 3 options:
|
|
|
|
1) Boot the system with an empty NAND device mounted as YAFFS and copy
|
|
stuff on.
|
|
|
|
2) Make a filesystem image offline, then boot the system and use
|
|
MTDutils to write an image to flash.
|
|
|
|
3) Make a filesystem image offline and use some tool like a bootloader to
|
|
write it to flash.
|
|
|
|
Option 1 avoids a lot of issues because all the parts
|
|
(YAFFS/MTD/hardware) all take care of their own bits and (if you have
|
|
put things together properly) it will 'just work'. YAFFS just needs to
|
|
know how many bytes of the OOB it can use. However sometimes it is not
|
|
practical.
|
|
|
|
Option 2 lets MTD/hardware take care of the ECC so the filesystem
|
|
image just had to know which bytes to use for YAFFS Tags.
|
|
|
|
Option 3 is hardest as the image creator needs to know exactly what
|
|
ECC bytes, endianness and algorithm to use as well as which bytes are
|
|
available to YAFFS.
|
|
|
|
mkyaffs2image creates an image suitable for option 3 for the
|
|
particular case of yaffs2 on 2K page NAND with default MTD layout.
|
|
|
|
mkyaffsimage creates an equivalent image for 512b page NAND (i.e.
|
|
yaffs1 format).
|
|
|
|
Bootloaders
|
|
-----------
|
|
|
|
A bootloader using YAFFS needs to know how MTD is laying out the OOB
|
|
so that it can skip bad blocks.
|
|
|
|
YAFFS Tracing
|
|
-------------
|