John's
spec of the second extended filesystem
This information has been compiled
from resources published on the internet, the Linux
kernel source and information gained from
experimentation.
Introduction
minix -> extend fs -> 2nd
extended
Structure
The file system is created from a
sequential collection of blocks. These blocks can be
either 1k, 2k or 4k in size. These blocks are divided up
into groups for various reasons.
The starting point for the
filesystem is the superblock. This is a structure 1024
bytes in length and is always located at an offset of
1024 bytes from the start of the filesystem.
The following is a list of the
structure used by Linux. Note that other OS's may use a
different structure. For details see ???.h
field name |
type |
comment |
s_inodes_count |
ULONG |
Count of inodes in the
filesystem |
s_blocks_count |
ULONG |
Count of blocks in the
filesystem |
s_r_blocks_count |
ULONG |
Count of the number of
reserved blocks |
s_free_blocks_count |
ULONG |
Count of the number of
free blocks |
s_free_inodes_count |
ULONG |
Count of the number of
free inodes |
s_first_data_block |
ULONG |
The first block which
contains data |
s_log_block_size |
ULONG |
Indicator of the block
size |
s_log_frag_size |
LONG |
Indicator of the size of
the fragments |
s_blocks_per_group |
ULONG |
Count of the number of
blocks in each block group |
s_frags_per_group |
ULONG |
Count of the number of
fragments in each block group |
s_inodes_per_group |
ULONG |
Count of the number of
inodes in each block group |
s_mtime |
ULONG |
The time that the
filesystem was last mounted |
s_wtime |
ULONG |
The time that the
filesystem was last written to |
s_mnt_count |
USHORT |
The number of times the
file system has been mounted |
s_max_mnt_count |
SHORT |
The number of times the
file system can be mounted |
s_magic |
USHORT |
Magic number indicating
ex2fs |
s_state |
USHORT |
Flags indicating the
current state of the filesystem |
s_errors |
USHORT |
Flags indicating the
procedures for error reporting |
s_pad |
USHORT |
padding |
s_lastcheck |
ULONG |
The time that the
filesystem was last checked |
s_checkinterval |
ULONG |
The maximum time
permissable between checks |
s_creator_os |
ULONG |
Indicator of which OS
created the filesystem |
s_rev_level |
ULONG |
The revision level of the
filesystem |
s_reserved |
ULONG[235] |
padding to 1024 bytes |
s_r_blocks_count :
this is the number of blocks which are reserved for the
super user
s_first_data_block :
Depending on the block size, the first data block will be
either 0 or 1. See diagram below
s_log_block_size :
0 = 1k block size
1 = 2k
2 = 4k
s_log_frag_size :
At the moment, it seems that fragments are not
implemented. In the future I may have to find out how
they work.
s_blocks_per_group
The filesystem is divided up into block groups. Note that
the last block group may not be complete
s_inodes_per_group
Each block group has space reserved for a number of
inodes.
s_mtime, s_wtime
This may be the mount time, or the umount time. I am not
sure which.
s_mnt_count, s_max_mnt_count
Once this count reaches the maximum, the filesystem must
be checked, the count is then reset.
s_magic
This should contain the magic number 0xEF53
s_state
This contains a set of flags which indicate wether the
filesystem is clean etc.
s_errors
This contains falgs which indicate how the filesystem
should be treated if errors are found.
s_creator_os
For Linux this is ???
s_rev_level
The current revision is ???
The information in the superblock
is used to access the rest of the data on the disk.
The number of block groups = the
number of blocks / the number of blocks per group; //
rounded up
All block and inode addresses start
at 1. The first block on the disk is block 1. 0 is used
to indicate no block.
Each block group can be found at
the block address ((group number - 1)* blocks per group)
and is of course blocks per group long. Group numbers are
1 based aswell
Each group is just a series of
blocks, however the first blocks in the group have a
special purpose. The remainder are used for storing data.
|Superblock|Group
Descriptors|Block Bitmap|INode Bitmap|INode Table|Data
block|
|----------------------------|------------------------------------------------|
|This is the same for all groups | this is specific to
each group
The superblock is stored in the first data block (except
for group 1)
The Group Descriptors contains
information on the block groups. This data is covers all
the groups and is stored in all the groups for
rudundency. This is an array of the following structure
field name |
type |
comment |
bg_block_bitmap |
ULONG |
The address of the block
containing the block bitmap for this group |
bg_inode_bitmap |
ULONG |
The address of the block
containing the inode bitmap for this group |
bg_inode_table |
ULONG |
The address of the block
containing the inode table for this group |
bg_free_blocks_count |
USHORT |
The count of free blocks
in this group |
bg_free_inodes_count |
USHORT |
The count of free inodes
in this group |
bg_used_dirs_count |
USHORT |
The number inodes in this
group which are directories |
bg_pad |
USHORT |
padding |
bg_reserved |
ULONG[3] |
padding |
The size of the descriptors
can be calculated as (sizeof(ext2_group) * number of
groups) / block size; // rounded up if necessary
The information in this structure
us used to locate the block and inode bitmaps and inode
table.
Remember that the first entry
corresponds to block group 1.
The block bitmap is a bitmap
indicating which blocks in the group have been allocated.
If the bit is set then the block is allocated. The size
of the bitmap is (blocks per group / 8) / block size;//
with both divisions rounded up.
It is necessary to find out which
group a particular group is in to be able to look up the
bitmap. The group = ((Block number - 1) / Blocks per
group) + 1; // rounded up
The block in that group is then
Block Number - (Group * Blocks per group)
The inode bitmap is essentaly the
same as the block bitmap, but indicates which inodes are
allocated. The size of the inode bitmpap is (inodes per
group / 8) / block size;// with both divisions rounded
up.
The same calculations can be used
for finding the group of a particular inode. The group =
((INode number - 1) / INodes per group) + 1; // rounded
up
The inode in that group is then
INode Number - (Group * INodes per group)
The inode table is an array of the
inodes for that particular group. Again, the first entry
is for the first inode in that group.
field name |
type |
description |
i_mode |
USHORT |
File mode |
i_uid |
USHORT |
Owner Uid |
i_size |
ULONG |
Size in bytes |
i_atime |
ULONG |
Access time |
i_ctime |
ULONG |
Creation time |
i_mtime |
ULONG |
Modification time |
i_dtime |
ULONG |
Deletion Time |
i_gid |
USHORT |
Group Id |
i_links_count |
USHORT |
Links count |
i_blocks |
ULONG |
Blocks count |
i_flags |
ULONG |
File flags |
i_reserved1 |
ULONG |
OS dependent |
i_block |
ULONG[15] |
Pointers to blocks |
i_version |
ULONG |
File version (for NFS) |
i_file_acl |
ULONG |
File ACL |
i_dir_acl |
ULONG |
Directory ACL |
i_faddr |
ULONG |
Fragment address |
i_frag |
UCHAR |
Fragment number |
i_fsize |
UCHAR |
Fragment size |
i_pad1 |
USHORT |
|
i_reserved2 |
ULONG[2] |
|
The file mode is a set of
flags that specify the type of file and the access
permissions
identifier |
value |
comment |
S_IFMT |
F000 |
format mask |
S_IFSOCK |
A000 |
socket |
S_IFLNK |
C000 |
symbolic link |
S_IFREG |
8000 |
regular file |
S_IFBLK |
6000 |
block device |
S_IFDIR |
4000 |
directory |
S_IFCHR |
2000 |
character device |
S_IFIFO |
1000 |
fifo |
|
|
|
S_ISUID |
0800 |
SUID |
S_ISGID |
0400 |
SGID |
S_ISVTX |
0200 |
sticky bit |
|
|
|
S_IRWXU |
01C0 |
user mask |
S_IRUSR |
0100 |
read |
S_IWUSR |
0080 |
write |
S_IXUSR |
0040 |
execute |
|
|
|
S_IRWXG |
0038 |
group mask |
S_IRGRP |
0020 |
read |
S_IWGRP |
0010 |
write |
S_IXGRP |
0008 |
execute |
|
|
|
S_IRWXO |
0007 |
other mask |
S_IROTH |
0004 |
read |
S_IWOTH |
0002 |
write |
S_IXOTH |
0001 |
execute |
The i_block entry is an array
of block addresses. The first EXT2_NDIR_BLOCKS (12) are
direct block addresses. The data in these blocks is the
content of the file. The next block EXT2_IND_BLOCK in the
indirect block. This is the address of a block which
contains a list of addresses of blocks which contain the
data. There are block size / sizeof(ULONG) addresses in
this block.
The EXT2_DIND_BLOCK is simalar, but
it is a double indirect block. It countains the address
of a block which has a list of indirect block addresses.
Each indirect block then has another list is blocks.
The EXT2_TIND_BLOCK is simalar
again, but it is the tripple indirect block. It contains
a list of double indirect blocks etc.
Now that you know how to find and
read inodes, you can start to read the files. There are a
set of special inodes which are reserved for certain
puposes. These include
indetifier |
value |
description |
EXT2_BAD_INO |
1 |
Bad blocks inode |
EXT2_ROOT_INO |
2 |
Root inode |
EXT2_ACL_IDX_INO |
3 |
ACL inode |
EXT2_ACL_DATA_INO |
4 |
ACL inode |
EXT2_BOOT_LOADER_INO |
5 |
Boot loader inode |
EXT2_UNDEL_DIR_INO |
6 |
Undelete directory inode |
EXT2_FIRST_INO |
11 |
First non reserved inode |
The most important inode here
is the root inode. This is the inode at the root of the
file system. This inode is a directory, which like all
directories has the following structure:
field name |
type |
description |
inode |
ULONG |
address if inode |
rec_len |
USHORT |
length of this record |
name_len |
USHORT |
length of file name |
name |
CHAR[0] |
the file name |
A directory is a list of
these structures. The structures can not pass over a
block boundry, so the last record is extended to fill the
block. And entry with an inode of 0 should be ignored.
|