头图

大部分现代文件系统的祖先都能追溯到BSD FFS(Fast File System 1983年 BSD 4.2版本),所以对文件系统的介绍不可能忽略它。从40年后今天来看,BSD FFS的设计理念还有很多值得学习的地方,对于现代文件系统可以说是产生的深远的影响。在它的设计框架里,有一个超级块,一个块位图,一个inode位图和一些预分配的inode表。这种设计可在许多现代文件系统里找到影子。

4.2BSD (August 1983) would take over two years to implement and contained several major overhauls. Before its official release came three intermediate versions: 4.1a from April 1982[13] incorporated a modified version of BBN's preliminary TCP/IP implementation; 4.1b from June 1982 included the new Berkeley Fast File System, implemented by Marshall Kirk McKusick; and 4.1c in April 1983 was an interim release during the last few months of 4.2BSD's development. Back at Bell Labs, 4.1cBSD became the basis of the 8th Edition of Research Unix, and a commercially supported version was available from mt Xinu.
来源:History of the Berkeley Software Distribution - Wikipedia

一、物理结构

接下来开始正题ext2的物理结构,其实ext2和FFS非常类似,它把磁盘划分成几个固定大小的块组(block groups),每一个块组都像是一个微型的文件系统,拥有超级块,块组描述符、块位图,inode位图和inode表(部分没有前两个)。这样即使在大部分磁盘都损坏的情况下,文件系统检测程序仍然可以恢复部分文件:
ext2物理块结构
 上图是ext2的物理结构图,每个ext2文件系统由引导区和块组组成,引导区只有一个,但是块组有很多个,各个字段详细说明见下表:

区块字段区块含义使用说明
Boot Sector磁盘引导区1)引导区是供操作系统使用,文件系统直接跳过
2)引导区一般是1024字节,也就是1K,占用1个逻辑块
3)引导区后面都是块组,也就是从1024字节开始
以下是块组Block Group
超级块Super Block文件系统属性和控制信息1)块组的第一个块,超级块占用1个逻辑块
2)第一个块组必须有超级块
3)如果开启稀疏超级块特性,不是每个块组都有超级块,只有3、5、7三个数的次幂块组才有,比如:9、25、27、49、81等等
Block Group Descriptor块组描述符1)描述块组中数据块位图位置、inode位图位置、inode表位置、以及空闲块和inode数量、目录数量
2)所谓位置信息,就是块号
3)有超级块的块组,就有块组描述符,要么都有,要么都没有
Block Bitmap数据块位图1)一个二进制位序列
2)0表示空闲,1表示被使用
Inode BitmapInode位图1)一个二进制位序列
2)0表示空闲,1表示被使用
Inode TableInode表存储具体的inode持久化信息
Data Block数据块存储具体的数据块,包括目录数据

二、结构实例

1.软盘(Floppy Disk)块结构示例

假设软盘容量是1.44MB,格式化成ext2文件系统后,每个block是1KB,一共一个引导区和一个块组,引导区占用1KB,块组0占用1439KB(每个块组默认8192KB),结构示意图如下:
软盘ext2结构表

2.20M块设备结构示例

20MB的块设备,格式化成ext2文件系统,每个Block是1KB,一共一个引导区和三个块组,引导区还是1KB,每个块组8192个块,块结构如下:

三、数据结构

超级块包含了文件系统的属性和控制信息,是整个文件系统的核心,了解了超级块的管理,基本上就抓住了文件系统的核心,以下代码是ext2超级块磁盘结构:

/*
 * Structure of the super block
 */
struct ext2_super_block {
    __le32    s_inodes_count;        /* Inodes count */
    __le32    s_blocks_count;        /* Blocks count */
    __le32    s_r_blocks_count;    /* Reserved blocks count */
    __le32    s_free_blocks_count;    /* Free blocks count */
    __le32    s_free_inodes_count;    /* Free inodes count */
    __le32    s_first_data_block;    /* First Data Block */
    __le32    s_log_block_size;    /* Block size */
    __le32    s_log_frag_size;    /* Fragment size */
    __le32    s_blocks_per_group;    /* # Blocks per group */
    __le32    s_frags_per_group;    /* # Fragments per group */
    __le32    s_inodes_per_group;    /* # Inodes per group */
    __le32    s_mtime;        /* Mount time */
    __le32    s_wtime;        /* Write time */
    __le16    s_mnt_count;        /* Mount count */
    __le16    s_max_mnt_count;    /* Maximal mount count */
    __le16    s_magic;        /* Magic signature */
    __le16    s_state;        /* File system state */
    __le16    s_errors;        /* Behaviour when detecting errors */
    __le16    s_minor_rev_level;     /* minor revision level */
    __le32    s_lastcheck;        /* time of last check */
    __le32    s_checkinterval;    /* max. time between checks */
    __le32    s_creator_os;        /* OS */
    __le32    s_rev_level;        /* Revision level */
    __le16    s_def_resuid;        /* Default uid for reserved blocks */
    __le16    s_def_resgid;        /* Default gid for reserved blocks */
    /*
     * These fields are for EXT2_DYNAMIC_REV superblocks only.
     *
     * Note: the difference between the compatible feature set and
     * the incompatible feature set is that if there is a bit set
     * in the incompatible feature set that the kernel doesn't
     * know about, it should refuse to mount the filesystem.
     * 
     * e2fsck's requirements are more strict; if it doesn't know
     * about a feature in either the compatible or incompatible
     * feature set, it must abort and not try to meddle with
     * things it doesn't understand...
     */
    __le32    s_first_ino;         /* First non-reserved inode */
    __le16   s_inode_size;         /* size of inode structure */
    __le16    s_block_group_nr;     /* block group # of this superblock */
    __le32    s_feature_compat;     /* compatible feature set */
    __le32    s_feature_incompat;     /* incompatible feature set */
    __le32    s_feature_ro_compat;     /* readonly-compatible feature set */
    __u8    s_uuid[16];        /* 128-bit uuid for volume */
    char    s_volume_name[16];     /* volume name */
    char    s_last_mounted[64];     /* directory where last mounted */
    __le32    s_algorithm_usage_bitmap; /* For compression */
    /*
     * Performance hints.  Directory preallocation should only
     * happen if the EXT2_COMPAT_PREALLOC flag is on.
     */
    __u8    s_prealloc_blocks;    /* Nr of blocks to try to preallocate*/
    __u8    s_prealloc_dir_blocks;    /* Nr to preallocate for dirs */
    __u16    s_padding1;
    /*
     * Journaling support valid if EXT3_FEATURE_COMPAT_HAS_JOURNAL set.
     */
    __u8    s_journal_uuid[16];    /* uuid of journal superblock */
    __u32    s_journal_inum;        /* inode number of journal file */
    __u32    s_journal_dev;        /* device number of journal file */
    __u32    s_last_orphan;        /* start of list of inodes to delete */
    __u32    s_hash_seed[4];        /* HTREE hash seed */
    __u8    s_def_hash_version;    /* Default hash version to use */
    __u8    s_reserved_char_pad;
    __u16    s_reserved_word_pad;
    __le32    s_default_mount_opts;
     __le32    s_first_meta_bg;     /* First metablock block group */
    __u32    s_reserved[190];    /* Padding to the end of the block */
};

四、总结

ext2文件系统物理结构分为多个块组,一个块组默认有8192块,每一块默认是1024字节。一个块组中包含:超级块、块组描述符、块位图、inode位图、inode表、数据块。这样就构成了完整的ext2文件系统。

转载原文:03 ext2文件系统物理结构剖析 - 挖掘数据要素,发现数据价值,助力万物智联!


毕辞数据Aaron
1 声望0 粉丝

10+年菊厂研发经验,在分布式系统领域深耕多年,精通分布式文件系统、存储系统架构