为了分配一个新的 inode(例如,创建文件时),xv6 会调用 ialloc(5204)。Ialloc 类似于 balloc:它循环遍历磁盘上的 inode 构造,一次一个块,探求标记为空闲的构造。当它找到一个时,它通过将新的type写入磁盘来声明该 inode,并然后返回 inode 缓存中的一个条款,该条款带有对 iget(5218)的尾调用。Ialloc 的精确操作取决于只有一个进程可以同时持有对 bp 的引用这一事实:ialloc 可以确信没有其他进程会同时看到 inode 可用并试图声明它。
Iget (5254) looks through the inode cache for an active entry (ip->ref > 0) with the desired device and inode number. If it finds one, it returns a new reference to that inode. (5263-5267) . As iget scans, it records the position of the first empty slot (5268-5269) , which it uses if it needs to allocate a cache entry.
Iget(5254)在 inode 缓存中查找具有所需设备和 inode 号的活动条款(ip->ref>0)。如果找到一个,它将返回对该 inode 的新引用。(5263-5267)。当 iget 扫描时,它记录第一个空槽的位置(5268-5269),如果须要分配缓存条款,则利用该位置。
Code must lock the inode using ilock before reading or writing its metadata or content. Ilock (5303) uses a sleep-lock for this purpose. Once ilock has exclusive access to the inode, it reads the inode from disk (more likely, the buffer cache) if needed. The function iunlock (5331) releases the sleep-lock, which may cause any processes sleeping to be woken up.
代码在读取或写入其元数据或内容之前,必须利用 ilock 锁定 inode。Ilock(5303)为此目的利用了就寝锁。一旦 ilock 得到了对 inode 的独占访问权限,如果须要,它会从磁盘(更有可能是缓冲区缓存)中读取 inode。函数 iunlock(5331)开释就寝锁,这可能会导致任何正在就寝的进程被唤醒。
Iput (5358) releases a C pointer to an inode by decrementing the reference count (5376) . If this is the last reference, the inode’s slot in the inode cache is now free and can be re-used for a different inode.
Iput(5358)通过减少引用计数(5376)来开释指向 inode 的 C 指针。如果这是末了一个引用,那么 inode 在 inode 缓存中的槽现在就空闲了,可以重新用于其他 inode。
If iput sees that there are no C pointer references to an inode and that the inode has no links to it (occurs in no directory), then the inode and its data blocks must be freed. Iput calls itrunc to truncate the file to zero bytes, freeing the data blocks; sets the inode type to 0 (unallocated); and writes the inode to disk (5366) .
如果 iput 创造没有指向 inode 的 C 指针引用,并且 inode 没有链接到它(在没有目录中涌现),那么 inode 及其数据块必须被开释。Iput 调用 itrunc 将文件截断为零字节,开释数据块;将 inode 类型设置为 0(未分配);并将 inode 写入磁盘(5366)。
The locking protocol in iput in the case in which it frees the inode deserves a closer look. One danger is that a concurrent thread might be waiting in ilock to use this inode (e.g. to read a file or list a directory), and won’t be prepared to find the inode is not longer allocated. This can’t happen because there is no way for a system call to get a pointer to a cached inode if it has no links to it and ip->ref is one. That one reference is the reference owned by the thread calling iput. It’s true that iput checks that the reference count is one outside of its icache.lock critical section, but at that point the link count is known to be zero, so no thread will try to acquire a new reference.
The other main danger is that a concurrent call to ialloc might choose the same inode that iput is freeing. This can only happen after the iupdate writes the disk so that the inode has type zero. This race is benign; the allocating thread will politely wait to acquire the inode’s sleep-lock before reading or writing the inode, at which point iput is done with it.
这段笔墨详细解释了在开释 inode 时,iput 中的锁协议是如何事情的。紧张涉及两个风险:
- 正在等待 ilock 的并发线程可能会考试测验利用该 inode(例如,读取文件或列出目录),但却创造 inode 不再可用。由于没有链接指向该 inode,并且 ip->ref 为 1,因此系统调用无法得到指向缓存 inode 的指针。这一个引用是由调用 iput 的线程所拥有的。虽然 iput 在其 icache.lock 临界区之外检讨引用计数是否为 1,但此时链接计数已知为零,因此不会有任何线程考试测验获取新的引用。
- 另一个紧张风险是,并发调用 ialloc 可能会选择开释的相同 inode。这只能在 iupdate 将磁盘写入 inode 并将其类型设置为零之后发生。这种竞争是良性的;分配线程将礼貌地等待获取 inode 的就寝锁,然后再读取或写入 inode,此时 iput 已经完成了对它的处理。
iput() can write to the disk. This means that any system call that uses the file system may write the disk, because the system call may be the last one having a reference to the file. Even calls like read() that appear to be read-only, may end up calling iput(). This, in turn, means that even read-only system calls must be wrapped in transactions if they use the file system.
iput() 可以写入磁盘。这意味着任何利用文件系统的系统调用都可能写入磁盘,由于系统调用可能是末了一个对文件有引用的。纵然是像 read() 这样看起来是只读的调用,终极也可能会调用 iput()。这反过来意味着,纵然是只读的系统调用,如果它们利用文件系统,也必须包装在事务中。
There is a challenging interaction between iput() and crashes. iput() doesn’t truncate a file immediately when the link count for the file drops to zero, because some process might still hold a reference to the inode in memory: a process might still be reading and writing to the file, because it successfully opened it. But, if a crash happens before the last process closes the file descriptor for the file, then the file will be marked allocated on disk but no directory entry points to it.
iput()和崩溃之间存在一个具有寻衅性的交互。当文件的链接计数降至零时,iput()不会立即截断文件,由于某些进程可能仍旧在内存中持有对 inode 的引用:某个进程可能仍在读取和写入文件,由于它已成功打开了该文件。但是,如果在末了一个进程关闭文件描述符之前发生崩溃,那么文件将在磁盘上被标记为已分配,但没有目录条款指向它。
File systems handle this case in one of two ways. The simple solution is that on recovery, after reboot, the file system scans the whole file system for files that are marked allocated, but have no directory entry pointing to them. If any such file exists, then it can free those files.
文件系统处理这种情形的办法有两种。大略的办理方案是,在规复(重新启动后)时,文件系统会扫描全体文件系统,查找标记为已分配但没有目录条款指向它们的文件。如果存在任何这样的文件,那么它可以开释这些文件。
The second solution doesn’t require scanning the file system. In this solution, the file system records on disk (e.g., in the super block) the inode inumber of a file whose link count drops to zero but whose reference count isn’t zero. If the file system removes the file when its reference counts reaches 0, then it updates the on-disk list by removing that inode from the list. On recovery, the file system frees any file in the list.
第二种办理方案不须要扫描文件系统。在这种办理方案中,文件系统在磁盘上记录(例如,在超级块中)链接计数降至零但引用计数不为零的文件的 inode 编号。如果文件系统在引用计数达到 0 时删除文件,则它通过从列表中删除该 inode 来更新磁盘上的列表。在规复时,文件系统将开释列表中的任何文件。
Xv6 implements neither solution, which means that inodes may be marked allocated on disk, even though they are not in use anymore. This means that over time xv6 runs the risk that it may run out of disk space.
Xv6 没有实现这两种办理方案,这意味着纵然 inode 不再利用,它们也可能在磁盘上被标记为已分配。这意味着随着韶光的推移,xv6 可能会面临磁盘空间用尽的风险。