Michael

Posted on May 29, 2024 • Edited on Jun 1, 2024

文件描述符和Bash中的重定向

#bash #redirection #descriptor

前言

Linux中的文件描述符(File Descritor)和Bash中的重定向(Redirections)网上资料有很多描述，各种各样的都有，有的描述文档内容，有的很深入，但是很理论，本文将理论和实践结合，使用各种例证思考Bash的重定向操作实质。

文件描述符

底层关系结构

一切皆文件，文件描述符自然举足轻重，下面上一张经典图片

很清晰的描述了文件描述符底层运作。最左边一列为进程范围内的数据结构，每个进程都有一张文件描述符结构表(struct fdtable)，每一栏包含close_on_exec的flags和指向OFD(Open File Descriptor)的指针；第二栏(OFD)和第三栏(Inode Table)是系统级别的，其中第二栏包含打开的文件描述符的属性，状态和inode指针，第三栏包含文件的物理信息等。

每个进程都有自己的进程表，ls -l /dev/fd或者ls -l /proc/$$/fd显示当前进程下打开的文件描述符，如下图所示，标准输入输出和错误都指向当前$(tty)

lrwx------ 1 username usergroup 64 10月 26 14:01 0 -> /dev/pts/0
lrwx------ 1 username usergroup 64 10月 26 14:01 1 -> /dev/pts/0
lrwx------ 1 username usergroup 64 10月 26 14:01 2 -> /dev/pts/0
lrwx------ 1 username usergroup 64 10月 26 14:01 255 -> /dev/pts/0

如果想过滤fd，除了使用上述命令扩展下外，还可以使用比如lsof -P -n -p $$ -a -d 0,1,2,10

查看打开的文件表描述符信息cat /proc/$$/fdinfo/2

pos:    0
flags:  02000002
mnt_id: 26
ino:    3

各种操作的影响

dup操作属于process内部fd操作，根据输入fd复制新建fd，如图所示比如会在第一列ProcessA中新生成一行，新旧的file ptr指向OFD同一栏，比如fd1和fd20都指向23，所有她们具有相同的文档offset等属性。
fork会生成子进程，子进程默认继承父进程的描述符表，比如ProcessA的fd2和ProcessB的fd2，同时指向73的OFD。
不同进程中的open操作会生成各自的OFD行，但是会同时指向相同的inode信息，比如ProcessA中的fd0和ProcessB中的fd3，最后都指向了inode table中的1976，所以他们具有不同的file offset信息。

下面使用dup举例说明，打开一个文件，dup后，使用后者修改后，查看前者信息：

// gcc -Wall -Wextra -pedantic -o exapmle example.c
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>

void show(int fd1)
{
    int flags;
    long int offset = 0;
    flags = fcntl(fd1, F_GETFL);

    if (flags & O_APPEND)
    {
        fprintf(stdout, "%d has O_APPEND\n", fd1);
    }
    else
    {
        fprintf(stdout, "%d doesn't have O_APPEND attribute\n", fd1);
    }

    offset = lseek(fd1, 0, SEEK_CUR);
    if (offset == -1)
    {
        fprintf(stderr, "lseek failed: %s\n", strerror(errno));
    }
    fprintf(stdout, "file offset: %ld\n", offset);
    fprintf(stdout, "------------\n");
}

int main(int argc, char *argv[])
{
    int fd1, fd2;
    int flags;

    if (argc != 2)
    {
        fprintf(stderr, "Usage: %s file_path\n", argv[0]);
        exit(1);
    }

    fd1 = open(argv[1], O_RDWR);
    fd2 = dup(fd1);

    printf("fd1: %d, fd2: %d\n", fd1, fd2);

    show(fd1);

    flags = fcntl(fd2, F_GETFL);
    flags |= O_APPEND;
    fcntl(fd2, F_SETFL, flags);

    if (lseek(fd2, 3, SEEK_SET) == -1)
    {
        fprintf(stderr, "lseek set failed: %s\n", strerror(errno));
        exit(-1);
    }

    show(fd1);

    close(fd1);
    close(fd2);
    return 0;
}

Bash中的重定向(Redirectons)

Bash中的重定向就是操作文件和描述符的关系，自己的理解，符号化文件操作，天才设计。一行简单的命令，文件描述符的处理在命令执行之前。

顺序问题

官方文档关于重定向有个命令

ls > dirlist 2>&1
ls 2>&1 > dirlist

上述两个命令重定向先后顺序重要性，后者是不能满足1，2定向到dirlist文件要求，为什么呢，讲的很清楚，正确的前者是先执行> dirlist，将output指向dirlist文件，然后将error指向output，而此时output指向了dirlist，所有1，2均指向了文件dirlist。前面说到，标准的1，2，3是软链接到tty的，错误的命令先将2指向1，就是说2指向了1的终极指向tty，后面再将1指向了dirlist，没有达到目的。下面咱们验证下理论：

ls -l /dev/fd/ > test1.txt 2>&1
ls -l /dev/fd/ 2>&1 > test2.txt

cat test1.txt
lrwx------ 1 username usergroup 64 10月 26 20:18 0 -> /dev/pts/0
l-wx------ 1 username usergroup 64 10月 26 20:18 1 -> /home/shouhua/test.txt
l-wx------ 1 username usergroup 64 10月 26 20:18 2 -> /home/shouhua/test.txt
lr-x------ 1 username usergroup 64 10月 26 20:18 3 -> /proc/7756/fd

cat test2.txt
lrwx------ 1 username usergroup 64 10月 26 20:18 0 -> /dev/pts/0
l-wx------ 1 username usergroup 64 10月 26 20:18 1 -> /home/shouhua/test.txt
lrwx------ 1 username usergroup 64 10月 26 20:18 2 -> /dev/pts/0
lr-x------ 1 username usergroup 64 10月 26 20:18 3 -> /proc/7759/fd

注意上述使用/dev/fd/而不是/proc/$$/fd/，前者是unix系统先出现的，后者算是部分系统支持，Bash操作的是前者，如果使用后者，则没有变化。

文件offset影响

使用文件描述符处理文件需要注意file offset(pos)，比如下面例子','加到了offset=6的地方：

echo hello world > test.txt
exec 10<> test.txt
read -n 5 -u 10
echo $REPLY
cat /proc/$$/fdinfo/10 # 查看输出的pos参数
echo -n ',' >&10
cat test.txt
exec 10>&-

redirection各种操作

&用于分割>和<与fd，不然就不知道解析1>2，为算数表达式？还是重定向？所以涉及到两端为数字时候，想到&，有个特殊的情况，echo hello >& test.txt，但是这种情况也可以这样 echo hello &> test.txt。默认情况下10以上的fd可能被系统占用。

# 基础文件操作
echo hello >test.txt
echo hello >>test.txt
echo hello &>test.txt
cat < test.txt
# here document
# here string
# duplicate file descriptor，注意下面的`&`，下面两个效果一样，不同的是当11没有写的时候，前者默认0，后者默认1
exec 11<&10
exec 11>&10
# move file descriptor
exec 11>&10- # 复制10到11， 并且关闭10
exec 11<&10-
# open file for reading and writing
exec 10<>test.txt
# 系统分配fd
exec {fd}<>test.txt
echo $fd

重定向对应文件描述符各种操作

exec 11>&10复制，相当于dup，复制后，具有相同的OFD(Open File Descriptor)项，因此具有相同的offset
不同的terminal同时使用同一个fd打开相同的文件，类似第三种情况，不同的OFD项，但是指向相同的inode table项
使用ls -l /dev/fd/ > test.txt时，类似上面第二种fork情况，ls会使用子进程执行，复制父进程fd table，所以能通过test.txt文件看到父进程中的fd信息

DEV Community