DEV Community

Teppei Fukuda
Teppei Fukuda

Posted on • Edited on

Determine a File Type of io.Reader

#go

Overview

It isn't easy to determine a file type of io.Reader since it doesn't have Seek() or Peek(). If we read some bytes from io.Reader, we can't revert them. As for Gzip, I tried to use gzip.NewReader and determine if it is Gzip by checking an error, but gzip.NewReader reads 10 bytes and doesn't write them back. If you try another NewReader after gzip.NewReader, it should fail due to lack of 10 bytes.

I did not come up with a smart way, but I found it when reading the source code of containers/image.

You can see the actual example here.

Detail

I will show you an example of Gzip. You can use the same way if you need to determine other file types.

func isGzip(input io.Reader) (io.Reader, bool, error) {
    buf := [3]byte{}

    n, err := io.ReadAtLeast(input, buf[:], len(buf))
    if err != nil {
        return nil, false, err
    }

    isGzip := buf[0] == 0x1F && buf[1] == 0x8B && buf[2] == 0x8
    return io.MultiReader(bytes.NewReader(buf[:n]), input), isGzip, nil
}

First of all, it reads only 3 bytes from io.Reader. In the above example, it uses io.ReadAtLeast, but I feel we can use io.ReadFull as well.

Then, we check if it is Gzip with 3 bytes. Of course, you need another process if you want to determine other file types.

Finally, we combine buf[:n] and input by io.MultiReader. The new io.Reader generated by io.MultiReader reads 3 bytes from the first bytes.NewReader and the rest bytes from the second io.Reader. This is equivalent to the original io.reader.

I wrote an example using the above technique.
https://play.golang.org/p/VH19FHQnqdr

Caution

If you want to use io.Copy, this way doesn't seem good from the perspective of a performance since io.MultiReader doesn't have WriteTo. io.Copy prefers WriteTo for a performance. In that case, you can use bufio.Reader

Summary

You can use io.MultiReader if you need to peek some bytes from io.Reader.

Top comments (0)