Wednesday, March 9, 2011

GZip pushback input stream

When dealing w/ Java's GZIPInputStream, it has an annoying habit of eating bits past the end of the gzip trailer. Normally when reading a file, this isn't a problem, but when reading concatenated GZIP'd files or gzip data embedded in other inputstreams, you would really like to have those overread bits back. Using this inputstream, you can.
public class GzipPushbackStream extends GZIPInputStream {

    private boolean pushed = false;
    private static final int TRAILER_LEN = 8; // 8 byte tailer

    public GzipPushbackStream(PushbackInputStream pis) throws IOException {
        super(pis);
    }

    @Override
    public int read(byte[] buf, int off, int len) throws IOException {
        int read = super.read(buf, off, len);
        if (eos && !pushed) {
            int n = inf.getRemaining();
            if (n > TRAILER_LEN) {
                int offset = super.len - (n -TRAILER_LEN);
                ((PushbackInputStream) in).unread(super.buf, offset, n - TRAILER_LEN);
            }
            pushed = true;
        }
        return read;
    }
}