Compression Streams

Draft Community Group Report,

This version:
https://ricea.github.io/compression/
Issue Tracking:
GitHub
Editors:
Canon Mukai (Google)
Adam Rice (Google)

Abstract

This document defines a set of JavaScript APIs to compress and decompress streams of binary data.

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Introduction

This section is non-normative.

The APIs specified in this specification are used to compress and decompress streams of data. They support "deflate" and "gzip" as compression algorithms. They are widely used in web developers.

2. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MUST and SHOULD are to be interpreted as described in [RFC2119].

This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.

Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)

Implementations that use ECMAScript to implement the APIs defined in this specification MUST implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL-1], as this specification uses that specification and terminology.

3. Terminology

A chunk is a piece of data. In the case of CompressionStream and DecompressionStream, the output chunk type is Uint8Array. They accept any BufferSource type as input.

A stream represents an ordered sequence of chunks. The terms ReadableStream and WritableStream are defined in [WHATWG-STREAMS].

Deflate is a compression format defined in [RFC1950]. It is referred to there as "ZLIB", but in this standard we call it "deflate" to match HTTP (see [RFC7230] section 4.2.2). Gzip is another compression format defined in [RFC1952], also based on the deflate algorithm.

4. Interface Mixin GenericTransformStream

The GenericTransformStream interface mixin represents the concept of a transform stream in IDL. It is not a TransformStream, though it has the same interface and it delegates to one.

interface mixin GenericTransformStream {
  readonly attribute ReadableStream readable;
  readonly attribute WritableStream writable;
};

An object that includes GenericTransformStream has an associated transform of type TransformStream.

4.1. Attributes

readable, of type ReadableStream, readonly

The readable attribute’s getter, when invoked, must return this object’s transform [[readable]].

writable, of type WritableStream, readonly

The writable attribute’s getter, when invoked, must return this object’s transform [[writable]].

5. Interface CompressionStream

[Exposed=(Window,Worker)]
interface CompressionStream {
  constructor(DOMString format);
};
CompressionStream includes GenericTransformStream;

The CompressionStream(format) constructor, when invoked, must run these steps:

  1. If format is unsupported in CompressionStream, then throw a TypeError.

  2. Let cs be a new CompressionStream object.

  3. Set cs's format to format.

  4. Let startAlgorithm be an algorithm that takes no arguments and returns nothing.

  5. Let transformAlgorithm be an algorithm which takes a chunk argument and runs the compress and enqueue a chunk algorithm with cs and chunk.

  6. Let flushAlgorithm be an algorithm which takes no argument and runs the compress flush and enqueue algorithm with cs.

  7. Let transform be the result of calling CreateTransformStream(startAlgorithm, transformAlgorithm, flushAlgorithm).

  8. Set cs's transform to transform.

  9. Return cs.

The compress and enqueue a chunk algorithm, given a CompressionStream object cs and a chunk, runs these steps:

  1. If chunk is not a BufferSource type, then throw a TypeError.

  2. Let buffer be the result of compressing chunk with cs's format. If this throws an exception, then return a promise rejected with that exception.

  3. Let controller be cs's transform.[[TransformStreamController]].

  4. If buffer is empty, return a new promise resolved with undefined.

  5. Split buffer into one or more non-empty pieces and convert them into Uint8Arrays.

  6. For each Uint8Array array, call TransformStreamDefaultControllerEnqueue(controller, array).

  7. Return a new promise resolved with undefined.

The compress flush and enqueue algorithm, which handles the end of data from the input ReadableStream object, given a CompressionStream object cs, runs these steps:

  1. Let buffer be the result of compressing an empty input with cs's format, with the finish flag.

  2. If buffer is empty, return a new promise resolved with undefined.

  3. Split buffer into one or more non-empty pieces and convert them into Uint8Arrays.

  4. For each Uint8Array array, call TransformStreamDefaultControllerEnqueue(controller, array).

  5. Return a new promise resolved with undefined.

6. Interface DecompressionStream

[Exposed=(Window,Worker)]
interface DecompressionStream {
  constructor(DOMString format);
};
DecompressionStream includes GenericTransformStream;

The DecompressionStream(format) constructor, when invoked, must run these steps:

  1. If format is unsupported in DecompressionStream, then throw a TypeError.

  2. Let ds be a new DecompressionStream object.

  3. Set ds's format to format.

  4. Let startAlgorithm be an algorithm that takes no arguments and returns nothing.

  5. Let transformAlgorithm be an algorithm which takes a chunk argument and runs the decompress and enqueue a chunk algorithm with ds and chunk.

  6. Let flushAlgorithm be an algorithm which takes no argument and runs the decompress flush and enqueue algorithm with ds.

  7. Let transform be the result of calling CreateTransformStream(startAlgorithm, transformAlgorithm, flushAlgorithm).

  8. Set ds's transform to transform.

  9. Return ds.

The decompress and enqueue a chunk algorithm, given a DecompressionStream object ds and a chunk, runs these steps:

  1. If chunk is not a BufferSource type, then throw a TypeError.

  2. Let buffer be the result of decompressing chunk with ds's format. If this throws an exception, then return a promise rejected with that exception.

  3. Let controller be ds's transform.[[TransformStreamController]].

  4. If buffer is empty, return a new promise resolved with undefined.

  5. Split buffer into one or more non-empty pieces and convert them into Uint8Arrays.

  6. For each Uint8Array array, call TransformStreamDefaultControllerEnqueue(controller, array).

  7. Return a new promise resolved with undefined.

The decompress flush and enqueue algorithm, which handles the end of data from the input ReadableStream object, given a DecompressionStream object ds, runs these steps:

  1. Let buffer be the result of decompressing an empty input with ds's format, with the finish flag.

  2. If buffer is empty, return a new promise resolved with undefined.

  3. Split buffer into one or more non-empty pieces and convert them into Uint8Arrays.

  4. For each Uint8Array array, call TransformStreamDefaultControllerEnqueue(controller, array).

  5. Return a new promise resolved with undefined.

7. Privacy and Security Considerations

The API doesn’t add any new privileges to the web platform.

However, web developers have to pay attention to the situation when attackers can get the length of the data. If so, they may be able to guess the contents of the data.

8. Examples

8.1. Gzip-compress a stream

const compressedReadableStream
    = inputReadableStream.pipeThrough(new CompressionStream('gzip'));

8.2. Deflate-compress an ArrayBuffer to a Uint8Array

async function compressArrayBuffer(in) {
  const cs = new CompressionStream('deflate');
  const writer = cs.writable.getWriter();
  writer.write(in);
  writer.close();
  const out = [];
  const reader = cs.readable.getReader();
  let totalSize = 0;
  while (true) {
    const { value, done } = await reader.read();
    if (done)
      break;
    out.push(value);
    totalSize += value.byteLength;
  }
  const concatenated = new Uint8Array(totalSize);
  let offset = 0;
  for (const array of out) {
    concatenated.set(array, offset);
    offset += array.byteLength;
  }
  return concatenated;
}

8.3. Gzip-decompress a Blob to Blob

async function DecompressBlob(blob) {
  const ds = new DecompressionStream('gzip');
  const decompressionStream = blob.stream().pipeThrough(ds);
  return await new Response(decompressedStream).blob();
}

9. Acknowledgments

The editors wish to thank Domenic Denicola and Yutaka Hirano, for their support.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[RFC1950]
P. Deutsch; J-L. Gailly. ZLIB Compressed Data Format Specification version 3.3. May 1996. Informational. URL: https://tools.ietf.org/html/rfc1950
[RFC1952]
P. Deutsch. GZIP file format specification version 4.3. May 1996. Informational. URL: https://tools.ietf.org/html/rfc1952
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[WebIDL]
Boris Zbarsky. Web IDL. 15 December 2016. ED. URL: https://heycam.github.io/webidl/
[WEBIDL-1]
Cameron McCormack. WebIDL Level 1. 15 December 2016. REC. URL: https://www.w3.org/TR/2016/REC-WebIDL-1-20161215/
[WHATWG-STREAMS]
Adam Rice; Domenic Denicola; 吉野剛史 (Takeshi Yoshino). Streams Standard. Living Standard. URL: https://streams.spec.whatwg.org/

Informative References

[RFC7230]
R. Fielding, Ed.; J. Reschke, Ed.. Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing. June 2014. Proposed Standard. URL: https://httpwg.org/specs/rfc7230.html

IDL Index

interface mixin GenericTransformStream {
  readonly attribute ReadableStream readable;
  readonly attribute WritableStream writable;
};

[Exposed=(Window,Worker)]
interface CompressionStream {
  constructor(DOMString format);
};
CompressionStream includes GenericTransformStream;

[Exposed=(Window,Worker)]
interface DecompressionStream {
  constructor(DOMString format);
};
DecompressionStream includes GenericTransformStream;