Network Working Group M. Speer Request for Comment: 2029 D. Hoffman Category: Standards Track Sun Microsystems, Inc. October 1996
RTP Payload Format of Sun's CellB Video Encoding
Status of this Memo
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
Abstract
This memo describes a packetization scheme for the CellB video encoding. The scheme proposed allows applications to transport CellB video flows over protocols used by RTP. This document is meant for implementors of video applications that want to use RTP and CellB.
The Cell image compression algorithm is a variable bit-rate video coding scheme. It provides "high" quality, low bit-rate image compression at low computational cost. The bytestream that is produced by the Cell encoder consists of instructional codes and information about the compressed image.
For futher information on Cell compression technology, refer to [1]. Currently, there are two versions of the Cell compression technology: CellA and CellB. CellA is primarily designed for the encoding of stored video intended for local display, and will not be discussed in this memo.
CellB, derived from CellA, has been optimized for network-based video applications. It is computationally symmetric in both encode and decode. CellB utilizes a fixed colormap and vector quantization techniques in the YUV color space to achieve compression.
The RTP timestamp is in units of 90KHz. The same timestamp value is used for all packet payloads of a frame. The RTP maker bit denotes the end of a frame.
The packetization of the CellB bytestream is designed to make the resulting packet stream robust to packet loss. To achieve this goal, an additional header is added to each RTP packet to uniquely identify the location of the first cell of the packet within the current frame. In addition, the width and height of the frame in pixels is carried in each CellB packet header. Although the size can only change between frames, it is carried in every packet to simplify the packet encoding.
All fields are 16-bit unsigned integers in network byte order, and are placed at the beginning of the payload for each RTP packet. The Cell X and the Cell Y Location coordinates are expressed as cell coordinates, not pixel coordinates. Since cells represent 4x4 blocks of pixels, the X or Y dimension of the cell coordinates range in value from 0 through 1/4 of the of the same dimension in pixel coordinates.
A packet can be of any size chosen by the implementor, up to a full frame. All multi-byte codes must be completely contained within a packet. In general, the implementor should avoid packet sizes that result in fragmentation by the network.
Cell codes are 4 bytes in length, and describe a 4x4 pixel cell. There are two possible luminance (Y) levels for each cell, but only one pair of chrominance (UV) values. The CellB cell code is shown below:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 M M M M M M M M M M M M M M M|U V U V U V U V|Y Y Y Y Y Y Y Y| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4x4 Bitmask U/V Code Y/Y Code
The first two bytes of the cell code are a bitmask. Each bit in the mask represents a pixel in a 16-pixel cell. Bit 0 represents the value of the upper right-hand pixel of the cell, and subsequent bits represent the pixels in row-major order. If a pixel's bit is set in the 4x4 Bitmask, then the pixel will be rendered with the pixel value <Y(1), U, V>. If the pixel's bit is not set in the bitmask, then the pixel's value will be rendered with the value <Y(0), U, V>. The bitmask for the cell is normalized so that the most significant bit is always 0 (i.e., corresponding to <Y(0), U, V>).
The U/V field of the cell code represents the chrominance component. This code is in index into a table of vectors that represents two independent components of chrominance.
The Y/Y field of the cell code represents two luminance values (Y(0) and Y(1)). This code is an index into a table of two-compoment luminance vectors.
The derivation of the U/V and Y/Y tables is outside the scope of this memo. A complete discussion can be found in [1].
The single byte CellB skip code tells the CellB decoder to skip the next S+1 cells in the current video frame being decoded. The maximum number of cells that can be skipped is 32. The format of the skip code is shown below.
0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |1 0 0 S S S S S| +-+-+-+-+-+-+-+-+
The single byte "new Y/Y table" code is used to tell the decoder that the next 512 bytes are a new Y/Y quantization table. The code and the representation of the table are shown below. The sample encoder/decoder pair in this document do not implement this feature of the CellB compression. However, future CellB codecs may implement this feature.
The single byte "new U/V table" code is used to tell the decoder that the next 512 bytes represent a new U/V quantization table. The code is shown below. The sample encoder/decoder pair provided in this document do not implement this feature of the CellB compression. However, future CellB codecs may implement this feature.