Video Compression

VideoNerd

How Encode/Decode/Transcode  HEVC/H.264 by Intel Quick Sync Video HW (QSV) via ffmpeg

Content Encode HEVC/H.265 stream Encode H264/AVC Decode HEVC and H.264 streams Transcode from H.264/AVC to HEVC/H.265 Appendix: What’s a gain of HEVC over H.264/AVC?     First of all pls. check that your Intel GPU supports HEVC/H.265 encoding and decoding by visiting the following site:        https://software.intel.com/en-us/articles/encode-and-decode-capabilities-for-7th-generation-intel-core-processors-and-newer Check whether your ffmpeg supports QSV with ‘-hwaccels’ […]

High-Level HEVC Parser (Python Script)

The script ParseHevcHighLevelSyntax.py is dedicated to parse hevc high-level syntax (SPS, PPS and slice headers), syntax elements are printed out as plain text. The script is adapted to work in real time with input fifo and with frame duration latency (therefore the last frame is not processed).  Two versions of the script supplied: a) adapted […]

Start Code Emulation Prevention in H264 and HEVC

AVC/H.264 and HEVC/H.265 elementary video streams contain a particular bit-patterns (called start codes) 0x000001, these patterns are dedicated to delimit NALUs (e.g. frames or slices). The problem is that both AVC/H.264 and HEVC/H.265 can’t guarantee prevention of emulation of the start code within NAL data. Let’s imagine a decoder looks for the start code to […]

Picture Width and Height Shall Be Multiple of 8

The HEVC/H.265 standards requires that picture luma sizes (both width and height) are multiple of MinCbSizeY (this parameter is configurable, but it typical value is 8). The standard says: pic_height_in_luma_samples and pic_width_in_luma_samples “shall be an integer multiple of MinCbSizeY”. Where ‘MinCbSizeY’ is derived from SPS parameter ‘log2_min_luma_coding_block_size_minus3‘, typical magnitude of ‘MinCbSizeY’ is 8. What to […]

What’s max_bytes_per_pic_denom and how it impacts on frame sizes?

The purpose of the parameter ‘max_bytes_per_pic_denom‘ (which is present in SPS VUI) is to signal to a decoder on the maximal frame size. The decoder can use this info by allocation of input buffer size without a risk to overflow the buffer. If the parameter max_bytes_per_pic_denom  is not present its default value is 2, if […]

Cons and Pros of Successive Non-Reference B-frames

The GOP structure with two consecutive non-used-for-reference B frames (IPbbPbbPbb…) is widely used in various applications. In the figure above lines denote “refering to”. What are pros and cons of such GOPs? Cons: 1) Distance between two successive P-frames is two leaps in frames. Consequently, the coding efficiency deteriorated. 2) One of references of each […]

Sliced Streams and 360 (3DoF) Video

This page was inspired by “Tiling of Panorama Video for Interactive Virtual Cameras: Overheads and Potential Bandwidth Requirement Reduction”,  by Vamsidhar Reddy Gaddam et al. Delivery the entire 360-degree video (at leadt 8K) to the end user or to the client device (e.g. to head-mounted device) is unfeasible due to huge bandwidth requirement and even […]

Openh264 Codec (Free SW, BSD Licence)

Openh264 is a free sw codec (encoder and decoder), for details look at www.openh264.org . Installation of Openh264 on MAC OS 1. Clone Openh264 project (openh264 folder is created automatically to keep sources)    git clone https://github.com/cisco/openh264.git 2. Enter to openh264 folder 3. Run mktargets.py to change Makefile to generate ‘h264dec’ binary (i.e. the decoder): […]

HEVC CRA and RASL Frames: Definitions and Relationship

Terminology: CRA (Clean Random Access) is a compromise between open gop coding efficiency (notice that closed gop cadence is worse due to temporal prediction discontinuities) and random access ability (sometimes ‘seekability’ is used instead of ‘random access ability’). Leading pictures – following in decoding order but preceding in presentation order. Leading pictures are divided into […]

On QP-Modulation (Adaptive Quantization)

QP-Modulation (or Adaptive Quantization) methods are based on the well known empirical fact that the human visual system (HVS) is less sensitive to distortions in high-complexity areas than those of homogeneous regions. Moreover, high complexity often corresponds to fast motion, so a decrease in quality will be less noticeable. By the way, this property has […]