Video Compression

VideoNerd

This page was inspired by

“Tiling of Panorama Video for Interactive Virtual Cameras: Overheads and Potential Bandwidth Requirement Reduction”, 

by Vamsidhar Reddy Gaddam et al.

Delivery the entire 360-degree video (at leadt 8K) to the end user or to the client device (e.g. to head-mounted device) is unfeasible due to huge bandwidth requirement and even if the bandwidth is not issue then a decoder probably can’t keep real time decoding. On the other hand the client usually not need the whole picture to observe. In 360 video, at any moment a viewer usually sees only a small spatial portion of the entire coded video. So, the user needs only a small part (a viewport) to be present.

Idea: divide uniformly 360 video into sub-pictures (slices or tiles, tiles are relevant for HEVC/H.265) and to care that deblocking is off across slice/tile boundaries and motion vectors don’t cross slice/tile boundaries (this mode is called constrained motion). In such case each slice/tile is completely self-contained and the client receives relevant slices/tiles according to selected viewport:

                   Tiled Streams – HEVC/H.265

              

  Sliced Streams – AVC/H.264

 

However, we are faced with the following problem:

the user turns his or her head and changes the viewport, the server sends  new tile stream covering the viewport. But the tile stream  inter-predicted  (or temporally predicted) and can’t be correctly decoded without IDR, since the reference is not available (the user has the reference of the previous viewport).

One of solutions:  transmit IDRs with high frequency, e.g. each third super-frame is IDR and in the case of 60 fps the latency of changing viewport is 48ms + network delay.

   

Typically 3DOF Video is composed as  cubemap projection, because deblocking and other inloop filters are turned off across tile boundaries faint virtual edges might be observed.          

The above figure is taken from the paper “Overview of the Versatile Video Coding (VVC) Standard and Its Applications”, by Benjamin Bross et al.

 

Extraction of sliced stream from the super-stream (comprising dozens horizontal slices per frame) is not challenging. You need change the resolution in SPS (to comply the sliced stream), and set first_mb_in_slice to 0 for each extracted slice.

i wrote a python script ExtractSlicedStream.py

Usage:

   -i                         input h264 fifo or file
-o                        output fifo of first sliced stream
-s                         slice height in MBs
-n                        slice number to extract (default 0)
-g                        number of gops to process, if 0 then all (default 0)
-v                        whether to print naltypes and offsets (default false)

Example:
       python ExtractSlicedStream.py  -i  multi-sliced.h264   -s 30   -n 1   -g 3 -o test.h264

 

13 Responses

  1. The next time I read a blog, I hope that it doesnt disappoint me as much as this one. I mean, I know it was my choice to read, but I actually thought youd have something interesting to say. All I hear is a bunch of whining about something that you could fix if you werent too busy looking for attention.

  2. Great V I should certainly pronounce, impressed with your web site. I had no trouble navigating through all the tabs as well as related information ended up being truly simple to do to access. I recently found what I hoped for before you know it in the least. Quite unusual. Is likely to appreciate it for those who add forums or something, site theme . a tones way for your customer to communicate. Excellent task..

Leave a Reply

Your email address will not be published. Required fields are marked *