Introduction

Scalable video coding is coding of video in multiple layers, where each layer represents a different quality representation of the same video scene.

The base layer (BL) is the lowest quality representation. One or more enhancement layers (ELs) may be coded by referencing lower layers and provide improved video quality by exploitation inter-layer redundancy (i.e. prediction from collocated data from lower layer).

Example of Inter-layer redundancy: enhancement layer motion vectors and modes can be predicted from the base layer.

Scalable Video streams allow a more graceful degradation in video quality compared with non-scalable video, where reduction in bitrate typically causes more severe drops in quality. In case of network congestion Middle Box (MANE) removes a number of enhancement layers and sends the trimmed stream to Client, but MANE can’t discard upper layers at arbitrary point (or frame), only at key frames. Therefore an interval of key frames determines responsiveness latency to network condition changes.

HEVC/H.265 has a special scalability extension called SHVC.

Scalable video is tailored for internet streaming with changing bandwidth or network conditions.

Notice that alternative solutions like HLS or MPEG-DASH are more popular due to low complexity, but these solutions require a huge amount of disk space at Server’s side in order to keep a multitude of replicas of same sources with different bitrates and resolutions and bit widths (SDR/HDR).

In a scalable solution a video encoder generates several compressed bitstreams: a base-layer and enhancement-layers. Each layer uses the previous one as a reference, thus inter-layer redundancy is exploited.

There are three main types of video scalability (the figures taken from the paper “Scalable Internet video using MPEG-4” by Hayder Radha, 1999):

SNR-scalability – each layer is encoded with progressively decreasing quantization step sizes, thus video equality is progressively improved. Progressive JPEG was the first practical use of SNR scalability.

SNR or Quality scalability is less time consuming on both sides encoder and decoder than the spatial, since image scaling is not applied:

Example (two-layered): frames in enhancement layer exploit both inter-layer and inter-frame redundancy.

Spatial Scalability – each layer is encoded with progressively increasing resolution

Temporal Scalability – each layer adds frame rate.

Example (two layered solution), enhancement layer contains non-reference B-frames which use the base layer as reference. If the base layer frame rate is 30fps then with enhancement layer the stream is 60fps.

• Bit depth scalability: coding a video with different bit depths for different layers. The base layer has the lowest bit depth (usually 8bpp).

SHM Reference Codec

Download & Build

To download SHM SW you need install SVM (i prefer Tortoise SVM)

Then take the url of SHM:

https://hevc.hhi.fraunhofer.de/trac/shvc/browser

or clone with ‘git’:

git clone https://vcgit.hhi.fraunhofer.de/jvet/SHM.git

In the root SHM-dev folder in sub-folder ‘build’ you can find Visual-Studio solutions for different VS versions.

Compile TAppEncoder and TAppDecoder projects in x64 Release mode.

Perhaps, some changes in ‘cstdint‘ file are required to avoid errors like: ‘uintmax_t’: is not a member of ‘`global namespace”

I use cstdint located here to build Encoder and Decoder

Encode and Decode Base Layer

Example Encode Base Layer in Low Latency Mode

No B-frames, source is 384×320 yuv sequence of yuv420p format, frame rate 60, single-reference

Encode the base level

TAppEncoder.exe -c ipp.cfg -i0 test_384x320.yuv -o0 NUL -wdt0 384 -hgt0 320 -ip0 60 -fr0 60 -c single_layer.cfg -f 10000 -b test_bl.h265

-ip0 – intra period for base layer (in our case each 60-th frame is IDR).
-fr0 – the frame rate of base layer
-f – number of frames to encode

ipp.cfg – the main config file which specifies GOP structure (in our case IPPPP), number of references, mode of motion estimation, deblock filter parameters, switching such tools as PCM and AMP.

The content of ipp.cfg (many parameters have same sense as in HM, anyway there is documentation in https://hevc.hhi.fraunhofer.de/shvc):

single_layer.cfg – in addition there is another cfg-file – single_layer.cfg which specifies parameters of each layer, in the cfg-file below we specify target bitrate 1Mbps:

NumLayers : 1
NonHEVCBase : 0
ScalabilityMask1 : 0 # Multiview
ScalabilityMask2 : 1 # Scalable
ScalabilityMask3 : 0 # Auxiliary pictures
AdaptiveResolutionChange : 0 # Resolution change frame (0: disable)
SkipPictureAtArcSwitch : 0 # Code higher layer picture as skip at ARC switching (0: disable (default), 1: enable)
MaxTidRefPresentFlag : 1 # max_tid_ref_present_flag (0=not present, 1=present(default))
CrossLayerPictureTypeAlignFlag: 1 # Picture type alignment across layers
CrossLayerIrapAlignFlag : 1 # Align IRAP across layers
SEIpictureDigest : 0

#============= LAYER 0 ==================
QP0 : 30
MaxTidIlRefPicsPlus10 : 1 # max_tid_il_ref_pics_plus1 for layer0
#============ Rate Control ==============
RateControl0 : 1 # Rate control: enable rate control for layer 0
TargetBitrate0 : 10000000 # Rate control: target bitrate for layer 0, in bps
KeepHierarchicalBit0 : 1 # Rate control: keep hierarchical bit allocation for layer 0 in rate control algorithm
LCULevelRateControl0 : 1 # Rate control: 1: LCU level RC for layer 0; 0: picture level RC for layer 0
RCLCUSeparateModel0 : 1 # Rate control: use LCU level separate R-lambda model for layer 0
InitialQP0 : 25 # Rate control: initial QP for layer 0
RCForceIntraQP0 : 0 # Rate control: force intra QP to be equal to initial QP for layer 0

Example Decode Base Layer

TAppDecoder.exe -b test_bl.h265 -o0 base_layer.yuv

-o0 output of base layer

You can play decoded yuv-file:

ffplay -s 384×320 base_layer.yuv

Encode and Decode Dual SNR

Example Encode Two SNR Layers

Encoding two layers in SNR mode, the base layer is coded with constant QP=30 and the enhancement layer is coded with QP=20

TAppEncoder.exe -c ipp.cfg -i0 test_384x320.yuv -i1 testf_384x320.yuv -o0 NUL -o1 NUL -wdt0 384 -wdt1 384 -hgt0 320 -hgt1 320 -ip0 60 -ip1 60 -fr0 60 -fr1 60 -c dual_layer.cfg -f 200 -b dual_layer.h265

POC 9 LId: 0 TId: 0 ( P-SLICE TRAIL_R, nQP 31 QP 31 ) 1544 bits [Y 37.5578 dB U 42.4246 dB V 43.4092 dB] [ET 1 ] [L0 8c ] [L1 ]

POC 9 LId: 1 TId: 0 ( P-SLICE STSA_R, nQP 21 QP 21 ) 6656 bits [Y 44.8422 dB U 47.5345 dB V 48.1982 dB] [ET 1 ] [L0 8 9(0, {1.00, 1.00}x)c ] [L1 ]

POC 10 LId: 0 TId: 0 ( P-SLICE TRAIL_R, nQP 31 QP 31 ) 1864 bits [Y 37.5857 dB U 42.4312 dB V 43.4382 dB] [ET 1 ] [L0 9c ] [L1 ]

POC 10 LId: 1 TId: 0 ( P-SLICE STSA_R, nQP 21 QP 21 ) 13376 bits [Y 44.8429 dB U 47.5713 dB V 48.2687 dB] [ET 1 ] [L0 9 10(0, {1.00, 1.00}x)c ] [L1 ]

POC 11 LId: 0 TId: 0 ( P-SLICE TRAIL_R, nQP 31 QP 31 ) 2400 bits [Y 37.5694 dB U 42.4481 dB V 43.2989 dB] [ET 1 ] [L0 10c ] [L1 ]

POC 11 LId: 1 TId: 0 ( P-SLICE STSA_R, nQP 21 QP 21 ) 12704 bits [Y 44.8439 dB U 47.5875 dB V 48.0735 dB] [ET 1 ] [L0 10 11(0, {1.00, 1.00}x)c ] [L1 ]

POC 12 LId: 0 TId: 0 ( P-SLICE TRAIL_R, nQP 31 QP 31 ) 2336 bits [Y 37.5577 dB U 42.4618 dB V 42.9799 dB] [ET 1 ] [L0 11c ] [L1 ]

POC 12 LId: 1 TId: 0 ( P-SLICE STSA_R, nQP 21 QP 21 ) 11512 bits [Y 44.8213 dB U 47.4915 dB V 47.9293 dB] [ET 1 ] [L0 11 12(0, {1.00, 1.00}x)c ] [L1 ]

POC 13 LId: 0 TId: 0 ( P-SLICE TRAIL_R, nQP 31 QP 31 ) 2448 bits [Y 37.5799 dB U 42.4388 dB V 42.6812 dB] [ET 1 ] [L0 12c ] [L1 ]

Due to SNR mode input to each layer is same: -i0 test_384x320.yuv -i1 testf_384x320.yuv

ipp.cfg remains the same but instead of single-layer cfg-file we use dual_layer.cfg:

NumLayers : 2
NonHEVCBase : 0
ScalabilityMask1 : 0 # Multiview
ScalabilityMask2 : 1 # Scalable
ScalabilityMask3 : 0 # Auxiliary pictures
AdaptiveResolutionChange : 0 # Resolution change frame (0: disable)
SkipPictureAtArcSwitch : 0 # Code higher layer picture as skip at ARC switching (0: disable (default), 1: enable)
MaxTidRefPresentFlag : 1 # max_tid_ref_present_flag (0=not present, 1=present(default))
CrossLayerPictureTypeAlignFlag: 1 # Picture type alignment across layers
CrossLayerIrapAlignFlag : 1 # Align IRAP across layers
SEIpictureDigest : 0

#============= LAYER 0 ==================
QP0 : 30
MaxTidIlRefPicsPlus10 : 1 # max_tid_il_ref_pics_plus1 for layer0
#============ Rate Control ==============
RateControl0 : 0 # Rate control: enable rate control for layer 0
TargetBitrate0 : 1000000 # Rate control: target bitrate for layer 0, in bps
KeepHierarchicalBit0 : 1 # Rate control: keep hierarchical bit allocation for layer 0 in rate control algorithm
LCULevelRateControl0 : 1 # Rate control: 1: LCU level RC for layer 0; 0: picture level RC for layer 0
RCLCUSeparateModel0 : 1 # Rate control: use LCU level separate R-lambda model for layer 0
InitialQP0 : 0 # Rate control: initial QP for layer 0
RCForceIntraQP0 : 0 # Rate control: force intra QP to be equal to initial QP for layer 0

#============ WaveFront ================
WaveFrontSynchro0 : 0 # 0: No WaveFront synchronisation (WaveFrontSubstreams must be 1 in this case).
# >0: WaveFront synchronises with the LCU above and to the right by this many LCUs.

#============= LAYER 1 ==================
QP1 : 20
NumSamplePredRefLayers1 : 1 # number of sample pred reference layers
SamplePredRefLayerIds1 : 0 # reference layer id
NumMotionPredRefLayers1 : 1 # number of motion pred reference layers
MotionPredRefLayerIds1 : 0 # reference layer id
NumActiveRefLayers1 : 1 # number of active reference layers
PredLayerIds1 : 0 # inter-layer prediction layer index within available reference layers

#============ Rate Control ==============
RateControl1 : 0 # Rate control: enable rate control for layer 1
TargetBitrate1 : 1000000 # Rate control: target bitrate for layer 1, in bps
KeepHierarchicalBit1 : 1 # Rate control: keep hierarchical bit allocation for layer 1 in rate control algorithm
LCULevelRateControl1 : 1 # Rate control: 1: LCU level RC for layer 1; 0: picture level RC for layer 1
RCLCUSeparateModel1 : 1 # Rate control: use LCU level separate R-lambda model for layer 1
InitialQP1 : 0 # Rate control: initial QP for layer 1
RCForceIntraQP1 : 0 # Rate control: force intra QP to be equal to initial QP for layer 1

#============ WaveFront ================
WaveFrontSynchro1 : 0 # 0: No WaveFront synchronisation (WaveFrontSubstreams must be 1 in this case).
# >0: WaveFront synchronises with the LCU above and to the right by this many LCUs.

NumLayerSets : 2 # Include default layer set, value of 0 not allowed
NumLayerInIdList1 : 2 # 0-th layer set is default, need not specify LayerSetLayerIdList0 or NumLayerInIdList0
LayerSetLayerIdList1 : 0 1

NumAddLayerSets : 0
NumOutputLayerSets : 2 # Include defualt OLS, value of 0 not allowed
DefaultTargetOutputLayerIdc : 1
NumLayersInOutputLayerSet : 1 # The number of layers in the 0-th OLS should not be specified,
# ListOfOutputLayers0 need not be specified
ListOfOutputLayers1 : 1

Notes:

In Layer 1 (enhancement) of dual_layer.cfg:

- NumSamplePredRefLayers1 = 1 since only one layer (the base layer is under)

- SamplePredRefLayerIds1 = 0 reference layer for sample prediction is 0 (the base layer)

- NumMotionPredRefLayers1 = 1 number of layers to use for motion prediction, in our case only single base layer is available

- MotionPredRefLayerIds1 = 0 reference layer for motion data prediction is 0 (the base layer)

Example Decode Layers

The h265 file dual_layer.h265 contains two layers, to get yuv of base layer use:

TAppDecoder.exe -b dual_layer.h265 -o0 ench_layer.yuv

to get yuv of enhancement layer use -o1 and -ls 2

TAppDecoder.exe -b dual_layer.h265 -ls 2 -o1 ench_layer.yuv

You can play decoded yuv-file:

ffplay -s 384×320 enh_layer.yuv

Performance Results

I measured the performance (encoding time) of SHM (Scalable HEVC) by means of PowerShell’s Measure-Command, SNR dual layered mode.

Measure-Command {.\TAppEncoder.exe -c ipp.cfg -i0 Fifa17_1920x1080.yuv -i1 Fifa17_1920x1080.yuv -o0 NUL -o1 NUL -wdt0 1920 -wdt1 1920 -hgt0 1080 -hgt1 1080 -ip0 60 -ip1 60 -fr0 60 -fr1 60 -c dual_layer.cfg -f 5 -b snr_layer_1080p.h265}

To complete 5 of 1080p frames SHM takes about 200s, i.e. 40s per frame

Days : 0
Hours : 0
Minutes : 3
Seconds : 28
Milliseconds : 304
Ticks : 2083040816
TotalDays : 0.00241092687037037
TotalHours : 0.0578622448888889
TotalMinutes : 3.47173469333333
TotalSeconds : 208.3040816
TotalMilliseconds : 208304.0816

Slava

23+ years’ programming and theoretical experience in the computer science fields such as video compression, media streaming and artificial intelligence (co-author of several papers and patents).

the author is looking for new job, my resume

Tagged Fresh Topics

20 Responses

hire a hacker uk says:

20.06.2022 at 21:21

I truly treasure your work, Great post.

Reply
marizonilogert says:

09.10.2022 at 21:29

Hello, i think that i saw you visited my website thus i came to “return the favor”.I am trying to find things to enhance my website!I suppose its ok to use some of your ideas!!

Reply
1. Slava says:
  
  10.10.2022 at 05:46
  
  welcomed
  
  Reply
vorteile cbd says:

27.10.2022 at 12:44

I very glad to find this web site on bing, just what I was looking for : D also saved to fav.

Reply
zmozero teriloren says:

24.11.2022 at 05:52

Some truly terrific work on behalf of the owner of this site, dead great written content.

Reply
Tablet says:

01.12.2022 at 14:41

I’d constantly want to be update on new articles on this web site, saved to bookmarks! .

Reply
NFT Newsstand says:

14.12.2022 at 07:11

F*ckin’ awesome things here. I am very glad to peer your article. Thanks a lot and i’m having a look ahead to touch you. Will you kindly drop me a e-mail?

Reply
1. Slava says:
  
  14.12.2022 at 09:55
  
  slavah264@gmail.com
  
  Reply
cultural intelligence assessment tool says:

15.12.2022 at 02:39

Amazing blog! Is your theme custom made or did you download it from somewhere? A design like yours with a few simple adjustements would really make my blog stand out. Please let me know where you got your theme. With thanks

Reply
1. Slava says:
  
  24.12.2022 at 15:16
  
  i did a project related to satellite communication (military) using scalable video.
  
  Reply
How to Start a Burglar and Fire Alarms for Household Use (wholesale) Business (Beginners Guide) says:

16.12.2022 at 12:01

I am always invstigating online for articles that can benefit me. Thank you!

Reply
The Best Places to find a Girlfriend in Fort Worth (United States) says:

18.12.2022 at 04:50

I am extremely inspired with your writing abilities as smartly as with the format on your weblog. Is that this a paid theme or did you customize it your self? Anyway keep up the nice quality writing, it’s rare to peer a nice weblog like this one today..

Reply
The Best Romantic Places to Propose in Jiyuan (China) says:

19.12.2022 at 00:07

excellent post.Ne’er knew this, appreciate it for letting me know.

Reply
Top Places to own a Vacation Home in Changzhi (China) says:

19.12.2022 at 13:25

Spot on with this write-up, I actually think this web site wants way more consideration. I’ll most likely be again to read far more, thanks for that info.

Reply
Learning and Understanding about Mediastinal endodermal sinus tumors Disease (Volume 1) says:

20.12.2022 at 13:44

Regards for helping out, great information.

Reply
Top Places to See in Qingzhou (China) says:

21.12.2022 at 10:23

This is very attention-grabbing, You are an overly skilled blogger. I’ve joined your feed and look forward to in the hunt for more of your magnificent post. Also, I’ve shared your website in my social networks!

Reply
How to Play the Whip (Beginners Guide to Musical Instruments) says:

21.12.2022 at 12:33

Enjoyed reading through this, very good stuff, thanks.

Reply
How to Write a Business Plan for a Foil Made Of Brass Business says:

26.12.2022 at 10:29

Hello there, I found your website by the use of Google whilst searching for a comparable matter, your web site got here up, it seems to be good. I have bookmarked it in my google bookmarks.

Reply
Top Places to See in Cebu (Philippines) says:

27.12.2022 at 12:09

Keep working ,fantastic job!

Reply
The Best Places to Take Photos in Cenxi (China) says:

03.01.2023 at 14:45

An impressive share, I just given this onto a colleague who was doing a little analysis on this. And he in fact bought me breakfast because I found it for him.. smile. So let me reword that: Thnx for the treat! But yeah Thnkx for spending the time to discuss this, I feel strongly about it and love reading more on this topic. If possible, as you become expertise, would you mind updating your blog with more details? It is highly helpful for me. Big thumb up for this blog post!

Reply

Scalable Coding of SHVC with SHM Reference Codec

Introduction

SHM Reference Codec

Download & Build

Encode and Decode Base Layer

Encode and Decode Dual SNR

Performance Results

20 Responses

Leave a Reply Cancel reply

Introduction

SHM Reference Codec

Download & Build

Encode and Decode Base Layer

Encode and Decode Dual SNR

Performance Results

Related posts:

20 Responses

Leave a Reply Cancel reply