Hierarchical B-Frames or B-Pyramid

General

What’s Hierarchical B-Frame Mode or B-pyramid (notice that in my opinion B-pyramid is a bad term)?

If there is a run of B frames and some B-frames in the run are used for backward reference for some other B frames – then this mode is called Hierarchical B-Frames Coding or B-pyramid.

The following figure is taken from the paper “ANALYSIS OF HIERARCHICAL B PICTURES AND MCTF”, by Heiko Schwarz, Detlev Marpe, and Thomas Wiegand, illustrates the conception of B-pyramid:

Let’s display the first GOP from the above figure slightly different:

So, some geometric form is revealed but not a pyramid. Therefore, in my opinion the term B-pyramid is not a good choice.

To exploit B-pyramid feature fully it’s necessary to set GOP size (in frames) to a dyadic number (2^n), e.g. gop size = 16 frames or 32 frames.

According to results of the above mentioned article “ANALYSIS OF HIERARCHICAL B PICTURES AND MCTF” using of Hierarchical B-Frames commonly improves coding efficiency (e.g. on Football CIF 30Hz, the improvement is about 0.5 Y-PSNR dB).

Pros and Cons of Hierarchical B-frames

Pros: better exploitation of temporal redundancy.

Cons: long coding latency (not suitable for low-latency applications)

How Detect Hierarchical B-Frames or B-Pyramid in Stream?

For each frame we check that all following four conditions:

Current frame is B
Previous frame (in decoding order) is also B (i.e. successive number of B frames is greater than one)
Previous RefIdc (nal_ref_idc) is non-zero (i.e. the previous B-frame is used for reference)
POC of current B frame is smaller than that of the previous one

If all above conditions are met then the B-pyramid is detected.

If elementary stream is encapsulated in Mpeg-TS container then we can use PTS instead of POC. It’s worth mentioning PTS are easily picked from PES header while in case of pic_order_cnt_type=1 the derivation of POC is a complicated process. Indeed, to parse the POC value it’s necessary to dive into SPS and pick up log2_max_pic_order_cnt_lsb and a dozen other parameters in case of pic_order_cnt_type=1.

B-Pyramid versus non-reference B-frames

What’s a gain of B-pyramid GOP structure IPbBbPbBb…. against IPbbbPbbb…. (three consecutive non-reference B-frames). Here ‘B’ denotes B-frame used for reference and ‘b’

denotes B-frame not used for reference. i use x264 in constant QP mode (QP=25), closed GOP = 30 frames

On the testing yuv-sequence “container” (384×320, 300 frames): the bit-size saving is ~0.7%

On the testing yuv-sequence “ akiyo“ (384×320, 300 frames): the bit-size saving is ~1.7%

Working with x264

IPbbbPbbb…

x264 –input-res 384×320 –fps 30 –b-adapt 0 –bframes 3 –b-pyramid none –ref 1 –no-scenecut –keyint 30 –min-keyint 30 –qp 25 –output test_ibbb.h264 container_384x320.yuv

IPbBbPbBb…

x264 –input-res 384×320 –fps 30 –b-adapt 0 –bframes 3 –b-pyramid strict –ref 1 –no-scenecut –keyint 30 –min-keyint 30 –qp 25 –output test_ibBb.h264 container_384x320.yuv

How Detect B-Pyramid if Elementary Stream is Encapsulated in Mpeg-TS or MPEG4 Container?

MPEG TS Container

When Elementary Stream is encapsulated in MPEG-TS container we look for video frame boundaries to pick up PTS. We get PTS from the PES header and frame start is mandatory indicated by AUD (nal_type=9) in transport packet payload. Notice that if PTS is not present then PTS=DTS and no B-pyramid can exist in such case. Picture data (or slice data in case of multiple slices per picture) is contained in NALU with nal_type = 1 or 5 (IDR). There is a possibility that slice data is absent in the current transport packet and it’s present in the next or next-next video packet (e.g. if SPS is too long).

Once NAL with nal_type 1 or 5 is sensed we need extract nal_ref_idc from the NAL header and two first parameters from the slice header: first_mb_in_slice and slice_type.

NAL unit of each slice consists of:

Start-code (000001 or 00000001), nal header (1 byte), slice header and slice data.

nalType = nal_header & 0x1f

nal_ref_idc = ( nal_header & 0x60 )>>5

To determine first_mb_in_slice and slice_type we need read the first byte from the slice header – slh[0] and to execute the following operations:

Get first_mb_in_slice:first_mb_in_slice = slh[0]>>7

if first_mb_in_slice==1 then the current slice is the first slice in a picture and it actually is the start of picture data (in such case the next step is to determine whether the slice type is B or not)
If first_mb_in_slice=0 then the current slice is not the first one in a picture and the picture type has been already determined.

if first_mb_in_slice==1 then we have to determine whether the slice type is B or not. Slice type code corresponding to B has two values 1 or 6. Exp-golomb bit-representation of 1 is ‘010’ and 6 is ‘00111’.

Hence if the current slice is corresponding to the first slice in a picture (i.e. first_mb_in_slice=1 or MSbit is ‘1’) and the picture type is B then one of the following two bit-patterns are transmitted in the first byte slh[0] of the slice:

1010 or 100111

Basing on the above patterns we derive the following rules to determine whether the picture type is B or not:

if (slh[0]>>4)=0xA then current slice is the first slice and the picture type is B

if ( slh[0] & 0xFC ) = 0x9C then then current slice is the first slice and the picture type is B

For each frame we check that all following four conditions:

Current frame is B
Previous frame (in decoding order) is also B (i.e. successive number of B frames is greater than one)
Previous RefIdc (nal_ref_idc) is non-zero (i.e. the previous frame is used for reference)
PTS of current B frame is smaller than that of the previous one

If all above conditions are met then B-pyramid is detected.

MPEG4 Container (non-fragmented)

With ‘stco’ and ‘stsz’ tables in meta-data we can access all access units successively in decoding order.

For each access unit we skip over non-VCL units (e.g. SEI) until first slice data NAL sensed (nal_type=1 or 5).

Then we read NAL header (to determine nal_ref_idc) and the following byte (which corresponds to the first byte of slice header) to determine slice type (B or not B). Slice type and nal_ref_idc are identically determined according to the previous section. Although ref_idc can be derived from sdtp-box provided that this box is present in meta-data (notice it’s not mandatory to signal sdtp-box).

With ctts-table in meta data we derive PTS of each access unit (if ctts is not present then PTS = DTS and no B-pyramid can exist in such stream).

For each frame we check that all following four conditions:

Current frame is B
Previous frame (in decoding order) is also B (i.e. successive number of B frames is greater than one)
Previous RefIdc (nal_ref_idc) is non-zero (i.e. the previous frame is used for reference)
PTS of current B frame is smaller than that of the previous one

If all above conditions are met then B-pyramid is detected.

Slava

23+ years’ programming and theoretical experience in the computer science fields such as video compression, media streaming and artificial intelligence (co-author of several papers and patents).

the author is looking for new job, my resume

Tagged top

14 Responses

Zachariah Angleberger says:

18.06.2022 at 02:03

Good day very nice site!! Guy .. Excellent .. Superb .. I’ll bookmark your web site and take the feeds additionally?KI am satisfied to search out numerous useful info here in the put up, we want develop extra strategies in this regard, thank you for sharing. . . . . .

Reply
Travis Ness says:

18.06.2022 at 02:21

Hello, I think your blog might be having browser compatibility issues. When I look at your website in Opera, it looks fine but when opening in Internet Explorer, it has some overlapping. I just wanted to give you a quick heads up! Other then that, great blog!

Reply
Kennith Cheong says:

18.06.2022 at 05:12

I believe this site contains some real great information for everyone. “As we grow oldthe beauty steals inward.” by Ralph Waldo Emerson.

Reply
zorivareworilon says:

28.06.2022 at 11:55

Lovely just what I was looking for.Thanks to the author for taking his time on this one.

Reply
best cryptocurrency to invest in 2021 for short-term says:

13.09.2022 at 09:45

I haven?¦t checked in here for some time because I thought it was getting boring, but the last several posts are great quality so I guess I?¦ll add you back to my everyday bloglist. You deserve it my friend 🙂

Reply
best cryptocurrency to buy now says:

14.09.2022 at 21:23

With havin so much content and articles do you ever run into any problems of plagorism or copyright violation? My website has a lot of unique content I’ve either authored myself or outsourced but it looks like a lot of it is popping it up all over the web without my permission. Do you know any solutions to help prevent content from being stolen? I’d certainly appreciate it.

Reply
1. Slava says:
  
  15.09.2022 at 08:49
  
  if i add a figure from an external paper, i always put the reference to this paper.
  Although some ideas in my website might coincide with ideas published in technical literature. If such event occurs it’s non-deliberately.
  
  Reply
New face says:

12.11.2022 at 02:36

certainly like your web-site however you have to test the spelling on several of your posts. Many of them are rife with spelling problems and I find it very bothersome to tell the truth on the other hand I’ll certainly come again again.

Reply
1. Slava says:
  
  12.11.2022 at 07:20
  
  to improve spelling i need hire a technical writer and i have not money for this task.
  This site is non-profit with the purpose to enable people from poor countries to be familiar with modern technologies in video compression and streaming
  
  Reply
zmozero teriloren says:

30.11.2022 at 02:46

Wohh exactly what I was searching for, thanks for putting up.

Reply
The Best Places to get Married in Wenzhou (China) says:

16.12.2022 at 12:12

Hi, Neat post. There is a problem with your website in internet explorer, would check this… IE still is the market leader and a huge portion of people will miss your great writing because of this problem.

Reply
Top Places to Holiday in Brisbane (Australia) says:

17.12.2022 at 18:12

Thanks for another informative blog. Where else could I get that type of info written in such a perfect way? I’ve a project that I am just now working on, and I have been on the look out for such information.

Reply
MudriDr says:

24.12.2022 at 03:46

I am glad to be a visitant of this pure site! , appreciate it for this rare information! .

Reply
Top Places to find Love in Kunshan (China) says:

28.12.2022 at 16:36

I think you have mentioned some very interesting points, thanks for the post.

Reply

General

Pros and Cons of Hierarchical B-frames

How Detect Hierarchical B-Frames or B-Pyramid in Stream?

B-Pyramid versus non-reference B-frames

How Detect B-Pyramid if Elementary Stream is Encapsulated in Mpeg-TS or MPEG4 Container?

MPEG TS Container

MPEG4 Container (non-fragmented)

Related posts:

14 Responses

Leave a Reply Cancel reply