Video Compression

VideoNerd

The kvazaar hevc encoder can be considered as an alternative to x265 and in some aspects kvazaar is even better x265.

 

Content

Compilation of kvazaar

Running of kvazaar

          Outline of Rate Control

Tiling Video

Scalable kvazaar

 

Getting Kvazaar Source

Official site:   http://ultravideo.cs.tut.fi/#encoder 

git clone https://github.com/ultravideo/kvazaar.git

Compilation of kvazaar

Open Visual Studio solution located in the folder ‘build’

Copy vsyasm.exe from http://ultravideo.cs.tut.fi/vsyasm.exe and put this file on any folder in PATH (e.g. in C:\Windows), vsyasm.exe is used in compilation of assembly sources.

With Visual Studio Build ‘kvazaar_lib’ project and then ‘kvazaar_cli’, the exe-file ‘kvazaar.exe’ is created and placed within the directory ‘bin’ (e.g. for x64 Release mode: \bin\x64-Release\kvazaar.exe).

 

Running

Example: Encode 100 frames, 5Mbps, gop length 30 frames, 2×2 tiling grid

kvazaar -i test_1920x1080.yuv --input-res 352x288 --input-fps 30 --tiles 2x2 -p 30 --mv-constraint frametile --bitrate 500000 -n 100 -o test1.h265

Notes:

  • ‘-p 30’  specifies GOP length of 30 frames (closed gop). Notice that the GOP in kvazaar encoder means the number pictures in B-pyramid (by default the encoder builds the pyramid of 4 frames), i.e. GOP = B-pyramid. If B frames are not used then GOP size is 1, sine the pyramid consists of single frame of the type P.
  • ‘--tiles 2x2’   divides each frame uniformly into the grid of 2×2 tiles. Alternative way to instruct the encoder to divide each frame into uniform tiles is with  --tiles-width-split and --tiles-height-split,  e.g. 2×2 tiling: --tiles-width-split u2 --tiles-height-split u2
  • ‘--mv-constraint frametile’ – motion vectors don’t cross tile boundaries.
  • By default each frame is accompanied with the hash value ‘checksum’ incorporated in a suffix SEI (21 bytes of length). There is  an  option to generate ‘md5‘ hash values (--hash md5)
  •  By default in the first frame info SEI message is generated where encoding parameters are given in binary format:

To disable generation of the info SEI use ‘--no-info

 

To activate frame-level parallelism use ‘-owf N’

By default kvazaar adds hash-sei at the end of each frame, to disable this feature use --hash none

Note: Tile boundaries can be visible, especially if --mv-constraint frametile  set.

 

For low-delay coding IPPPP use '--gop lp'  (with '--no-open-gop  -p 60' IDR-interval is 60 frames):

kvazaar -i battlefield_384x320.yuv --input-res 384x320 --input-fps 60 -p 60 --no-open-gop --gop lp --vps-period 1 --mv-constraint frametile --bitrate 500000 --hash none  -o battlefield.h265

To generate stream with B frames (B-pyramid) use --gop 8 or --gop 16:

kvazaar -i akiyo_cif.yuv --input-res 352x288 --no-info --input-fps 30 -p 32 --no-open-gop --gop 8 --vps-period 1 --tiles-width-split 64,128,192 --tiles-height-split u2 --bitrate 500000 --hash none --owf 1 --no-wpp -o akiyo-b-frames.h265

since tiles are not used --mv-constraint frametile restricts motion vectors to point outside the picture (motion_vectors_over_pic_boundaries_flag = 0)

--vps-period 1 indicates to adding VPS,SPS and PPS before each IDR frame (i.e. at the start of each GOP) to achieve random access. If --vps-period 0 then VPS, SPS and PPS are sent in the very first frame, IDRs at the middle of the stream are not accompanied with VPS,SPS and PPS, thus, the random access is disabled.

By default WPP is on, to disable WPP use '--no-wpp'

--full-intra-search  instructs the encoder to check all 35 intra modes for each prediction unit, if this parameter is set impact on encoding times is inevitable. The penalty is an increase of ~7% in encoding times.

--ml-pu-depth-intra  instructs the encoder to apply Machine Learning algo to determine CTU partition.

--rd 3    instructs the encoder to check all intra prediction modes (35 modes) and to select the best, otherwise a sort of logarithmic search is applied (checked each N-th mode and then fine tuning).  The ‘rd 3’ mode significantly slows down encoding.

--lossless  invokes the lossless mode, i.e. decoded frame equals to the input one.

--aud     instructs kvazaar encoder to put AUD (Access Unit Delimiter) at the start of each frame. AUDs are required for mpegts.

  •  How to specify allowable prediction unit (PU) sizes. The standard HEVC enables PU to have the sizes from 4×4 to 64×64.  The kvazaar encoder restricts PU sizes into smaller range (checking of all PU sizes is time-consuming). By default, the PU sizes 16×16 and 8×8 are allowed for both intra and inter modes.         For example, in fast preset PU sizes are 32×32, 16×16 and 8×8 for both intra and inter modes:
    “pu-depth-intra”, “1-3”,
    “pu-depth-inter”, “1-3”, 

The CLI parameters --pu-depth-intraand --pu-depth-interspecifies PU size range. E.g. to enable 32×32 intra and inter PUs one needs adding

               ‘--pu-depth-intra 1-3  --pu-depth-inter 1-3 .

        kvazaar.exe -i akiyo_cif.y4m--qp 22 --tiles 2x2 --input-res 352x288 --no-info --intra-qp-offset 1 --no-wpp  --threads 0 --input-fps 30 -p 30 --bitrate 500000 --no-open-gop --pu-depth-intra 1-3 --pu-depth-inter 1-3 --gop lp-g4d1t1 --vps-period 1 --hash none --owf 1 -o akiyo.h265

 

 

Outline of Rate Control

Rate Control is tailored to determine the number of bits for hierarchical structure: GOP (Group of Pictures), picture and CTU. The GOP can have different sizes from 1 (GOP=single picture) to 32 frames.

Allocation of bits for GOP is performed at the start of each GOP by the function:  gop_allocate_bits

The number of bits per picture is computed by the following statement (if GOP is “flattened” then pic_weight is constant for each frame in the GOP):

const double pic_target_bits =
state->frame->cur_gop_target_bits * pic_weight - pic_header_bits(state);

The number of bits per CTU is calculated by the function: lcu_allocate_bits

 

 

Tiling Video

Tiling is essential feature for 360 video. To cope with the bandwidth problem of 360 video and to reduce transmission delay the new form of coding and packing: Tiled Video Streaming has been adopted by MPEG committee as amendment to HEVC/ISOBMFF. This solution requires HEVC encoder to divide frames into a fixed grid of tiles (not necessarily uniform), each tile is encoded independently with tile-constrained motion prediction and without deblocking filtering across tile boundaries.

 

 

kvazaar encoder supports both tiles and a constrained motion vector mode, i.e. motion vectors do not to cross picture/tile boundary,  each tile is encapsulated in a single slice, i.e. tile=slice, no two or more tiles share same slice, for 3×3 tiling grid the following arguments are applied:

--tiles 3x3   --slices tiles    --mv-constraint frametilemargin

 

 Notes: 

  • More information on kvazaar codec you can read in the paper “Kvazaar: Open-Source HEVC/H.265 Encoder” by Marko Viitanen et al.
  • kvazaar supports y4m format as input, if the input file is y4m then resolution and frame rate can be skipped from the command line, this info is stored in the frame header.
  • To insert VPS/SPS/PPS at the start of each IDR use '--vps-period 1'
  • kvazaar supports monochrome input format (yuv400p): --input-format P400.  To convert yuv420p into gray(tv) format (or monochrome format) i use ffmpeg tool to extract only Y-plane:     ffmpeg -video_size 384x320 -i swbf_384x320.yuv -vf extractplanes=y  swbf_384x320_mono.y4m , then i apply kvazaar as follows:

kvazaar -i swbf_384x320_mono.y4m --input-format P400 -p 60 --vps-period 1 --no-open-gop --gop lp-g32d1t1 --mv-constraint frametile --bitrate 1000000 --hash none -o swbf_mono.h265

         By default kvazaar generates suffix hash SEIs at the end of each frame, to disable generation of hashes i use ‘--hash none

  • kvazaar supports non-symmetrical binary motion partitions, '--smp' enables the following partitions  2NxN and Nx2N, by default these partitions are disabled.

  Example:

kvazaar -i  swbf_384x320.y4m -p 60 --vps-period 1 --no-open-gop --gop lp --mv-constraint frametile --bitrate 1000000  --smp --hash none -o swbf_smp.h265

 

  • kvazaar supports asymmetrical binary motion partitions with the flag '--amp‘, by default this mode is disabled:

 

kvazaar -i  swbf_384x320.y4m -p 60 --vps-period 1 --no-open-gop --gop lp --mv-constraint frametile --bitrate 1000000  --smp --amp --hash none -o swbf_smp.h265

 

Non-uniform tiling

Let’s suppose we wish to divide CIF (352×288) image into 4 tiles as follows:

 

i use ‘--tiles-width-split 256 --tiles-height-split u2‘, where '--tiles-width-split 256' means that second tile column starts with the pixel 256 (must be divisible by 64) and ‘--tiles-height-split u2‘ means that tile rows (2 tile rows) are uniform.

Multiple vertical tiles in the form of CTU columns:

kvazaar -i akiyo_cif.yuv --input-res 352x288 --no-info --input-fps 30 -p 30 --no-open-gop --gop lp --vps-period 1 --tiles-width-split 64,128,192 --tiles-height-split u2 --bitrate 500000 --hash none --owf 1 --no-wpp -o akiyo.h265

 

 

Scalable kvazaar

The kvazaar has a real-time implementation of SHVC (Scalable HEVC, spatial and SNR scalability). In the paper “REAL-TIME IMPLEMENTATION OF SCALABLE HEVC ENCODER” by Jaakko Laitinen, Ari Lemmetti, Jarno Vanne it’s reported:

“On an 8-core Xeon W-2145 processor, the proposed spatially scalable Kvazaar can encode two-layer 1080p video above 50 fps with scaling ratios of 1.5 and 2. The respective coding gains are 18.4% and 9.9% over Kvazaar simulcast coding at similar speed.”
Note: The coding gain (up to 18%) is achieved owing to exploitation of inter-layer redundancy.
        
To clone the scalable kvazaar type the following:
git clone https://github.com/ultravideo/scalable-kvazaar
         
Enter to the directory ‘scalable-kvazaar’ and then to the sub-folder ‘build’ and finally open kvazaar_VS2015.sln.  After building this Visual Studio solution for example in x64-Release the binary kvazaar.exe is created in scalable-kvazaar\bin\x64-Release
       
Example:  Encoding of dual layered SNR scalable (the base layer is coded with QP=30 and the enhanced layer is coded with QP=20) of 1080p video
       
kvazaar.exe --input Fifa17_1920x1080.yuv --input-fps 50 --input-res 1920x1080 --preset=ultrafast --threads=8 --owf=2 -q 30 --layer --input  Fifa17_1920x1080.yuv --input-res 1920x1080 --input-fps 50 --preset=ultrafast --threads=8 --owf=2 -q 20 -o fifa1080p_dual_snr.h265
     
Notes
  • All parameters after '--layer' belong to the enhanced layer

 

  • Scalable video (in both mode SNR and Spatial) enables better coding efficiency over simulcast due to exploitation of inter-layer redundnancy.                                                                                                                                                                                        In the paper “REAL-TIME IMPLEMENTATION OF SCALABLE HEVC ENCODER” by Jaakko Laitinen, Ari Lemmetti, Jarno Vanne is reported that kvazaar scalable encoder provides 8-9% of BD-rate gain for dual layered mode (i.e. you can reduce the bitrate by 8% without compromising visual quality).

19 Responses

  1. I’d have to examine with you here. Which is not one thing I usually do! I take pleasure in reading a post that may make folks think. Additionally, thanks for permitting me to comment!

  2. You made some respectable points there. I looked on the internet for the difficulty and located most individuals will associate with with your website.

  3. It’s hard to search out educated people on this topic, however you sound like you recognize what you’re talking about! Thanks

  4. Hello just wanted to give you a quick heads up. The text in your article seem to be running off the screen in Firefox. I’m not sure if this is a formatting issue or something to do with browser compatibility but I figured I’d post to let you know. The style and design look great though! Hope you get the problem solved soon. Kudos

  5. You really make it appear really easy along with your presentation however I find this matter to be really something which I believe I would never understand. It sort of feels too complex and extremely large for me. I’m looking forward in your next post, I?¦ll try to get the cling of it!

  6. Hello There. I found your weblog the use of msn. That is a really neatly written article. I will make sure to bookmark it and return to learn extra of your useful information. Thank you for the post. I’ll definitely return.

Leave a Reply

Your email address will not be published. Required fields are marked *