The kvazaar hevc encoder can be considered as an alternative to x265 and in some aspects kvazaar is even better x265.
Content
Getting Kvazaar Source
Official site: http://ultravideo.cs.tut.fi/#encoder
git clone https://github.com/ultravideo/kvazaar.git
Compilation of kvazaar
Open Visual Studio solution located in the folder ‘build’
Copy vsyasm.exe from http://ultravideo.cs.tut.fi/vsyasm.exe and put this file on any folder in PATH (e.g. in C:\Windows), vsyasm.exe is used in compilation of assembly sources.
With Visual Studio Build ‘kvazaar_lib’ project and then ‘kvazaar_cli’, the exe-file ‘kvazaar.exe’ is created and placed within the directory ‘bin’ (e.g. for x64 Release mode: \bin\x64-Release\kvazaar.exe).
Running
Example: Encode 100 frames, 5Mbps, gop length 30 frames, 2×2 tiling grid
kvazaar -i test_1920x1080.yuv --input-res 352x288 --input-fps 30 --tiles 2x2 -p 30 --mv-constraint frametile --bitrate 500000 -n 100 -o test1.h265
Notes:
- ‘-p 30’ specifies GOP length of 30 frames (closed gop). Notice that the GOP in kvazaar encoder means the number pictures in B-pyramid (by default the encoder builds the pyramid of 4 frames), i.e. GOP = B-pyramid. If B frames are not used then GOP size is 1, sine the pyramid consists of single frame of the type P.
‘--tiles 2x2’
divides each frame uniformly into the grid of 2×2 tiles. Alternative way to instruct the encoder to divide each frame into uniform tiles is with--tiles-width-split
and--tiles-height-split
, e.g. 2×2 tiling:--tiles-width-split u2 --tiles-height-split u2
‘--mv-constraint frametile’
– motion vectors don’t cross tile boundaries.- By default each frame is accompanied with the hash value ‘checksum’ incorporated in a suffix SEI (21 bytes of length). There is an option to generate ‘md5‘ hash values (
--hash md5
) - By default in the first frame info SEI message is generated where encoding parameters are given in binary format:
To disable generation of the info SEI use ‘--no-info
‘
To activate frame-level parallelism use ‘-owf N’
By default kvazaar adds hash-sei at the end of each frame, to disable this feature use ‘--hash none
‘
Note: Tile boundaries can be visible, especially if --mv-constraint frametile
set.
For low-delay coding IPPPP use '--gop lp'
(with '--no-open-gop -p 60'
IDR-interval is 60 frames):
kvazaar -i battlefield_384x320.yuv --input-res 384x320 --input-fps 60 -p 60 --no-open-gop --gop lp --vps-period 1 --mv-constraint frametile --bitrate 500000 --hash none -o battlefield.h265
To generate stream with B frames (B-pyramid) use --gop 8
or --gop 16:
kvazaar -i akiyo_cif.yuv --input-res 352x288 --no-info --input-fps 30 -p 32 --no-open-gop --gop 8 --vps-period 1 --tiles-width-split 64,128,192 --tiles-height-split u2 --bitrate 500000 --hash none --owf 1 --no-wpp -o akiyo-b-frames.h265
since tiles are not used --mv-constraint frametile
restricts motion vectors to point outside the picture (motion_vectors_over_pic_boundaries_flag = 0)
--vps-period 1
indicates to adding VPS,SPS and PPS before each IDR frame (i.e. at the start of each GOP) to achieve random access. If --vps-period 0
then VPS, SPS and PPS are sent in the very first frame, IDRs at the middle of the stream are not accompanied with VPS,SPS and PPS, thus, the random access is disabled.
By default WPP is on, to disable WPP use '--no-wpp'
--full-intra-search
instructs the encoder to check all 35 intra modes for each prediction unit, if this parameter is set impact on encoding times is inevitable. The penalty is an increase of ~7% in encoding times.
--ml-pu-depth-intra
instructs the encoder to apply Machine Learning algo to determine CTU partition.
--rd 3
instructs the encoder to check all intra prediction modes (35 modes) and to select the best, otherwise a sort of logarithmic search is applied (checked each N-th mode and then fine tuning). The ‘rd 3’ mode significantly slows down encoding.
--lossless
invokes the lossless mode, i.e. decoded frame equals to the input one.
--aud
instructs kvazaar encoder to put AUD (Access Unit Delimiter) at the start of each frame. AUDs are required for mpegts.
- How to specify allowable prediction unit (PU) sizes. The standard HEVC enables PU to have the sizes from 4×4 to 64×64. The kvazaar encoder restricts PU sizes into smaller range (checking of all PU sizes is time-consuming). By default, the PU sizes 16×16 and 8×8 are allowed for both intra and inter modes. For example, in fast preset PU sizes are 32×32, 16×16 and 8×8 for both intra and inter modes:
“pu-depth-intra”, “1-3”,
“pu-depth-inter”, “1-3”,
The CLI parameters --pu-depth-intra
and --pu-depth-inter
specifies PU size range. E.g. to enable 32×32 intra and inter PUs one needs adding
‘--pu-depth-intra 1-3 --pu-depth-inter 1-3
‘ .
kvazaar.exe -i akiyo_cif.y4m--qp 22 --tiles 2x2 --input-res 352x288 --no-info --intra-qp-offset 1 --no-wpp --threads 0 --input-fps 30 -p 30 --bitrate 500000 --no-open-gop --pu-depth-intra 1-3 --pu-depth-inter 1-3 --gop lp-g4d1t1 --vps-period 1 --hash none --owf 1 -o akiyo.h265
Rate Control is tailored to determine the number of bits for hierarchical structure: GOP (Group of Pictures), picture and CTU. The GOP can have different sizes from 1 (GOP=single picture) to 32 frames.
Allocation of bits for GOP is performed at the start of each GOP by the function: gop_allocate_bits
The number of bits per picture is computed by the following statement (if GOP is “flattened” then pic_weight is constant for each frame in the GOP):
const double pic_target_bits =
state->frame->cur_gop_target_bits * pic_weight - pic_header_bits(state);
The number of bits per CTU is calculated by the function: lcu_allocate_bits
Tiling Video
Tiling is essential feature for 360 video. To cope with the bandwidth problem of 360 video and to reduce transmission delay the new form of coding and packing: Tiled Video Streaming has been adopted by MPEG committee as amendment to HEVC/ISOBMFF. This solution requires HEVC encoder to divide frames into a fixed grid of tiles (not necessarily uniform), each tile is encoded independently with tile-constrained motion prediction and without deblocking filtering across tile boundaries.
kvazaar encoder supports both tiles and a constrained motion vector mode, i.e. motion vectors do not to cross picture/tile boundary, each tile is encapsulated in a single slice, i.e. tile=slice, no two or more tiles share same slice, for 3×3 tiling grid the following arguments are applied:
--tiles 3x3 --slices tiles --mv-constraint frametilemargin
Notes:
- More information on kvazaar codec you can read in the paper “Kvazaar: Open-Source HEVC/H.265 Encoder” by
- kvazaar supports y4m format as input, if the input file is y4m then resolution and frame rate can be skipped from the command line, this info is stored in the frame header.
- To insert VPS/SPS/PPS at the start of each IDR use
'--vps-period 1'
- kvazaar supports monochrome input format (yuv400p):
--input-format P400
. To convert yuv420p into gray(tv) format (or monochrome format) i use ffmpeg tool to extract only Y-plane:ffmpeg -video_size 384x320 -i swbf_384x320.yuv -vf extractplanes=y swbf_384x320_mono.y4m
, then i apply kvazaar as follows:
kvazaar -i swbf_384x320_mono.y4m --input-format P400 -p 60 --vps-period 1 --no-open-gop --gop lp-g32d1t1 --mv-constraint frametile --bitrate 1000000 --hash none -o swbf_mono.h265
By default kvazaar generates suffix hash SEIs at the end of each frame, to disable generation of hashes i use ‘--hash none
‘
- kvazaar supports non-symmetrical binary motion partitions,
'--smp'
enables the following partitions 2NxN and Nx2N, by default these partitions are disabled.
Example:
kvazaar -i swbf_384x320.y4m -p 60 --vps-period 1 --no-open-gop --gop lp --mv-constraint frametile --bitrate 1000000 --smp --hash none -o swbf_smp.h265
- kvazaar supports asymmetrical binary motion partitions with the flag
'--amp
‘, by default this mode is disabled:
kvazaar -i swbf_384x320.y4m -p 60 --vps-period 1 --no-open-gop --gop lp --mv-constraint frametile --bitrate 1000000 --smp --amp --hash none -o swbf_smp.h265
Non-uniform tiling
Let’s suppose we wish to divide CIF (352×288) image into 4 tiles as follows:
i use ‘--tiles-width-split 256 --tiles-height-split u2
‘, where '--tiles-width-split 256'
means that second tile column starts with the pixel 256 (must be divisible by 64) and ‘--tiles-height-split u2
‘ means that tile rows (2 tile rows) are uniform.
Multiple vertical tiles in the form of CTU columns:
kvazaar -i akiyo_cif.yuv --input-res 352x288 --no-info --input-fps 30 -p 30 --no-open-gop --gop lp --vps-period 1 --tiles-width-split 64,128,192 --tiles-height-split u2 --bitrate 500000 --hash none --owf 1 --no-wpp -o akiyo.h265
The kvazaar has a real-time implementation of SHVC (Scalable HEVC, spatial and SNR scalability). In the paper “REAL-TIME IMPLEMENTATION OF SCALABLE HEVC ENCODER” by Jaakko Laitinen, Ari Lemmetti, Jarno Vanne it’s reported:
kvazaar.exe --input Fifa17_1920x1080.yuv --input-fps 50 --input-res 1920x1080 --preset=ultrafast --threads=8 --owf=2 -q 30 --layer --input Fifa17_1920x1080.yuv --input-res 1920x1080 --input-fps 50 --preset=ultrafast --threads=8 --owf=2 -q 20 -o fifa1080p_dual_snr.h265
- All parameters after
'--layer'
belong to the enhanced layer
- Scalable video (in both mode SNR and Spatial) enables better coding efficiency over simulcast due to exploitation of inter-layer redundnancy. In the paper “REAL-TIME IMPLEMENTATION OF SCALABLE HEVC ENCODER” by Jaakko Laitinen, Ari Lemmetti, Jarno Vanne is reported that kvazaar scalable encoder provides 8-9% of BD-rate gain for dual layered mode (i.e. you can reduce the bitrate by 8% without compromising visual quality).
23+ years’ programming and theoretical experience in the computer science fields such as video compression, media streaming and artificial intelligence (co-author of several papers and patents).
the author is looking for new job, my resume
Rattling wonderful visual appeal on this website , I’d rate it 10 10.
I’d have to examine with you here. Which is not one thing I usually do! I take pleasure in reading a post that may make folks think. Additionally, thanks for permitting me to comment!
You made some respectable points there. I looked on the internet for the difficulty and located most individuals will associate with with your website.
It’s hard to search out educated people on this topic, however you sound like you recognize what you’re talking about! Thanks
23 years in israeli hi-tech, working 10-12 hours 24/7, even a donkey would become a specialist
Hello just wanted to give you a quick heads up. The text in your article seem to be running off the screen in Firefox. I’m not sure if this is a formatting issue or something to do with browser compatibility but I figured I’d post to let you know. The style and design look great though! Hope you get the problem solved soon. Kudos
no, i have not money to hire a skilled web-designer
Lovely just what I was looking for.Thanks to the author for taking his clock time on this one.
Hello! This is my 1st comment here so I just wanted to give a quick shout out and tell you I genuinely enjoy reading through your blog posts. Can you suggest any other blogs/websites/forums that cover the same topics? Thanks!
videocompression.tech
This is the right blog for anyone who wants to find out about this topic. You realize so much its almost hard to argue with you (not that I actually would want…HaHa). You definitely put a new spin on a topic thats been written about for years. Great stuff, just great!
Its like you read my mind! You appear to know a lot about this, like you wrote the book in it or something. I think that you can do with some pics to drive the message home a bit, but other than that, this is excellent blog. A great read. I will definitely be back.
Hiya very nice web site!! Guy .. Beautiful .. Wonderful .. I will bookmark your blog and take the feeds additionally…I’m happy to find so many useful info right here within the publish, we want develop more techniques in this regard, thanks for sharing. . . . . .
This is a topic close to my heart cheers, where are your contact details though?
You really make it appear really easy along with your presentation however I find this matter to be really something which I believe I would never understand. It sort of feels too complex and extremely large for me. I’m looking forward in your next post, I?¦ll try to get the cling of it!
Thanks for the post.
Thanks for the post. best regards.
To the videonerd.website admin, Your posts are always well-delivered and engaging.
Hello There. I found your weblog the use of msn. That is a really neatly written article. I will make sure to bookmark it and return to learn extra of your useful information. Thank you for the post. I’ll definitely return.