How Perform Scaling with ffmpeg

Scaling with JSVM DownConvertStatic.exe

There are many techniques to scale images, most popular are Nearest Neighbor, Bilinear, Bicubic, Spline, and Lanczos.

The Nearest Neighbor considers the closest pixel, Bilinear – the closest 2×2 pixels, Bicubic the closest 4 × 4 pixels.  Spline and Lanczos consider more surrounding pixels. If we denote ‘–>’ as less complex the above-mentioned techniques are ordered as follows:

Nearest Neighbor –> Bilinear  –> Bicubic –> Spline  –> Lanczos

The following quote is taken from the book “OpenCV 3.x with Python By Example – Second Edition“, the section “Image Scaling”:

If we are enlarging an image, it’s preferable to use linear or cubic interpolation. If we are shrinking an image, it’s preferable to use area-based interpolation.

Similar results are reported in the paper  – there is a difference in best up-scaling and down-scaling methods.

 

  • For up-scaling  1024×768  to 1920×1080  of 24-bit RGB  ranging of LibAv scaling methods (a part of ffmpeg):

So, SWS_BILINEAR is best method for up-scaling – no visual distortions and high processing rate . BICUBIC is apparently better in quality but the processing rate is relatively low – 78fps.

 

  • For down-scaling  the results are different

SWS_BILINEAR is not best solution for down-scaling, the best solution is SWS_AREA , no visual differences and high processing rate.

Notice, in the paper “Comparison gallery of image scaling algorithms”  bi-cubic method seems a good, although and not the best.

How Perform Scaling with ffmpeg

1) Decode and scale with Lanczos method to the resolution 352×288, output is stored in y4m raw video format

ffmpeg -i crowd.h264  -pix_fmt yuv420p  -vsync 0  -s 352×288 -sws_flags lanczos -y crowd_cif.y4m

here, ‘-vsync 0’ is necessary, otherwise ffmpeg might discard frames

Note: instead of Lanczos technique you can choose: 

-sws_flags bilinear’ , ‘-sws_flags bicubic’, ‘-sws_flags neighbor’ (the simplest method), ‘-sws_flags area’ , ‘-sws_flags spline’ etc.

 

Scaling with JSVM DownConvertStatic.exe

The repository https://vcgit.hhi.fraunhofer.de/jvet/jsvm.git   contains JSVM (scalable h264) codec plus several tools. One of the tools is DownConvertStatic.exe   – scaler of yuv files.

To build JSVM reference codec

1) clone JSVM

git clone https://vcgit.hhi.fraunhofer.de/jvet/jsvm.git

2) go to the folder jsvm\JSVM\H264Extension\build\windows  and open H264AVCVideoEncDec_vc10.sln (if you have more advanced version of Visual Studio make retarget)

3) Choose Release/x64 and built the solution (some projects apparently would fail to be built).

4) exe-files are created at the folder jsvm\bin64

 

Usage:

DownConvertStatic   input-width  input-height  input-yuv   output-width  output-height  output-yuv   method

method –  0 standard scaling method, 1 for dyadic scaling

 

 

Example: downscale 1920×1080 to 352×288

DownConvertStatic.exe  1920 1080 crowd_1920x1080_50fps.yuv 352 288 crowd_cif.yuv 0

Resampler

   500 frames converted

in 9.64 seconds => 19 ms/frame

 

 

Note:

Upscaling interpolation-based methods (bilinear, bicubic, and Lanczos etc.), which are available in ffmpeg tool, update pixel magnitudes by weighted averaging of neighboring pixel values from a low-resolution original frame.  These algorithms upscale pretty good smooth regions due to local similarities but they fail at abrupt spatial discontinuities (e.g. edges).

There are a class of upscaling methods, called “single-image or video super-resolution”, which takes into account edges and high-detailed areas. The goal of these super-resolution methods is to magnify an image while maintaining the sharpness of the edges and the details in the image.

11 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *