Video Compression

VideoNerd

The presentation Motion Estimation for Video Coding covers the following topics:

1) General intro to motion prediction models – translational and affine

2) Taxonomy of Motion Estimation Methods

3) Use case: Block Matching Motion Estimation

4) Use cases of motion estimation for HEVC/H.265 and AV1

 

Notes:

  • The simplest motion model of Motion Estimation is block translation, which assumes that a frame is composed of moving blocks, the motion of each block can be characterized by a translation vector V. This model is effective for small motions.

 

  • The orthographic projection of 3D rigid motion of a planar surface can be approximated by a six-parameter affine model, which is applied to each block (located at (x,y)) as the whole:

x new  = a1 x  + a2 y  + a3

y new  = a4 x  + a5 y  + a6

if the distance of the planar surface from the camera is large enough (in such case all rays from the planar object to the camera can be assumed parallel) then the affine approximation is good.

Examples:

Isotropic scaling (Zoom)

           x new  =  k x

  y new  = k y

 

  • Noise makes Motion Estimation (ME) to lock on local minima, therefore false motion vectors are generated (it’s like false memory).
    To cope with the noise in ME it’s not uncommon to apply hierarchical motion estimation:
    1) perform a coarse motion estimation on filtered image (noise reduced, often decimated)
    2) fine motion estimation on original image around candidate motion vectors found in the coarse ME phase.

 

  • when an object is moving fast, it can exit from the search windows. Therefore 60fps video is beneficial than 30fps.

 

  • motion in natural pictures varies slowly from one macroblock to another. In other words, motion vectors of neighboring macroblocks are closely correlated. This assumption is taken into consideration in Motion Estimation as the motion alignment favor.

 

  • A significant amount of correlation exists between neighboring blocks’ motion vectors, the motion vectors are themselves predicted from already transmitted motion vectors, and the motion vector prediction error is encoded.   

 

  • According to the paper “Motion-Compensating Prediction with. Fractional-Pel Accuracy”,  B. Girod, 1993:

For sequences with high motion activity, quarter-pixel accuracy seems to be sufficient, while for sequences containing lower motion activity the half-pixel solutions are typically sufficient.

 

  • Motion Activity of k-th frame is the average magnitude of all motion vectors |vi(k)| (N in total), normalized by the maximal possible motion vector  |vmax|  (such vector is equal to the half of the motion search range), in percents:

the motion activity range is between 0 to 100%

12 Responses

  1. I’m still learning from you, while I’m trying to achieve my goals. I absolutely liked reading all that is posted on your website.Keep the tips coming. I liked it!

  2. I was just looking for this info for some time. After 6 hours of continuous Googleing, finally I got it in your website. I wonder what is the lack of Google strategy that do not rank this type of informative web sites in top of the list. Normally the top web sites are full of garbage.

  3. I’m not that much of a online reader to be honest but your blogs really nice, keep it up! I’ll go ahead and bookmark your site to come back in the future. Many thanks

  4. This design is incredible! You obviously know how to keep a reader amused. Between your wit and your videos, I was almost moved to start my own blog (well, almost…HaHa!) Great job. I really enjoyed what you had to say, and more than that, how you presented it. Too cool!

Leave a Reply

Your email address will not be published. Required fields are marked *