QP-Modulation (or Adaptive Quantization) methods are based on the well known empirical fact that the human visual system (HVS) is less sensitive to distortions in high-complexity areas than those of homogeneous regions. Moreover, high complexity often corresponds to fast motion, so a decrease in quality will be less noticeable.
By the way, this property has been widely utilized in many video coding standards via custom quantization matrices (beginning from JPEG).
Generally speaking QP-Modulation (Adaptive Quantization) changes QP according to:
1) In case of high-detailed (high-complexity) regions increase QP (i.e. making quantization more coarser and quantization errors greater and consuming less bits). Notice that over-increase of QP can incur blurring due to heavy suppression of high-frequency components.
2) In case of homogeneous regions (or “smooth” regions) QP is decereased (i.e. making quantization more fine and quantization errors smaller and consuming more bits)
Notice that QP-modulation per se requires bits for signaling per-block QP values.
The above-mentioned QP-modulation method is sometimes called as “Spatial QP-modulation”. x264 and x265 supports Adaptive Quantization, it’s controlled by the switches: aq-mode and aq-strength.
In addition to Spatial QP-Modulation, there is “Temporal QP-modulation” (NVIDIA supports temporal Adaptive Quantization on some devices). Temporal QP-modulation or Adaptive Quantization is based on observations that HVS detection thresholds of blockiness and other artifacts rise dramatically in the vicinity of the temporal discontinuity (scene cut or flash lights) and tapers off quickly as the distance from the temporal edge increases.
Example of flash lights:
Mostly interesting fact that visual impairment detection thresholds is observed slightly before the temporal edge (backward masking). This probably proves that Visual Cortex uses a kind of look-ahead processing. By the way, similar phenomenon is observed in audio perception and it’s called ‘backward temporal masking’.
Practical hint: due to temporal masking in case of scene cut you can safety increase QPs even on smooth areas. Under some circumstances you can even discard a frame around a scene cut.
23+ years’ programming and theoretical experience in the computer science fields such as video compression, media streaming and artificial intelligence (co-author of several papers and patents).
the author is looking for new job, my resume