Detection of Shot Boundaries and Gradual Transitions
Introduction This paper was inspired by the book “CONTENT-BASED ANALYSIS OF DIGITAL VIDEO” by Alan Hanjalic, 2004. Reliable detection of scene cuts and gradual transitions (fades, wipes, dissolves) can improve coding efficiency (e.g. in case of a scene cut the next frame is set to I-frame). In addition, video index or scene index can be […]
Flat/Texture Block Classification
The method of detection flat/texture/detail areas in frequency domain is well-known and described in various variations in literature (therefore i omit referencing to specific articles). I provide rather schematical description of the method. Knowing of the type of a block (flat or detailed) enables to adjust quantization step-sizes (it is well known that the human […]
Discontinuity Indicators
The main purpose of discontinuity_indicator flag (which is signaled at ts-packet header) is to make excerpt/segment easliy concatenated or inserted into MPEG-2 Systems stream (TS stream). Notice that TS stream is commonly comprised from a number of elementary media streams (video and audio). If you wish to concatenate your TS stream to another one you […]
How Count AVC/H.264 Video Frames in MPEG-2 Systems (Transport Stream) file?
Transport stream is commonly comprised by several elementary (“atomic”) streams (e.g. video and audio) and each elementary stream is specified by unique number PID which is present at the header of each transport packet. To find the PID of video stream i suggest exploiting tsinfo utility from open-source tstools library. The official site of tstools […]
Why Scalability Not Widely Used?
Both AVC/H.264 and HEVC/H.265 supports quality (SNR) and spatial scalability. The Quality (SNR) layering means that an encoder employs coarse quantization (high QPs) for the base layer, and successively finer quantization (smaller QPs) for the enhancement layers and as a result the quality is successively improved . from lower layers to higher ones. The Quality […]
Edge Activity or Percentage of Edge Pixels
Spatial activity or Edge activity of frame or video sequence is important parameter for encoder’s settings (e.g. for determination of frame quantization level). On the one hand high edge acitivity in a scene means that this scene is difficult for encoding (required more bits for edge retention), on the other hand dense edge patterns masks […]
Flatening region of Quality-Bitrate Curve
Schematically Quality-Rate curve can be illustrated in the following graph: It’s obvious that Quality-Rate curve depends on video content (e.g. significant motion), video resolution and your encoder’s settings. However, it’s important to detect whether your encoder with a predefined bit-rate places your video in the Flat Region or not. You need find a Bitrate threshold, […]
Outline of Multiplexor
Basic Assumptions Video fps is constant Video stream carries PCRs PES = frame Audio frame contains 1024 samples (e.g. AAC) DTS of k-th video frame is specified as VDTS(k) = VDTS(0) + k / fps (1) where VDTS(0) is initial DTS of the very first video frame (usually DTS(0)=PTS(0)), ‘fps‘ is the frame […]
How Generate DTS/PTS from Video Elementary Stream?
Elementary AVC/H.264 stream does not contain timing information (excluding rare cases when presentation times are signaled in picture-timing SEI messages). Therefore, in order to encapsulate elementary stream into mp4 container or mpeg-system program it’s required to compute DTS/PTS (according to the predefined frame rate – fps) such that the following conditions are met: 1) DTS<=PTS […]
Outline of Exponential Frame-Level Adaptive Rate Control
i shortly outline an exponential frame-level Rate Control model, where QP (quantization parameter) is constant (or near-constant) within a single frame. Notice that MB-level rate control can achieve more accurate coded bitrate than frame-level because the quantization parameter (QP) may be changed with different MBs. Exponential Adaptive Rate Control is based on exponential R-Q model: […]