I’m Dr. Krishna Rao Vijayanagar, and I have worked on Video Compression (AVC, HEVC, MultiView Plus Depth), ABR streaming, and Video Analytics (QoE, .
Inter and Intra Prediction. What is an I-frame? What is a P-frame? What is a B-frame? Reference B-frame and Non-Reference B-frames. Use of I, P, and B-frames in Video Compression & Streaming. Where do you use I-frames? Refreshing Video Quality. Recovery from Bitstream Errors. Trick Modes (Seeking Forward and Back) Where do you use P and B frames? Conclusion.
The concept of I-frames, P-frames, and B-frames is fundamental to the field of video compression. These three frame types are used in specific situations to improve the codec’s compression efficiency, the compressed stream’s video quality, and the resilience of the stream to transmission and storage errors & failures.
In this tutorial, we look at how I-frames, P-frames, and B-frames work and what they are used for. If you are into video compression, do read our
tutorial on the discrete cosine transform,
why video compression is important,
and a layman’s explanation of what a video codec is and how it’s created.
Okay, with that, let’s get started with a couple of fundamental aspects of modern day video compression – Intra and Inter prediction.
Table of Contents
Inter and Intra Prediction
What is an I-frame?
What is a P-frame?
What is a B-frame?
Reference B-frame and Non-Reference B-frames
Use of I, P, and B-frames in Video Compression & Streaming
Where do you use I-frames?
Refreshing Video Quality
Recovery from Bitstream Errors
Trick Modes (Seeking Forward and Back)
Where do you use P and B frames?
Conclusion
I won’t do a deep dive of Intra and Inter-prediction in this article, but, I’ll give you an idea of why these exist and what they are meant for.
Take, for example, the image below. It shows two video frames (adjacent to each other) with a rectangular block of black pixels. In frame 1, the block is on the left-hand side of the image, and in the second frame, it has moved to the right.
If I want to compress Frame #2 using a modern video codec like H.264 or HEVC, I would do something as follows –
Break the video into blocks of pixels (macroblocks) and compress them one at a time.
In order to compress each macroblock, the first step is to find a macroblock similar to the one we want to compress by searching in the current frame or previous or future frames.
The best-match macroblock’s location is recorded (which frame and its position in that frame). Then, the two macroblocks’ difference is compressed and sent to the decoder along with the location information.
With me so far? Good!
Take a look at the image below. If I want to compress the macroblock in Frame #2 (that I’ve marked with a red square), what do you think is the best option? Or how should it be done?
First, I can look in frame #1 and find the matching block. It appears to have moved by a distance approximately the frame’s width (a little less, I know) and approximately at the same height. Good, we have the motion vector now.
I can search within the same frame and quickly realize that the block above the one marked in red is IDENTICAL to it. So, I can tell the decoder to copy that one instead of hunting in another frame. The motion vector (if any) is also minimal.
Now take a look at the next example. We want to compress the macroblock containing the blue sphere in frame #2. How should we go about doing this? Search within the same frame or search in previously encoded frames?
First, I can look in frame #1 and find the matching sphere. It appears to have moved by a distance approximately the frame’s width (a little less, I know) and moved up a little. This gives us the motion vector. The difference between the two blocks containing spheres appears to be very small (guesstimate!)
Second, I can search within the same frame and realize no other block contains a sphere. So, bad luck searching for a match within the same frame!
So, what did we learn from these toy exa...