Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding Motion Vectors #37

Open
chanwutk opened this issue Jun 29, 2024 · 1 comment
Open

Understanding Motion Vectors #37

chanwutk opened this issue Jun 29, 2024 · 1 comment

Comments

@chanwutk
Copy link

Hello,

I am trying to understand the motion vectors returned from Method :: retrieve() described in the readme.md.

  • For 1, 2: I assumed to be w and h of the macroblock according to AVMotionVector. Would you mind confirming whether the size of the macroblock is for the current frame or the reference frame?
  • For 5, 6: I am confused about the x/y-center coordinate. Does this mean that the dst_x and dst_y represent the center of the current frame's macroblock instead of the top-left? And, do src_x and src_y represent the center of the reference frame's macroblock, as well?
  • For 7, 8: I am unsure how to use the equations for motion_x and motion_y. Are these equations for calculating each individual pixel within the macroblock?
  • For 9: Is this the current frame's macroblock scale compared to the reference frame's macroblock? For example, assume the w, h refer to the size of the current frame's macroblock and dst_x, dst_y, src_x, and src_y refer to the top-left positions of their corresponding macroblocks, if motion_scale = 2 and src_x, src_y = 1, 1 and dst_x, dst_y = 2, 3 and w, h = 16, 16, then does the macroblock of x1, x2, y1, y2 = 2, (2+16), 3, (3+16) of the current frame is reconstructed from the macroblock of x1, x2, y1, y2 = 1, (1+16/2), 1, (1+16/2) of the reference frame?

I appreciate any help, thank you!

@LukasBommes
Copy link
Owner

LukasBommes commented Oct 31, 2024

Hello @chanwutk,
thanks for your questions. They are good ones, so it took me some research to figure out the answers. Here, they are:

  • Correct, fields 1 and 2 are width and height of the macroblock, respectively. For some reason, the description was deleted in the README when I rephrased the text a few years ago. I added this back in. The macroblock size in H.264 can be 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, or 4x4. Importantly, the size of the MB is the same in the current and reference frame because the encoder will subtract the predicted MB from the current MB and only consider the difference within that region for the next step (discrete cosinus transform).

  • I looked at the motion vectors of one frame in the MPEG-4 Part 2 encoded video for simplicity because here the macroblocks are all 16 x 16. src_x increases from 8 to 1272 and src_y from 8 to 712. Similarly, dst_x takes values between 8 to 1272 and dst_y values from 8 to 712. From this I deduce that the motion vector "base" and "tip" are centered in the macroblock and do not refer to the top-left corner. But beware that H.264 uses quarter-pixel motion estimation, i.e., the src_x and src_y coordinates are actually fractional coordinates that are calculated as src_x = dst_x + motion_x / motion_scale and src_y = dst_y + motion_y / motion_scale.

  • The previous answer contains the equations for the relation between source positions, reference position, motion, and motion scale. I took the equation from the documentation of the AVMotionVector struct, which is extracted by the motion vector extractor. I don't know for what reason the AVMotionVector struct stores src_x and src_y additionally as integers as this information is redundant and less accurate (integer resolution) than the representation via motion and motion scale (1/4 pixel resolution). You can also see that in the drawing function src_x and src_y are not used (anymore, after this PR was merged). Anyways, I noted that the formulas for motion_x and motion_y in the README are incorrect and I will update them accordingly.

  • I hope the previous two answer could clarify this question. If you are still unsure I recommend section 3.3.16 "sub-pixel motion compensation" in this book.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants