The issue of reshape in unpatchify #132

prsigma · 2025-03-15T11:49:36Z

Thanks for your excellent work. In your models/latte.py, there is the following code:

def unpatchify(self, x):
        """
        x: (N, T, patch_size**2 * C)
        imgs: (N, H, W, C)
        """
        c = self.out_channels
        p = self.x_embedder.patch_size[0]
        h = w = int(x.shape[1] ** 0.5)
        assert h * w == x.shape[1]

        x = x.reshape(shape=(x.shape[0], h, w, p, p, c))
        x = torch.einsum('nhwpqc->nchpwq', x)
        imgs = x.reshape(shape=(x.shape[0], c, h * p, h * p))
        return imgs

For the input x, why can't it be reshaped to (x.shape[0], h, w, c, p, p)? Is there any particular reason for this specific reshaping order?

The text was updated successfully, but these errors were encountered:

maxin-cn · 2025-03-15T12:03:34Z

This order depends on whether the order of channels (C) and spatial dimensions (p, p) is consistent with the original patchify process. After modification, it may not be able to generate a video, you can simply try.

prsigma · 2025-03-15T12:29:54Z

This order depends on whether the order of channels (C) and spatial dimensions (p, p) is consistent with the original patchify process. After modification, it may not be able to generate a video, you can simply try.

I noticed that your patchify operation uses a patch embedding similar to that of Vision Transformers (ViT), with the specific structure as follows:

(x_embedder): PatchEmbed(
    (proj): Conv2d(4, 1152, kernel_size=(2, 2), stride=(2, 2))
    (norm): Identity()
)

How do you determine the order of (p, p) and C in the patchify process?

maxin-cn · 2025-03-16T02:24:20Z

Please see here.

github-actions · 2025-03-24T00:50:43Z

Hi There! 👋

This issue has been marked as stale due to inactivity for 7 days.

We would like to inquire if you still have the same problem or if it has been resolved.

If you need further assistance, please feel free to respond to this comment within the next 7 days. Otherwise, the issue will be automatically closed.

We appreciate your understanding and would like to express our gratitude for your contribution to Latte. Thank you for your support. 🙏

maxin-cn added the question Further information is requested label Mar 15, 2025

github-actions bot added the automatic-stale label Mar 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The issue of reshape in unpatchify #132

The issue of reshape in unpatchify #132

prsigma commented Mar 15, 2025

maxin-cn commented Mar 15, 2025

prsigma commented Mar 15, 2025

maxin-cn commented Mar 16, 2025

github-actions bot commented Mar 24, 2025

The issue of reshape in unpatchify #132

The issue of reshape in unpatchify #132

Comments

prsigma commented Mar 15, 2025

maxin-cn commented Mar 15, 2025

prsigma commented Mar 15, 2025

maxin-cn commented Mar 16, 2025

github-actions bot commented Mar 24, 2025