-
Notifications
You must be signed in to change notification settings - Fork 293
Can pyramid-flow utilize kv-cache to reduce computation? #227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
yes, but due to the complex temporal compression designs, im afraid that kv cache wouldnt save much computation here |
thanks, so the model will run slower and slower for the latter frame for more and more condition frames is used, is that true? |
yes unless you apply some methods to truncate history condition, such as sliding window. |
could you please explain in detail why kv cache wouldn't save much computation?I am confused about that. |
if you compress history context aggressively, then most of the compute is spent on self-attention among new frame tokens, instead of cross-attention between new tokens and history tokens. kv cache can only reduce the latter part of compute. |
It seems that the computation of the history condition latent is calculated in each frame prediction, but I think it can use kv-cache to reduce this redundant computation, is that true?
The text was updated successfully, but these errors were encountered: