You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am wondering if you have conducted any further experiments on vector quantization. The DCAE-f128 can compress a 256x256 image into a 2x2 feature map, resulting in 4 tokens with VQ. This could lead to significant acceleration in LLM training and inference, paving the way for real-time video generation. Feel free to ask if you need any more adjustments!