Home

/baofff/ All are Worth Words: A ViT Backbone for Diffusion Models

Code Link
https://github.com/baofff/U-ViT
Description
In particular, a latent diffusion model with a small U-ViT achieves a record-breaking FID of 5. 48 in text-to-image generation on MS-COCO, among methods without accessing large external datasets during the training of generative models. Code: https://github.com/baofff/U-ViT
Retrieved
2022/11/22
Stars
48
TOP