[1] Kim, T., Song, G., Lee, S., Kim, S., Seo, Y., Lee, S., ... & Bae, K. (2021). L-Verse: Bidirectional Generation Between Image and Text.?Generation Between Image and Text. arXiv preprint arXiv:2111.11133.
[2] Van Den Oord, A., & Vinyals, O. (2017). Neural discrete representation learning.?Advances in neural information processing systems,?30.
[3] Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes.?arXiv preprint arXiv:1312.6114.
[4] Van Den Oord, A., & Vinyals, O. (2017). Neural discrete representation learning.?Advances in neural information processing systems,?30.
[5] Kim, T., Song, G., Lee, S., Kim, S., Seo, Y., Lee, S., ... & Bae, K. (2021). L-Verse: Bidirectional Generation Between Image and Text. arXiv preprint arXiv:2111.11133.