참고

[1] Rajpurkar, Pranav, et al. "SQuAD: 100,000+ Questions for Machine Comprehension of Text." Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016.

[2] Kwiatkowski, Tom, et al. "Natural Questions: a Benchmark for Question Answering Research." Transactions of the Association for Computational Linguistics 7 (2019): 452-466.

[3] Xu, Fangyuan, Junyi Jessy Li, and Eunsol Choi. "How Do We Answer Complex Questions: Discourse Structure of Long-form Answers." Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022.

[4] Raffel, Colin, et al. "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer." Journal of Machine Learning Research 21 (2020): 1-67.

[5] Sanh, Victor, et al. "Multitask Prompted Training Enables Zero-Shot Task Generalization." International Conference on Learning Representations. 2021.

[6] Taylor, Ross, et al. "Galactica: A large language model for science." arXiv preprint arXiv:2211.09085 (2022).

[7] Chung, Hyung Won, et al. "Scaling instruction-finetuned language models." arXiv preprint arXiv:2210.11416 (2022).

[8] Lee, Yoonjoo, et al. “QASA: Advanced Question Answering on Scientific Articles.” Proceedings of the 40th International Conference on Machine Learning, PMLR 202:19036-19052, 2023.