참고

[1] LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding, ACL, 2024.

[2] MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources, ACL-Findings, 2024.

[3] EXAONE 3.0 7.8B Instruction Tuned Language Model, arxiv.

[4] LLaMA: Open and Efficient Foundation Language Models, arxiv.

[5] Language Models are Few-Shot Learners, NeurIPS, 2020.

[6] FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference, ACL-Findings, 2023.

[7] An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks, EMNLP, 2022.

[8] Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering, EACL, 2023.