A Survey of Multimodal LLMs (2021-2024)
A comprehensive survey of multimodal large language models from 2021 to 2024, covering encoder-only models, encoder-decoder architectures, decoder-only models, and specialized applications for documents and screens.