Computer Vision
SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-modal Large Language Models
Ziyi Lin*, Dongyang Liu*, Renrui Zhang*, Peng Gao*, Longtian Qiu*, Han Xiao, Han Qiu, Wenqi Shao, Keqin Chen, Jiaming Han, Siyuan Huang, Yichi Zhang, Xuming He, Yu Qiao, Hongsheng Li
European Conference on Computer Vision, 2024