Publications
2025
SparseLoRA: Accelerating LLM Fine-Tuning with Contextual SparsitySamir Khaki* , Xiuyu Li* , Junxian Guo* , Ligeng Zhu , Konstantinos N. Plataniotis , Amir Yazdanbakhsh , Kurt Keutzer , Song Han , Zhijian Liu
ICML 2025
NVILA: Efficient Frontier Visual Language ModelsZhijian Liu* , Ligeng Zhu* , Baifeng Shi , Zhuoyang Zhang , Yuming Lou , Shang Yang , Haocheng Xi , Shiyi Cao , Yuxian Gu , Dacheng Li , Xiuyu Li , Yunhao Fang , Yukang Chen , Cheng-Yu Hsieh , De-An Huang , An-Chieh Cheng , Vishwesh Nath , Jinyi Hu , Sifei Liu , Ranjay Krishna , Daguang Xu , Xiaolong Wang , Pavlo Molchanov , Jan Kautz , Hongxu Yin† , Song Han† , Yao Lu†
LServe: Efficient Long-Sequence LLM Serving with Unified Sparse AttentionShang Yang* , Junxian Guo* , Haotian Tang , Qinghao Hu , Guangxuan Xiao , Jiaming Tang , Yujun Lin , Zhijian Liu , Yao Lu , Song Han
2024
LongLoRA: Efficient Fine-Tuning of Long-Context Large Language ModelsYukang Chen , Shengju Qian , Haotian Tang , Xin Lai , Zhijian Liu , Song Han , Jiaya Jia