Projects
2026
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM InferenceYesheng Liang , Haisheng Chen , Zihan Zhang , Song Han , Zhijian Liu
DFlash: Block Diffusion for Flash Speculative DecodingJian Chen , Yesheng Liang , Zhijian Liu
2025
VLASH: Real-Time VLAs via Future-State-Aware Asynchronous InferenceJiaming Tang* , Yufei Sun* , Yilong Zhao , Shang Yang , Yujun Lin , Zhuoyang Zhang , James Hou , Yao Lu , Zhijian Liu , Song Han
SparseVILA: Decoupling Visual Sparsity for Efficient VLM InferenceSamir Khaki , Junxian Guo , Jiaming Tang , Shang Yang , Yukang Chen , Konstantinos N. Plataniotis , Yao Lu , Song Han , Zhijian Liu
ICCV 2025
PaperSparseLoRA: Accelerating LLM Fine-Tuning with Contextual SparsitySamir Khaki* , Xiuyu Li* , Junxian Guo* , Ligeng Zhu , Chenfeng Xu , Konstantinos N. Plataniotis , Amir Yazdanbakhsh , Kurt Keutzer , Song Han , Zhijian Liu
NVILA: Efficient Frontier Visual Language ModelsZhijian Liu* , Ligeng Zhu* , Baifeng Shi , Zhuoyang Zhang , Yuming Lou , Shang Yang , Haocheng Xi , Shiyi Cao , Yuxian Gu , Dacheng Li , Xiuyu Li , Yunhao Fang , Yukang Chen , Cheng-Yu Hsieh , De-An Huang , An-Chieh Cheng , Vishwesh Nath , Jinyi Hu , Sifei Liu , Ranjay Krishna , Daguang Xu , Xiaolong Wang , Pavlo Molchanov , Jan Kautz , Hongxu Yin† , Song Han† , Yao Lu†
Z Lab