Publications ( *, †, ‡ indicates the equal contributions, corresponding author, project leader, respectively.)
Currently, my interest lies in Embodied Agents,
which are at the intersection of Multimodal Large Language Models and Embodied AI,
with particular interests in high-level planning and low-level control with spatio-temporal intelligence,
working towards an generalist agent in a complex real-world environment.
Representative works are highlighted.
|
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics
Enshen Zhou *,
Jingkun An *,
Cheng Chi *‡,
Yi Han,
Shanyu Rong,
Chi Zhang,
Pengwei Wang,
Zhongyuan Wang,
Tiejun
Huang,
Lu Sheng†,
Shanghang Zhang†
Paper /
Project /
Code /
Copy BibTeX
Copy Success!
TL;DR: From words to exactly where you mean using RoboRefer!
Arxiv 2025
|
|
AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image
Generation
Jingkun An *,
Yinghao Zhu *,
Zongjian Li*,
Enshen Zhou,
Haoran Feng,
Xijie Huang,
Bohua
Chen,
Yemin Shi,
Chengwei Pan†
Paper /
Project /
Code /
Copy BibTeX
Copy Success!
TL;DR: Train T2I Diffusion model with AI-Generated Feedback for DPO!
AAAI 2025
|
|
Medical MLLM is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models
Xijie Huang *,
Xinyuan Wang *,
Haotao Zhang *,
Yinghao Zhu *,
Jiawen Xi,
Jingkun An,
Hao Wang,
Hao Liang,
Chengwei Pan†
Paper /
Project /
Code /
Copy BibTeX
Copy Success!
TL;DR: Medical MLLM is Vulnerable!
AAAI 2025
|
|
M3Fair: Mitigating Bias in Healthcare Data through Multi-Level and Multi-Sensitive Attribute Reweighting Method
Yinghao Zhu *,
Jingkun An *,
Enshen Zhou,
Hao Li,
Haoran Feng,
Paper /
Project /
Code /
Copy BibTeX
Copy Success!
TL;DR: BeFair (A Bias Detection and Mitigation Tool)!
2023 NIH Bias Detection Third Prize (Top 5)
|
Selected Awards and Honors
2024: Outstanding Graduate of Beihang University.
2023: Grand Prize (Top 1) in "Challenge Cup" Competition of Science Achievement in China.
|
|