Hi there! My name is Hang Wu (吴杭), you can also call me by my English name Laurent.
I am currently a first-year PhD student at the University of California, Merced, conducting research under the guidance of Prof. Yiwei Wang, with Prof. Ming-Hsuan Yang as my senior advisor. Additionally, I work closely with Prof. Yujun Cai. My main research interests are in vision-language models and large multimodal models, with a focus on improving their performance and specific applications. I received my bachelor’s degree from Tongji University, where I worked on image processing tasks in the low-level vision field.
You can find my CV here: Hang Wu’s Curriculum Vitae. If you are interested in my work, please feel free to drop me an email.
🔥 News
- 2025.09: Three papers submitted to ICLR 2026.
- 2025.08: 🎉🎉 Our paper DiMo-GUI has been accepted to EMNLP 2025 Main Conference!
- 2025.08: Officially join UC Merced NLP Lab and start my PhD journey.
- 2025.05: Two papers submitted to EMNLP 2025.
- 2025.04: Thrilled to accept PhD offer from UC Merced. Looking forward to working and living in CA!
- 2025.03: Join vivo as a Research Intern!
- 2024.11: One paper submitted to CVPR 2025.
📝 Publications
RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
Hang Wu, Yujun Cai, Haonan Ge, Hongkai Chen, Ming-Hsuan Yang, Yiwei Wang$^{\dagger}$
Arxiv 2025
- Benchmark refinement: Enforces consistent option granularity, unified evaluation dimensions, and mutual exclusivity for greater dataset reliability.
- Baseline analysis: Thoroughly evaluates ShotVL, revealing weaknesses in reasoning, prompt adherence, and output consistency.
- Evaluation expansion: Adds a protocol assessing both task-specific performance and core model competencies, enabling more balanced and robust comparisons.
FrameMind: Frame-Interleaved Video Reasoning via Reinforcement Learning
Haonan Ge, Yiwei Wang, Kai-Wei Chang, Hang Wu, Yujun Cai$^{\dagger}$
Arxiv 2025
- We introduce FiCOT, a reasoning paradigm enabling dynamic visual evidence gathering during inference.
- We propose DRFS, a training methodology for learning adaptive sampling policies. We also develop DRFS-GRPO,anefficient reinforcement learning algorithm for training complex perception-reasoning policies from sparse rewards.
DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning
Hang Wu, Hongkai Chen$^{\dagger}$, Yujun Cai, Chang Liu, Qingwen Ye, Ming-Hsuan Yang, Yiwei Wang
EMNLP 2025 Main Conference
- We propose DiMo-GUI, a training-free framework that can be seamlessly integrated as a plug-and-play component into any GUI agent.
- Without requiring additional training or external data, DiMo-GUI effectively enhances grounding performance across various GUI tasks.
Structured Attention Matters to Multimodal LLMs in Document Understanding
Chang Liu,Hongkai Chen$^{\dagger}$, Yujun Cai, Hang Wu, Qingwen Ye, Ming-Hsuan Yang, Yiwei Wang
Arxiv 2025
- Our work investigates a fundamental yet overlooked aspect: how input format influences document comprehension performance.
- We propose a novel structurepreserving approach that encodes document elements using the LATEX paradigm, maintaining the hierarchical organization and spatial relationships critical for comprehension.
🎓 Educations
- 2025.08 - Present, PhD student, University of California, Merced.
- 2021.09 - 2025.06, Undergraduate student, Tongji University.
💻 Internships
- 2025.03 - 2025.07, Research Intern, vivo@Shenzhen, China.
- 2025.01 - 2025.07, Research Intern, UC Merced NLP Lab@University of California-Merced, Remote.
- 2023.09 - 2025.03, Research Intern, Ni’s Group@Tongji University, Shanghai, China.
📚 Projects

Research on Perception-oriented High Dynamic Range Imaging Systems
National-level Innovation Project
- We treat artifacts in HDR images as detectable entities, explicitly detect and suppress them to enhance HDR quality.
- National-level innovation project at Tongji University, with a funding of 10,000 RMB.
📖 Services
Conference Reviewer
- The IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2026
Teaching
- CSE-022: Introduction to Programming, Teaching Assistant (Fall 2025, UC Merced).
🎨 About Me
- I’m ESFJ.
- I come from Maanshan, Anhui Province, China.
- I’m a huge sports fan and enjoy doing many kinds of sports in my spare time, including basketball🏀, soccer⚽️, waterpolo🤽♂️, swimming🏊, badminton🏸…
- I love pop music and R&B, with Ed Sheeran and The Weeknd as my favorite English artists, David Tao and Khalil Fong as my favorite Chinese artists.