Me
Shuang Chen
Undergraduate @ ZJU 26'

About

Hi! I am a junior year undergraduate student majoring in Computer Science and Technology in Chu Kochen Honors College, Zhejiang University. I am am currently a research assistant at State Key Laboratory of CAD&CG, Zhejiang, China, supervised by Xiaofei He and Boxi Wu.
My research interests lie in Web Agents, Reinforcement Learning, Video Generation, vision-language learning and Computer Vision.
Currently, I am focusing on self-training autonomous generalist virtual agents that understand open-ended natural/multi-modal instructions, observe wild GUI environments with grounded knowledge, iteratively generate executable actions and autonomously refine their capabilities through experiential learning to complete complex tasks.

Experiences

  • Jun. 2024 -- Sept. 2024, Ant Research , Alibaba, Hangzhou, China
         Research Intern, Work on Web agent research and development of Agent plug-ins.
  • Jul 2023 -- Dec 2023, Westlake University, Hangzhou, China
         Visiting Research Student, Work on building a benchmark to robustly evaluate the reasoning capability of LLMs.
         Supervisor: Prof. Yue Zhang
  • Jan 2024 -- Mar 2024, Imperial College London, London, UK
         Visiting Student, Data Science Winter Session.
         Awards: Best CV project & Best individual Award
  • March 2023 -- Present, State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
         Research Assistant, Work on Embodied Agent and MultiModal Machine Learning.
         Supervisor: Prof. Yueting Zhuang and Prof. Xiaofei He

  • News

    • One paper on VidSketch: Hand-drawn Sketch-Driven Video Generation now available on website and arxiv.
    • One paper on HuViDPO:Enhancing Video Generation using DPO now available on website and arxiv.
    • One paper accepted to ACM MM 2024.

    Selected Publications and Manuscripts, (Full List: here)

    VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control
    Under Review (ICML 2025)
    Fact: Teaching MLLMs with Faithful, Concise and Transferable Rationales
    Accepted by ACM MM(2024), Melbourne, Australia

    HuViDPO: Enhancing Video Generation through Direct Preference Optimization for Human-Centric Alignment
    Under Review (CVPR 2025)

    Selected Honors & Awards

    Top