Kaiyang Ji

jiky2024 AT shanghaitech DOT edu DOT cn

main_pic.jpg

2-505, VDI, SIST

393 Middle Huaxia Road, 201210

Shanghai, China

About Me

I am Kaiyang Ji, a second-year master student at Visual & Data Intelligence Center (VDI) in ShanghaiTech University, advised by Prof. Jingya Wang and Prof. Ye Shi. Previously, I graduated from ShanghaiTech University with a major in computer science, advised by Prof. Jingya Wang and Prof. Jingyi Yu.

Research Interest

My research interest broadly lies in computer vision, machine learning, and robotics. Particularly, my current research focuses on Human-Centered 3D Vision, Generative Models and Embodied AI.

I am looking for collaborators and friends. Feel free to contact me if you are interested in these fantasic topics!

Email / Google Scholar / Github

news

May 01, 2026 Our paper DiscoForcing has been accepted by ICML 2026! :sparkles:
Jan 27, 2026 Our paper VLM-RMD has been accepted by ICLR 2026! :sparkles:
Jul 29, 2025 We have organized ICCV 2025 Workshop Challenge “Human-Robot-Scene Interaction and Collaboration”! :raised_hands:
Jun 26, 2025 Our paper Human-X has been accepted by ICCV 2025 as Highlight! :tada::tada::tada:
Sep 01, 2024 I have joined VDI in 24Fall as a CS Master student!
Feb 27, 2024 Our paper S2Fusion has been accepted by CVPR 2024! :sparkles:

selected publications

  1. ICML 2026
    discoforcing.png
    DiscoForcing: A Unified Framework for Real-Time Audio-Driven Character Control with Diffusion Forcing
    Kaiyang Ji*, Bingsheng Qian*, Binghuan Wu, and 3 more authors
    In Forty-third International Conference on Machine Learning (ICML), 2026
  2. ICCV 2025 Highlight
    human-x.png
    Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis
    Kaiyang Ji, Ye Shi, Zichen Jin, and 5 more authors
    In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025
  3. ICLR 2026
    vlm-rmd.png
    Human-Object Interaction via Automatically Designed VLM-Guided Motion Policy
    Zekai Deng, Ye Shi, Kaiyang Ji, and 3 more authors
    In The Fourteenth International Conference on Learning Representations (ICLR), 2026
  4. CVPR 2024
    s2fusion.png
    A unified diffusion framework for scene-aware human motion estimation from sparse signals
    Jiangnan Tang, Jingya Wang, Kaiyang Ji, and 3 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024