| 2026 |
-
Shengqiong Wu, Bobo Li, Xinkai Wang, Xiangtai Li, Lei Cui, Furu Wei, Shuicheng YAN, Hao Fei, Tat-Seng Chua.  Synergizing Understanding and Generation with Interleaved Analyzing-Drafting Thinking. ICLR. 2026.  [pdf]
-
Jundong Xu, Hao Fei, Huichi Zhou, Xin Quan, Qijun Huang, Shengqiong Wu, William Yang Wang, Mong-Li Lee, Wynne Hsu.  LogicReward: Incentivizing LLM Reasoning via Step-Wise Logical Supervision. ICLR. 2026.  [Project][pdf]
-
Kai Liu, Wei Li, Lai Chen, Shengqiong Wu, Yanhao Zheng, Jiayi Ji, Fan Zhou, Jiebo Luo, Ziwei Liu, Hao Fei, Tat-Seng Chua.  JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization. ICLR. 2026.  [Project][pdf]
-
Kai Liu, Yanhao Zheng, Kai Wang, Shengqiong Wu, Rongjunchen Zhang, Jiebo Luo, Dimitrios Hatzinakos, Ziwei Liu, Hao Fei, Tat-Seng Chua.  JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation. ICLR. 2026.  [Project][pdf]
|
| 2025 |
-
Shengqiong Wu, Weicai Ye, Yuanxing Zhang, Jiahao Wang, Quande Liu, Xintao Wang, Pengfei Wan, Kun Gai, Hao Fei, Tat-Seng Chua.  A Reason-then-Describe Instruction Interpreter for Controllable Video Generation. arxiv. 2025.  [Project][pdf]
-
Zhengyang Liang, Daoan Zhang, Huichi Zhou, Rui Huang, Bobo Li, Yuechen Zhang, Shengqiong Wu, Xiaohan Wang, Jiebo Luo, Lizi Liao, Hao Fei.  UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist. arxiv. 2025.  [Project][pdf]
-
Shengqiong Wu, Weicai Ye, Jiahao Wang, Quande Liu, Xintao Wang, Pengfei Wan, Di Zhang, Kun Gai, Shuicheng Yan, Hao Fei, Tat-Seng Chua.  Any2Caption: Interpreting Any Condition to Caption for Controllable Video Generation. arxiv. 2025.  [Project][pdf]
-
Hao Fei, Yuan Zhou, Juncheng Li, Xiangtai Li, Qingshan Xu, Bobo Li, Shengqiong Wu, Yaoting Wang, Junbao Zhou,et al.  On Path to Multimodal Generalist: General-Level and General-Bench. ICML. 2025.  [Project][pdf][Huggingface]
-
Yaoting Wang, Shengqiong Wu, Yuechen Zhang, William Wang, Ziwei Liu, Jiebo Luo, Hao Fei.  Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey. arxiv. 2025.  [Code][pdf]
-
Shengqiong Wu, Hao Fei, Tat-Seng Chua, Shuicheng Yan.  Universal Scene Graph Generation. CVPR. 2025.  [Code][pdf]
-
Shengqiong Wu, Hao Fei, Jingkang Yang, Xiangtai Li, Juncheng Li, Hanwang Zhang, Tat-seng Chua.  Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene. CVPR. 2025.  [Code][pdf]
-
Shengqiong Wu, Hao Fei, Xiangtai Li, Jiayi Ji, Hanwang Zhang, Tat-Seng Chua, Shuicheng Yan.  Towards Semantic Equivalence of Tokenization in Multimodal LLM. ICLR. 2025.  [Code][pdf]
-
Shengqiong Wu, Hao Fei, Liangming Pan, William Yang Wang, Shuicheng Yan, Tat-Seng Chua.  Combating Multimodal LLM Hallucination via Bottom-up Holistic Reasoning. In Proceedings of AAAI. 2025.   [pdf]
|
| 2024 |
-
Hao Fei, Shengqiong Wu, Hanwang Zhang, Tat-Seng Chua, Shuicheng Yan.  VITRON: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing. In Proceedings of NeurIPS. 2024.  [Code][pdf]
-
Meng Luo, Hao Fei*, Bobo Li, Shengqiong Wu, Qian Liu, Soujanya Poria, Erik Cambria, Mong-Li Lee, Wynne Hsu.  PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis. In Proceedings of ACM MM. 2024.   (Oral).[Code][pdf]
-
Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-Seng Chua.  NExT-GPT: Any-to-Any Multimodal Large Language Model. In Proceedings of ICML. 2024.   (Oral) [Code | 3.6k π][pdf]
-
Hao Fei, Shengqiong Wu, Wei Ji, Hanwang Zhang, Meishan Zhang, Mong-Li Lee, Wynne Hsu.  Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition. In Proceedings of ICML. 2024.   (Oral) [Code][pdf]
-
Hao Fei, Shengqiong Wu, Wei Ji, Hanwang Zhang, Tat-Seng Chua.  Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs. In Proceedings of CVPR. 2024.  [Code][pdf]
|
| 2023 |
-
Shengqiong Wu, Hao Fei, Hanwang Zhang, Tat-Seng Chua.  Imagine That! Abstract-to-Intricate Text-to-Image Synthesis with Scene Graph Hallucination Diffusion. In Proceedings of NeurIPS. 2023.  (long, poster)  [Code][pdf]
-
Leigang Qu*, Shengqiong Wu*, Hao Fei, Liqiang Nie, Tat-Seng Chua.  LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation. In Proceedings of ACM MM. 2023.  (*: equal contribution, long)  [Code][pdf]
-
Bobo Li, Hao Fei, Yuhan Wu, Jinsong Zhang, Shengqiong Wu, Jingye Li, Yijiang Liu, Lizi Liao, Tat-Seng Chua, Fei Li, Donghong Ji.  DiaASQ: A benchmark of conversational aspect-based sentiment quadruple analysis.In Proceedings of ACL. 2023.  (long, poster)  [Code][pdf]
-
Shengqiong Wu, Hao Fei, Yixin Cao, Lidong Bing, Tat-Seng Chua.  Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling. In Proceedings of ACL. 2023.  (long, poster, paper award nomination, 1.6%)  [Code][pdf]
-
Shengqiong Wu, Hao Fei, Wei Ji, Tat-Seng Chua.  Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment. In Proceedings of ACL. 2023.  (long, oral)  [pdf]
|
| 2022 |
-
Hao Fei, Shengqiong Wu, Jingye Li, Bobo Li, Fei Li, Libo Qin, Meishan Zhang, Min Zhang, Tat-Seng Chua.  LasUIE: Unifying information extraction with latent adaptive structure-aware generative language model. In Proceedings of NeurIPS. 2022.  (long, poster) [Code][pdf]
-
Hu Cao, Jingye Li, Fangfang Su, Fei Li, Hao Fei, Shengqiong Wu, Bobo Li, Liang Zhao and Donghong Ji.  OneEE: A One-Stage Framework for Fast Overlapping and Nested Event Extraction. In Proceedings of COLING. 2022.  (long, oral) [Code][pdf]
-
Shengqiong Wu, Hao Fei, Fei Li, Meishan Zhang, Yijiang Liu, Chong Teng, Donghong Ji.  Mastering the Explicit Opinion-Role Interaction: Syntax-Aided Neural Transition System for Unified Opinion Role Labeling. In Proceedings of AAAI. 2022.  (long, online) [Code][pdf]
-
Jingye Li, Hao Fei, Jiang Liu, Shengqiong Wu, Meishan Zhang, Chong Teng, Donghong Ji, Fei Li.  Unified named entity recognition as word-word relation classification. In Proceedings of AAAI. 2022.  (long, online) [Code][pdf]
|
| 2021 |
-
Shengqiong Wu, Hao Fei, Yafeng Ren, Donghong Ji, Jingye Li.  Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms Extraction with Rich Syntactic Knowledge. In Proceedings of IJCAI. 2021.  (long, online) [Code][pdf]
|