📝 Publications

TPAMI 2025
sym

TPAMI 2025 Fine-Grained Visual Text Prompting
Lingfeng Yang, Xiang Li, Yueze Wang, Xinlong Wang, Jian Yang

Paper | Code

  • Proposes fine-grained multimodal prompting to enhance large multimodal models’ localization and grounding capability, thereby boosting referring comprehension performance.
  • Our work has been adopted by the research group of Prof. Philip H. S. Torr (Oxford University, Marr Prize laureate), who employed the proposed Fine-Grained Visual Prompting (FGVTP) as the core target extractor in their weakly supervised referring segmentation framework.
  • Our work has inspired subsequent studies and has been applied to multiple domains, including Egocentric Action Recognition and Compositional Action Recognition for embodied intelligence perception.
NeurIPS 2023
sym

NeurIPS 2023 Fine-Grained Visual Prompting
Lingfeng Yang, Yueze Wang, Xiang Li, Xinlong Wang, Jian Yang

Paper | Code | 中文解读 | 中文视频

  • Propose a specific visual prompting technique that enhances referring expression comprehension by highlighting regions of interest through background blurring based on fine-grained segmentation.
  • Maintains faster inference speed in the trade-off while achieving more than a 5-point improvement over state-of-the-art methods.
NeurIPS 2022 Spotlight
sym

NeurIPS 2022 RecursiveMix: Mixed Learning with History (Spotlight, Top 12.8%)
Lingfeng Yang, Xiang Li, Borui Zhao, Renjie Song, Jian Yang

Paper | Code

  • Propose a simple yet effective mixed-data augmentation technique for image classification.
  • Enhance model pretraining performance for object detection and semantic segmentation tasks.
CVPR 2022 Oral
sym

CVPR 2022 Dynamic MLP for Fine-Grained Image Classification by Leveraging Geographical and Temporal Information (Oral, Top 3.3%)
Lingfeng Yang, Xiang Li, Renjie Song, Borui Zhao, Juntian Tao, Shihao Zhou, Jiajun Liang, Jian Yang

Paper | Code

  • Proposed a dynamic MLP fusion framework for fine-grained image classification by incorporating geo-temporal information.
  • Improved classification accuracy on multiple fine-grained datasets.