profile photo
张皓泉
Haoquan Zhang

Hi, my name is Haoquan Zhang, a final-year undergraduate majoring in Data Science at South China University of Technology (SCUT), and also an incoming phd student to join The Chinese University of Hong Kong (CUHK). I am currently working with my advisor Weiyang Liu , focusing on Vision-Language Representation Learning. Additionally, I am collaborating with Kaipeng Zhang @Shanghai AI Lab to address challenges in Generative Synthesis.

Access more info / Contack me through the following links:


News📰

2024-10-31 💼 Glad to serve as a visiting student at Westlake University, advised by Yandong Wen.
2024-10-25 💼 Glad to receive an intern offer from Shanghai AI Lab! See you in Shanghai!
2024-06-26 🎓 PhD offer from CUHK!
2024-06-14 🚀 Mask4Align now released!
2024-02-27 📄 My first paper accepted by CVPR 2024! See you in Seattle!

Research 💡

Mask4Align
Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems

Haoquan Zhang, Ronggang Huang, Yi Xie, Huaidong Zhang

Pretrained VLMs excel in accurately recognizing and precisely localizing entities within VQA tasks. However, in visual scenes with multiple entities, textual descriptions struggle to distinguish the entities from the same category effectively. Consequently, the existing VQA dataset cannot adequately cover scenarios involving multiple entities. Therefore, we introduce a Mask for Align (Mask4Align) method to determine the entity's position in the given image that best matches the user input question. This method incorporates colored masks into the image, enabling the VQA model to handle discrimination and localization challenges associated with multiple entities.

[Paper] [Submission Journey]

CVPR 2024
Mask4Align
Asymmetric Image Retrieval with Semi-Collaborative Distillation

Yi Xie*, Haoquan Zhang*, Xuandi Luo, Huaidong Zhang, Xuemiao Xu, Shengfeng He
* Co-first authorship

In asymmetric image retrieval systems, there is a significant capacity gap between the query and gallery network. The low-capacity query network struggles to effectively store and understand knowledge from the high-capacity teacher network. Therefore, we introduce a simple yet effective semi-collaborative distillation (SCD) framework, which can additionally adjust the gallery network because the gallery network has a redundant capacity to carry specific knowledge from the query network. Specifically, as the query network converges, we incrementally unfreeze the gallery network to smoothly adjust the feature space of the gallery network to be consistent with that of the query network.

[Paper] [Code]

Under Review

Projects 📂

BEM 2023 Contest
Design of Auxiliary Diagnosis Algorithm for Schizophrenia Based on Feature Fusion of EEG and ECG
[Entry (Chinese)]

Entry, 2023, National Biomedical Engineering Innovation Design Competition for College Students

Calculated brain functional network features, heart rate variability features and heart-brain coupling features to build machine learning models for automatic diagnosis; Deep learning models using ResNet were built based on original EEG and ECG also.

Second Prize. (6%)
Perfect GunMayhem Remake: A 2D Shooting PVP Game Based on Cocos2d-x
[Github] · [Project Page] · [Original Game] · [Art Assets (.ai)]

Course design, 2022, Advanced Language Programming (C++)

GunMayhem Remake is a project independently completed by our team members, covering all aspects, including source code, game artwork, and music assets. You can play our executable file.

Shoutout to Kevin Gu for creating this incredible game!

Final Score: 99, 4.0/4.0. (1%)

Awards 🏆

TaiHu Innovation Prize (1%),
Highest scholarship, which awarded by the Wuxi governments, 2024

Second Prize (6%), *Medical AI Track
The National BME lnnovation Design Competition, China Society of Biomedical Engineering, 2023

Meritorious Winner (6%), *Prior to the release of ChatGPT 😏
The Interdisciplinary Contest in Modeling (ICM), COMAP, 2021


Experiences 🌍

Shanghai AI Lab
Research Intern
Collaborating with Dr. Kaipeng Zhang
Westlake University
Visiting Students
Advised by Yandong Wen

Presentaitons 🗣️

A Brief Intro to Visual Prompting
* With my friends and mentors @ Shanghai AI Lab

An introduction to the concept of Visual Prompting, a technique used to enhance the performance of pre-trained Vision-Language models. It references several research papers that delve into different aspects of Visual Prompting.

[Slides (.pptx)] [Slides (.pdf)]


Do something insightful.

© 2024 Haoquan Zhang  

SCUT Logo CUHK Logo ailab Logo ailab Logo