My Name is Xiaoyi Han.
My research lies at the intersection of Computer Vision and Natural Language Processing, with a particular focus on Multi-Modal Learning (Scene Graph Generation). My research interests are broad, ranging from basic Computer Vision tasks such as Object Detection and Image Segmentation, to exploring the frontiers of Cross-Modal understanding and generation, such as Image Caption and Video Question Answering, etc.
Powered by Jekyll and Minimal Light theme.