XIAO LING (シヨウ リン)

情報科学研究院 情報理工学部門 数理科学分野准教授
Last Updated :2025/06/07

■研究者基本情報

Researchmap個人ページ

研究者番号

  • 40946787

研究キーワード

  • 大規模マルチモーダルモデル
  • マルチモーダル処理
  • 推薦システム
  • 欠陥検出
  • 人工知能
  • コンピュータビジョン

研究分野

  • ものづくり技術(機械・電気電子・化学工学), 設計工学
  • 情報通信, 知能情報学

■経歴

経歴

  • 2025年04月 - 現在
    北海道大学, 大学院情報科学研究院 情報理工学部門, 准教授
  • 2023年10月 - 2025年03月
    東京大学, 大学院情報理工学系研究科 電子情報学専攻, 特任助教, 日本国
  • 2023年11月 - 2024年03月
    東京大学 Beyond AI 研究推進機構, Adjunct Project assistant professor, 日本国
  • 2022年07月 - 2023年10月
    東京大学 Beyond AI 研究推進機構, Adjunct project researcher, 日本国
  • 2021年06月 - 2023年10月
    東京大学, 大学院情報理工学系研究科 電子情報学専攻, 特任研究员, 日本国
  • 2014年07月 - 2015年08月
    BYD, 設計第二部, 機械工学エンジニア, 中華人民共和国

学歴

  • 2015年09月 - 2020年12月, 華中科技大学, 機械科学と工学学院, 機械電子系, 博士, 中華人民共和国
  • 2018年10月 - 2019年11月, クイーンズランド大学, 機械・鉱業工学部, Visiting Scholar

■研究活動情報

受賞

  • 2024年02月, 画像工学研究会, IE賞               
    継続学習における敵対的頑健性の向上
    向井皇喜;熊野創一郎;Nicolas Michel;LingXiao;山崎俊彦

論文

  • Multi-level knowledge distillation for fine-grained fashion image retrieval
    Ling Xiao, Toshihiko Yamasaki
    Knowledge-Based Systems, 310, 112955, 112955, Elsevier BV, 2025年02月
    研究論文(学術雑誌)
  • A Multimodal Dataset and Benchmark for Tourism Review Generation               
    Hiromasa Yamanishi, Ling Xiao, Toshihiko Yamasaki
    Proceedings of the ACM International Conference on Recommender Systems Workshops, 2025年01月03日
  • Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video
    Tomoya Sugihara, Shuntaro Masuda, Ling Xiao, Toshihiko Yamasaki
    Proceedings of the 6th ACM International Conference on Multimedia in Asia, abs/2405.08890, 1, 1, ACM, 2024年12月03日
    研究論文(国際会議プロシーディングス), Current video summarization methods primarily depend on supervised computer
    vision techniques, which demands time-consuming manual annotations. Further,
    the annotations are always subjective which make this task more challenging. To
    address these issues, we analyzed the feasibility in transforming the video
    summarization into a text summary task and leverage Large Language Models
    (LLMs) to boost video summarization. This paper proposes a novel
    self-supervised framework for video summarization guided by LLMs. Our method
    begins by generating captions for video frames, which are then synthesized into
    text summaries by LLMs. Subsequently, we measure semantic distance between the
    frame captions and the text summary. It's worth noting that we propose a novel
    loss function to optimize our model according to the diversity of the video.
    Finally, the summarized video can be generated by selecting the frames whose
    captions are similar with the text summary. Our model achieves competitive
    results against other state-of-the-art methods and paves a novel pathway in
    video summarization.
  • SCOMatch: Alleviating Overtrusting in Open-Set Semi-supervised Learning
    Zerun Wang, Liuyu Xiang, Lang Huang, Jiafeng Mao, Ling Xiao, Toshihiko Yamasaki
    Proceedings of the European Conference on Computer Vision, 217, 233, Springer Nature Switzerland, 2024年10月29日
    論文集(書籍)内論文
  • E-ReaRev: Adaptive Reasoning for Question Answering over Incomplete Knowledge Graphs by Edge and Meaning Extensions
    Xiaotong Ye, Ling Xiao, Chi Zhang, Toshihiko Yamasaki
    Proceedings of the International Conference on Applications of Natural Language to Information Systems, 85, 95, Springer Nature Switzerland, 2024年09月20日
    論文集(書籍)内論文
  • Boosting Fine-grained Fashion Retrieval with Relational Knowledge Distillation               
    Ling Xiao, Toshihiko Yamasaki
    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 8229, 8234, 2024年
  • HetSpot: Analyzing Tourist Spot Popularity with Heterogeneous Graph Neural Network.
    Hiromasa Yamanishi, Ling Xiao 0001, Toshihiko Yamasaki
    Proceedings of the 6th International Conference on Image, Video and Signal Processing, 111, 120, 2024年
    研究論文(国際会議プロシーディングス)
  • LiFSO-Net: A lightweight feature screening optimization network for complex-scale flat metal defect detection.
    Hao Zhong, Ling Xiao 0001, Haifeng Wang, Xin Zhang, Chenhui Wan, Youmin Hu, Bo Wu
    Knowl. Based Syst., 304, 112520, 112520, 2024年
    研究論文(学術雑誌)
  • STFE-Net: A multi-stage approach to enhance statistical texture feature for defect detection on metal surfaces.
    Hao Zhong, Daxing Fu, Ling Xiao, Fang Zhao, Jie Liu 0017, Youmin Hu, Bo Wu 0006
    Adv. Eng. Informatics, 61, 102437, 102437, 2024年
    研究論文(学術雑誌)
  • Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval
    Ling Xiao, Toshihiko Yamasaki
    IEEE Access, 12, 48068, 48080, 2024年
    研究論文(学術雑誌), Fine-grained fashion retrieval searches for items that share a similar
    attribute with the query image. Most existing methods use a pre-trained feature
    extractor (e.g., ResNet 50) to capture image representations. However, a
    pre-trained feature backbone is typically trained for image classification and
    object detection, which are fundamentally different tasks from fine-grained
    fashion retrieval. Therefore, existing methods suffer from a feature gap
    problem when directly using the pre-trained backbone for fine-tuning. To solve
    this problem, we introduce an attribute-guided multi-level attention network
    (AG-MAN). Specifically, we first enhance the pre-trained feature extractor to
    capture multi-level image embedding, thereby enriching the low-level features
    within these representations. Then, we propose a classification scheme where
    images with the same attribute, albeit with different values, are categorized
    into the same class. This can further alleviate the feature gap problem by
    perturbing object-centric feature learning. Moreover, we propose an improved
    attribute-guided attention module for extracting more accurate
    attribute-specific representations. Our model consistently outperforms existing
    attention based methods when assessed on the FashionAI (62.8788% in MAP),
    DeepFashion (8.9804% in MAP), and Zappos50k datasets (93.32% in Prediction
    accuracy). Especially, ours improves the most typical ASENet_V2 model by 2.12%,
    0.31%, and 0.78% points in FashionAI, DeepFashion, and Zappos50k datasets,
    respectively. The source code is available in
    https://github.com/Dr-LingXiao/AG-MAN.
  • Online Open-set Semi-supervised Object Detection via Semi-supervised Outlier Filtering.
    Zerun Wang, Ling Xiao 0001, Liuyu Xiang, Zhaotian Weng, Toshihiko Yamasaki
    CoRR, abs/2305.13802, 2023年
    研究論文(学術雑誌)
  • Learning Fashion Compatibility with Color Distortion Prediction.
    Ling Xiao 0001, Xiaofeng Zhang 0005, Toshihiko Yamasaki
    MIPR, 81, 84, 2023年
    研究論文(国際会議プロシーディングス)
  • Toward a More Robust Fine-Grained Fashion Retrieval.
    Ling Xiao 0001, Xiaofeng Zhang 0005, Toshihiko Yamasaki
    MIPR, 1, 4, 2023年
    研究論文(国際会議プロシーディングス)
  • Bridging the Capacity Gap for Online Knowledge Distillation.
    Maorong Wang, Hao Yu 0014, Ling Xiao 0001, Toshihiko Yamasaki
    MIPR, 1, 4, 2023年
    研究論文(国際会議プロシーディングス)
  • Missing Small Fastener Detection Using Deep Learning
    Ling Xiao, Bo Wu, Youmin Hu
    IEEE Transactions on Instrumentation and Measurement, 70, 1, 9, Institute of Electrical and Electronics Engineers ({IEEE}), 2021年
    研究論文(学術雑誌)
  • Surface Defect Detection Using Image Pyramid
    Ling Xiao, Bo Wu, Youmin Hu
    IEEE Sensors Journal, 20, 13, 7181, 7188, Institute of Electrical and Electronics Engineers ({IEEE}), 2020年07月01日
    研究論文(学術雑誌)
  • A Hierarchical Features-Based Model for Freight Train Defect Inspection
    Ling Xiao, Bo Wu, Youmin Hu, Jie Liu
    IEEE Sensors Journal, 1, 1, Institute of Electrical and Electronics Engineers ({IEEE}), 2020年03月01日
    研究論文(学術雑誌)
  • OSED: Object-specific edge detection.
    Ling Xiao, Bo Wu 0006, Youmin Hu
    Journal of Visual Communication and Image Representation, 72, 102918, 102918, 2020年
    研究論文(学術雑誌)
  • Detection of powder bed defects in selective laser sintering using convolutional neural network               
    Xiao, Ling, Lu, Mingyuan, Huang, Han
    The International Journal of Advanced Manufacturing Technology, 107, 2485, 2496, Springer, 2020年
    研究論文(学術雑誌)
  • Surface Defect Detection using Hierarchical Features.
    Ling Xiao, Tao Huang, Bo Wu 0006, Youmin Hu, Jiehan Zhou
    15th IEEE International Conference on Automation Science and Engineering(CASE), 1592, 1596, IEEE, 2019年
    研究論文(国際会議プロシーディングス)

その他活動・業績

  • 大規模言語モデルを活用した自己教師あり学習によるビデオ要約
    杉原, 朋弥, 増田, 俊太郎, 肖, 玲, 山崎, 俊彦, 第86回全国大会講演論文集, 2024, 1, 653, 654, 2024年03月01日
    既存のビデオ要約手法は重要シーン抽出にコンピュータービジョン技術をベースにしていて、大量のアノテーションデータが必要である。しかし、人手によるアノテーションは主観的である上にコストが高いため教師データの作成難易度が高い。そこで本研究では、近年の大規模言語モデルの進歩を活用した、自己教師あり学習に基づく新しいフレームワークを提案する。具体的には、フレームからキャプションを生成して映像を言語化し、大規模言語モデルにより映像の要約を作成する。この要約を教師データとして使用して、自然言語処理による新しいビデオ要約手法を実現した。本研究はビデオ要約の分野に新しい方向性を示し、既存の課題の解決に寄与する。, 日本語
  • Rethinking Momentum Knowledge Distillation in Online Continual Learning
    Nicolas Michel, Maorong Wang, Ling Xiao, Toshihiko Yamasaki, Forty-first International Conference on Machine Learning, abs/2309.02870, 2024年
    Online Continual Learning (OCL) addresses the problem of training neural
    networks on a continuous data stream where multiple classification tasks emerge
    in sequence. In contrast to offline Continual Learning, data can be seen only
    once in OCL, which is a very severe constraint. In this context, replay-based
    strategies have achieved impressive results and most state-of-the-art
    approaches heavily depend on them. While Knowledge Distillation (KD) has been
    extensively used in offline Continual Learning, it remains under-exploited in
    OCL, despite its high potential. In this paper, we analyze the challenges in
    applying KD to OCL and give empirical justifications. We introduce a direct yet
    effective methodology for applying Momentum Knowledge Distillation (MKD) to
    many flagship OCL methods and demonstrate its capabilities to enhance existing
    approaches. In addition to improving existing state-of-the-art accuracy by more
    than $10\%$ points on ImageNet100, we shed light on MKD internal mechanics and
    impacts during training in OCL. We argue that similar to replay, MKD should be
    considered a central component of OCL. The code is available at
    \url{https://github.com/Nicolas1203/mkd_ocl}.
  • Improving Plasticity in Online Continual Learning via Collaborative Learning
    Maorong Wang, Nicolas Michel, Ling Xiao, Toshihiko Yamasaki, CVPR, abs/2312.00600, 23460, 23469, 2023年12月01日
    Online Continual Learning (CL) solves the problem of learning the
    ever-emerging new classification tasks from a continuous data stream. Unlike
    its offline counterpart, in online CL, the training data can only be seen once.
    Most existing online CL research regards catastrophic forgetting (i.e., model
    stability) as almost the only challenge. In this paper, we argue that the
    model's capability to acquire new knowledge (i.e., model plasticity) is another
    challenge in online CL. While replay-based strategies have been shown to be
    effective in alleviating catastrophic forgetting, there is a notable gap in
    research attention toward improving model plasticity. To this end, we propose
    Collaborative Continual Learning (CCL), a collaborative learning based strategy
    to improve the model's capability in acquiring new concepts. Additionally, we
    introduce Distillation Chain (DC), a collaborative learning scheme to boost the
    training of the models. We adapt CCL-DC to existing representative online CL
    works. Extensive experiments demonstrate that even if the learners are
    well-trained with state-of-the-art online CL methods, our strategy can still
    improve model plasticity dramatically, and thereby improve the overall
    performance by a large margin. The source code of our work is available at
    https://github.com/maorong-wang/CCL-DC.
  • Online Open-set Semi-supervised Object Detection with Dual Competing Head
    Zerun Wang, Ling Xiao, Liuyu Xiang, Zhaotian Weng, Toshihiko Yamasaki, 2023年05月23日
    Open-set semi-supervised object detection (OSSOD) task leverages practical
    open-set unlabeled datasets that comprise both in-distribution (ID) and
    out-of-distribution (OOD) instances for conducting semi-supervised object
    detection (SSOD). The main challenge in OSSOD is distinguishing and filtering
    the OOD instances (i.e., outliers) during pseudo-labeling since OODs will
    affect the performance. The only OSSOD work employs an additional offline OOD
    detection network trained solely with labeled data to solve this problem.
    However, the limited labeled data restricts the potential for improvement.
    Meanwhile, the offline strategy results in low efficiency. To alleviate these
    issues, this paper proposes an end-to-end online OSSOD framework that improves
    performance and efficiency: 1) We propose a semi-supervised outlier filtering
    method that more effectively filters the OOD instances using both labeled and
    unlabeled data. 2) We propose a threshold-free Dual Competing OOD head that
    further improves the performance by suppressing the error accumulation during
    semi-supervised outlier filtering. 3) Our proposed method is an online
    end-to-end trainable OSSOD framework. Experimental results show that our method
    achieves state-of-the-art performance on several OSSOD benchmarks compared to
    existing methods. Moreover, additional experiments show that our method is more
    efficient and can be easily applied to different SSOD frameworks to boost their
    performance.
  • MetaMixer: A Regularization Strategy for Online Knowledge Distillation
    Maorong Wang, Ling Xiao, Toshihiko Yamasaki, CoRR, abs/2303.07951, 2023年03月14日
    Online knowledge distillation (KD) has received increasing attention in
    recent years. However, while most existing online KD methods focus on
    developing complicated model structures and training strategies to improve the
    distillation of high-level knowledge like probability distribution, the effects
    of the multi-level knowledge in the online KD are greatly overlooked,
    especially the low-level knowledge. Thus, to provide a novel viewpoint to
    online KD, we propose MetaMixer, a regularization strategy that can strengthen
    the distillation by combining the low-level knowledge that impacts the
    localization capability of the networks, and high-level knowledge that focuses
    on the whole image. Experiments under different conditions show that MetaMixer
    can achieve significant performance gains over state-of-the-art methods.
  • Semi-supervised Fashion Compatibility Prediction by Color Distortion Prediction
    Ling Xiao, Toshihiko Yamasaki, CoRR, abs/2212.14680, 2022年12月27日
    Supervised learning methods have been suffering from the fact that a
    large-scale labeled dataset is mandatory, which is difficult to obtain. This
    has been a more significant issue for fashion compatibility prediction because
    compatibility aims to capture people's perception of aesthetics, which are
    sparse and changing. Thus, the labeled dataset may become outdated quickly due
    to fast fashion. Moreover, labeling the dataset always needs some expert
    knowledge; at least they should have a good sense of aesthetics. However, there
    are limited self/semi-supervised learning techniques in this field. In this
    paper, we propose a general color distortion prediction task forcing the
    baseline to recognize low-level image information to learn more discriminative
    representation for fashion compatibility prediction. Specifically, we first
    propose to distort the image by adjusting the image color balance, contrast,
    sharpness, and brightness. Then, we propose adding Gaussian noise to the
    distorted image before passing them to the convolutional neural network (CNN)
    backbone to learn a probability distribution over all possible distortions. The
    proposed pretext task is adopted in the state-of-the-art methods in fashion
    compatibility and shows its effectiveness in improving these methods' ability
    in extracting better feature representations. Applying the proposed pretext
    task to the baseline can consistently outperform the original baseline.
  • SAT: Self-adaptive training for fashion compatibility prediction
    Ling Xiao, Toshihiko Yamasaki, 2022 IEEE International Conference on Image Processing(ICIP), abs/2206.12622, 2431, 2435, 2022年06月25日
    This paper presents a self-adaptive training (SAT) model for fashion
    compatibility prediction. It focuses on the learning of some hard items, such
    as those that share similar color, texture, and pattern features but are
    considered incompatible due to the aesthetics or temporal shifts. Specifically,
    we first design a method to define hard outfits and a difficulty score (DS) is
    defined and assigned to each outfit based on the difficulty in recommending an
    item for it. Then, we propose a self-adaptive triplet loss (SATL), where the DS
    of the outfit is considered. Finally, we propose a very simple conditional
    similarity network combining the proposed SATL to achieve the learning of hard
    items in the fashion compatibility prediction. Experiments on the publicly
    available Polyvore Outfits and Polyvore Outfits-D datasets demonstrate our
    SAT's effectiveness in fashion compatibility prediction. Besides, our SATL can
    be easily extended to other conditional similarity networks to improve their
    performance., IEEE

講演・口頭発表等

  • 人工知能の新たな領域を探求する               
    肖玲
    Forum for HUASHAN Scholars (IFHS2024), 2024年10月27日, 中国語, 公開講演,セミナー,チュートリアル,講習,講義等
    2024年10月25日 - 2024年10月28日, [招待講演]
  • 大規模マルチモーダルモデルを活用した推薦システム               
    肖玲
    The 4th International Computational Imaging Conference (CITA 2024), 2024年09月22日, 英語, 口頭発表(招待・特別)
    2024年09月20日 - 2024年09月22日, [招待講演]
  • ファッションリトリーバルの過去・現在・未来:ユーザー志向に向けて               
    肖玲
    MVE, 2024年03月13日, 英語, 口頭発表(招待・特別)
    2024年03月13日 - 2024年03月15日, [招待講演]

担当経験のある科目_授業

  • Visual Media               
    東京大学
    2024年04月 - 2024年07月

共同研究・競争的資金等の研究課題

産業財産権

  • 光フローマップに対する高速なFCM画像分割手法               
    特許権, 胡友民, 胡中旭, 吴波, 武敏健, 刘颉, 肖玲, 王诗杰, 李雪莲
    特願ZL201710530461.3
  • 溶接溶融池オンライン監視プラットフォーム用の多機能治具               
    実用新案権, 胡友民, 唐松, 肖玲, 谷勇, 刘颉
    特願ZL201620434683.6
  • 溶接溶融池の動的プロセスをオンライン監視するシステムおよび方法               
    特許権, 胡友民, 刘颉, 肖玲, 唐松, 谷勇
    特願ZL201610288460.8
  • 可視化されたクレーン吊り下げ位置決めシステム               
    特許権, 胡友民, 肖玲, 吴波, 刘颉
    特願ZL201611246219.5