XIAO LING (シヨウ リン)
情報科学研究院 情報理工学部門 数理科学分野 | 准教授 |
Last Updated :2025/06/07
■研究者基本情報
Researchmap個人ページ
ホームページURL
研究者番号
- 40946787
J-Global ID
■経歴
経歴
- 2025年04月 - 現在
北海道大学, 大学院情報科学研究院 情報理工学部門, 准教授 - 2023年10月 - 2025年03月
東京大学, 大学院情報理工学系研究科 電子情報学専攻, 特任助教, 日本国 - 2023年11月 - 2024年03月
東京大学 Beyond AI 研究推進機構, Adjunct Project assistant professor, 日本国 - 2022年07月 - 2023年10月
東京大学 Beyond AI 研究推進機構, Adjunct project researcher, 日本国 - 2021年06月 - 2023年10月
東京大学, 大学院情報理工学系研究科 電子情報学専攻, 特任研究员, 日本国 - 2014年07月 - 2015年08月
BYD, 設計第二部, 機械工学エンジニア, 中華人民共和国
学歴
■研究活動情報
論文
- Multi-level knowledge distillation for fine-grained fashion image retrieval
Ling Xiao, Toshihiko Yamasaki
Knowledge-Based Systems, 310, 112955, 112955, Elsevier BV, 2025年02月
研究論文(学術雑誌) - A Multimodal Dataset and Benchmark for Tourism Review Generation
Hiromasa Yamanishi, Ling Xiao, Toshihiko Yamasaki
Proceedings of the ACM International Conference on Recommender Systems Workshops, 2025年01月03日 - Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video
Tomoya Sugihara, Shuntaro Masuda, Ling Xiao, Toshihiko Yamasaki
Proceedings of the 6th ACM International Conference on Multimedia in Asia, abs/2405.08890, 1, 1, ACM, 2024年12月03日
研究論文(国際会議プロシーディングス), Current video summarization methods primarily depend on supervised computer
vision techniques, which demands time-consuming manual annotations. Further,
the annotations are always subjective which make this task more challenging. To
address these issues, we analyzed the feasibility in transforming the video
summarization into a text summary task and leverage Large Language Models
(LLMs) to boost video summarization. This paper proposes a novel
self-supervised framework for video summarization guided by LLMs. Our method
begins by generating captions for video frames, which are then synthesized into
text summaries by LLMs. Subsequently, we measure semantic distance between the
frame captions and the text summary. It's worth noting that we propose a novel
loss function to optimize our model according to the diversity of the video.
Finally, the summarized video can be generated by selecting the frames whose
captions are similar with the text summary. Our model achieves competitive
results against other state-of-the-art methods and paves a novel pathway in
video summarization. - SCOMatch: Alleviating Overtrusting in Open-Set Semi-supervised Learning
Zerun Wang, Liuyu Xiang, Lang Huang, Jiafeng Mao, Ling Xiao, Toshihiko Yamasaki
Proceedings of the European Conference on Computer Vision, 217, 233, Springer Nature Switzerland, 2024年10月29日
論文集(書籍)内論文 - E-ReaRev: Adaptive Reasoning for Question Answering over Incomplete Knowledge Graphs by Edge and Meaning Extensions
Xiaotong Ye, Ling Xiao, Chi Zhang, Toshihiko Yamasaki
Proceedings of the International Conference on Applications of Natural Language to Information Systems, 85, 95, Springer Nature Switzerland, 2024年09月20日
論文集(書籍)内論文 - Boosting Fine-grained Fashion Retrieval with Relational Knowledge Distillation
Ling Xiao, Toshihiko Yamasaki
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 8229, 8234, 2024年 - HetSpot: Analyzing Tourist Spot Popularity with Heterogeneous Graph Neural Network.
Hiromasa Yamanishi, Ling Xiao 0001, Toshihiko Yamasaki
Proceedings of the 6th International Conference on Image, Video and Signal Processing, 111, 120, 2024年
研究論文(国際会議プロシーディングス) - LiFSO-Net: A lightweight feature screening optimization network for complex-scale flat metal defect detection.
Hao Zhong, Ling Xiao 0001, Haifeng Wang, Xin Zhang, Chenhui Wan, Youmin Hu, Bo Wu
Knowl. Based Syst., 304, 112520, 112520, 2024年
研究論文(学術雑誌) - STFE-Net: A multi-stage approach to enhance statistical texture feature for defect detection on metal surfaces.
Hao Zhong, Daxing Fu, Ling Xiao, Fang Zhao, Jie Liu 0017, Youmin Hu, Bo Wu 0006
Adv. Eng. Informatics, 61, 102437, 102437, 2024年
研究論文(学術雑誌) - Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval
Ling Xiao, Toshihiko Yamasaki
IEEE Access, 12, 48068, 48080, 2024年
研究論文(学術雑誌), Fine-grained fashion retrieval searches for items that share a similar
attribute with the query image. Most existing methods use a pre-trained feature
extractor (e.g., ResNet 50) to capture image representations. However, a
pre-trained feature backbone is typically trained for image classification and
object detection, which are fundamentally different tasks from fine-grained
fashion retrieval. Therefore, existing methods suffer from a feature gap
problem when directly using the pre-trained backbone for fine-tuning. To solve
this problem, we introduce an attribute-guided multi-level attention network
(AG-MAN). Specifically, we first enhance the pre-trained feature extractor to
capture multi-level image embedding, thereby enriching the low-level features
within these representations. Then, we propose a classification scheme where
images with the same attribute, albeit with different values, are categorized
into the same class. This can further alleviate the feature gap problem by
perturbing object-centric feature learning. Moreover, we propose an improved
attribute-guided attention module for extracting more accurate
attribute-specific representations. Our model consistently outperforms existing
attention based methods when assessed on the FashionAI (62.8788% in MAP),
DeepFashion (8.9804% in MAP), and Zappos50k datasets (93.32% in Prediction
accuracy). Especially, ours improves the most typical ASENet_V2 model by 2.12%,
0.31%, and 0.78% points in FashionAI, DeepFashion, and Zappos50k datasets,
respectively. The source code is available in
https://github.com/Dr-LingXiao/AG-MAN. - Online Open-set Semi-supervised Object Detection via Semi-supervised Outlier Filtering.
Zerun Wang, Ling Xiao 0001, Liuyu Xiang, Zhaotian Weng, Toshihiko Yamasaki
CoRR, abs/2305.13802, 2023年
研究論文(学術雑誌) - Learning Fashion Compatibility with Color Distortion Prediction.
Ling Xiao 0001, Xiaofeng Zhang 0005, Toshihiko Yamasaki
MIPR, 81, 84, 2023年
研究論文(国際会議プロシーディングス) - Toward a More Robust Fine-Grained Fashion Retrieval.
Ling Xiao 0001, Xiaofeng Zhang 0005, Toshihiko Yamasaki
MIPR, 1, 4, 2023年
研究論文(国際会議プロシーディングス) - Bridging the Capacity Gap for Online Knowledge Distillation.
Maorong Wang, Hao Yu 0014, Ling Xiao 0001, Toshihiko Yamasaki
MIPR, 1, 4, 2023年
研究論文(国際会議プロシーディングス) - Missing Small Fastener Detection Using Deep Learning
Ling Xiao, Bo Wu, Youmin Hu
IEEE Transactions on Instrumentation and Measurement, 70, 1, 9, Institute of Electrical and Electronics Engineers ({IEEE}), 2021年
研究論文(学術雑誌) - Surface Defect Detection Using Image Pyramid
Ling Xiao, Bo Wu, Youmin Hu
IEEE Sensors Journal, 20, 13, 7181, 7188, Institute of Electrical and Electronics Engineers ({IEEE}), 2020年07月01日
研究論文(学術雑誌) - A Hierarchical Features-Based Model for Freight Train Defect Inspection
Ling Xiao, Bo Wu, Youmin Hu, Jie Liu
IEEE Sensors Journal, 1, 1, Institute of Electrical and Electronics Engineers ({IEEE}), 2020年03月01日
研究論文(学術雑誌) - OSED: Object-specific edge detection.
Ling Xiao, Bo Wu 0006, Youmin Hu
Journal of Visual Communication and Image Representation, 72, 102918, 102918, 2020年
研究論文(学術雑誌) - Detection of powder bed defects in selective laser sintering using convolutional neural network
Xiao, Ling, Lu, Mingyuan, Huang, Han
The International Journal of Advanced Manufacturing Technology, 107, 2485, 2496, Springer, 2020年
研究論文(学術雑誌) - Surface Defect Detection using Hierarchical Features.
Ling Xiao, Tao Huang, Bo Wu 0006, Youmin Hu, Jiehan Zhou
15th IEEE International Conference on Automation Science and Engineering(CASE), 1592, 1596, IEEE, 2019年
研究論文(国際会議プロシーディングス)
その他活動・業績
- 大規模言語モデルを活用した自己教師あり学習によるビデオ要約
杉原, 朋弥, 増田, 俊太郎, 肖, 玲, 山崎, 俊彦, 第86回全国大会講演論文集, 2024, 1, 653, 654, 2024年03月01日
既存のビデオ要約手法は重要シーン抽出にコンピュータービジョン技術をベースにしていて、大量のアノテーションデータが必要である。しかし、人手によるアノテーションは主観的である上にコストが高いため教師データの作成難易度が高い。そこで本研究では、近年の大規模言語モデルの進歩を活用した、自己教師あり学習に基づく新しいフレームワークを提案する。具体的には、フレームからキャプションを生成して映像を言語化し、大規模言語モデルにより映像の要約を作成する。この要約を教師データとして使用して、自然言語処理による新しいビデオ要約手法を実現した。本研究はビデオ要約の分野に新しい方向性を示し、既存の課題の解決に寄与する。, 日本語 - Rethinking Momentum Knowledge Distillation in Online Continual Learning
Nicolas Michel, Maorong Wang, Ling Xiao, Toshihiko Yamasaki, Forty-first International Conference on Machine Learning, abs/2309.02870, 2024年
Online Continual Learning (OCL) addresses the problem of training neural
networks on a continuous data stream where multiple classification tasks emerge
in sequence. In contrast to offline Continual Learning, data can be seen only
once in OCL, which is a very severe constraint. In this context, replay-based
strategies have achieved impressive results and most state-of-the-art
approaches heavily depend on them. While Knowledge Distillation (KD) has been
extensively used in offline Continual Learning, it remains under-exploited in
OCL, despite its high potential. In this paper, we analyze the challenges in
applying KD to OCL and give empirical justifications. We introduce a direct yet
effective methodology for applying Momentum Knowledge Distillation (MKD) to
many flagship OCL methods and demonstrate its capabilities to enhance existing
approaches. In addition to improving existing state-of-the-art accuracy by more
than $10\%$ points on ImageNet100, we shed light on MKD internal mechanics and
impacts during training in OCL. We argue that similar to replay, MKD should be
considered a central component of OCL. The code is available at
\url{https://github.com/Nicolas1203/mkd_ocl}. - Improving Plasticity in Online Continual Learning via Collaborative Learning
Maorong Wang, Nicolas Michel, Ling Xiao, Toshihiko Yamasaki, CVPR, abs/2312.00600, 23460, 23469, 2023年12月01日
Online Continual Learning (CL) solves the problem of learning the
ever-emerging new classification tasks from a continuous data stream. Unlike
its offline counterpart, in online CL, the training data can only be seen once.
Most existing online CL research regards catastrophic forgetting (i.e., model
stability) as almost the only challenge. In this paper, we argue that the
model's capability to acquire new knowledge (i.e., model plasticity) is another
challenge in online CL. While replay-based strategies have been shown to be
effective in alleviating catastrophic forgetting, there is a notable gap in
research attention toward improving model plasticity. To this end, we propose
Collaborative Continual Learning (CCL), a collaborative learning based strategy
to improve the model's capability in acquiring new concepts. Additionally, we
introduce Distillation Chain (DC), a collaborative learning scheme to boost the
training of the models. We adapt CCL-DC to existing representative online CL
works. Extensive experiments demonstrate that even if the learners are
well-trained with state-of-the-art online CL methods, our strategy can still
improve model plasticity dramatically, and thereby improve the overall
performance by a large margin. The source code of our work is available at
https://github.com/maorong-wang/CCL-DC. - Online Open-set Semi-supervised Object Detection with Dual Competing Head
Zerun Wang, Ling Xiao, Liuyu Xiang, Zhaotian Weng, Toshihiko Yamasaki, 2023年05月23日
Open-set semi-supervised object detection (OSSOD) task leverages practical
open-set unlabeled datasets that comprise both in-distribution (ID) and
out-of-distribution (OOD) instances for conducting semi-supervised object
detection (SSOD). The main challenge in OSSOD is distinguishing and filtering
the OOD instances (i.e., outliers) during pseudo-labeling since OODs will
affect the performance. The only OSSOD work employs an additional offline OOD
detection network trained solely with labeled data to solve this problem.
However, the limited labeled data restricts the potential for improvement.
Meanwhile, the offline strategy results in low efficiency. To alleviate these
issues, this paper proposes an end-to-end online OSSOD framework that improves
performance and efficiency: 1) We propose a semi-supervised outlier filtering
method that more effectively filters the OOD instances using both labeled and
unlabeled data. 2) We propose a threshold-free Dual Competing OOD head that
further improves the performance by suppressing the error accumulation during
semi-supervised outlier filtering. 3) Our proposed method is an online
end-to-end trainable OSSOD framework. Experimental results show that our method
achieves state-of-the-art performance on several OSSOD benchmarks compared to
existing methods. Moreover, additional experiments show that our method is more
efficient and can be easily applied to different SSOD frameworks to boost their
performance. - MetaMixer: A Regularization Strategy for Online Knowledge Distillation
Maorong Wang, Ling Xiao, Toshihiko Yamasaki, CoRR, abs/2303.07951, 2023年03月14日
Online knowledge distillation (KD) has received increasing attention in
recent years. However, while most existing online KD methods focus on
developing complicated model structures and training strategies to improve the
distillation of high-level knowledge like probability distribution, the effects
of the multi-level knowledge in the online KD are greatly overlooked,
especially the low-level knowledge. Thus, to provide a novel viewpoint to
online KD, we propose MetaMixer, a regularization strategy that can strengthen
the distillation by combining the low-level knowledge that impacts the
localization capability of the networks, and high-level knowledge that focuses
on the whole image. Experiments under different conditions show that MetaMixer
can achieve significant performance gains over state-of-the-art methods. - Semi-supervised Fashion Compatibility Prediction by Color Distortion Prediction
Ling Xiao, Toshihiko Yamasaki, CoRR, abs/2212.14680, 2022年12月27日
Supervised learning methods have been suffering from the fact that a
large-scale labeled dataset is mandatory, which is difficult to obtain. This
has been a more significant issue for fashion compatibility prediction because
compatibility aims to capture people's perception of aesthetics, which are
sparse and changing. Thus, the labeled dataset may become outdated quickly due
to fast fashion. Moreover, labeling the dataset always needs some expert
knowledge; at least they should have a good sense of aesthetics. However, there
are limited self/semi-supervised learning techniques in this field. In this
paper, we propose a general color distortion prediction task forcing the
baseline to recognize low-level image information to learn more discriminative
representation for fashion compatibility prediction. Specifically, we first
propose to distort the image by adjusting the image color balance, contrast,
sharpness, and brightness. Then, we propose adding Gaussian noise to the
distorted image before passing them to the convolutional neural network (CNN)
backbone to learn a probability distribution over all possible distortions. The
proposed pretext task is adopted in the state-of-the-art methods in fashion
compatibility and shows its effectiveness in improving these methods' ability
in extracting better feature representations. Applying the proposed pretext
task to the baseline can consistently outperform the original baseline. - SAT: Self-adaptive training for fashion compatibility prediction
Ling Xiao, Toshihiko Yamasaki, 2022 IEEE International Conference on Image Processing(ICIP), abs/2206.12622, 2431, 2435, 2022年06月25日
This paper presents a self-adaptive training (SAT) model for fashion
compatibility prediction. It focuses on the learning of some hard items, such
as those that share similar color, texture, and pattern features but are
considered incompatible due to the aesthetics or temporal shifts. Specifically,
we first design a method to define hard outfits and a difficulty score (DS) is
defined and assigned to each outfit based on the difficulty in recommending an
item for it. Then, we propose a self-adaptive triplet loss (SATL), where the DS
of the outfit is considered. Finally, we propose a very simple conditional
similarity network combining the proposed SATL to achieve the learning of hard
items in the fashion compatibility prediction. Experiments on the publicly
available Polyvore Outfits and Polyvore Outfits-D datasets demonstrate our
SAT's effectiveness in fashion compatibility prediction. Besides, our SATL can
be easily extended to other conditional similarity networks to improve their
performance., IEEE
講演・口頭発表等
- 人工知能の新たな領域を探求する
肖玲
Forum for HUASHAN Scholars (IFHS2024), 2024年10月27日, 中国語, 公開講演,セミナー,チュートリアル,講習,講義等
2024年10月25日 - 2024年10月28日, [招待講演] - 大規模マルチモーダルモデルを活用した推薦システム
肖玲
The 4th International Computational Imaging Conference (CITA 2024), 2024年09月22日, 英語, 口頭発表(招待・特別)
2024年09月20日 - 2024年09月22日, [招待講演] - ファッションリトリーバルの過去・現在・未来:ユーザー志向に向けて
肖玲
MVE, 2024年03月13日, 英語, 口頭発表(招待・特別)
2024年03月13日 - 2024年03月15日, [招待講演]
共同研究・競争的資金等の研究課題
- Efficient and accurate scaling Graph Neural Networks for giant graphs
Grants-in-Aid for Scientific Research
2024年04月01日 - 2026年03月31日
Japan Society for the Promotion of Science, Grant-in-Aid for Early-Career Scientists, The University of Tokyo, 24K20787