澤山 正貴 (サワヤマ マサタカ)
情報科学研究院 メディアネットワーク部門 情報メディア学分野 | 准教授 |
■研究者基本情報
プロフィール情報
2010年 - 2013年 千葉大学大学院融合科学研究科情報科学専攻 博士後期課程
2013年 - 2016年 NTTコミュニケーション科学基礎研究所 人間情報研究部 感覚表現グループ リサーチアソシエイト
2016年 - 2018年 NTTコミュニケーション科学基礎研究所 人間情報研究部 感覚表現グループ 研究員
2018年 - 2020年 NTTコミュニケーション科学基礎研究所 人間情報研究部 感覚表現グループ 研究主任
2021年 - 2022年 Inria, Team Flowers ポスドク研究員
2022年 - 2023年 東京大学大学院情報理工学系研究科 特任講師
2023年 - 2025年 東京大学大学院情報理工学系研究科 講師
2025年 - 現在 北海道大学大学院情報科学研究院 准教授
Researchmap個人ページ
J-Global ID
■研究活動情報
論文
- Probing the link between vision and language in material perception using psychophysics and unsupervised learning
Chenxi Liao, Masataka Sawayama, Bei Xiao
PLOS Computational Biology, 20, 10, e1012481, e1012481, Public Library of Science (PLoS), 2024年10月03日, [査読有り]
英語, 研究論文(学術雑誌), We can visually discriminate and recognize a wide range of materials. Meanwhile, we use language to describe what we see and communicate relevant information about the materials. Here, we investigate the relationship between visual judgment and language expression to understand how visual features relate to semantic representations in human cognition. We use deep generative models to generate images of realistic materials. Interpolating between the generative models enables us to systematically create material appearances in both well-defined and ambiguous categories. Using these stimuli, we compared the representations of materials from two behavioral tasks: visual material similarity judgments and free-form verbal descriptions. Our findings reveal a moderate but significant correlation between vision and language on a categorical level. However, analyzing the representations with an unsupervised alignment method, we discover structural differences that arise at the image-to-image level, especially among ambiguous materials morphed between known categories. Moreover, visual judgments exhibit more individual differences compared to verbal descriptions. Our results show that while verbal descriptions capture material qualities on the coarse level, they may not fully convey the visual nuances of material appearances. Analyzing the image representation of materials obtained from various pre-trained deep neural networks, we find that similarity structures in human visual judgments align more closely with those of the vision-language models than purely vision-based models. Our work illustrates the need to consider the vision-language relationship in building a comprehensive model for material perception. Moreover, we propose a novel framework for evaluating the alignment and misalignment between representations from different modalities, leveraging information from human behaviors and computational models. - Temporal and Spatial Analysis of Event-Related Potentials in Response to Color Saliency Differences Among Various Color Vision Types
Naoko Takahashi, Masataka Sawayama, Xu Chen, Yuki Motomura, Hiroshige Takeichi, Satoru Miyauchi, Chihiro Hiramatsu
Frontiers in Human Neuroscience, 18, 1, 17, 2024年10月02日, [査読有り], [国際誌]
英語, 研究論文(学術雑誌), Introduction: Human color vision exhibits significant diversity that cannot be fully explained by categorical classifications. Understanding how individuals with different color vision phenotypes perceive, recognize, and react to the same physical stimuli provides valuable insights into sensory characteristics. This study aimed to identify behavioral and neural differences between different color visions, primarily classified as typical trichromats and anomalous trichromats, in response to two chromatic stimuli, blue-green and red, during an attention-demanding oddball task.
Methods: We analyzed the P3 component of event-related potentials (ERPs), associated with attention, and conducted a broad spatiotemporal exploration of neural differences. Behavioral responses were also analyzed to complement neural data. Participants included typical trichromats (n = 13) and anomalous trichromats (n = 5), and the chromatic stimuli were presented in an oddball paradigm.
Results: Typical trichromats exhibited faster potentiation from the occipital to parietal regions in response to the more salient red stimulus, particularly in the area overlapping with the P3 component. In contrast, anomalous trichromats revealed faster potentiation to the expected more salient blue-green stimulus in the occipital to parietal regions, with no other significant neural differences between stimuli. Comparisons between the color vision types showed no significant overall neural differences.
Discussion: The large variability in red-green sensitivity among anomalous trichromats, along with neural variability not fully explained by this sensitivity, likely contributed to the absence of clear neural distinctions based on color saliency. While reaction times were influenced by red-green sensitivity, neural signals showed ambiguity regarding saliency differences. These findings suggest that factors beyond red-green sensitivity influenced neural activity related to color perception and cognition in minority color vision phenotypes. Further research with larger sample sizes is needed to more comprehensively explore these neural dynamics and their broader implications. - Stick to your role! Stability of personal values expressed in large language models
Grgur Kovač, Rémy Portelas, Masataka Sawayama, Peter Ford Dominey, Pierre-Yves Oudeyer
PLOS ONE, 19, 8, e0309114, e0309114, Public Library of Science (PLoS), 2024年08月26日, [査読有り], [国際共著], [国際誌]
英語, 研究論文(学術雑誌), The standard way to study Large Language Models (LLMs) through benchmarks or psychology questionnaires is to provide many different queries from similar minimal contexts (e.g. multiple choice questions). However, due to LLM’s highly context-dependent nature, conclusions from such minimal-context evaluations may be little informative about the model’s behavior in deployment (where it will be exposed to many new contexts). We argue that context-dependence should be studied as another dimension of LLM comparison alongside others such as cognitive abilities, knowledge, or model size. In this paper, we present a case-study about the stability of value expression over different contexts (simulated conversations on different topics), and as measured using a standard psychology questionnaire (PVQ) and behavioral downstream tasks. We consider 21 LLMs from six families. Reusing methods from psychology, we study Rank-order stability on the population (interpersonal) level, and Ipsative stability on the individual (intrapersonal) level. We explore two settings: with and without instructing LLMs to simulate particular personalities. We observe similar trends in the stability of models and model families—Mixtral, Mistral, GPT-3.5 and Qwen families being more stable than LLaMa-2 and Phi—over those two settings, two different simulated populations, and even on three downstream behavioral tasks. When instructed to simulate particular personas, LLMs exhibit low Rank-Order stability, and this stability further diminishes with conversation length. This highlights the need for future research directions on LLMs that can coherently simulate a diversity of personas, as well as how context-dependence can be studied in more thorough and efficient ways. This paper provides a foundational step in that direction, and, to our knowledge, it is the first study of value stability in LLMs. The project website with code is available at https://sites.google.com/view/llmvaluestability. - Decoding time-resolved neural representations of orientation ensemble perception
Ryuto Yashiro, Masataka Sawayama, Kaoru Amano
Frontiers in Neuroscience, 18, 1, 13, Frontiers Media SA, 2024年08月01日, [査読有り], [国際誌]
英語, 研究論文(学術雑誌), The visual system can compute summary statistics of several visual elements at a glance. Numerous studies have shown that an ensemble of different visual features can be perceived over 50–200 ms; however, the time point at which the visual system forms an accurate ensemble representation associated with an individual’s perception remains unclear. This is mainly because most previous studies have not fully addressed time-resolved neural representations that occur during ensemble perception, particularly lacking quantification of the representational strength of ensembles and their correlation with behavior. Here, we conducted orientation ensemble discrimination tasks and electroencephalogram (EEG) recordings to decode orientation representations over time while human observers discriminated an average of multiple orientations. We modeled EEG signals as a linear sum of hypothetical orientation channel responses and inverted this model to quantify the representational strength of orientation ensemble. Our analysis using this inverted encoding model revealed stronger representations of the average orientation over 400–700 ms. We also correlated the orientation representation estimated from EEG signals with the perceived average orientation reported in the ensemble discrimination task with adjustment methods. We found that the estimated orientation at approximately 600–700 ms significantly correlated with the individual differences in perceived average orientation. These results suggest that although ensembles can be quickly and roughly computed, the visual system may gradually compute an orientation ensemble over several hundred milliseconds to achieve a more accurate ensemble representation. - Unsupervised learning reveals interpretable latent representations for translucency perception
Liao C, Sawayama M, Xiao B
PLoS Computational Biology, 19, (2): e1010878, 1, 31, 2023年01月, [査読有り]
英語, 研究論文(学術雑誌) - An Open-Source Cognitive Test Battery to Assess Human Attention and Memory
Maxime Adolphe, Masataka Sawayama, Denis Maurel, Alexandra Delmas, Pierre-Yves Oudeyer, Hélène Sauzéon
Frontiers in Psychology, 13, 1, 16, Frontiers Media SA, 2022年06月10日, [査読有り], [筆頭著者, 責任著者]
英語, 研究論文(学術雑誌), Cognitive test batteries are widely used in diverse research fields, such as cognitive training, cognitive disorder assessment, or brain mechanism understanding. Although they need flexibility according to their usage objectives, most test batteries are not available as open-source software and are not be tuned by researchers in detail. The present study introduces an open-source cognitive test battery to assess attention and memory, using a javascript library, p5.js. Because of the ubiquitous nature of dynamic attention in our daily lives, it is crucial to have tools for its assessment or training. For that purpose, our test battery includes seven cognitive tasks (multiple-objects tracking, enumeration, go/no-go, load-induced blindness, task-switching, working memory, and memorability), common in cognitive science literature. By using the test battery, we conducted an online experiment to collect the benchmark data. Results conducted on 2 separate days showed the high cross-day reliability. Specifically, the task performance did not largely change with the different days. Besides, our test battery captures diverse individual differences and can evaluate them based on the cognitive factors extracted from latent factor analysis. Since we share our source code as open-source software, users can expand and manipulate experimental conditions flexibly. Our test battery is also flexible in terms of the experimental environment, i.e., it is possible to experiment either online or in a laboratory environment. - Language-biased image classification: evaluation based on semantic representations
Lemesle*, Y, Sawayama*, M, Valle-Perez, G, Adolphe, M, Sauzéon, H, Oudeyer, P. Y
International Conference on Learning Representations (ICLR) 2022, 1, 19, 2022年04月, [査読有り], [筆頭著者, 責任著者]
英語, 研究論文(国際会議プロシーディングス) - Visual discrimination of optical material properties: a large-scale study.
Sawayama, M, Dobashi, Y, Okabe, M, Hosokawasa, K, Koumura, T, Saarela, T, Olkkonen, M, Nishida, S
Journal of Vision, 22(2), 17, 1, 24, 2022年02月, [査読有り], [筆頭著者, 責任著者], [国際誌]
英語, 研究論文(学術雑誌), Complex visual processing involved in perceiving the object materials can be better elucidated by taking a variety of research approaches. Sharing stimulus and response data is an effective strategy to make the results of different studies directly comparable and can assist researchers with different backgrounds to jump into the field. Here, we constructed a database containing several sets of material images annotated with visual discrimination performance. We created the material images using physically based computer graphics techniques and conducted psychophysical experiments with them in both laboratory and crowdsourcing settings. The observer's task was to discriminate materials on one of six dimensions (gloss contrast, gloss distinctness of image, translucent vs. opaque, metal vs. plastic, metal vs. glass, and glossy vs. painted). The illumination consistency and object geometry were also varied. We used a nonverbal procedure (an oddity task) applicable for diverse use cases, such as cross-cultural, cross-species, clinical, or developmental studies. Results showed that the material discrimination depended on the illuminations and geometries and that the ability to discriminate the spatial consistency of specular highlights in glossiness perception showed larger individual differences than in other tasks. In addition, analysis of visual features showed that the parameters of higher order color texture statistics can partially, but not completely, explain task performance. The results obtained through crowdsourcing were highly correlated with those obtained in the laboratory, suggesting that our database can be used even when the experimental conditions are not strictly controlled in the laboratory. Several projects using our dataset are underway. - Crystal or Jelly? Effect of Color on the Perception of Translucent Materials with Photographs of Real-world Objects.
Liao, C, Sawayama, M, Xiao, B
Journal of Vision, 22(2), 6, 1, 23, 2022年02月, [査読有り]
英語, 研究論文(学術雑誌) - The roles of lower- and higher-order surface statistics in tactile texture perception
Scinob Kuroki, Masataka Sawayama, Shin’ya Nishida
Journal of Neurophysiology, 126, 1, 95, 111, American Physiological Society, 2021年07月01日, [査読有り], [国際誌]
英語, 研究論文(学術雑誌), Humans can discriminate subtle spatial patterns differences in the surrounding world through their hands, but the underlying computation remains poorly understood. Here, we 3-D-printed textured surfaces and analyzed the tactile discrimination performance regarding the sensitivity to surface statistics. The results suggest that observers have sensitivity to lower-order statistics whereas not to higher-order statistics. That is, touch differs from vision not only in spatiotemporal resolution but also in (in)sensitivity to high-level surface statistics. - Discounting mechanism underlies extinction illusion
Lana Okubo, Kazuhiko Yokosawa, Masataka Sawayama, Takahiro Kawabe
Consciousness and Cognition, 90, 103100, 103100, Elsevier BV, 2021年04月, [査読有り]
英語, 研究論文(学術雑誌) - A computational mechanism for seeing dynamic deformation.
Kawabe, T, Sawayama, M
eNeuro, 7, 2, 1, 14, 2020年03月, [査読有り], [国際誌]
英語, 研究論文(学術雑誌), Human observers perceptually discriminate the dynamic deformation of materials in the real world. However, the psychophysical and neural mechanisms responsible for the perception of dynamic deformation have not been fully elucidated. By using a deforming bar as the stimulus, we showed that the spatial frequency of deformation was a critical determinant of deformation perception. Simulating the response of direction-selective units (i.e., MT pattern motion cells) to stimuli, we found that the perception of dynamic deformation was well explained by assuming a higher-order mechanism monitoring the spatial pattern of direction responses. Our model with the higher-order mechanism also successfully explained the appearance of a visual illusion wherein a static bar apparently deforms against a tilted drifting grating. In particular, it was the lower spatial frequencies in this pattern that strongly contributed to the deformation perception. Finally, by manipulating the luminance of the static bar, we observed that the mechanism for the illusory deformation was more sensitive to luminance than contrast cues. - Image-based translucency transfer through correlation analysis over multi-scale spatial color distribution
Todo, H., Yatagawa, T., Sawayama, M., Dobashi, Y., & Kakimoto, M.
the Visual Computer, 35, 6-8, 811, 822, 2019年06月, [査読有り]
英語, 研究論文(学術雑誌) - Motion Perception: From Detection to Interpretation
Nishida, S., Kawabe, T., Sawayama, M., & Fukiage, T.
Annual Review of Vision Science, 4, 1, 501, 523, 2018年09月, [査読有り]
英語, 研究論文(学術雑誌) - Material and shape perception based on two types of intensity gradient information
Masataka Sawayama, Shin'ya Nishida
PLoS Computational Biology, 14, 4, 1, 40, Public Library of Science, 2018年04月01日, [査読有り], [筆頭著者, 責任著者]
英語, 研究論文(学術雑誌) - Haptic texture perception on 3D-printed surfaces transcribed from visual natural textures
Scinob Kuroki, Masataka Sawayama, Shin’ya Nishida
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10893, 102, 112, Springer Verlag, 2018年, [査読有り]
英語, 研究論文(国際会議プロシーディングス) - Animating static objects by illusion-based projection mapping
Taiki Fukiage, Takahiro Kawabe, Masataka Sawayama, Shin'ya Nishida
JOURNAL OF THE SOCIETY FOR INFORMATION DISPLAY, 25, 7, 434, 443, 2017年07月, [査読有り]
英語, 研究論文(学術雑誌) - Visual wetness perception based on image color statistics
Masataka Sawayama, Edward H. Adelson, Shin'ya Nishida
JOURNAL OF VISION, 17, 5, 1, 24, 2017年05月, [査読有り], [筆頭著者, 責任著者]
英語, 研究論文(学術雑誌) - Human perception of subresolution fineness of dense textures based on image intensity statistics
Masataka Sawayama, Shin'ya Nishida, Mikio Shinya
JOURNAL OF VISION, 17, 4, 1, 18, 2017年04月, [査読有り], [筆頭著者, 責任著者]
英語, 研究論文(学術雑誌) - Deformation Lamps: A Projection Technique to Make Static Objects Perceptually Dynamic
Takahiro Kawabe, Taiki Fukiage, Masataka Sawayama, Shin'ya Nishida
ACM TRANSACTIONS ON APPLIED PERCEPTION, 13, 2, Article No. 10, 2016年03月, [査読有り]
英語, 研究論文(学術雑誌) - Deformation Lamps: A projection technique to make a static picture dynamic
Takahiro Kawabe, Masataka Sawayama, Shin'ya Nishida
ACM SIGGRAPH 2015 Emerging Technologies, SIGGRAPH 2015, Association for Computing Machinery, Inc, 2015年07月31日, [査読有り]
英語, 研究論文(国際会議プロシーディングス) - Stain on texture: Perception of a dark spot having a blurred edge on textured backgrounds
Masataka Sawayama, Eiji Kimura
VISION RESEARCH, 109, 209, 220, 2015年04月, [査読有り], [筆頭著者, 責任著者]
英語, 研究論文(学術雑誌) - Spatial organization affects lightness perception on articulated surrounds
Masataka Sawayama, Eiji Kimura
JOURNAL OF VISION, 13, 5, 1, 14, 2013年, [査読有り], [筆頭著者, 責任著者]
英語, 研究論文(学術雑誌) - Local computation of lightness on articulated surrounds
Masataka Sawayama, Eiji Kimura
I-PERCEPTION, 3, 8, 505, 514, 2012年, [査読有り], [筆頭著者, 責任著者]
英語, 研究論文(学術雑誌)
その他活動・業績
- 質感認知研究のための実験手法:テクスチャ合成による3次元形状の生成
澤山 正貴, 岡部 誠, 西田 眞也, 土橋 宜典, 基礎心理学研究, 36, 1, 56, 65, 2017年This research note reviews experimental methods to elucidate the visual processing underlying material perception, and considers how to generate experimental stimuli of three-dimensional shapes for the experiments. For generation of a computer graphics image of a three-dimensional object, it has been widely known that its shape features can affect the material appearance of the object. However, it is not established how to systematically control the shape features to investigate the effect. Here we suggest to utilize texture synthesis algorithms. Specifically, we used a height map of a three-dimensional object as a source image, and synthesized a novel height map by using a texture synthesis algorithm. We tested three algorithms to generate the height maps; i) synthesis based on image statistics, ii) example-based synthesis, and iii) synthesis using a convolutional neural network. We discuss how effective the texture synthesis algorithms are to investigate the effect of the shape features on the material perception.
, 日本基礎心理学会, 日本語, 速報,短報,研究ノート等(学術雑誌) - R&Dホットコーナー ソリューション 変幻灯 : 止まっている対象を錯覚的に動かす光投影技術 : NTTコミュニケーション科学基礎研究所
河邉 隆寛, 吹上 大樹, 澤山 正貴, NTT技術ジャーナル, 27, 9, 87, 90, 2015年09月
電気通信協会, 日本語 - 質感認識の科学と制御 : 液体質感をもたらす映像中の動き情報の探求 (特集 コミュニケーション科学の新展開)
河邉 隆寛, 澤山 正貴, 丸谷 和史, NTT技術ジャーナル, 26, 9, 27, 31, 2014年09月
電気通信協会, 日本語 - 5-3 静止した2 次元実対象を運動情報によって錯覚的に変形させる光投影手法(第5部門 デモセッション)
河邉 隆寛, 澤山 正貴, 丸谷 和史, 西田 眞也, 映像情報メディア学会年次大会講演予稿集, 2014, 5, 3-1"-"5-3-2", 2014年09月01日
We report a light projection method to illusorily deform static objects in printed materials by motion information that is projected through a video projector. By employing this method, it is possible to add the material impression of non-rigid materials to the printed materials., 一般社団法人映像情報メディア学会, 日本語