Researcher Database

Ichigaku Takigawa
Creative Research Institution Institute for Chemical Reaction Design and Discovery
Specially Appointed Associate Professor

Researcher Profile and Settings

Affiliation

  • Creative Research Institution Institute for Chemical Reaction Design and Discovery

Job Title

  • Specially Appointed Associate Professor

Research funding number

  • 10374597

J-Global ID

Research Interests

  • マテリアルズインフォマティクス   ケモインフォマティクス   バイオインフォマティクス   離散構造   データ駆動科学   機械学習   データマイニング   列挙アルゴリズム   モデル化   

Research Areas

  • Informatics / Statistical science
  • Informatics / Intelligent informatics
  • Life sciences / Systems genomics
  • Informatics / Biological, health, and medical informatics

Academic & Professional Experience

  • 2019/04 - Today 理化学研究所 革新知能統合研究センター 研究員
  • 2019/04 - Today 北海道大学 化学反応創成研究拠点(WPI-ICReDD) 特任准教授
  • 2018/10 - 2019/03 北海道大学 化学反応創成研究拠点(WPI-ICReDD) 准教授
  • 2018/07 - 2019/03 理化学研究所 革新知能統合研究センター 客員研究員
  • 2016/08 - 2019/03 Hokkaido University Research Institute for Electronic Science
  • 2015/12 - 2019/03 科学技術振興機構 さきがけ研究員
  • 2014/10 - 2019/03 Hokkaido University Graduate School of Information Science and Technology
  • 2012/01 - 2014/09 Hokkaido University Creative Research Institution
  • 2007/04 - 2011/12 Kyoto University Graduate School of Pharmaceutical Sciences
  • 2007/04 - 2011/12 Kyoto University Institute for Chemical Research
  • 2010/05 - 2010/08 Boston University Visiting Scholar
  • 2005/08 - 2007/03 Kyoto University Institute for Chemical Research
  • 2005/04 - 2005/07 Kyoto University Institute for Chemical Research
  • 2004/04 - 2005/03 Hokkaido University Graduate School of Information Science and Technology

Education

  • 2001/04 - 2004/03  Hokkaido University
  • 1999/04 - 2001/03  Hokkaido University
  • 1995/04 - 1999/03  Hokkaido University  School of Engineering  Department of Information Engineering

Association Memberships

  • JAPANESE SOCIETY FOR BIOINFORMATICS   THE JAPANESE SOCIETY FOR ARTIFICIAL INTELLIGENCE   ACM   IEEE   

Research Activities

Published Papers

  • Keisuke Suzuki, Takashi Toyao, Zen Maeno, Satoru Takakusagi, Ken-ichi Shimizu, Ichigaku Takigawa
    ChemCatChem Wiley 2019/09 [Refereed][Not invited]
  • Kamachi Takashi, Tatsumi Toshinobu, Toyao Takashi, Hinuma Yoyo, Maeno Zen, Takakusagi Satoru, Furukawa Shinya, Takigawa Ichigaku, Shimizu Ken-ichi
    JOURNAL OF PHYSICAL CHEMISTRY C 123 (34) 20988 - 20997 1932-7447 2019/08/29 [Refereed][Not invited]
  • Hinuma Yoyo, Toyao Takashi, Kamachi Takashi, Maeno Zen, Takakusagi Satoru, Furukawa Shinya, Takigawa Ichigaku, Shimizu Ken-ichi
    JOURNAL OF PHYSICAL CHEMISTRY C 122 (51) 29435 - 29444 1932-7447 2018/12/27 [Refereed][Not invited]
  • Takashi Toyao, Keisuke Suzuki, Shoma Kikuchi, Satoru Takakusagi, Ken-Ichi Shimizu, Ichigaku Takigawa
    Journal of Physical Chemistry C 122 (15) 8315 - 8326 1932-7455 2018/04/19 [Refereed][Not invited]
     
    The process employed to discover new materials for specific applications typically utilizes screening of large compound libraries. In this approach, the performance of a compound is correlated to the properties of elements referred to as descriptors. In the effort described below, we developed a simple and efficient machine learning (ML) model for predicting adsorption energies of CH4 related species, namely, CH3, CH2, CH, C, and H on the Cu-based alloys. The developed ML model predicted the DFT-calculated adsorption energies with 12 descriptors, which are readily available values for the selected elements. The predictive accuracy of four regression methods (ordinary linear regression by least-squares (OLR), random forest regression (RFR), gradient boosting regression (GBR), and extra tree regression (ETR)) with different numbers of descriptors and different test-set/training-set ratios was quantitatively evaluated using statistical cross validations. Among four types of regression methods, we have found that ETR gave the best performance in predicting the adsorption energies with the average root mean squared errors (RMSEs) below 0.3 eV. Strikingly, despite its simplicity and low computational cost, this model can predict the adsorption energies on a range of Cu-based alloy models (46 in total number) as calculated by using DFT. In addition, we show the ML prediction for the differences in the adsorption energies of CH3 and CH2 on the same surface. This would be of great importance especially when designing the selective catalytic reaction processes to suppress the undesired over-reactions. The accuracy and simplicity of the developed system suggest that adsorption energies can be readily predicted without time-consuming DFT calculations, and eventually, this would allow us to predict the catalytic performances of the solid catalysts.
  • Machine learning predictions of factors affecting the activity of heterogeneous metal catalysts
    Takigawa Ichigaku, Shimizu Ken-ichi, Tsuda Koji, Takakusagi Satoru
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY 255 0065-7727 2018/03/18 [Refereed][Not invited]
  • Takahashi KI, duVerle DA, Yotsukura S, Takigawa I, Mamitsuka H
    Methods in molecular biology (Clifton, N.J.) 1807 95 - 111 1064-3745 2018 [Refereed][Not invited]
  • Ryo Shirakawa, Yusei Yokoyama, Fumiya Okazaki, Ichigaku Takigawa
    CoRR abs/1807.02963 2018 [Refereed][Not invited]
  • Ayana Sasaki, Takahiro Nagatake, Riku Egami, Guoqiang Gu, Ichigaku Takigawa, Wataru Ikeda, Tomoya Nakatani, Jun Kunisawa, Yasuyuki Fujita
    Cell Reports 23 (4) 974 - 982 2211-1247 2018 [Refereed][Not invited]
     
    Recent studies have revealed that newly emerging transformed cells are often eliminated from epithelial tissues via cell competition with the surrounding normal epithelial cells. This cancer preventive phenomenon is termed epithelial defense against cancer (EDAC). However, it remains largely unknown whether and how EDAC is diminished during carcinogenesis. In this study, using a cell competition mouse model, we show that high-fat diet (HFD) feeding substantially attenuates the frequency of apical elimination of RasV12-transformed cells from intestinal and pancreatic epithelia. This process involves both lipid metabolism and chronic inflammation. Furthermore, aspirin treatment significantly facilitates eradication of transformed cells from the epithelial tissues in HFD-fed mice. Thus, our work demonstrates that obesity can profoundly influence competitive interaction between normal and transformed cells, providing insights into cell competition and cancer preventive medicine. Sasaki et al. demonstrate using a cell competition mouse model that high-fat diet feeding substantially attenuates the frequency of apical elimination of RasV12-transformed cells from intestinal and pancreatic epithelia. These results indicate that obesity can profoundly influence competitive interaction between normal and transformed cells at the initial stage of carcinogenesis.
  • Yuya Sugie, Array,Array,Array, Hiroshi Teramoto, Array,Array, Shin-ichi Minato, Masanao Yamaoka,Array
    Theory and Practice of Natural Computing - 7th International Conference, TPNC 2018, Dublin, Ireland, December 12-14, 2018, Proceedings Springer 111 - 123 2018 [Refereed][Not invited]
  • Takashi Takemoto, Normann Mertig, Masato Hayashi, Saki Susa-Tanaka, Hiroshi Teramoto, Atsuyoshi Nakamura, Ichigaku Takigawa, Shin-ichi Minato, Tamiki Komatsuzaki, Masanao Yamaoka
    2018 International Conference on ReConFigurable Computing and FPGAs, ReConFig 2018, Cancun, Mexico, December 3-5, 2018 IEEE 1 - 8 2325-6532 2018 [Refereed][Not invited]
  • Yuka Hama, Masataka Katsu, Ichigaku Takigawa, Ichiro Yabe, Masaaki Matsushima, Ikuko Takahashi, Takayuki Katayama, Jun Utsumi, Hidenao Sasaki
    MOLECULAR BRAIN 10 (1) 54  1756-6606 2017/11 [Refereed][Not invited]
     
    Genomic variation includes single-nucleotide variants, small insertions or deletions (indels), and copy number variants (CNVs). CNVs affect gene expression by altering the genome structure and transposable elements within a region. CNVs are greater than 1 kb in size; hence, CNVs can produce more variation than can individual single-nucleotide variations that are detected by next-generation sequencing. Multiple system atrophy (MSA) is an alpha-synucleinopathy adult-onset disorder. Pathologically, it is characterized by insoluble aggregation of filamentous alpha-synuclein in brain oligodendrocytes. Generally, MSA is sporadic, although there are rare cases of familial MSA. In addition, the frequencies of the clinical phenotypes differ considerably among countries. Reports indicate that genetic factors play roles in the mechanisms involved in the pathology and onset of MSA. To evaluate the genetic background of this disorder, we attempted to determine whether there are differences in CNVs between patients with MSA and normal control subjects. We found that the number of CNVs on chromosomes 5, 22, and 4 was increased in MSA; 3 CNVs in non-coding regions were considered risk factors for MSA. Our results show that CNVs in non-coding regions influence the expression of genes through transcription-related mechanisms and potentially increase subsequent structural alterations of chromosomes. Therefore, these CNVs likely play roles in the molecular mechanisms underlying MSA.
  • Tien Lam Pham, Hiori Kino, Kiyoyuki Terakura, Takashi Miyake, Koji Tsuda, Ichigaku Takigawa, Hieu Chi Dam
    SCIENCE AND TECHNOLOGY OF ADVANCED MATERIALS 18 (1) 756 - 765 1468-6996 2017/10 [Refereed][Not invited]
     
    We propose a novel representation of materials named an 'orbital-field matrix (OFM)', which is based on the distribution of valence shell electrons. We demonstrate that this new representation can be highly useful in mining material data. Experimental investigation shows that the formation energies of crystalline materials, atomization energies of molecular materials, and local magnetic moments of the constituent atoms in bimetal alloys of lanthanide metal and transition-metal can be predicted with high accuracy using the OFM. Knowledge regarding the role of the coordination numbers of the transition-metal and lanthanide elements in determining the local magnetic moments of the transition-metal sites can be acquired directly from decision tree regression analyses using the OFM.[GRAPHICS]
  • Sohiya Yotsukura, Masayuki Karasuyama, Ichigaku Takigawa, Hiroshi Mamitsuka
    BRIEFINGS IN BIOINFORMATICS 18 (4) 619 - 633 1467-5463 2017/07 [Refereed][Not invited]
     
    Triple-negative (TN) breast cancer (BC) patients have limited treatment options and poor prognosis even after extant treatments and standard chemotherapeutic regimens. Linking TN patients to clinically known phenotypes with appropriate treatments is vital. Location-specific sequence variants are expected to be useful for this purpose by identifying subgroups within a disease population. Single gene mutational signatures have been widely reported, with related phenotypes in literature. We thoroughly survey currently available mutations (and mutated genes), linked to BC phenotypes, to demonstrate their limited performance as sole predictors/biomarkers to assign phenotypes to patients. We then explore mutational combinations, as a pilot study, using The Cancer Genome Atlas Research Network mutational data of BC and three machine learning methods: association rules (limitless arity multiple procedure), decision tree and hierarchical disjoint clustering. The study results in a patient classification scheme through combinatorial mutations in Phosphatidylinositol-4,5-Bisphosphate 3-Kinase and tumor protein 53, being consistent with all three methods, implying its validity from a diverse viewpoint. However, it would warrant further research to select multi-gene signatures to identify phenotypes specifically and be clinically used routinely.
  • Jana Backhus, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo, Masanori Sugimoto
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E100A (3) 865 - 876 1745-1337 2017/03 [Refereed][Not invited]
     
    In this paper, we introduce a self-constructive Normalized Gaussian Network (NGnet) for online learning tasks. In online tasks, data samples are received sequentially, and domain knowledge is often limited. Then, we need to employ learning methods to the NGnet that possess robust performance and dynamically select an accurate model size. We revise a previously proposed localized forgetting approach for the NGnet and adapt some unit manipulation mechanisms to it for dynamic model selection. The mechanisms are improved for more robustness in negative interference prone environments, and a new merge manipulation is considered to deal with model redundancies. The effectiveness of the proposed method is compared with the previous localized forgetting approach and an established learning method for the NGnet. Several experiments are conducted for a function approximation and chaotic time series forecasting task. The proposed approach possesses robust and favorable performance in different learning situations over all testbeds.
  • Ichigaku Takigawa, Hiroshi Mamitsuka
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 39 (3) 617 - 624 0162-8828 2017/02 [Refereed][Not invited]
     
    Supervised learning over graphs is an intrinsically difficult problem: simultaneous learning of relevant features from the complete subgraph feature set, in which enumerating all subgraph features occurring in given graphs is practically intractable due to combinatorial explosion. We show that 1) existing graph supervised learning studies, such as Adaboost, LPBoost, and LARS/LASSO, can be viewed as variations of a branch-and-bound algorithm with simple bounds, which we call Morishita-Kudo bounds; 2) We present a direct sparse optimization algorithm for generalized problems with arbitrary twice-differentiable loss functions, to which Morishita-Kudo bounds cannot be directly applied; 3) We experimentally showed that i) our direct optimization method improves the convergence rate and stability, and ii) L1-penalized logistic regression (L1-LogReg) by our method identifies a smaller subgraph set, keeping the competitive performance, iii) the learned subgraphs by L1-LogReg are more size-balanced than competing methods, which are biased to small-sized subgraphs.
  • Fumiko Shinkai-Ouchi, Suguru Koyama, Yasuko Ono, Shoji Hata, Koichi Ojima, Mayumi Shindo, David duVerle, Mika Ueno, Fujiko Kitamura, Naoko Doi, Ichigaku Takigawa, Hiroshi Mamitsuka, Hiroyuki Sorimachi
    MOLECULAR & CELLULAR PROTEOMICS 15 (4) 1262 - 1280 1535-9476 2016/04 [Refereed][Not invited]
     
    Calpains are intracellular Ca2+-regulated cysteine proteases that are essential for various cellular functions. Mammalian conventional calpains (calpain-1 and calpain-2) modulate the structure and function of their substrates by limited proteolysis. Thus, it is critically important to determine the site(s) in proteins at which calpains cleave. However, the calpains' substrate specificity remains unclear, because the amino acid (aa) sequences around their cleavage sites are very diverse. To clarify calpains' substrate specificities, 84 20-mer oligopeptides, corresponding to P10-P10 of reported cleavage site sequences, were proteolyzed by calpains, and the catalytic efficiencies (k(cat)/K-m) were globally determined by LC/MS. This analysis revealed 483 cleavage site sequences, including 360 novel ones. The k(cat)/K(m)s for 119 sites ranged from 12.5-1,710 M(-1)s(-1). Although most sites were cleaved by both calpain-1 and -2 with a similar k(cat)/K-m, sequence comparisons revealed distinct aa preferences at P9-P7/P2/P5. The aa compositions of the novel sites were not statistically different from those of previously reported sites as a whole, suggesting calpains have a strict implicit rule for sequence specificity, and that the limited proteolysis of intact substrates is because of substrates' higher-order structures. Cleavage position frequencies indicated that longer sequences N-terminal to the cleavage site (P-sites) were preferred for proteolysis over C-terminal (P-sites). Quantitative structure-activity relationship (QSAR) analyses using partial least-squares regression and >1,300 aa descriptors achieved k(cat)/K-m prediction with r = 0.834, and binary-QSAR modeling attained an 87.5% positive prediction value for 132 reported calpain cleavage sites independent of our model construction. These results outperformed previous calpain cleavage predictors, and revealed the importance of the P2, P3, and P4 sites, and P1-P2 cooperativity. Furthermore, using our binary-QSAR model, novel cleavage sites in myoglobin were identified, verifying our predictor. This study increases our understanding of calpain substrate specificities, and opens calpains to next-generation, i.e. activity-related quantitative and cooperativity-dependent analyses.
  • Atsuyoshi Nakamura, Ichigaku Takigawa, Hisashi Tosaka, Mineichi Kudo, Hiroshi Mamitsuka
    DISCRETE APPLIED MATHEMATICS 200 123 - 152 0166-218X 2016/02 [Refereed][Not invited]
     
    We consider a frequent approximate pattern mining problem, in which interspersed repetitive regions are extracted from a given string. That is, we enumerate substrings that frequently match substrings of a given string locally and optimally. For this problem, we propose a new algorithm, in which candidate patterns are generated without duplication using the suffix tree of a given string. We further define a k-gap-constrained setting, in which the number of gaps in the alignment between a pattern and an occurrence is limited to at most k. Under this setting, we present memory-efficient algorithms, particularly a candidate-based version, which runs fast enough even over human chromosome sequences with, more than 10 million nucleotides. We note that our problem and algorithms for strings can be directly extended to ordered labeled trees. In our experiments we used both randomly synthesized strings, in which corrupted similar substrings are embedded, and real data of human chromosome. The synthetic data experiments show that our proposed approach extracted embedded patterns correctly and time-efficiently. In real data experiments, we examined the centers of 100 clusters computed after grouping the patterns obtained by our k-gap-constrained versions (k = 0, 1 and 2) and the results revealed that the regions of their occurrences coincided with around a half of the regions automatically annotated as Alu sequences by a manually curated repeat sequence database. (C) 2015 Elsevier B.V. All rights reserved.
  • Ichigaku Takigawa, Ken-Ichi Shimizu, Koji Tsuda, Satoru Takakusagi
    RSC Advances 6 (58) 52587 - 52595 2046-2069 2016 [Refereed][Not invited]
     
    The d-band center for metals has been widely used in order to understand activity trends in metal-surface-catalyzed reactions in terms of the linear Brønsted-Evans-Polanyi relation and Hammer-Nørskov d-band model. In this paper, the d-band centers for eleven metals (Fe, Co, Ni, Cu, Ru, Rh, Pd, Ag, Ir, Pt, Au) and their pairwise bimetals for two different structures (1% metal doped- or overlayer-covered metal surfaces) are statistically predicted using machine learning methods from readily available values as descriptors for the target metals (such as the density and the enthalpy of fusion of each metal). The predictive accuracy of four regression methods with different numbers of descriptors and different test-set/training-set ratios are quantitatively evaluated using statistical cross validations. It is shown that the d-band centers are reasonably well predicted by the gradient boosting regression (GBR) method with only six descriptors, even when we predict 75% of the data from only 25% given for training (average root mean square error (RMSE) < 0.5 eV). This demonstrates a potential use of machine learning methods for predicting the activity trends of metal surfaces with a negligible CPU time compared to first-principles methods.
  • Ichigaku Takigawa, Ken-ichi Shimizu, Koji Tsuda, Satoru Takakusagi
    RSC ADVANCES 6 (58) 52587 - 52595 2046-2069 2016 [Refereed][Not invited]
     
    The d-band center for metals has been widely used in order to understand activity trends in metal-surface-catalyzed reactions in terms of the linear Bronsted-Evans-Polanyi relation and Hammer-Norskov d-band model. In this paper, the d-band centers for eleven metals (Fe, Co, Ni, Cu, Ru, Rh, Pd, Ag, Ir, Pt, Au) and their pairwise bimetals for two different structures (1% metal doped- or overlayer-covered metal surfaces) are statistically predicted using machine learning methods from readily available values as descriptors for the target metals (such as the density and the enthalpy of fusion of each metal). The predictive accuracy of four regression methods with different numbers of descriptors and different test-set/training-set ratios are quantitatively evaluated using statistical cross validations. It is shown that the d-band centers are reasonably well predicted by the gradient boosting regression (GBR) method with only six descriptors, even when we predict 75% of the data from only 25% given for training (average root mean square error (RMSE) < 0.5 eV). This demonstrates a potential use of machine learning methods for predicting the activity trends of metal surfaces with a negligible CPU time compared to first-principles methods.
  • Jana Backhus, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo, Masanori Sugimoto
    NEURAL INFORMATION PROCESSING, ICONIP 2016, PT IV 9950 538 - 546 0302-9743 2016 [Refereed][Not invited]
     
    In this paper, we propose a weight-time-dependent (WTD) update approach for an online EM algorithm applied to the Normalized Gaussian network (NGnet). WTD aims to improve a recently proposed weight-dependent (WD) update approach by Celaya and Agostini. First, we discuss the derivation of WD from an older time-dependent (TD) update approach. Then, we consider additional aspects to improve WD, and by including them we derive the new WTD approach from TD. The difference between WD and WTD is discussed, and some experiments are conducted to demonstrate the effectiveness of the proposed approach. WTD succeeds in improving the learning performance for a function approximation task with balanced and dynamic data distributions.
  • Jana Backhus, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo, Masanori Sugimoto
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT I 9886 444 - 452 0302-9743 2016 [Refereed][Not invited]
     
    In this paper, a Normalized Gaussian Network (NGnet) is introduced for online sequential learning that uses unit manipulation mechanisms to build the network model self-constructively. Several unit manipulation mechanisms have been proposed for online learning of an NGnet. However, unit redundancy still exists in the network model. We propose a merge mechanism for such redundant units, and change its overlap calculation in order to improve the identification accuracy of redundant units. The effectiveness of the proposed approach is demonstrated in a function approximation task with balanced and imbalanced data distributions. It succeeded in reducing the model complexity around 11% on average while keeping or even improving learning performance.
  • Sadamori Koujaku, Ichigaku Takigawa, Mineichi Kudo, Hideyuki Imai
    SOCIAL NETWORKS 44 143 - 152 0378-8733 2016/01 [Refereed][Not invited]
     
    Discovery of cohesive subgraphs is an important issue in social network analysis. As representative cohesive subgraphs, pseudo cliques have been developed by relaxing the perfection of cliques. By enumerating pseudo clique subgraphs, we can find some structures of interest such as a star-like structure. However, a little more complicated structures such as a core/periphery structure is still hard to be found by them. Therefore, we propose a novel pseudo clique called p-dense core and show the connection with the other pseudo cliques. Moreover, we show that a set of p-dense core subgraphs gives an optimal solution in a graph partitioning problem. Several experiments on real-life networks demonstrated the effectiveness for cohesive subgraph discovery. (C) 2015 Elsevier B.V. All rights reserved.
  • Akira Tanaka, Hirofumi Takebayashi, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo
    IEICE Transactions E98.A (11) 2324 - 2324 0916-8508 2015/11 [Refereed][Not invited]
  • Hajime Yamauchi, Takanori Matsumaru, Tomoko Morita, Susumu Ishikawa, Katsumi Maenaka, Ichigaku Takigawa, Kentaro Semba, Shunsuke Kon, Yasuyuki Fujita
    SCIENTIFIC REPORTS 5 15336  2045-2322 2015/10 [Refereed][Not invited]
     
    Recent studies have revealed that cell competition can occur between normal and transformed epithelial cells; normal epithelial cells recognize the presence of the neighboring transformed cells and actively eliminate them from epithelial tissues. Here, we have established a brand-new high-throughput screening platform that targets cell competition. By using this platform, we have identified Rebeccamycin as a hit compound that specifically promotes elimination of RasV12-transformed cells from the epithelium, though after longer treatment it shows substantial cytotoxic effect against normal epithelial cells. Among several Rebeccamycin-derivative compounds, we have found that VC1-8 has least cytotoxicity against normal cells but shows the comparable effect on the elimination of transformed cells. This cell competition-promoting activity of VC1-8 is observed both in vitro and ex vivo. These data demonstrate that the cell competition-based screening is a promising tool for the establishment of a novel type of cancer preventive medicine.
  • Sadamori Koujaku, Mineichi Kudo, Ichigaku Takigawa, Hideyuki Imai
    WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB 793 - 798 2015 [Refereed][Not invited]
     
    Detection of anomalous changes in social networks has been studied in various applications such as change detection of social interests and virus infections. Among several kinds of network changes, we concentrate on the structural changes of relatively small stationary communities. Such a change is important because it implies that some crucial changes have happened in a special group, such as dismiss of a board of directors. One difficulty is that we have to do this in a noisy environment. This paper, therefore, proposes an algorithm that finds stationary communities in a noisy environment. Experiments on two real networks showed the advantages of our proposed algorithm.
  • Hidehisa Takahashi, Ichigaku Takigawa, Masashi Watanabe, Delnur Anwar, Mio Shibata, Chieri Tomomori-Sato, Shigeo Sato, Amol Ranjan, Chris W. Seidel, Tadasuke Tsukiyama, Wataru Mizushima, Masayasu Hayashi, Yasuyuki Ohkawa, Joan W. Conaway, Ronald C. Conaway, Shigetsugu Hatakeyama
    NATURE COMMUNICATIONS 6 5941  2041-1723 2015/01 [Refereed][Not invited]
     
    Regulation of transcription elongation by RNA polymerase II (Pol II) is a key regulatory step in gene transcription. Recently, the little elongation complex (LEC)-which contains the transcription elongation factor ELL/EAF-was found to be required for the transcription of Pol II-dependent small nuclear RNA (snRNA) genes. Here we show that the human Mediator subunit MED26 plays a role in the recruitment of LEC to a subset of snRNA genes through direct interaction of EAF and the N-terminal domain (NTD) of MED26. Loss of MED26 in cells decreases the occupancy of LEC at a subset of snRNA genes and results in a reduction in their transcription. Our results suggest that the MED26-NTD functions as a molecular switch in the exchange of TBP-associated factor 7 (TAF7) for LEC to facilitate the transition from initiation to elongation during transcription of a subset of snRNA genes.
  • Hao Ding, Ichigaku Takigawa, Hiroshi Mamitsuka, Shanfeng Zhu
    BRIEFINGS IN BIOINFORMATICS 15 (5) 734 - 747 1467-5463 2014/09 [Refereed][Not invited]
     
    Computationally predicting drug-target interactions is useful to select possible drug (or target) candidates for further biochemical verification. We focus on machine learning-based approaches, particularly similarity-based methods that use drug and target similarities, which show relationships among drugs and those among targets, respectively. These two similarities represent two emerging concepts, the chemical space and the genomic space. Typically, the methods combine these two types of similarities to generate models for predicting new drug-target interactions. This process is also closely related to a lot of work in pharmacogenomics or chemical biology that attempt to understand the relationships between the chemical and genomic spaces. This background makes the similarity-based approaches attractive and promising. This article reviews the similarity-based machine learning methods for predicting drug-target interactions, which are state-of-the-art and have aroused great interest in bioinformatics. We describe each of these methods briefly, and empirically compare these methods under a uniform experimental setting to explore their advantages and limitations.
  • Yamashita Y, Kadokura Y, Sotta N, Fujiwara T, Takigawa I, Satake A, Onouchi H, Naito S
    The Journal of biological chemistry 289 (18) 12693 - 12704 0021-9258 2014/05/02 [Refereed][Not invited]
  • Akira Tanaka, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION 8621 273 - 281 0302-9743 2014 [Refereed][Not invited]
     
    Kernel-based learning is widely known as a powerful tool for various fields of information science such as pattern recognition and regression estimation. For the last few decades, a combination of different learning machines so-called ensemble learning, which includes learning with multiple kernels, have attracted much attention in this field. Although its efficacy was revealed numerically in many works, its theoretical grounds are not investigated sufficiently. In this paper, we discuss regression problems with a class of kernels and show that the generalization error by an ensemble kernel regressor with the class of kernels is smaller than the averaged generalization error by kernel regressors with each kernel in the class.
  • Akira Tanaka, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo
    Proceedings of the Sixth Asian Conference on Machine Learning, ACML 2014, Nha Trang City, Vietnam, November 26-28, 2014. JMLR.org 39 (2014) 1 - 15 2014 [Refereed][Not invited]
  • Kei-Ichiro Takahashi, Ichigaku Takigawa, Hiroshi Mamitsuka
    PLoS ONE 8 (12) e82890  1932-6203 2013/12/30 [Refereed][Not invited]
     
    Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index. jsp. © 2013 Takahashi et al.
  • Atsuyoshi Nakamura, Tomoya Saito, Ichigaku Takigawa, Mineichi Kudo, Hiroshi Mamitsuka
    Discrete Applied Mathematics 161 (10-11) 1556 - 1575 0166-218X 2013/07 [Refereed][Not invited]
     
    A string with many repetitions can be represented compactly by replacing h-fold contiguous repetitions of a string r with (r)h. We present a compact representation, which we call a repetition representation (of a string) or RRS, by which a set of disjoint or nested tandem arrays can be compacted. In this paper, we study the problem of finding a minimum RRS or MRRS, where the size of an RRS is defined by the sum of the length of component letters and the description length of the component repetitions (ṡ)h which is defined by ER (h) using a repetition weight function wR. We develop two dynamic programming-based algorithms to solve this problem: CMR, which works for any type of wR, and CMR-C, which is faster but can be applied to a constant wR only. CMR-C is an O(n2logn)-time O(nlogn)-space algorithm, which is more efficient in both time and space than CMR by a ((logn)/n)-factor, where n is the length of the given string. The problem of finding an MRRS for a string can be extended to that of finding a minimum repetition representation (of a tree) or MRRT for a given labeled ordered tree. For this problem, we present two algorithms, CMRT and CMRT-C, by using CMR and CMR-C, respectively, as a subroutine. As well as the theoretical analysis, we confirmed the efficiency of the proposed algorithms by experiments, which consist of the following three parts: First we demonstrated that CMR-C and CMRT-C are fast enough for large-scale data by using synthetic strings and trees, respectively. The size of an MRRS for a given string can be a measure of how compactly the string can be represented, meaning how well the string is structurally organized. This is also true of trees. To check such ability of MRRS-size, second we measured the size of an MRRS for chromosomes of nine different species. We found that all the chromosomes of the same species have a similar compression rate when realized by an MRRS. Run length encoding (RLE) was also shown to have species-specific compression rate, but species were separated more clearly by MRRS than by RLE. Third we examined the size of an MRRT for web pages of world-leading companies by using the tag trees, showing a consistency between the compression rate by an MRRT and visual web page structures. © © 2013 Elsevier B.V. All rights reserved.
  • Ichigaku Takigawa, Hiroshi Mamitsuka
    DRUG DISCOVERY TODAY 18 (1-2) 50 - 57 1359-6446 2013/01 [Refereed][Not invited]
     
    Combinatorial chemistry has generated chemical libraries and databases with a huge number of chemical compounds, which include prospective drugs. Chemical structures of compounds can be molecular graphs, to which a variety of graph-based techniques in computer science, specifically graph mining, can be applied. The most basic way for analyzing molecular graphs is using structural fragments, so-called subgraphs in graph theory. The mainstream technique in graph mining is frequent subgraph mining, by which we can retrieve essential subgraphs in given molecular graphs. In this article we explain the idea and procedure of mining frequent subgraphs from given molecular graphs, raising some real applications, and we describe the recent advances of graph mining.
  • Timothy Hancock, Ichigaku Takigawa, Hiroshi Mamitsuka
    Methods in Molecular Biology 939 69 - 85 1064-3745 2013 [Refereed][Not invited]
     
    Methods capable of identifying genetic pathways with coordinated expression signatures are critical to advance our understanding of the functions of biological networks. Currently, the most comprehensive and validated biological networks are metabolic networks. Complete metabolic networks are easily sourced from multiple online databases. These databases reveal metabolic networks to be large, highly complex structures. This complexity is sufficient to hide the specific details on which pathways are interacting to produce an observed network response. In this chapter we will outline a complete framework for identifying the metabolic pathways that relate to an observed phenomenon. To illuminate the functional metabolic pathways, we overlay microarray experiments on top of a complete metabolic network. We then extract the functional components within a metabolic network through a combination of novel pathway ranking, clustering, and classification algorithms. This chapter is designed as a simple tutorial which enables this framework to be applied to any metabolic network and microarray data. © 2013 Springer Science+Business Media New York.
  • Koujaku S, Kudo M, Takigawa I, Imai H
    Lecture Notes in Engineering and Computer Science 1 LNECS 324 - 329 2013 [Refereed][Not invited]
  • Ichigaku Takigawa, Koji Tsuda, Hiroshi Mamitsuka
    Methods in Molecular Biology 993 67 - 80 1064-3745 2013 [Refereed][Not invited]
     
    Recent analysis on polypharmacology leads to the idea that only small fragments of drugs and targets are a key to understanding their interactions forming polypharmacology. This idea motivates us to build an in silico approach of finding signi ficant substructure patterns from drug-target (molecular graph-amino acid sequence) pairs. This article introduces an ef ficient in silico method for enumerating, from given drug- target pairs, all frequent subgraph-subsequence pairs, which can then be further examined by hypothesis testing for statistical signi ficance. Unique features of the method are its scalability, computational ef ficiency, and technical soundness in terms of computer science and statistics. The presented method was applied to 11,219 drug-target pairs in DrugBank to obtain signi ficant substructure pairs, which can divide most of the original 11,219 pairs into eight highly exclusive clusters, implying that the obtained substructure pairs are indispensable components for interpreting polypharmacology. © Springer Science+Business Media, LLC 2013.
  • Timothy Hancock, Nicolas Wicker, Ichigaku Takigawa, Hiroshi Mamitsuka
    PLOS ONE 7 (2) e31345  1932-6203 2012/02 [Refereed][Not invited]
     
    In this paper we investigate how metabolic network structure affects any coordination between transcript and metabolite profiles. To achieve this goal we conduct two complementary analyses focused on the metabolic response to stress. First, we investigate the general size of any relationship between metabolic network gene expression and metabolite profiles. We find that strongly correlated transcript-metabolite profiles are sustained over surprisingly long network distances away from any target metabolite. Secondly, we employ a novel pathway mining method to investigate the structure of this transcript-metabolite relationship. The objective of this method is to identify a minimum set of metabolites which are the target of significantly correlated gene expression pathways. The results reveal that in general, a global regulation signature targeting a small number of metabolites is responsible for a large scale metabolic response. However, our method also reveals pathway specific effects that can degrade this global regulation signature and complicates the observed coordination between transcript-metabolite profiles.
  • Akira Tanaka, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION 7626 345 - 353 0302-9743 2012 [Refereed][Not invited]
     
    Learning based on kernel machines is widely known as a powerful tool for various fields of information science such as pattern recognition and regression estimation. An appropriate model selection is required in order to obtain desirable learning results. In our previous work, we discussed a class of kernels forming a nested class of reproducing kernel Hilbert spaces with an invariant metric and proved that the kernel corresponding to the smallest reproducing kernel Hilbert space, including an unknown true function, gives the best model. In this paper, we relax the invariant metric condition and show that a similar result is obtained when a subspace with an invariant metric exists.
  • Mitsunori Kayano, Ichigaku Takigawa, Motoki Shiga, Koji Tsuda, Hiroshi Mamitsuka
    NUCLEIC ACIDS RESEARCH 39 (11) e74  0305-1048 2011/06 [Refereed][Not invited]
     
    A switching mechanism in gene expression, where two genes are positively correlated in one condition and negatively correlated in the other condition, is a key to elucidating complex biological systems. There already exist methods for detecting switching mechanisms from microarrays. However, current approaches have problems under three real cases: outliers, expression values with a very small range and a small number of examples. ROS-DET overcomes these three problems, keeping the computational complexity of current approaches. We demonstrated that ROS-DET outperformed existing methods, under that all these three situations are considered. Furthermore, for each of the top 10 pairs ranked by ROS-DET, we attempted to identify a pathway, i.e. consecutive biological phenomena, being related with the corresponding two genes by checking the biological literature. In 8 out of the 10 pairs, we found two parallel pathways, one of the two genes being in each of the two pathways and two pathways coming to (or starting with) the same gene. This indicates that two parallel pathways would be cooperatively used under one experimental condition, corresponding to the positive correlation, and the two pathways might be alternatively used under the other condition, corresponding to the negative correlation. ROS-DET is available from http://www.bic.kyoto-u.ac.jp/pathway/kayano/ros-det.htm.
  • Ichigaku Takigawa, Koji Tsuda, Hiroshi Mamitsuka
    PLOS ONE 6 (2) e16999  1932-6203 2011/02 [Refereed][Not invited]
     
    A current key feature in drug-target network is that drugs often bind to multiple targets, known as polypharmacology or drug promiscuity. Recent literature has indicated that relatively small fragments in both drugs and targets are crucial in forming polypharmacology. We hypothesize that principles behind polypharmacology are embedded in paired fragments in molecular graphs and amino acid sequences of drug-target interactions. We developed a fast, scalable algorithm for mining significantly co-occurring subgraph-subsequence pairs from drug-target interactions. A noteworthy feature of our approach is to capture significant paired patterns of subgraph-subsequence, while patterns of either drugs or targets only have been considered in the literature so far. Significant substructure pairs allow the grouping of drug-target interactions into clusters, covering approximately 75% of interactions containing approved drugs. These clusters were highly exclusive to each other, being statistically significant and logically implying that each cluster corresponds to a distinguished type of polypharmacology. These exclusive clusters cannot be easily obtained by using either drug or target information only but are naturally found by highlighting significant substructure pairs in drug-target interactions. These results confirm the effectiveness of our method for interpreting polypharmacology in drug-target network.
  • Motoki Shiga, Ichigaku Takigawa, Hiroshi Mamitsuka
    PATTERN RECOGNITION 44 (2) 236 - 251 0031-3203 2011/02 [Refereed][Not invited]
     
    We address the issue of clustering examples by integrating multiple data sources, particularly numerical vectors and nodes in a network. We propose a new, efficient spectral approach, which integrates the two costs for clustering numerical vectors and clustering nodes in a network into a matrix trace, reducing the issue to a trace optimization problem which can be solved by an eigenvalue decomposition. We empirically demonstrate the performance of the proposed approach through a variety of experiments, including both synthetic and real biological datasets. (C) 2010 Elsevier Ltd. All rights reserved.
  • Ichigaku Takigawa, Hiroshi Mamitsuka
    MACHINE LEARNING 82 (2) 95 - 121 0885-6125 2011/02 [Refereed][Not invited]
     
    The output of frequent pattern mining is a huge number of frequent patterns, which are very redundant, causing a serious problem in understandability. We focus on mining frequent subgraphs for which well-considered approaches to reduce the redundancy are limited because of the complex nature of graphs. Two known, standard solutions are closed and maximal frequent subgraphs, but closed frequent subgraphs are still redundant and maximal frequent subgraphs are too specific. A more promising solution is delta-tolerance closed frequent subgraphs, which decrease monotonically in delta, being equal to maximal frequent subgraphs and closed frequent subgraphs for delta = 0 and 1, respectively. However, the current algorithm for mining d-tolerance closed frequent subgraphs is a naive, two-step approach in which frequent subgraphs are all enumerated and then sifted according to delta-tolerance closedness. We propose an efficient algorithm based on the idea of "reverse-search" by which the completeness of enumeration is guaranteed and for which new pruning conditions are incorporated. We empirically demonstrate that our approach significantly reduced the amount of real computation time of two compared algorithms for mining delta-tolerance closed frequent subgraphs, being pronounced more for practical settings.
  • Timothy Hancock, Ichigaku Takigawa, Hiroshi Mamitsuka
    BIOINFORMATICS 26 (17) 2128 - 2135 1367-4803 2010/09 [Refereed][Not invited]
     
    Motivation: An observed metabolic response is the result of the coordinated activation and interaction between multiple genetic pathways. However, the complex structure of metabolism has meant that a compete understanding of which pathways are required to produce an observed metabolic response is not fully understood. In this article, we propose an approach that can identify the genetic pathways which dictate the response of metabolic network to specific experimental conditions. Results: Our approach is a combination of probabilistic models for pathway ranking, clustering and classification. First, we use a nonparametric pathway extraction method to identify the most highly correlated paths through the metabolic network. We then extract the defining structure within these top-ranked pathways using both Markov clustering and classification algorithms. Furthermore, we define detailed node and edge annotations, which enable us to track each pathway, not only with respect to its genetic dependencies, but also allow for an analysis of the interacting reactions, compounds and KEGG sub-networks. We show that our approach identifies biologically meaningful pathways within two microarray expression datasets using entire KEGG metabolic networks.
  • On the performance of methods for finding a switching mechanism in gene expression.
    Kayano M, Takigawa I, Shiga M, Tsuda K, Mamitsuka H
    Genome informatics. International Conference on Genome Informatics 1 24 69 - 83 0919-9454 2010/07 [Refereed][Not invited]
  • Kayano M, Takigawa I, Shiga M, Tsuda K, Mamitsuka H
    Genome informatics. International Conference on Genome Informatics 24 69 - 83 2010 [Refereed][Not invited]
  • duVerle D, Takigawa I, Ono Y, Sorimachi H, Mamitsuka H
    Genome informatics. International Conference on Genome Informatics 22 202 - 213 0919-9454 2010/01 [Refereed][Not invited]
  • Atsuyoshi Nakamura, Tomoya Saito, Ichigaku Takigawa, Hiroshi Mamitsuka, Mineichi Kudo
    STRING PROCESSING AND INFORMATION RETRIEVAL 6393 185 - + 0302-9743 2010 [Refereed][Not invited]
     
    A string with many repetitions can be written compactly by replacing h-fold contiguous repetitions of substring r with (r)h. We refer to such a compact representation as a repetition representation string or RRS, by which a set of disjoint or nested tandem arrays can be compacted. In this paper, we study the problem of finding a minimum RRS or MRRS, where the size of an RRS is defined to be the sum of its component letter sizes and the sizes needed to describe the repetitions (.)h which are defined as w(R)(h) using a repetition weight function w(R). We develop two dynamic programming algorithms to solve the problem. One is CMR that works for any repetition weight function, and the other is CMR-C that is faster but can be applied only when the repetition weight function is constant. CMR-C is an O(w(n + z))-time algorithm using O(n + z) space for a given string with length n, where w and z are the number of distinct primitive tandem repeats and the number of their occurrences, respectively. Since w = O(n) and z = O(n log n) in the worst case, CMR-C is an O(n(2) log n)-time O(n log n)-space algorithm, which is faster than OMR by ((log n)/n)-factor.
  • Mitsunori Kayano, Ichigaku Takigawa, Motoki Shiga, Koji Tsuda, Hiroshi Mamitsuka
    BIOINFORMATICS 25 (21) 2735 - 2743 1367-4803 2009/11 [Refereed][Not invited]
     
    Motivation: We address the issue of finding a three-way gene interaction, i.e. two interacting genes in expression under the genotypes of another gene, given a dataset in which expressions and genotypes are measured at once for each individual. This issue can be a general, switching mechanism in expression of two genes, being controlled by categories of another gene, and finding this type of interaction can be a key to elucidating complex biological systems. The most suitable method for this issue is likelihood ratio test using logistic regressions, which we call interaction test, but a serious problem of this test is computational intractability at a genome-wide level. Results: We developed a fast method for this issue which improves the speed of interaction test by around 10 times for any size of datasets, keeping highly interacting genes with an accuracy of similar to 85%. We applied our method to similar to 3 x 10(8) three-way combinations generated from a dataset on human brain samples and detected three-way gene interactions with small P-values. To check the reliability of our results, we first conducted permutations by which we can show that the obtained P-values are significantly smaller than those obtained from permuted null examples. We then used GEO (Gene Expression Omnibus) to generate gene expression datasets with binary classes to confirm the detected three-way interactions by using these datasets and interaction tests. The result showed us some datasets with significantly small P-values, strongly supporting the reliability of the detected three-way interactions.
  • Shanfeng Zhu, Ichigaku Takigawa, Jia Zeng, Hiroshi Mamitsuka
    INFORMATION PROCESSING & MANAGEMENT 45 (5) 555 - 570 0306-4573 2009/09 [Refereed][Not invited]
     
    We propose a new finite mixture model for clustering multiple-field documents, such as scientific literature with distinct fields: title, abstract, keywords, main text and references. This probabilistic model, which we call field independent clustering model (FICM), incorporates the distinct word distributions of each field to integrate the discriminative abilities of each field as well as to select the most suitable component probabilistic model for each field. We evaluated the performance of FICM by applying it to the problem of clustering three-field (title, abstract and MeSH) biomedical documents from TREC 2004 and 2005 Genomics tracks, and two-field (title and abstract) news reports from Reuters-21578. Experimental results showed that FICM outperformed the classical multinomial model and the multivariate Bernoulli model, being at a statistically significant level for all the three collections. These results indicate that FICM outperformed widely-used probabilistic models for document clustering by considering the characteristics of each field. We further showed that the component model, which is consistent with the nature of the corresponding field, achieved a better performance and considering the diversity of model setting also gave a further performance improvement. An extended abstract of parts of the work presented in this paper has appeared in Zhu et al. [Zhu, S., Takigawa, L, Zhang, S., & Mamitsuka, H. (2007). A probabilistic model for clustering text documents with multiple fields. In Proceedings of the 29th European conference on information retrieval, ECIR 2007. Lecture notes in computer science (Vol. 4425, pp. 331-342)]. (C) 2009 Elsevier Ltd. All rights reserved.
  • Ichigaku Takigawa, Mineichi Kudo, Atsuyoshi Nakamura
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 22 (1) 101 - 108 0952-1976 2009/02 [Refereed][Not invited]
     
    We propose a general framework for nonparametric classification of multi-dimensional numerical patterns. Given training points for each class, it builds a set cover with convex sets each of which contains some training points of the class but no points of the other classes. Each convex set has thus an associated class label, and classification of a query point is made to the class of the convex set such that the projection of the query point onto its boundary is minimal. In this sense, the convex sets of a class are regarded as "prototypes" for that class. We then apply this framework to two special types of convex sets, minimum enclosing balls and convex hulls, giving algorithms for constructing a set cover with them and for computing the projection length onto their boundaries. For convex hulls, we also give a method for implicitly evaluating whether a point is contained in a convex hull, which can avoid computational difficulty for explicit construction of convex hulls in high-dimensional space. (C) 2008 Elsevier Ltd. All rights reserved.
  • Kosuke Hashimoto, Ichigaku Takigawa, Motoki Shiga, Minoru Kanehisa, Hiroshi Mamitsuka
    BIOINFORMATICS 24 (16) I167 - I173 1367-4803 2008/08 [Refereed][Not invited]
     
    Motivation: Carbohydrate sugar chains or glycans, the third major class of macromolecules, hold branch shaped tree structures. Glycan motifs are known to be two types: (1) conserved patterns called cores containing the root and (2) ubiquitous motifs which appear in external parts including leaves and are distributed over different glycan classes. Finding these glycan tree motifs is an important issue, but there have been no computational methods to capture these motifs efficiently. Results: We have developed an efficient method for mining motifs or significant subtrees from glycans. The key contribution of this method is: (1) to have proposed a new concept, 'alpha-closed frequent subtrees, and an efficient method for mining all these subtrees from given trees and (2) to have proposed to apply statistical hypothesis testing to rerank the frequent subtrees in significance. We experimentally verified the effectiveness of the proposed method using real glycans: (1)We examined the top 10 subtrees obtained by our method at some parameter setting and confirmed that all subtrees are significant motifs in glycobiology. (2) We applied the results of our method to a classification problem and found that our method outperformed other competing methods, SVM with three different tree kernels, being all statistically significant.
  • Ichigaku Takigawa, Hiroshi Mamitsuka
    BIOINFORMATICS 24 (2) 250 - 257 1367-4803 2008/01 [Refereed][Not invited]
     
    Motivation: Pathway knowledge in public databases enables us to examine how individual metabolites are connected via chemical reactions and what genes are implicated in those processes. For two given (sets of) compounds, the number of possible paths between them in a metabolic network can be intractably large. It would be informative to rank these paths in order to differentiate between them. Results: Focusing on adjacent pairwise coexpression, we developed an algorithm which, for a specified k, efficiently outputs the top k paths based on a probabilistic scoring mechanism, using a given metabolic network and microarray datasets. Our idea of using adjacent pairwise coexpression is supported by recent studies that local coregulation is predominant in metabolism. We first evaluated this idea by examining to what extent highly correlated gene pairs are adjacent and how often they are consecutive in a metabolic network. We then applied our algorithm to two examples of path ranking: the paths from glucose to pyruvate in the entire metabolic network of yeast and the paths from phenylalanine to sinapyl alcohol in monolignols pathways of arabidopsis under several different microarray conditions, to confirm and discuss the performance analysis of our method.
  • Mineichi Kudo, Atsuyoshi Nakamura, Ichigaku Takigawa
    19th International Conference on Pattern Recognition (ICPR 2008), December 8-11, 2008, Tampa, Florida, USA IEEE Computer Society 1 - 4 1051-4651 2008 [Refereed][Not invited]
  • Motoki Shiga, Ichigaku Takigawa, Hiroshi Mamitsuka
    BIOINFORMATICS 23 (13) I468 - I478 1367-4803 2007/07 [Refereed][Not invited]
     
    Motivation: A promising and reliable approach to annotate gene function is clustering genes not only by using gene expression data but also literature information, especially gene networks. Results: We present a systematic method for gene clustering by combining these totally different two types of data, particularly focusing on network modularity, a global feature of gene networks. Our method is based on learning a probabilistic model, which we call a hidden modular random field in which the relation between hidden variables directly represents a given gene network. Our learning algorithm which minimizes an energy function considering the network modularity is practically time-efficient, regardless of using the global network property. We evaluated our method by using a metabolic network and microarray expression data, changing with microarray datasets, parameters of our model and gold standard clusters. Experimental results showed that our method outperformed other four competing methods, including k-means and existing graph partitioning methods, being statistically significant in all cases. Further detailed analysis showed that our method could group a set of genes into a cluster which corresponds to the folate metabolic pathway while other methods could not. From these results, we can say that our method is highly effective for gene clustering and annotating gene function.
  • Motoki Shiga, Ichigaku Takigawa, Hiroshi Mamitsuka
    KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING 647 - 656 2007 [Refereed][Not invited]
     
    We address the issue of clustering numerical vectors with a network. The problem setting is basically equivalent to constrained clustering by Wagstaff and Cardie [20] and semi-supervised clustering by Basu et al. [2], but our focus is more on the optimal combination of two heterogeneous data sources. An application of this setting is web pages which can be numerically vectorized by their contents, e.g. term frequencies, and which are hyperlinked to each other, showing a network. Another typical application is genes whose behavior can be numerically measured and a gene network can be given from another data source. We first define a new graph clustering measure which we call normalized network modularity, by balancing the cluster size of the original modularity. We then propose a new clustering method which integrates the cost of clustering numerical vectors with the cost of maximizing the normalized network modularity into a spectral relaxation problem. Our learning algorithm is based on spectral clustering which makes our issue an eigenvalue problem and uses k-means for final cluster assignments. A significant advantage of our method is that we can optimize the weight parameter for balancing the two costs from the given data by choosing the minimum total cost. We evaluated the performance of our proposed method using a variety of datasets including synthetic data as well as real-world data from molecular biology. Experimental results showed that our method is effective enough to have good results for clustering by numerical vectors and a network.
  • Shanfeng Zhu, Ichigaku Takigawa, Shuqin Zhang, Hiroshi Mamitsuka
    ADVANCES IN INFORMATION RETRIEVAL 4425 331 - + 0302-9743 2007 [Refereed][Not invited]
     
    We address the problem of clustering documents with multiple fields, such as scientific literature with the distinct fields: title, abstract, keywords, main text and references. By taking into consideration of the distinct word distributions of each field, we propose a new probabilistic model, Field Independent Clustering Model (FICM), for clustering documents with multiple fields. The benefits of FICM come not only from integrating the discrimination abilities of each field but also from the power of selecting the most suitable component probabilistic model for each field. We examined the performance of FICM on the problem of clustering biomedical documents with three fields (title, abstract and MeSH). From the genomics track data of TREC 2004 and TREC 2005, we randomly generated 60 datasets where the number of classes in each dataset ranged from 3 to 12. By applying the appropriate configuration of generative models for each field, FICM outperformed a classical multinomial model in 59 out of the total 60 datasets, of which 47 were statistically significant at the 95% level, and FICM also outperformed a multivariate Bernoulli model in 52 out of the total 60 datasets, of which 36 were statistically significant at the 95% level.
  • Raymond Wan, Ichigaku Takigawa, Hiroshi Mamitsuka
    DATA MINING AND BIOINFORMATICS 4316 40 - + 0302-9743 2006 [Refereed][Not invited]
     
    Biological data presents unique problems for data analysis due to its high dimensions. Microarray data is one example of such data which has received much attention in recent years. Machine learning algorithms such as support vector machines (SVM) are ideal for microarray data due to its high classification accuracies. However, sometimes the information being sought is a list of genes which best separates the classes, and not a classification rate. Decision trees are one alternative which do not perform as well as SVMs, but their output is easily understood by non-specialists. A major obstacle with applying current decision tree implementations for high-dimensional data sets is their tendency to assign the same scores for multiple attributes. In this paper, we propose two distribution-dependant criteria for decision trees to improve their usefulness for microarray classification.
  • Raymond Wan, Ichigaku Takigawa, Hiroshi Mamitsuka, Vo Ngoc Anh
    Proceedings of the Fifteenth Text REtrieval Conference, TREC 2006, Gaithersburg, Maryland, USA, November 14-17, 2006 National Institute of Standards and Technology (NIST) 2006 [Refereed][Not invited]
  • Takigawa, I, M Kudo, A Nakamura
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINDS 3587 90 - 99 0302-9743 2005 [Refereed][Not invited]
     
    We propose a new nonparametric classification framework for numerical patterns, which can also be exploitable for exploratory data analysis. The key idea is approximating each class region by a family of convex geometric sets which can cover samples of the target class without containing any samples of other classes. According to this framework, we consider a combinatorial classifier based on a family of spheres, each of which is the minimum covering sphere for a subset of positive samples and does not contain any negative samples. We also present a polynomial-time exact algorithm and an incremental randomized algorithm to compute it. In addition, we discuss the soft-classification version and evaluate these algorithms by some numerical experiments.
  • Takigawa, I, M Kudo, J Toyama
    IEEE TRANSACTIONS ON SIGNAL PROCESSING 52 (3) 582 - 591 1053-587X 2004/03 [Refereed][Not invited]
     
    Results of the analysis of the performance of minimum l(1)-norm solutions in underdetermined blind source separation, that is, separation of n sources from m ( < n) linearly mixed observations, are presented in this paper. The minimum l(1)-norm solutions are known to be justified as maximum a posteriori probability (MAP) solutions under a Laplacian prior. Previous works have not given much attention to the performance of minimum l(1)-norm solutions, despite the need to know about its properties in order to investigate its practical effectiveness. We first derive a probability density of minimum l(1)-norm solutions and some properties. We then show that the minimum l(1)-norm solutions work best in a case in which the number of simultaneous nonzero source time samples is less than the number of sensors at each time point or in a case in which the source signals have a highly peaked distribution. We also show that when neither of these conditions is satisfied, the performance of minimum l(1)-norm solutions is almost the same as that of linear solutions obtained by the Moore-Penrose inverse. Our results show when the minimum l(1)-norm solutions are reliable.
  • Tanaka A, Takigawa I, Imai H, Kudo M, Miyakoshi M
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3213 1058 - 1064 2004 [Refereed][Not invited]
  • Takigawa, I, M Kudo, A Nakamura, J Toyama
    INDEPENDENT COMPONENT ANALYSIS AND BLIND SIGNAL SEPARATION 3195 193 - 200 0302-9743 2004 [Refereed][Not invited]
     
    This paper studied the minimum l(1)-norm signal recovery in underdetermined source separation, which is a problem of separating n sources blindly from m linear mixtures for n. > m. Based on our previous result of submatrix representation and decision regions, we describe the property of the minimum l(1)-norm sequence from the viewpoint of source separation, and discuss how to construct it geometrically from the observed sequence and the mixing matrix, and the unstability for a perturbation of mixing matrix.
  • A Tanaka, Takigawa, I, H Imai, M Kudo, M Miyakoshi
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS 3213 1058 - 1064 0302-9743 2004 [Refereed][Not invited]
     
    Kernel machines are widely known as powerful tools for various fields of information science. In general, they are designed based on a generalization criterion related to the complexity of the model and intuitive but ad hoe philosophy such as maximal margin principle shown in SVM. On the other hand, the project ion learning scheme was proposed in the field of neural networks. In the projection learning, the generalization ability is evaluated by the distance between the unknown target function and the estimated one. In,this paper, we construct projection learning based kernel machines and propose a method of making a kernel function that has necessary representability for the task. The method is reduced to a selection of an appropriate reproducing kernel Hilbert space from a series of monotone increasing subspaces. We also verify the efficacy of the proposed method by numerical examples.
  • Takigawa, I, M Kudo, A Nakamura, J Toyama
    INDEPENDENT COMPONENT ANALYSIS AND BLIND SIGNAL SEPARATION 3195 193 - 200 0302-9743 2004 [Refereed][Not invited]
     
    This paper studied the minimum l(1)-norm signal recovery in underdetermined source separation, which is a problem of separating n sources blindly from m linear mixtures for n. > m. Based on our previous result of submatrix representation and decision regions, we describe the property of the minimum l(1)-norm sequence from the viewpoint of source separation, and discuss how to construct it geometrically from the observed sequence and the mixing matrix, and the unstability for a perturbation of mixing matrix.
  • Takigawa I, Toyama J, Shimbo M
    Proceedings of the IEEE International Conference on Systems, Man and Cybernetics 1 1999 [Refereed][Not invited]
  • David Duverle, Ichigaku Takigawa
    [Refereed][Not invited]
     
    While the importance of modulatory proteolysis in research has steadily increased, knowledge on this process has remained largely disorganized, with the nature and role of entities composing modulatory proteolysis still uncertain. We built CaMPDB, a resource on modulatory proteolysis, with a focus on calpain, a well-studied intracellular protease which regulates substrate functions by proteolytic processing. CaMPDB contains sequences of calpains, substrates and inhibitors as well as substrate cleavage sites, collected from the literature. Some cleavage efficiencies were evaluated by biochemical experiments and a cleavage site prediction tool is provided to assist biologists in understanding calpain-mediated cellular processes. CaMPDB is freely accessible at

Conference Activities & Talks

  • データ社会を生きる技術〜人工知能のHypeとHope〜  [Invited]
    瀧川 一学
    富山県寄附講義 (富山国際大学)  2019/12
  • 自然科学研究の道具としての機械学習  [Invited]
    瀧川 一学
    北陸先端科学技術大学院大学 情報科学系セミナー  2019/12
  • Machine Learning and Model-based Optimization for Heterogeneous Catalyst Design and Discovery  [Invited]
    瀧川 一学
    The 2nd ICReDD International Symposium - Toward Interdisciplinary Research Guided by Theory and Calculation,  2019/11
  • 機械学習は真の理解や発見に寄与できるか  [Invited]
    瀧川 一学
    第35回関東CAE懇話会, AI・IoT時代のデータ利活用による理解と発見  2019/10
  • 不均一系触媒研究のための機械学習と最適実験計画  [Invited]
    瀧川 一学
    ICReDD-CRESTキャリア領域 情報交換シンポジウム  2019/09
  • 人工知能の基本問題:これまでとこれから  [Not invited]
    瀧川 一学
    人工知能学会 第110回人工知能基本問題研究会(SIG-FPAI),  2019/09
  • 不均一系触媒研究のための機械学習と最適実験計画  [Invited]
    瀧川 一学
    第80回応用物理学会秋季学術講演会 シンポジウム, インフォマティクスが創る新しい材料科学とその実用化  2019/09
  • 分子のグラフ表現と機械学習  [Invited]
    瀧川 一学
    有機合成化学協会, 「AIと有機合成化学」第三回勉強会  2019/06
  • ユーザのための機械学習・深層学習入門  [Invited]
    瀧川 一学
    Rinkai Hackathon 2019 with DDBJing  2019/06
  • 科学と機械学習  [Invited]
    瀧川 一学
    NTTコミュニケーション科学基礎研究所セミナー  2019/05
  • 化学研究のための機械学習と最適実験計画  [Invited]
    瀧川 一学
    東京大学物性研究所スパコン共同利用・CCMS合同研究会「計算物質科学の新展開」,  2019/04
  • 決定木・回帰木に基づくアンサンブル学習  [Invited]
    瀧川 一学
    統計数理研究所 リーディングDAT講座, L-B2 機械学習とデータサイエンスの現代的手法  2018/12
  • Machine Learning for Chemical Sciences  [Not invited]
    Ichigaku Takigawa
    2018 International Workshop on New Frontiers in Convergence Science and Technology, Hokkaido University (HU) - Seoul National University (SNU) Joint Symposium  2018/11
  • データ駆動科学と機械学習  [Invited]
    瀧川 一学
    岐阜大学工学部・第2回データサイエンス研究会  2018/09
  • 分子のグラフ表現と機械学習  [Invited]
    瀧川 一学
    第 79 回応用物理学会特別シンポジウム:インフォマティクスへの招待~ 機械学習・インフォマティクスは応用物理をどう変えるか?~  2018/09  名古屋国際会議場
  • 決定木・回帰木に基づくアンサンブル学習の最近  [Invited]
    瀧川 一学
    電子情報通信学会 スマートインフォメディアシステ ム研究会 (SIS)  2018/06
  • 機械学習は真の発見に寄与できるのか?  [Invited]
    瀧川 一学
    MI2I・JAIST 合同シンポジウム((情報統合型物質・材料開発イ ニシアティブ・北陸先端科学技術大学院大学)データ科学における予測と理解の両立を目指して-分かるとは何か? -  2018/05
  • Machine learning predictions of factors affecting the activity of heterogeneous metal catalysts  [Invited]
    瀧川 一学
    The 255th ACS (American Chemical Society) National Meeting, "CATL: Machine Learning for Catalysis Research"  2018/03
  • 分子のグラフ表現と機械学習  [Invited]
    瀧川 一学
    異分野融合ワークショップ「データ科学との融合による化学の新展開」  2018/03
  • Machine Learning and Surrogate Optimization on Heterogeneous Catalysts  [Invited]
    Ichigaku Takigawa
    2019 PRESTO International Symposium on Materials Informatics  2018/02
  • Frontiers of data-driven property prediction: molecular machine learning  [Invited]
    瀧川 一学
    Innovation Camp 2018 for Computational Materials Science (ICCMS2018)  2018/01
  • グラフデータの機械学習における特徴表現の設計と学習  [Invited]
    瀧川 一学
    日本応用数理学会 2017 年度年会  2017/09
  • 合成変量とアンサンブル:回帰森と加法モデルの要点  [Invited]
    瀧川 一学
    電子情報通信学会 信号処理研究会(SIP)  2017/06
  • 機械学習は化学研究の"経験と勘"を合理化できるか?  [Invited]
    瀧川 一学
    電気化学会 第 33 回ライラックセミナー・第 23 回若手研究者交流会  2017/06
  • 道具としての機械学習:直感的概要とその実際  [Invited]
    瀧川 一学
    地球流体データ解析・数値計算ワークショップ  2017/03
  • 科学と機械学習のあいだ:変量の設計・変換・選択・交互作用・線形性  [Invited]
    瀧川 一学
    第 19 回情報論的学習理論ワー クショップ (IBIS2016)  2016/11
  • メディエーター複合体による転写伸長制御  [Invited]
    瀧川 一学
    第 2 回バイオインフォマティクスアゴラ  2016/07
  • データマイニングとしての多重標的相互作用解析  [Invited]
    瀧川 一学
    CBI 学会 2015 年大会, FS-08, in silico によるポリ ファーマコロジー創薬  2015/10
  • データマイニングとしての多重標的相互作用解析  [Invited]
    瀧川 一学
    第 365 回 CBI 学会講演会, フェノタイプスクリーニ ング 古くて新しい創薬手法 Part2  2015/07
  • 多数のグラフからの統計的機械学習  [Invited]
    瀧川 一学
    人工知能学会 第 94 回 人工知能基本問題研究会 (SIG-FPAI)  2014/07
  • Finding structural patterns shared among interacting molecules  [Invited]
    瀧川 一学
    The 3rd Beilstein Symposium on Glyco-Bioinformatics  2013/06
  • 酵素遺伝子の発現情報に基づく効率的な代謝経路ランキング  [Invited]
    瀧川 一学
    2008 年度統計関連学会連合大会  2008/09
  • Mathematica による機械学習とパターン認識  [Invited]
    瀧川 一学
    日本 Mathematica ユーザ会第二回ワークショップ  2006/10
  • 独立成分分析による音源分離と聴覚情景分析  [Invited]
    瀧川 一学
    第 39 回計測自動制御学会 学術講演会 (SICE2000)  2000/07

MISC

  • 瀧川一学, 瀧川一学  人工知能学会人工知能基本問題研究会資料  110th-  38  2019/09/15  [Not refereed][Not invited]
  • 守屋勇樹, 田畑剛, 田畑剛, 岩崎未央, 河野信, 五斗進, 石濱泰, 瀧川一学, 瀧川一学, 吉沢明康  質量分析総合討論会講演要旨集  67th-  166  2019/04/26  [Not refereed][Not invited]
  • 日沼洋陽, 日沼洋陽, 鳥屋尾隆, 鳥屋尾隆, 蒲池高志, 蒲池高志, 前野禅, 高草木達, 古川森也, 古川森也, 瀧川一学, 瀧川一学, 清水研一, 清水研一  日本金属学会講演概要(CD-ROM)  164th-  ROMBUNNO.213  2019/03/06  [Not refereed][Not invited]
  • 菅原優, 瀧川一学, 瀧川一学  人工知能学会人工知能基本問題研究会資料  109th-  56‐61  2019/03/05  [Not refereed][Not invited]
  • 中野裕太, 瀧川一学, 瀧川一学, 瀧川一学, 瀧川一学  情報処理学会研究報告(Web)  2019-  (MPS-122)  Vol.2019‐MPS‐122,No.16,1‐6 (WEB ONLY)  2019/02/21  [Not refereed][Not invited]
  • 小林正人, 小林正人, 原渕祐, 原渕祐, 堤拓朗, 小野ゆり子, 瀧川一学, 瀧川一学, 武次徹也, 武次徹也  日本コンピュータ化学会年会講演予稿集  2018-  73  2018/11/03  [Not refereed][Not invited]
  • 鳥屋尾隆, 鳥屋尾隆, 高草木達, 瀧川一学, 清水研一, 清水研一  触媒討論会討論会A予稿集  122nd-  46  2018/09/19  [Not refereed][Not invited]
  • 瀧川一学, 瀧川一学, 瀧川一学  応用物理学会秋季学術講演会講演予稿集(CD-ROM)  79th-  ROMBUNNO.18p‐CE‐9  2018/09/05  [Not refereed][Not invited]
  • 高橋翔哉, 湊真一, 瀧川一学  情報処理学会研究報告(Web)  2018-  (AL-169)  Vol.2018‐AL‐169,No.6,1‐7 (WEB ONLY)  2018/08/27  [Not refereed][Not invited]
  • 坂上 陽規, 栗田 和宏, 瀧川 一学, 有村 博紀  人工知能基本問題研究会  105-  63  -68  2018/01/28  [Not refereed][Not invited]
  • 岡崎 文哉, 瀧川 一学  人工知能基本問題研究会  105-  18  -23  2018/01/28  [Not refereed][Not invited]
  • 白川 稜, 岡崎 文哉, 瀧川 一学  人工知能基本問題研究会  105-  12  -17  2018/01/28  [Not refereed][Not invited]
  • 坂上陽規, 栗田和宏, 瀧川一学, 有村博紀  人工知能学会人工知能基本問題研究会資料  105th-  63‐68  2018/01/22  [Not refereed][Not invited]
  • 岡崎文哉, 瀧川一学, 瀧川一学  人工知能学会人工知能基本問題研究会資料  105th-  18‐23  2018/01/22  [Not refereed][Not invited]
  • 白川稜, 岡崎文哉, 瀧川一学, 瀧川一学  人工知能学会人工知能基本問題研究会資料  105th-  12‐17  2018/01/22  [Not refereed][Not invited]
  • 原田将之介, 秋田大空, 椿真史, 馬場雪乃, 瀧川一学, 山西芳裕, 鹿島久嗣, 鹿島久嗣  人工知能学会全国大会論文集(CD-ROM)  32nd-  (0)  ROMBUNNO.2A1.01  -2A101  2018  [Not refereed][Not invited]
     
    <p>グラフは一般的かつ強力なデータ表現技法で、化合物やソーシャルネットワーク等の複雑な構造を表現する際に有用である。グラフ構造で表現されるデータに対する機械学習の応用も盛んに行われているが、既存の機械学習手法の殆どはデータが固定長のベクトルで表されていることを前提としているため、グラフの適切な取り扱い方について多くの研究が成されてきた。近年のグラフニューラルネットワークは、グラフからの自動的かつ柔軟な特徴抽出を可能にし、予測精度を大きく向上させた。本論文では、これまで別々に研究が成されてきた外部グラフ及び内部グラフから構成される、より一般的なグラフ構造であるgraph of graphsのノードに対して、内外を統合する一貫学習による二重畳み込み法を用いて特徴表現学習を行う。実データを用いたリンク予測実験で、提案手法の有用性を示す。</p>
  • 坂上陽規, 瀧川一学, 有村博紀  人工知能学会全国大会論文集(CD-ROM)  32nd-  (0)  ROMBUNNO.3Pin1.10  -3Pin110  2018  [Not refereed][Not invited]
     
    <p>グラフ断片決定木(graph frangmented decision trees)は,テストとしてグラフパターンの拡張演算を持つような決定木であり,グラフ決定リストのような1次のグラフパターンをもつグラフ分類規則とみなせる.われわれは,先行研究(坂上他,第105回SIGFPAI研究会)で,gSpanのようなグラフパターン列挙手法を用いずに,グラフ断片決定木を貪欲にトップダウン構築する学習アルゴリズムGFDTを提案した.本稿では,GFDTアルゴリズムをgSpanのようなグラフ属性発見器として用いて,集約学習器(ランダムフォレストRF)と組み合わせた場合の性能を実験的に評価する.実データを用いた実験では,gSpanとRFを組み合わせた手法と比較し,その有用性を調べた.</p>
  • 瀧川一学, 瀧川一学  日本応用数理学会年会講演予稿集(CD-ROM)  2017-  351‐352  2017/09/04  [Not refereed][Not invited]
  • Takigawa Ichigaku, Shimizu Ken-ichi, Tsuda Koji, Takakusagi Satoru  Abstract of annual meeting of the Surface Science of Japan  37th-  (0)  24  2017/08/17  [Not refereed][Not invited]
     
    金属のdバンド中心は、様々な触媒反応の活性序列を示す良い指標である。dバンド中心は第一原理計算によって求めるが、種々の金属やバイメタル系に対して網羅的に計算を行うのは、計算・時間コストが高くなる。そこで本研究では,金属の密度,イオン化エネルギー等の入手容易な物理量を記述子とし、種々の金属・合金のdバンド中心の機械学習による高速予測を検討し,その結果、妥当な精度で予測できることを実証した。
  • YOTSUKURA Sohiya, KARASUYAMA Masayuki, TAKIGAWA Ichigaku, MAMITSUKA Hiroshi  Briefings in Bioinformatics  18-  (4)  619‐633  2017/07  [Not refereed][Not invited]
  • 瀧川一学, 瀧川一学  電子情報通信学会技術研究報告  117-  (96(CAS2017 1-23))  43  2017/06/12  [Not refereed][Not invited]
  • 穐本浩昇, 田中讓, 瀧川一学, 瀧川一学  情報処理学会全国大会講演論文集  79th-  (3)  3.443‐3.444  2017/03/16  [Not refereed][Not invited]
  • 大内史子, 小山傑, 小山傑, 進藤真由美, 馬見塚拓, 瀧川一学, 尾嶋孝一, 秦勝志, 小野弥子, 反町洋之  日本農芸化学会大会講演要旨集(Web)  2017-  ROMBUNNO.3J34a03 (WEB ONLY)  2017/03/05  [Not refereed][Not invited]
  • 横山 侑政, 瀧川 一学  JSAI大会論文集  2017-  (0)  1K13  -1K13  2017  [Not refereed][Not invited]
     
    <p>グラフ分類問題において、従来研究では部分グラフ指示子の線形モデルで表現可能な仮説クラスの学習に制限されていた。そこで、本研究では非線形モデルである決定木を弱学習器に用いた勾配ブースティングを提案する。本手法では、決定木の学習は全部分グラフ指示子に基づいて行う。いくつかのベンチマークデータセットに対して実験を行うことで、その性能を評価する。</p>
  • 岡崎 文哉, 奥山 葉月, 瀧川 一学, 湊 真一  JSAI大会論文集  2017-  (0)  4A11  -4A11  2017  [Not refereed][Not invited]
     
    <p>頻出部分グラフマイニングは,与えられたグラフ集合に頻出する部分グラフを列挙する問題である.出力部分グラフが膨大な数であることが多く,列挙後の保存や利活用が難しい.本研究では,系列二分決定グラフ(SeqBDD)を用いた圧縮索引化手法を提案する.各部分グラフを系列として表現し,部分グラフ集合をSeqBDDとして出力する.グラフの系列表現を3種類提案し,実データを用いた実験により圧縮率を検証する.</p>
  • 高橋秀尚, 柴田美音, 瀧川一学, 渡部昌, 築山忠維, 山本淳一, 山口雄輝, 藤井聡, 飯田緑, RANJAN Amol, SATO Shigeo, TOMOMORI‐SATO Chieri, CONAWAY Joan, CONAWAY Ronald, 畠山鎮次  日本生化学会大会(Web)  90th-  ROMBUNNO.4P2T15‐06(3P‐0635) (WEB ONLY)  2017  [Not refereed][Not invited]
  • 横山侑政, 瀧川一学, 瀧川一学  人工知能学会全国大会論文集(CD-ROM)  31st-  ROMBUNNO.1K1‐3  2017  [Not refereed][Not invited]
  • 越野沙耶佳, 岡崎文哉, 瀧川一学, 瀧川一学  人工知能学会全国大会論文集(CD-ROM)  31st-  ROMBUNNO.4J1‐4  2017  [Not refereed][Not invited]
  • 岡崎文哉, 奥山葉月, 瀧川一学, 瀧川一学, 湊真一  人工知能学会全国大会論文集(CD-ROM)  31st-  ROMBUNNO.4A1‐1  2017  [Not refereed][Not invited]
  • 鈴木慶介, 瀧川一学, 瀧川一学, 清水研一, 高草木達  人工知能学会全国大会論文集(CD-ROM)  31st-  ROMBUNNO.4J1‐3  2017  [Not refereed][Not invited]
  • 鈴木 慶介, 瀧川 一学, 清水 研一, 高草木 達  人工知能学会全国大会論文集  2017-  (0)  4J13  -4J13  2017  [Not refereed][Not invited]
     
    <p>メタンの酸化カップリング反応技術は,天然ガスを有効利用する手法の一つとして注目されており,機械学習を用いて反応量を予測する既存研究も存在する.本稿では既存研究に対して新たに,学習に用いるデータの形式(組成情報)と,特徴間の類似度を考慮したモデリングを行った結果を報告する.</p>
  • 越野 沙耶佳, 岡崎 文哉, 瀧川 一学  人工知能学会全国大会論文集  2017-  (0)  4J14  -4J14  2017  [Not refereed][Not invited]
     
    <p>定量的構造活性相関は,化学構造からその活性を予測する問題である.創薬の分野では化学的知識を用いて特徴量を導出するのが一般的である.一方,情報科学分野ではグラフに対する機械学習手法が提案されてきており分子グラフデータに適用されているが創薬分野の手法との精度比較は十分になされていない.本研究では,両分野それぞれの特徴量を用いた機械学習を行い,実験的に精度を比較する.</p>
  • 鈴木慶介, 今井英幸, ZHANG Ruoni, 瀧川一学, 瀧川一学, 湊真一  情報科学技術フォーラム講演論文集  15th-  175‐176  2016/08/23  [Not refereed][Not invited]
  • 瀧川一学  システム/制御/情報  60-  (3)  107  -112  2016/03/15  [Not refereed][Not invited]
  • TAKIGAWA Ichigaku  Systems, control and information  60-  (3)  107  -112  2016/03/15  [Not refereed][Not invited]
  • 横山 侑政, 瀧川 一学  人工知能基本問題研究会  99-  75  -80  2016/01/21  [Not refereed][Not invited]
  • 横山侑政, 瀧川一学  人工知能学会人工知能基本問題研究会資料  99th-  75  -80  2016/01/20  [Not refereed][Not invited]
  • BACKHUS Jana, TAKIGAWA Ichigaku, IMAI Hideyuki, KUDO Mineichi, SUGIMOTO Masanori  人工知能学会全国大会論文集(CD-ROM)  30th-  (0)  ROMBUNNO.3E4‐3  -3E43  2016  [Not refereed][Not invited]
  • Ichigaku Takigawa, Ken-ichi Shimizu, Koji Tsuda, Satoru Takakusagi  RSC ADVANCES  6-  (58)  52587  -52595  2016  [Not refereed][Not invited]
     
    The d-band center for metals has been widely used in order to understand activity trends in metal-surface-catalyzed reactions in terms of the linear Bronsted-Evans-Polanyi relation and Hammer-Norskov d-band model. In this paper, the d-band centers for eleven metals (Fe, Co, Ni, Cu, Ru, Rh, Pd, Ag, Ir, Pt, Au) and their pairwise bimetals for two different structures (1% metal doped- or overlayer-covered metal surfaces) are statistically predicted using machine learning methods from readily available values as descriptors for the target metals (such as the density and the enthalpy of fusion of each metal). The predictive accuracy of four regression methods with different numbers of descriptors and different test-set/training-set ratios are quantitatively evaluated using statistical cross validations. It is shown that the d-band centers are reasonably well predicted by the gradient boosting regression (GBR) method with only six descriptors, even when we predict 75% of the data from only 25% given for training (average root mean square error (RMSE) < 0.5 eV). This demonstrates a potential use of machine learning methods for predicting the activity trends of metal surfaces with a negligible CPU time compared to first-principles methods.
  • Sohiya Yotsukura, Masayuki Karasuyama, Ichigaku Takigawa, Hiroshi Mamitsuka  Big Data Analytics in Genomics  397  -428  2016/01/01  [Not refereed][Not invited]
     
    Breast cancer (BC) patients can be clinically classified into three types, called ER+, PR+, and HER2+, indicating the name of biomarkers and linking treatments. The serious problem is that the patients, called “triple negative” (TN), who cannot be fallen into any of these three categories, have no clear treatment options. Thus linking TN patients to the main three phenotypes clinically is very important. Usually BC patients are profiled by gene expression, while their patient class sets (such as PAM50) are inconsistent with clinical phenotypes. On the other hand, location-specific sequence variants are expected to be more predictive to detect BC patient subgroups, since a variety of somatic, single mutations are well-demonstrated to be linked to the resultant tumors. However those mutations have not been necessarily evaluated well as patterns to predict BC phenotypes. We thus detect patterns, which can assign known phenotypes to BC TN patients, focusing more on paired or more complicated nucleotide/gene mutational patterns, by using three machine learning methods: limitless arity multiple procedure (LAMP), decision trees, and hierarchical disjoint clustering. Association rules obtained through LAMP reveal a patient classification scheme through combinatorial mutations in PIK3CA and TP53, consistent with the obtained decision tree and three major clusters (occupied 182/208 samples), revealing the validity of results from diverse approaches. The final clusters, containing TN patients, present sub-population features in the TN patient pool that assign clinical phenotypes to TN patients. This paper is an extended and detailed version on a pilot study conducted in Yotsukura et al. (Brief Bioinform, to appear).
  • 大内史子, 小山傑, 小山傑, 小野弥子, 秦勝志, 尾嶋孝一, 尾嶋孝一, 進藤真由美, DE VERLE David, DE VERLE David, 土井奈穂子, 瀧川一学, 瀧川一学, 馬見塚拓, 反町洋之  日本病態プロテアーゼ学会学術集会プログラム抄録集  21st-  36  2016  [Not refereed][Not invited]
  • 岡崎文哉, 瀧川一学, 瀧川一学  人工知能学会全国大会論文集(CD-ROM)  30th-  (0)  ROMBUNNO.3I4‐3  -3I43  2016  [Not refereed][Not invited]
  • 岡崎 文哉, 瀧川 一学  電子情報通信学会技術研究報告 = IEICE technical report : 信学技報  115-  (323)  25  -32  2015/11/26  [Not refereed][Not invited]
  • 岡崎文哉, 瀧川一学  電子情報通信学会技術研究報告  115-  (323(IBISML2015 52-93))  25‐32  2015/11/19  [Not refereed][Not invited]
  • 瀧川一学  CBI学会大会  2015-  66  2015/10/26  [Not refereed][Not invited]
  • 瀧川一学  SAR News  (29)  9‐17  2015/10/08  [Not refereed][Not invited]
  • 瀧川一学  CBI学会研究講演会  365th-  6  2015  [Not refereed][Not invited]
  • HENRY Michael, KAWASAKI Akiyuki, TAKIGAWA Ichigaku, MEGURO Kimiro  土木学会年次学術講演会講演概要集(CD-ROM)  69th-  ROMBUNNO.CS2-015  2014/08/01  [Not refereed][Not invited]
  • 瀧川一学  人工知能学会人工知能基本問題研究会資料  94th-  15  2014/07/24  [Not refereed][Not invited]
  • 佐々木秀直, 浜結香, 佐久嶋研, 加納崇裕, 廣谷真, 矢部一郎, 瀧川一学  神経変性疾患に関する調査研究 平成25年度 総括・分担研究報告書  128  -129  2014  [Not refereed][Not invited]
  • 高橋秀尚, 瀧川一学, 渡部昌, ANWAR Delnur, 柴田美音, 佐藤チエリ, 佐藤滋生, RANJAN Amol, SEIDEL Chris, 築山忠維, 林正康, 大川恭行, CONAWAY Joan, CONAWAY Ronald, 畠山鎮次  日本生化学会大会(Web)  87th-  WEB ONLY 2T15P-16(3P-502)  2014  [Not refereed][Not invited]
  • 佐々木秀直, 浜結香, 松島理明, 矢部一郎, 瀧川一学, 内海潤  運動失調症の病態解明と治療法開発に関する研究 平成25年度 総括・分担研究報告書  149  -154  2014  [Not refereed][Not invited]
  • 高橋秀尚, 瀧川一学, 渡部昌, ANWAR Delnur, 柴田美音, 佐藤チエリ, 佐藤滋生, RANJAN Amol, SEIDEL Chris W, 築山忠維, 林正康, 大川恭行, CONAWAY Joan, CONAWAY Ronald C, 畠山鎮次  日本分子生物学会年会プログラム・要旨集(Web)  37th-  1W15-3(1P-0237) (WEB ONLY)  2014  [Not refereed][Not invited]
  • 原田 裕基, 瀧川 一学, 今井 英幸  日本計算機統計学会シンポジウム論文集  27-  (27)  235  -238  2013/11/15  [Not refereed][Not invited]
  • 原田裕基, 瀧川一学, 今井英幸  日本計算機統計学会シンポジウム論文集  27th-  235  -238  2013/11/15  [Not refereed][Not invited]
  • Timothy Hancock, Ichigaku Takigawa, Hiroshi Mamitsuka  Methods in Molecular Biology  939-  69  -85  2013  [Not refereed][Not invited]
     
    Methods capable of identifying genetic pathways with coordinated expression signatures are critical to advance our understanding of the functions of biological networks. Currently, the most comprehensive and validated biological networks are metabolic networks. Complete metabolic networks are easily sourced from multiple online databases. These databases reveal metabolic networks to be large, highly complex structures. This complexity is sufficient to hide the specific details on which pathways are interacting to produce an observed network response. In this chapter we will outline a complete framework for identifying the metabolic pathways that relate to an observed phenomenon. To illuminate the functional metabolic pathways, we overlay microarray experiments on top of a complete metabolic network. We then extract the functional components within a metabolic network through a combination of novel pathway ranking, clustering, and classification algorithms. This chapter is designed as a simple tutorial which enables this framework to be applied to any metabolic network and microarray data. © 2013 Springer Science+Business Media New York.
  • TAKAHASHI Hidehisa, 瀧川一学, ANWAR Delnur, 柴田美音, TOMOMORI‐SATO Chieri, SATO Shigeo, RANJAN Amol, SEIDEL Chris, 築山忠維, 渡部昌, 林正康, 大川恭行, CONAWAY Joan, CONAWAY Ronald, 畠山鎮次  日本分子生物学会年会プログラム・要旨集(Web)  36th-  1P-0182 (WEB ONLY)  2013  [Not refereed][Not invited]
  • 中村 篤祥, 瀧川 一学, 戸坂 央  人工知能基本問題研究会  85-  23  -28  2012/02/02  [Not refereed][Not invited]
  • NAKAMURA Atsuyoshi, TAKIGAWA Ichigaku, TOSAKA Hisashi, KUDO Mineichi, MAMITSUKA Hiroshi  人工知能学会人工知能基本問題研究会資料  85th-  23  -28  2012/01/26  [Not refereed][Not invited]
  • TAKAHASHI Keiichiro, TAKIGAWA Ichigaku, MAMITSUKA Hiroshi  情報計算化学生物学会大会予稿集  2011 (CD-ROM)-  ROMBUNNO.JSBI-41  2011/11/08  [Not refereed][Not invited]
  • 瀧川一学, 馬見塚拓  化学と教育  59-  (9)  450-453  -453  2011/09/20  [Not refereed][Not invited]
  • 瀧川 一学, 馬見塚 拓  化学と教育  59-  (9)  450  -453  2011/09  [Not refereed][Not invited]
     
    抗生物質,栄養素,香料等,化合物は現代生活に欠かせない。化合物の一つの重要な情報は化学構造式である。近年,化合物の化学構造式の情報が電子的に蓄積され,蓄積されたデータから計算機による情報抽出・知識発見が可能となってきた。特に,例えば抗がん剤等,特定の効能を持つ複数の薬の化学構造式に頻出する部分構造フラグメントの抽出は,これらが効能のキーとなる情報を含むため有効であり,さらに,それらが未知であれば,新規創薬設計への大きな貢献を成す可能性がある。本稿では,化学構造式をグラフとみなし頻出部分グラフを全列挙する問題と解決のための技術的背景と方法について解説する。
  • 瀧川 一学, 馬見塚 拓  化学と教育  59-  (9)  450  -453  2011  [Not refereed][Not invited]
  • 茅野光範, 茅野光範, 瀧川一学, 瀧川一学, 志賀元紀, 志賀元紀, 津田宏治, 津田宏治, 馬見塚拓, 馬見塚拓  統計関連学会連合大会講演報告集  2010-  210  2010/09  [Not refereed][Not invited]
  • NAKAMURA Atsuyoshi, SAITO Tomoya, TAKIGAWA Ichigaku, MAMITSUKA Hiroshi, KUDO Mineichi  Lect Notes Comput Sci  6393-  185  -190  2010  [Not refereed][Not invited]
  • KAYANO Mitsunori, TAKIGAWA Ichigaku, SHIGA Motoki, TSUDA Koji, MAMITSUKA Hiroshi  Genome Inform Ser  24-  69  -83  2010  [Not refereed][Not invited]
  • TAKIGAWA Ichigaku, MAMITSUKA Hiroshi  Proc Annu Conf Jpn Soc Bioinform  2010-  P019.1-P019.2  2010  [Not refereed][Not invited]
  • KAYANO Mitsunori, TAKIGAWA Ichigaku, SHIGA Motoki, TSUDA Koji, MAMITSUKA Hiroshi  Proc Annu Conf Jpn Soc Bioinform  2010-  P069.1-P069.2  2010  [Not refereed][Not invited]
  • 小山傑, 小山傑, 秦勝志, 小野弥子, 上野美香, 瀧川一学, 馬見塚拓, 阿部啓子, 反町洋之, 反町洋之  日本農芸化学会大会講演要旨集  2009-  291  2009/03/05  [Not refereed][Not invited]
  • DU VERLE David, TAKIGAWA Ichigaku, ONO Yasuko, SORIMACHI Hiroyuki, MAMITSUKA Hiroshi  Genome Inform Ser  22-  202  -213  2009  [Not refereed][Not invited]
  • 瀧川一学, 馬見塚拓  統計関連学会連合大会講演報告集  2008-  156  2008/09  [Not refereed][Not invited]
  • SHIGA Motoki, TAKIGAWA Ichigaku, MAMITSUKA Hiroshi  Biophysics  48-  (3)  190  -194  2008/05/25  [Not refereed][Not invited]
  • 小山傑, 小山傑, 秦勝志, 小野弥子, 尾嶋孝一, 尾嶋孝一, 林智佳子, 林智佳子, 北村ふじ子, 土井菜穂子, 土井菜穂子, 瀧川一学, 松島由典, 阿部啓子, 馬見塚拓, 反町洋之, 反町洋之  日本蛋白質科学会年会プログラム・要旨集  8th-  87  2008/05/23  [Not refereed][Not invited]
  • MATSUSHIMA Yoshifumi, TAKIGAWA Ichigaku, ONO Yasuko, SORIMACHI Hiroyuki, MAMITSUKA Hiroshi  Proc Annu Conf Jpn Soc Bioinform  2008-  P071.1-P071.2  2008  [Not refereed][Not invited]
  • KAYANO Mitsunori, TAKIGAWA Ichigaku, SHIGA Motoki, TSUDA Koji, MAMITSUKA Hiroshi  Proc Annu Conf Jpn Soc Bioinform  2008-  P049.1-P049.2  2008  [Not refereed][Not invited]
  • TAKIGAWA Ichigaku, HASHIMOTO Kosuke, SHIGA Motoki, KANEHISA Minoru, MAMITSUKA Hiroshi  Proc Annu Conf Jpn Soc Bioinform  2008-  P066.1-P066.2  2008  [Not refereed][Not invited]
  • ZHU Shanfeng, TAKIGAWA Ichigaku, ZHANG Shuqin, MAMITSUKA Hiroshi  Lect Notes Comput Sci  4425-  331  -342  2007  [Not refereed][Not invited]
  • TAKIGAWA ICHIGAKU, MAMITSUKA HIROSHI  情報処理学会研究報告  2006-  (13(BIO-4))  1-7  -7  2006/02/09  [Not refereed][Not invited]
     
    A living cell contains thousands of enzymes, many of which operate at the same time. To control them, metabolism is organized and carefully regulated at many levels. For rapid reactions such as glycolysis, we can assume that genes for adjacent reactions are co-expressed. Thus, we developed a method for generating a sequence of genes that can promote a known path so that genes in each subsequence are maximally co-expressed. Based on expression similarities between genes that encode enzymes, we can analyze the transcriptional activity of possible reaction paths.
  • WAN Raymond, TAKIGAWA Ichigaku, MAMITSUKA Hiroshi  Lect Notes Comput Sci  4316-  40  -49  2006  [Not refereed][Not invited]
  • TAKIGAWA KAZUSATO  スーパーコンピューターラボラトリー 平成17年度 研究成果報告書  105-107  -107  2006  [Not refereed][Not invited]
  • TAKIGAWA Ichigaku, KUDO Mineichi, NAKAMURA Atsuyoshi  Lect Notes Comput Sci  3587-  90  -99  2005  [Not refereed][Not invited]
  • TAKIGAWA Ichigaku, KUDO Mineichi  Technical report of IEICE. PRMU  104-  (524)  37  -42  2004/12/17  [Not refereed][Not invited]
     
    We propose a nonparametric multi-class classifier based on a family of spheres, each of which is the minimum covering sphere for a subset of positive samples and does not contain any negative samples. We first reformulate the subclass method originally defined for axis-parallel rectangles. According to our new framework, we introduce an exact algorithm and an efficient incremental randomized algorithm to construct a subclass family. In addition, we propose the soft-classification version of subclass method and evaluate these algorithms by some numerical experiments.
  • TAKIGAWA Ichigaku, TOYAMA Jun, KUDO Mineichi  Technical report of IEICE. PRMU  103-  (296)  113  -118  2003/09/09  [Not refereed][Not invited]
     
    On the problem like underdetermind signal separation of speech signal mixtures, that is, separation of more sources than mixtures, we need to find a unique and practically useful solution after the linear system is identified. For this problem, the minimum l_1-norm solutions at each data point have been used in the previous studies, and a linear programming have been used usually to solve each subproblem. For large scale data like speech signals, we propose the method which can construct such minimum l_1-norm sequence effectivelly.
  • 瀧川一学, 外山淳, 工藤峰一  情報科学技術フォーラム  FIT 2003-  343-344  -344  2003/08/25  [Not refereed][Not invited]

Research Grants & Projects

Social Contribution

Social Contribution

Social Contribution

  • 第 97 回サイエンス・カフェ札幌 「見えるものを見る AI 見たいものを見る人間~機械に「正しく」学習させるには~」
    Date (from-to) : 2017/10/01
    Role : Lecturer
    Sponser, Organizer, Publisher  : 紀伊國屋書店札幌本店
  • 平成遠友夜学校 「データ社会を生きる技術 ~人工知能の Hope と Hype~」
    Date (from-to) : 2017/08/01
    Role : Lecturer
    Sponser, Organizer, Publisher  : 北海道大学遠友学舎
  • 出前授業「データ大氾濫社会を生き抜く技術 ~多様で愉快な情報科学の世界~」
    Date (from-to) : 2014/11/11
    Role : Lecturer
    Sponser, Organizer, Publisher  : 北海道札幌北高等学校
  • 出前授業「データ社会と古くて新しいAI-続・多様で愉快な情報科学の世界-」
    Date (from-to) : 2013/11/07
    Role : Lecturer
    Sponser, Organizer, Publisher  : 北海道札幌北高等学校
  • 出前授業「データ大氾濫社会を生き抜く技術 ~多様で愉快な情報科学の世界~」
    Date (from-to) : 2012/11/15
    Role : Lecturer
    Sponser, Organizer, Publisher  : 北海道札幌北高等学校
  • 出前授業「ザ、GPS! 未来の自動運転に向けて」
    Date (from-to) : 2012/10/04
    Role : Lecturer
    Sponser, Organizer, Publisher  : 北海道広尾高等学校


Copyright © MEDIA FUSION Co.,Ltd. All rights reserved.