Researcher Database

Toshinori Endo
Faculty of Information Science and Technology Bioengineering and Bioinformatics Bioinformatics

Researcher Profile and Settings


  • Faculty of Information Science and Technology Bioengineering and Bioinformatics Bioinformatics

Job Title

  • Professor


  • Ph.D. in Genetics(The Graduate University for Advanced Studies)
  • Master.of Science(The University of Tokyo)


Research funding number

  • 00323692

J-Global ID

Research Interests

  • 遺伝子進化と生物進化   positive natural selection   prediction of function by means of machine learning   gene structure and function   Information Biology   

Research Areas

  • Life sciences / Evolutionary biology
  • Life sciences / Biodiversity and systematics
  • Life sciences / Functional biochemistry
  • Life sciences / Genomics
  • Nanotechnology/Materials / Molecular biochemistry
  • Informatics / Biological, health, and medical informatics

Academic & Professional Experience

  • 2016/12 - Today Hokkaido University
  • 2004/04 - Today Hokkaido University Graduate School of Information Science and Technology Professor
  • 2020/04 - 2021/03 Hokkaido University
  • 2020/04 - 2021/03 Hokkaido University
  • 2015/04 - 2017/03 Hokkaido University
  • 2014/04 - 2015/03 Hokkaido University Graduate School of Information Science and Technology
  • 2014/04 - 2015/03 Hokkaido University Department of Informatics and Electronics, Faculty of Engineering Department Chair
  • 2008/04 - 2009/03 Hokkaido University Graduate School of Information Science and Technology
  • 2000/04 - 2004/03 Tokyo Medical and Dental University Lecturer
  • 1999/05 - 2000/03 RIKEN Genome Sciences Center Researcher
  • 1997/04 - 1999/04 Japan Society for the Promotion of Science
  • 1996/04 - 1997/03 New Energy and Industrial Technology Development Organization Researcher


  • 1993/04 - 1996/03  The Graduate University for Advanced Studies  School of Life Science  Department of Genetics
  • 1990/04 - 1993/03  University of Tokyo  School of Science  Department of Biophysics and Biochemistry
  • 1986/04 - 1990/03  Tokyo University  Faculty of Science  Department of Biophysics and Biochemistry

Association Memberships

  • Japan Society of Bioinformatics   生き物文化誌学会   Genetics Society of Japan   日本進化学会   日本情報処理学会   Society for Molecular Biology and Evolution   日本情報処理学会バイオ情報学研究会   

Research Activities

Published Papers

  • Takahiro Nakamura, Toshinori Endo, Naoki Osada
    IPSJ Transactions on Bioinformatics 15 9 - 16 2022
  • Masaki Kawamoto, Toshiyuki Yamaji, Kyoko Saito, Yoshitaka Shirasago, Kazuhiro Satomura, Toshinori Endo, Masayoshi Fukasawa, Kentaro Hanada, Naoki Osada
    Frontiers in Genetics 11 546106 - 546106 2020/10/09 [Refereed]
    The human hepatoma-derived HuH-7 cell line and its derivatives (Huh7.5 and Huh7.5.1) have been widely used as a convenient experimental substitute for primary hepatocytes. In particular, these cell lines represent host cells suitable for propagating the hepatitis C virus (HCV) in vitro. The Huh7.5.1-8 cell line, a subline of Huh7.5.1, can propagate HCV more efficiently than its parental cells. To provide genomic information for cells' quality control, we performed whole-genome sequencing of HuH-7 and Huh7.5.1-8 and identified their characteristic genomic deletions, some of which are applicable to an in-house test for cell authentication. Among the genes related to HCV infection and replication, 53 genes were found to carry missense or loss-of-function mutations likely specific to the HuH-7 and/or Huh7.5.1-8. Eight genes, including DDX58 (RIG-I), BAX, EP300, and SPP1 (osteopontin), contained mutations observed only in Huh7.5.1-8 or mutations with higher frequency in Huh7.5.1-8. These mutations might be relevant to phenotypic differences between the two cell lines and may also serve as genetic markers to distinguish Huh7.5.1-8 cells from the ancestral HuH-7 cells.
  • Hiro Takahashi, Shido Miyaki, Hitoshi Onouchi, Taichiro Motomura, Nobuo Idesako, Anna Takahashi, Masataka Murase, Shuichi Fukuyoshi, Toshinori Endo, Kenji Satou, Satoshi Naito, Motoyuki Itoh
    Scientific reports 10 (1) 16289 - 16289 2020/10/01 [Refereed]
    Upstream open reading frames (uORFs) are present in the 5'-untranslated regions of many eukaryotic mRNAs, and some peptides encoded by these regions play important regulatory roles in controlling main ORF (mORF) translation. We previously developed a novel pipeline, ESUCA, to comprehensively identify plant uORFs encoding functional peptides, based on genome-wide identification of uORFs with conserved peptide sequences (CPuORFs). Here, we applied ESUCA to diverse animal genomes, because animal CPuORFs have been identified only by comparing uORF sequences between a limited number of species, and how many previously identified CPuORFs encode regulatory peptides is unclear. By using ESUCA, 1517 (1373 novel and 144 known) CPuORFs were extracted from four evolutionarily divergent animal genomes. We examined the effects of 17 human CPuORFs on mORF translation using transient expression assays. Through these analyses, we identified seven novel regulatory CPuORFs that repressed mORF translation in a sequence-dependent manner, including one conserved only among Eutheria. We discovered a much higher number of animal CPuORFs than previously identified. Since most human CPuORFs identified in this study are conserved across a wide range of Eutheria or a wider taxonomic range, many CPuORFs encoding regulatory peptides are expected to be found in the identified CPuORFs.
  • Hiro Takahashi, Noriya Hayashi, Yuta Hiragori, Shun Sasaki, Taichiro Motomura, Yui Yamashita, Satoshi Naito, Anna Takahashi, Kazuyuki Fuse, Kenji Satou, Toshinori Endo, Shoko Kojima, Hitoshi Onouchi
    BMC genomics 21 (1) 260 - 260 2020/03/30 [Refereed][Not invited]
    BACKGROUND: Upstream open reading frames (uORFs) in the 5'-untranslated regions (5'-UTRs) of certain eukaryotic mRNAs encode evolutionarily conserved functional peptides, such as cis-acting regulatory peptides that control translation of downstream main ORFs (mORFs). For genome-wide searches for uORFs with conserved peptide sequences (CPuORFs), comparative genomic studies have been conducted, in which uORF sequences were compared between selected species. To increase chances of identifying CPuORFs, we previously developed an approach in which uORF sequences were compared using BLAST between Arabidopsis and any other plant species with available transcript sequence databases. If this approach is applied to multiple plant species belonging to phylogenetically distant clades, it is expected to further comprehensively identify CPuORFs conserved in various plant lineages, including those conserved among relatively small taxonomic groups. RESULTS: To efficiently compare uORF sequences among many species and efficiently identify CPuORFs conserved in various taxonomic lineages, we developed a novel pipeline, ESUCA. We applied ESUCA to the genomes of five angiosperm species, which belong to phylogenetically distant clades, and selected CPuORFs conserved among at least three different orders. Through these analyses, we identified 89 novel CPuORF families. As expected, ESUCA analysis of each of the five angiosperm genomes identified many CPuORFs that were not identified from ESUCA analyses of the other four species. However, unexpectedly, these CPuORFs include those conserved across wide taxonomic ranges, indicating that the approach used here is useful not only for comprehensive identification of narrowly conserved CPuORFs but also for that of widely conserved CPuORFs. Examination of the effects of 11 selected CPuORFs on mORF translation revealed that CPuORFs conserved only in relatively narrow taxonomic ranges can have sequence-dependent regulatory effects, suggesting that most of the identified CPuORFs are conserved because of functional constraints of their encoded peptides. CONCLUSIONS: This study demonstrates that ESUCA is capable of efficiently identifying CPuORFs likely to be conserved because of the functional importance of their encoded peptides. Furthermore, our data show that the approach in which uORF sequences from multiple species are compared with those of many other species, using ESUCA, is highly effective in comprehensively identifying CPuORFs conserved in various taxonomic ranges.
  • Kyoko Shibata, Toshinori Endo, Yoshikazu Kuribayashi
    Drug research 69 (10) 565 - 571 2019/10 [Refereed]
    OBJECTIVE: The aim of this study was to determine promising treatment options for human inflammatory dilated cardiomyopathy using a computational drug-repositioning approach (repurposing established drug compounds for new therapeutic indications). BACKGROUND: If the myocardial tissue is detected to be infiltrated with inflammatory cells, primarily of lymphocytes, and if the virus is confirmed using genetic examination (PCR) or immunostaining, the infection is suspected. However, there is no specific treatment (i. e., an antiviral drug) even if the virus is identified; therefore, we used Connectivity Map to identify compounds showing inverse drug-disease signatures, indicating activity against inflammatory dilated cardiomyopathy. RESULTS: Potential drug-repositioning candidates for the treatment of inflammatory dilated cardiomyopathy were explored through a systematic comparison of the gene expression profiles induced by drugs using Gene Expression Omnibus and Connectivity Map databases. CONCLUSION: Using a computational drug-repositioning approach based on the integration of publicly available gene expression signatures of drugs and diseases, sirolimus was suggested as a novel therapeutic option for inflammatory dilated cardiomyopathy.
  • Kazuhiro Satomura, Naoki Osada, Toshinori Endo
    Ecological Genetics and Genomics 13 100045  2019/09 [Refereed][Not invited]
  • Masaki Ikeda, Kazuhiro Satomura, Tsuyoshi Sekizuka, Kentaro Hanada, Toshinori Endo, Naoki Osada
    American Journal of Primatology e22882  2018/06 [Refereed][Not invited]
  • Yasuaki Takada, Ryutaro Miyagi, Aya Takahashi, Toshinori Endo, Naoki Osada
    G3-GENES GENOMES GENETICS 7 (7) 2227 - 2234 2160-1836 2017/07 [Refereed][Not invited]
    Joint quantification of genetic and epigenetic effects on gene expression is important for understanding the establishment of complex gene regulation systems in living organisms. In particular, genomic imprinting and maternal effects play important roles in the developmental process of mammals and flowering plants. However, the influence of these effects on gene expression are difficult to quantify because they act simultaneously with cis-regulatory mutations. Here we propose a simple method to decompose cis-regulatory (i.e., allelic genotype), genomic imprinting [i.e., parent-of-origin (PO)], and maternal [i.e., maternal genotype (MG)] effects on allele-specific gene expression using RNA-seq data obtained from reciprocal crosses. We evaluated the efficiency of method using a simulated dataset and applied the method to whole-body Drosophila and mouse trophoblast stem cell (TSC) and liver RNA-seq data. Consistent with previous studies, we found little evidence of PO and MG effects in adult Drosophila samples. In contrast, we identified dozens and hundreds of mouse genes with significant PO and MG effects, respectively. Interestingly, a similar number of genes with significant PO effect were detect in mouse TSCs and livers, whereas more genes with significant MG effect were observed in livers. Further application of this method will clarify how these three effects influence gene expression levels in different tissues and developmental stages, and provide novel insight into the evolution of gene expression regulation.
  • IKEDA Masaki, ENDO Toshinori, OSADA Naoki
    Primate Research Supplement 日本霊長類学会 32 59 - 59 2016 

    旧世界ザルのゲノムには、サルレトロウイルス(SRV)と相同なDNA塩基配列を持つ内在性サルレトロウイルス(SERV)のDNA塩基配列が存在することが知られている。SERVの多くは挿入や欠損などを含んだ不完全な状態でゲノムに存在するが、一部は完全な状態を保っていることが報告されている。また、不完全なSERV配列が組換えを起こすことによって完全な状態になり感染性を再獲得する可能性がある。さらに、旧世界ザルの細胞は細胞製材として使用されているため、SERV感染の可能性は医学的にも非常に重要な問題であるが、SERVが旧世界ザルのゲノム中にどのように分布、進化したかは明らかになっていない。これらの問題に答えるために本研究では、アフリカミドリザル(Chlorocebus sabaeus)、アカゲザル(Macaca mulatta)、アヌビスヒヒ(Papio anubis)、ヒト(Homo sapiens)のゲノム配列に存在するSERV配列を網羅的に探索し、それらの配列を用いて分子系統学的解析を行った。解析は、系統樹作成や同義置換率と非同義置換率の計算を行った。マルチプルアライメントの結果から、旧世界ザルのゲノムに存在する一部のSERVは、全てのタンパク質をコードしている状態でゲノム上に存在していることがわかった。また、得られた系統樹から、SRVとSERVは2つのクラスターに分かれていることもわかった。さらに、探索して得られた塩基配列の非同義置換率と同義置換率を調べて機能的制約の有無を確認したところ、挿入や欠損を含まない塩基配列は機能的制約を受けていることがわかった。解析の結果から、旧世界ザルのゲノム上に完全な状態で存在する内在性サルレトロウイルスは、過去に感染したSRVが機能を失われないままの状態で残っている可能性が示唆された。

  • Stolfi, Alberto, Sasakura, Yasunori, Chalopin, Domitille, Satou, Yutaka, Christiaen, Lionel, Dantec, Christelle, Endo, Toshinori, Naville, Magali, Nishida, Hiroki, Swalla, Billie J, Volff, Jean-Nicolas, Voskoboynik, Ayelet, Dauga, Delphine, Lemaire, Patrick
    GENESIS WILEY-BLACKWELL 53 (1:::SI) 1 - 14 1526-954X 2015/01 [Refereed][Not invited]
  • Satou, Yutaka, Hirayama, Kazuko, Mita, Kaoru, Fujie, Manabu, Chiba, Shota, Yoshida, Reiko, Endo, Toshinori, Sasakura, Yasunori, Inaba, Kazuo, Satoh, Nori
    MOLECULAR BIOLOGY AND EVOLUTION OXFORD UNIV PRESS 32 (1) 81 - 90 0737-4038 2015/01 [Refereed][Not invited]
    Because self-incompatibility loci are maintained heterozygous and recombination within self-incompatibility loci would be disadvantageous, self-incompatibility loci are thought to contribute to structural and functional differentiation of chromosomes. Although the hermaphrodite chordate, Ciona intestinalis, has two self-incompatibility genes, this incompatibility system is incomplete and self-fertilization occurs under laboratory conditions. Here, we established an inbred strain of C. intestinalis by repeated self-fertilization. Decoding genome sequences of sibling animals of this strain identified a 2.4-Mbheterozygous region on chromosome 7. A self-incompatibility gene, Themis-B, was encoded within this region. This observation implied that this self-incompatibility locus and the linkage disequilibrium of its flanking region contribute to the formation of the 2.4-Mb heterozygous region, probably through recombination suppression. We showed that different individuals in natural populations had different numbers and different combinations of Themis-B variants, and that the rate of self-fertilization varied among these animals. Our result explains why self-fertilization occurs under laboratory conditions. It also supports the concept that the Themis-B locus is preferentially retained heterozygous in the inbred line and contributes to the formation of the 2.4-Mb heterozygous region. High structural variations might suppress recombination, and this long heterozygous region might represent a preliminary stage of structural differentiation of chromosomes.
  • Keisuke Ueno, Katsuhiko Mineta, Kimihito Ito, Toshinori Endo
    BMC STRUCTURAL BIOLOGY 12 5  1472-6807 2012/04 [Refereed][Not invited]
    Background: Structural genomics approaches, particularly those solving the 3D structures of many proteins with unknown functions, have increased the desire for structure-based function predictions. However, prediction of enzyme function is difficult because one member of a superfamily may catalyze a different reaction than other members, whereas members of different superfamilies can catalyze the same reaction. In addition, conformational changes, mutations or the absence of a particular catalytic residue can prevent inference of the mechanism by which catalytic residues stabilize and promote the elementary reaction. A major hurdle for alignment-based methods for prediction of function is the absence (despite its importance) of a measure of similarity of the physicochemical properties of catalytic sites. To solve this problem, the physicochemical features radially distributed around catalytic sites should be considered in addition to structural and sequence similarities. Results: We showed that radial distribution functions (RDFs), which are associated with the local structural and physicochemical properties of catalytic active sites, are capable of clustering oxidoreductases and transferases by function. The catalytic sites of these enzymes were also characterized using the RDFs. The RDFs provided a measure of the similarity among the catalytic sites, detecting conformational changes caused by mutation of catalytic residues. Furthermore, the RDFs reinforced the classification of enzyme functions based on conventional sequence and structural alignments. Conclusions: Our results demonstrate that the application of RDFs provides advantages in the functional classification of enzymes by providing information about catalytic sites.
  • Mia Nakachi, Ayako Nakajima, Mamoru Nomura, Kouki Yonezawa, Keisuke Ueno, Toshinori Endo, Kazuo Inaba
    MOLECULAR REPRODUCTION AND DEVELOPMENT 78 (7) 529 - 549 1040-452X 2011/07 [Refereed][Not invited]
    In this study, we performed extensive proteomic analysis of sperm from the ascidian Ciona intestinalis. Sperm were fractionated into heads and flagella, followed by further separation into Triton X-100-soluble and -insoluble fractions. Proteins from each fraction and whole sperm were separated by isoelectric focusing using two different pH ranges, followed by SDS-PAGE at two different polyacrylamide concentrations. In total, 1,294 protein spots representing 304 non-redundant proteins were identified by mass spectrometry (MALDI-TOF). On comparison of the proteins in each fraction, we were able to identify the proteins specific to different sperm compartments. Further comparison with the testis proteome allowed the pairing of proteins with sperm-specific functions. Together with information on gene expression in developing embryos and adult tissues, these results provide insight into novel cellular and functional aspects of sperm proteins, such as distinct localization of actin isoforms, novel Ca-2 vertical bar-binding proteins in axonemes, localization of testis-specific serine/threonine kinase, and the presence of G-protein coupled signaling and ubiquitin pathway in sperm flagella. Mol. Reprod. Dev. 78: 529-549, 2011. (C) 2011 Wiley-Liss, Inc.
  • Katsuhiko Mineta, Yasuko Yamamoto, Yuji Yamazaki, Hiroo Tanaka, Yukiyo Tada, Kuniaki Saito, Atsushi Tamura, Michihiro Igarashi, Toshinori Endo, Kosei Takeuchi, Sachiko Tsukita
    FEBS LETTERS 585 (4) 606 - 612 0014-5793 2011/02 [Refereed][Not invited]
    Claudins (Cldn) are essential membrane proteins of tight junctions (TJs), which form the paracellular permselective barrier. They are produced by a multi-gene family of 24 reported members in mouse and human. Based on a comprehensive search combined with phylogenetic analyses, we identified three novel claudins (claudin-25, -26, and -27). Quantitative RT-PCR revealed that the three novel claudins were expressed in a tissue- and/or developmental stage-dependent manner. Claudins-25 and -26, but not claudin-27, were immunofluorescently localized to TJs when exogenously expressed in cultured MDCK and Eph epithelial cell lines. These findings expand the claudin family to include at least 27 members. Structured summary: Claudin-25 and ZO-1 colocalize by fluorescence microscopy (View interaction) ZO-1 and Claudin-26 colocalize by fluorescence microscopy (View interaction) (C) 2011 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
  • Toshinori Endo, Keisuke Ueno, Kouki Yonezawa, Katsuhiko Mineta, Kohji Hotta, Yutaka Satou, Lixy Yamada, Michio Ogasawara, Hiroki Takahashi, Ayako Nakajima, Mia Nakachi, Mamoru Nomura, Junko Yaguchi, Yasunori Sasakura, Chisato Yamasaki, Miho Sera, Akiyasu C. Yoshizawa, Tadashi Imanishi, Hisaaki Taniguchi, Kazuo Inaba
    NUCLEIC ACIDS RESEARCH 39 (suppl) D807 - D814 0305-1048 2011/01 [Not refereed][Not invited]
    The Ciona intestinalis protein database (CIPRO) is an integrated protein database for the tunicate species C. intestinalis. The database is unique in two respects: first, because of its phylogenetic position, Ciona is suitable model for understanding vertebrate evolution; and second, the database includes original large-scale transcriptomic and proteomic data. Ciona intestinalis has also been a favorite of developmental biologists. Therefore, large amounts of data exist on its development and morphology, along with a recent genome sequence and gene expression data. The CIPRO database is aimed at collecting those published data as well as providing unique information from unpublished experimental data, such as 3D expression profiling, 2D-PAGE and mass spectrometry-based large-scale analyses at various developmental stages, curated annotation data and various bioinformatic data, to facilitate research in diverse areas, including developmental, comparative and evolutionary biology. For medical and evolutionary research, homologs in humans and major model organisms are intentionally included. The current database is based on a recently developed KH model containing 36 034 unique sequences, but for higher usability it covers 89 683 all known and predicted proteins from all gene models for this species. Of these sequences, more than 10 000 proteins have been manually annotated. Furthermore, to establish a community-supported protein database, these annotations are open to evaluation by users through the CIPRO website. CIPRO 2.5 is freely accessible at
  • Endo Toshinori, Ueno Keisuke, Yonezawa Kouki, Mineta Katsuhiko, Hotta Kohji, Satou Yutaka, Yamada Lixy, Ogasawara Michio, Takahashi Hiroki, Nakajima Ayako, Nakachi Mia, Nomura Mamoru, Yaguchi Junko, Sasakura Yasunori, Yamasaki Chisato, Sera Miho, Yoshizawa Akiyasu C, Imanishi Tadashi, Taniguchi Hisaaki, Inaba Kazuo
    Nucleic Acids Research 39 (SUPPL. 1) 0305-1048 2011/01 [Not refereed][Not invited]
    <p>The Ciona intestinalis protein database (CIPRO) is an integrated protein database for the tunicate species C. intestinalis. The database is unique in two respects: first, because of its phylogenetic position, Ciona is suitable model for understanding vertebrate evolution; and second, the database includesoriginal large-scale transcriptomic and proteomic data. Ciona intestinalis has also been a favorite ofdevelopmental biologists. Therefore, large amounts of data exist on its development and morphology, along with a recent genome sequence and geneexpression data. The CIPRO database is aimed at collecting those published data as well as providing unique information from unpublished experimental data, such as 3D expression profiling, 2D-PAGE and mass spectrometry-based large-scale analysesat various developmental stages, curated annotation data and various bioinformatic data, to facilitate research in diverse areas, including developmental, comparative and evolutionary biology. For medical and evolutionary research, homologs in humans and major model organisms are intentionally included. The current database is based on a recently developed KH model containing 36 034 unique sequenc
  • Endo, Toshinori, Ueno, Keisuke, Yonezawa, Kouki, Mineta, Katsuhiko, Hotta, Kohji, Satou, Yutaka, Yamada, Lixy, Ogasawara, Michio, Takahashi, Hiroki, Nakajima, Ayako, Nakachi, Mia, Nomura, Mamoru, Yaguchi, Junko, Konno, Alu, Sasakura, Yasunori, Yoshizawa, Akiyasu C, Taniguchi, Hisaaki, Yamasaki, Chisato, Sera, Miho, Imanishi, Tadashi, Inaba, Kazuo
    GENOME BIOLOGY BIOMED CENTRAL LTD 11 (Suppl. 1) 0 - 0 1474-760X 2010/01 [Refereed][Not invited]
  • Chisato Yamasaki, Katsuhiko Murakami, Yasuyuki Fujii, Yoshiharu Sato, Erimi Harada, Jun-Ichi Takeda, Takayuki Taniya, Ryuichi Sakate, Shingo Kikugawa, Makoto Shimada, Motohiko Tanino, Kanako O. Koyanagi, Roberto A. Barrero, Craig Gough, Hong-Woo Chun, Takuya Habara, Hideki Hanaoka, Yosuke Hayakawa, Phillip B. Hilton, Yayoi Kaneko, Masako Kanno, Yoshihiro Kawahara, Toshiyuki Kawamura, Akihiro Matsuya, Naoki Nagata, Kensaku Nishikata, Akiko Ogura Noda, Shin Nurimoto, Naomi Saichi, Hiroaki Sakai, Ryoko Sanbonmatsu, Rie Shiba, Mami Suzuki, Kazuhiko Takabayashi, Aiko Takahashi, Takuro Tamura, Masayuki Tanaka, Susumu Tanaka, Fusano Todokoro, Kaori Yamaguchi, Naoyuki Yamamoto, Toshihisa Okido, Jun Mashima, Aki Hashizume, Lihua Jin, Kyung-Bum Lee, Yi-Chueh Lin, Asami Nozaki, Katsunaga Sakai, Masahito Tada, Satoru Miyazaki, Takashi Makino, Hajime Ohyanagi, Naoki Osato, Nobuhiko Tanaka, Yoshiyuki Suzuki, Kazuho Ikeo, Naruya Saitou, Hideaki Sugawara, Claire O'Donovan, Tamara Kulikova, Eleanor Whitfield, Brian Halligan, Mary Shimoyama, Simon Twigger, Kei Yura, Kouichi Kimura, Tomohiro Yasuda, Tetsuo Nishikawa, Yutaka Akiyama, Chie Motono, Yuri Mukai, Hideki Nagasaki, Makiko Suwa, Paul Horton, Reiko Kikuno, Osamu Ohara, Doron Lancet, Eric Eveno, Esther Graudens, Sandrine Imbeaud, Marie Anne Debily, Yoshihide Hayashizaki, Clara Amid, Michael Han, Andreas Osanger, Toshinori Endo, Michael A. Thomas, Mika Hirakawa, Wojciech Makalowski, Mitsuteru Nakao, Nam-Soon Kim, Hyang-Sook Yoo, Sandro J. De Souza, Maria de Fatima Bonaldo, Yoshihito Niimura, Vladimir Kuryshev, Ingo Schupp, Stefan Wiemann, Matthew Bellgard, Masafumi Shionyu, Libin Jia, Danielle Thierry-Mieg, Jean Thierry-Mieg, Lukas Wagner, Qinghua Zhang, Mitiko Go, Shinsei Minoshima, Masafumi Ohtsubo, Kousuke Hanada, Peter Tonellato, Takao Isogai, Ji Zhang, Boris Lenhard, Sangsoo Kim, Zhu Chen, Ursula Hinz, Anne Estreicher, Kenta Nakai, Izabela Makalowska, Winston Hide, Nicola Tiffin, Laurens Wilming, Ranajit Chakraborty, Marcelo Bento Soares, Maria Luisa Chiusano, Yutaka Suzuki, Charles Auffray, Yumi Yamaguchi-Kabata, Takeshi Itoh, Teruyoshi Hishiki, Satoshi Fukuchi, Ken Nishikawa, Sumio Sugano, Nobuo Nomura, Yoshio Tateno, Tadashi Imanishi, Takashi Gojobori
    NUCLEIC ACIDS RESEARCH 36 (D) D793 - D799 0305-1048 2008/01 [Not refereed][Not invited]
    Here we report the new features and improvements in our latest release of the H-Invitational Database (H-InvDB;, a comprehensive annotation resource for human genes and transcripts. H-InvDB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6. We mapped those human transcripts onto the human genome sequences (NCBI build 36.1) and determined 34 699 human gene clusters, which could define 34 057 (98.1%) protein-coding and 642 (1.9%) non-protein-coding loci; 858 (2.5%) transcribed loci overlapped with predicted pseudogenes. For all these transcripts and genes, we provide comprehensive annotation including gene structures, gene functions, alternative splicing variants, functional non-protein-coding RNAs, functional domains, predicted sub cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs, co-localization with orphan diseases, gene expression profiles, orthologous genes, proteinprotein interactions (PPI) and annotation for gene families. The current H-InvDB annotation resources consist of two main views: Transcript view and Locus view and eight sub-databases: the DiseaseInfo Viewer, H-ANGEL, the Clustering Viewer, G-integra, the TOPO Viewer, Evola, the PPI view and the Gene family/group.
  • Akihiro Matsuya, Ryuichi Sakate, Yoshihiro Kawahara, Kanako O. Koyanagi, Yoshiharu Sato, Yasuyuki Fujii, Chisato Yamasaki, Takuya Habara, Hajime Nakaoka, Fusano Todokoro, Kaori Yamaguchi, Toshinori Endo, Satoshi Oota, Wojciech Makalowski, Kazuho Ikeo, Yoshiyuki Suzuki, Kousuke Hanada, Katsuyuki Hashimoto, Momoki Hirai, Hisakazu Iwama, Naruya Saitou, Aiko T. Hiraki, Lihua Jin, Yayoi Kaneko, Masako Kanno, Katsuhiko Murakami, Akiko Ogura Noda, Naomi Saichi, Ryoko Sanbonmatsu, Mami Suzuki, Jun-Ichi Takeda, Masayuki Tanaka, Takashi Gojobori, Tadashi Imanishi, Takeshi Itoh
    NUCLEIC ACIDS RESEARCH 36 D787 - D792 0305-1048 2008/01 [Refereed][Not invited]
    Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Currently, with the rapid growth of transcriptome data of various species, more reliable orthology information is prerequisite for further studies. However, detection of orthologs could be erroneous if pairwise distance-based methods, such as reciprocal BLAST searches, are utilized. Thus, as a sub-database of H-InvDB, an integrated database of annotated human genes (, we constructed a fully curated database of evolutionary features of human genes, called Evola. In the process of the ortholog detection, computational analysis based on conserved genome synteny and transcript sequence similarity was followed by manual curation by researchers examining phylogenetic trees. In total, 18 968 human genes have orthologs among 11 vertebrates (chimpanzee, mouse, cow, chicken, zebrafish, etc.), either computationally detected or manually curated orthologs. Evola provides amino acid sequence alignments and phylogenetic trees of orthologs and homologs. In 'd(N)/d(S) view', natural selection on genes can be analyzed between human and other species. In 'Locus maps', all transcript variants and their exon/intron structures can be compared among orthologous gene loci. We expect the Evola to serve as a comprehensive and reliable database to be utilized in comparative analyses for obtaining new knowledge about human genes. Evola is available at
  • Yutaka Satou, Katsuhiko Mineta, Michio Ogasawara, Yasunori Sasakura, Eiichi Shoguchi, Keisuke Ueno, Lixy Yamada, Jun Matsumoto, Jessica Wasserscheid, Ken Dewar, Graham B. Wiley, Simone L. Macmil, Bruce A. Roe, Robert W. Zeller, Kenneth E. M. Hastings, Patrick Lemaire, Erika Lindquist, Toshinori Endo, Kohji Hotta, Kazuo Inaba
    GENOME BIOLOGY 9 (10) R152  1474-760X 2008 [Refereed][Not invited]
    Background: The draft genome sequence of the ascidian Ciona intestinalis, along with associated gene models, has been a valuable research resource. However, recently accumulated expressed sequence tag (EST)/cDNA data have revealed numerous inconsistencies with the gene models due in part to intrinsic limitations in gene prediction programs and in part to the fragmented nature of the assembly. Results: We have prepared a less-fragmented assembly on the basis of scaffold-joining guided by paired-end EST and bacterial artificial chromosome (BAC) sequences, and BAC chromosomal in situ hybridization data. The new assembly (115.2 Mb) is similar in length to the initial assembly (116.7 Mb) but contains 1,272 (approximately 50%) fewer scaffolds. The largest scaffold in the new assembly incorporates 95 initial-assembly scaffolds. In conjunction with the new assembly, we have prepared a greatly improved global gene model set strictly correlated with the extensive currently available EST data. The total gene number (15,254) is similar to that of the initial set (15,582), but the new set includes 3,330 models at genomic sites where none were present in the initial set, and 1,779 models that represent fusions of multiple previously incomplete models. In approximately half, 5'-ends were precisely mapped using 5'-full-length ESTs, an important refinement even in otherwise unchanged models. Conclusion: Using these new resources, we identify a population of non-canonical (non-GT-AG) introns and also find that approximately 20% of Ciona genes reside in operons and that operons contain a high proportion of single-exon genes. Thus, the present dataset provides an opportunity to analyze the Ciona genome much more precisely than ever.
  • T Imanishi, T Itoh, Y Suzuki, C O'Donovan, S Fukuchi, KO Koyanagi, RA Barrero, T Tamura, Y Yamaguchi-Kabata, M Tanino, K Yura, S Miyazaki, K Ikeo, K Homma, A Kasprzyk, T Nishikawa, M Hirakawa, J Thierry-Mieg, D Thierry-Mieg, J Ashurst, LB Jia, M Nakao, MA Thomas, N Mulder, Y Karavidopoulou, LH Jin, S Kim, T Yasuda, B Lenhard, E Eveno, Y Suzuki, C Yamasaki, J Takeda, C Gough, P Hilton, Y Fujii, H Sakai, S Tanaka, C Amid, M Bellgard, MD Bonaldo, H Bono, SK Bromberg, AJ Brookes, E Bruford, P Carninci, C Chelala, C Couillault, SJ de Souza, MA Debily, MD Devignes, Dubchak, I, T Endo, A Estreicher, E Eyras, K Fukami-Kobayash, GR Gopinath, E Graudens, Y Hahn, M Han, ZG Han, K Hanada, H Hanaoka, E Harada, K Hashimoto, U Hinz, M Hirai, T Hishiki, Hopkinson, I, S Imbeaud, H Inoko, A Kanapin, Y Kaneko, T Kasukawa, J Kelso, P Kersey, R Kikuno, K Kimura, B Korn, Kuryshev, V, Makalowska, I, T Makino, S Mano, R Mariage-Samson, J Mashima, H Matsuda, HW Mewes, S Minoshima, K Nagai, H Nagasaki, N Nagata, R Nigam, O Ogasawara, O Ohara, M Ohtsubo, N Okada, T Okido, S Oota, M Ota, T Ota, T Otsuki, D Piatier-Tonneau, A Poustka, SX Ren, N Saitou, K Sakai, S Sakamoto, R Sakate, Schupp, I, F Servant, S Sherry, R Shiba, N Shimizu, M Shimoyama, AJ Simpson, B Soares, C Steward, M Suwa, M Suzuki, A Takahashi, G Tamiya, H Tanaka, T Taylor, JD Terwilliger, P Unneberg, Veeramachaneni, V, S Watanabe, L Wilming, N Yasuda, HS Yoo, M Stodolsky, W Makalowski, M Go, K Nakai, T Takagi, M Kanehisa, Y Sakaki, J Quackenbush, Y Okazaki, Y Hayashizaki, W Hide, R Chakraborty, K Nishikawa, H Sugawara, Y Tateno, Z Chen, M Oishi, P Tonellato, R Apweiler, K Okubo, L Wagner, S Wiemann, RL Strausberg, T Isogai, C Auffray, N Nomura, T Gojobori, S Sugano
    PLOS BIOLOGY 2 (6) 856 - 875 1545-7885 2004/06 [Refereed][Not invited]
    The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for nonprotein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.
  • T Endo, S Ogishima, H Tanaka
    JOURNAL OF MOLECULAR EVOLUTION 57 S174 - S181 0022-2844 2003 [Refereed][Not invited]
    Functional evolution is often driven by positive natural selection. Although it is thought to be rare in evolution at the molecular level, its effects may be observed as the accelerated evolutionary rates. Therefore one of the effective ways to identify functional evolution is to identify accelerated evolution. Many methods have been developed to test the statistical significance of the accelerated evolutionary rate by comparison with the appropriate reference rate. The rates of synonymous substitution are one of the most useful and popular references, especially for large-scale analyses. On the other hand, these rates are applicable only to a limited evolutionary time period because they saturate quickly-i.e., multiple substitutions happen frequently because of the lower functional constraint. The relative rate test is an alternative method. This technique has an advantage in terms of the saturation effect but is not sufficiently powerful when the evolutionary rate differs considerably among phylogenetic lineages. For the aim to provide a universal reference tree, we propose a method to construct a standardized tree which serves as the reference for accelerated evolutionary rate. The method is based upon multiple molecular phylogenies of single genes with the aim of providing higher reliability. The tree has averaged and normalized branch lengths with standard deviations for statistical neutrality limits. The standard deviation also suggests the reliability level of the branch order. The resulting tree serves as a reference tree for the reliability level of the branch order and the test of evolutionary rate acceleration even when some of the species lineages show an accelerated evolutionary rate for most of their genes due to bottlenecking and other effects.
  • T Sasaki, T Matsumoto, K Yamamoto, K Sakata, T Baba, Y Katayose, JZ Wu, Y Niimura, ZK Cheng, Y Nagamura, BA Antonio, H Kanamori, S Hosokawa, M Masukawa, K Arikawa, Y Chiden, M Hayashi, M Okamoto, T Ando, H Aoki, K Arita, M Hamada, C Harada, S Hijishita, M Honda, Y Ichikawa, A Idonuma, M Iijima, M Ikeno, S Ito, T Ito, Y Ito, Y Ito, A Iwabuchi, K Kamiya, W Karasawa, S Katagiri, A Kikuta, N Kobayashi, Kono, I, K Machita, T Maehara, H Mizuno, T Mizubayashi, Y Mukai, H Nagasaki, M Nakashima, Y Nakama, Y Nakamichi, M Nakamura, N Namiki, M Negishi, Ohta, I, N Ono, S Saji, K Sakai, M Shibata, T Shimokawa, A Shomura, JY Song, Y Takazaki, K Terasawa, K Tsuji, K Waki, H Yamagata, H Yamane, S Yoshiki, R Yoshihara, K Yukawa, HS Zhong, H Iwama, T Endo, H Ito, JH Hahn, HI Kim, MY Eun, M Yano, JM Jiang, T Gojohori
    NATURE 420 (6913) 312 - 316 0028-0836 2002/11 [Refereed][Not invited]
    The rice species Oryza sativa is considered to be a model plant because of its small genome size, extensive genetic map, relative ease of transformation and synteny with other cereal crops(1-4). Here we report the essentially complete sequence of chromosome 1, the longest chromosome in the rice genome. We summarize characteristics of the chromosome structure and the biological insight gained from the sequence. The analysis of 43.3 megabases (Mb) of non-overlapping sequence reveals 6,756 protein coding genes, of which 3,161 show homology to proteins of Arabidopsis thaliana, another model plant. About 30% ( 2,073) of the genes have been functionally categorized. Rice chromosome 1 is (G + C)-rich, especially in its coding regions, and is characterized by several gene families that are dispersed or arranged in tandem repeats. Comparison with a draft sequence(5) indicates the importance of a high-quality finished sequence.
  • T Endo, A Fedorov, SJ de Souza, W Gilbert
    MOLECULAR BIOLOGY AND EVOLUTION 19 (4) 521 - 525 0737-4038 2002/04 [Refereed][Not invited]
    Are intron positions correlated with regions of high amino acid conservation? For a set of ancient conserved proteins, with intronless prokaryotic but intron-containing eukaryotic homologs, multiple sequence alignments identified residues invariant throughout evolution. Intron positions between codons show no preferences. However, introns lying after the first base of a codon prefer conserved regions, markedly in glycines. Because glycines are in excess in conserved regions, this behavior could reflect phase-one introns entering glycine residues randomly in the ancestral sequences. Examination of intron positions within codons of evolutionarily invariable amino acids showed that roughly 50% of these introns are bordered by guanines at both 5'- and 3'-ends, 25% have a G only before the intron, and 5% have a G only after the intron, whereas about 20% are bordered by nonguanine bases.
  • Harukazu Suzuki, Yoshifumi Fukunishi, Ikuko Kagawa, Rintaro Saito, Hiroshi Oda, Toshinori Endo, Shinji Kondo, Hidemasa Bono, Yasushi Okazaki, Yoshihide Hayashizaki
    Genome Research 11 (10) 1758 - 1765 1088-9051 2001 [Refereed][Not invited]
    We have developed a novel assay system for systematic analysis of protein-protein interactions (PPIs) that is characteristic of a PCR-mediated rapid sample preparation and a high-throughput assay system based on the mammalian two-hybrid method. Using gene-specific primers, we successfully constructed the assay samples by two rounds of PCR with up to 3.6 kb from the first-round PCR fragments. In the assay system, we designed all the steps to be performed by adding only samples, reagents, and cells into 384-well assay plates using two types of semiautomatic multiple dispensers. The system enabled us examine more than 20,000 assay wells per day. We detected 145 interactions in our pilot study using 3500 samples derived from mouse full-length enriched cDNAs. Analysis of the interaction data showed both several significant interaction clusters and predicted functions of a few uncharacterized proteins. In combination with our comprehensive mouse full-length cDNA clone bank covering a large part of the whole genes, our high-throughput assay system will discover many interactions to facilitate understanding of the function of uncharacterized proteins and the molecular mechanism of crucial biological processes, and also enable completion of a rough draft of the entire PPI panel in certain cell types or tissues of mouse within a short time.
  • Y Sugahara, P Carninci, M Itoh, K Shibata, H Konno, T Endo, M Muramatsu, Y Hayashizaki
    GENE 263 (1-2) 93 - 102 0378-1119 2001/01 [Refereed][Not invited]
    To enhance the usefulness of the laboratory mouse and to facilitate the rapid assay of gene functions we have been collecting the entire set of mouse full-length cDNA by one-pass sequencing. To collect full-length cDNA clones efficiently, it is critical to construct high-quality cDNA libraries. In recent years, we have been developing a way to construct full-length cDNA libraries by using biotinylation of the cap structure (the 'CAP-trapper' method) coupled with treatment to increase reverse transcriptase efficiency at high temperature by the addition of trehalose. In this paper we report our evaluation of the quality of CAP trapper and a number of other full-length cDNA libraries, including the results of 5' end analysis of clones in CAP trapper and the other Libraries. We used a procedure that compared the 5'-ends of cDNA clones with those of genes in the public databases. Our analysis showed that 63% of cDNA clones in CAP trapper libraries had sequences that were either the same length as those of equivalent genes in the public database or 5'-extended, and that 90% of these clones maintained their coding sequences. These results indicate that the CAP trapper Library is a promising tool for collecting full-length cDNA in large-scale projects. Comparison of the quality of CAP trapper with that of other full-length-cDNA libraries confirmed the value of these libraries. (C) 2001 Elsevier Science B.V. All rights reserved.
  • Konno Hideaki, Fukunishi Yoshifumi, Endo Toshinori, Hayashizaki Yoshihide
    GI Japanese Society for Bioinformatics 10 288 - 289 0919-9454 1999
  • Endo Toshinori, Yamanaka Itaru, Konno Hideki, Fukunishi Yoshifumi, Kawai Jun, Suzuki Harukazu, Ozawa Yasuhiro, Shibata Kazuhiro, Yoshino Masayasu, Itoh Masayoshi, Carninci Piero, Okazaki Yasushi, Hayashizaki Yoshihide
    Genome Informatics Japanese Society for Bioinformatics 10 338 - 339 0919-9454 1999 [Refereed][Not invited]
  • Takeshi Ishimizu, Toshinori Endo, Yumi Yamaguchi-Kabata, Kazuo T. Nakamura, Fumio Sakiyama, Shigemi Norioka
    FEBS Letters 440 (3) 337 - 342 0014-5793 1998/12/04 [Refereed][Not invited]
    A stylar S-RNase is associated with gametophytic self-incompatibility in the Rosaceae, Solanaceae, and Scrophulariaceae. This S-RNase is responsible for S-allele-specific recognition in the self-incompatible reaction, but how it functions in specific discrimination is not clear. Window analysis of the numbers of synonymous (d(S)) and non-synonymous (d(N)) substitutions in rosaceous S-RNases detected four regions with an excess of d(N) over d(S) in which positive selection may operate (PS regions). The topology of the secondary structure of the S-RNases predicted by the PHD method is very similar to that of fungal RNase Rh whose tertiary structure is known. When the sequences of S-RNases are aligned with the sequence of RNase Rh based on the predicted secondary structures, the four PS regions correspond to two surface sites on the tertiary structure of RNase Rh. These findings suggest that in S-RNases the PS regions also form two sites and are candidates for the recognition sites for S-allele-specific discrimination. Copyright (C) 1998 Federation of European Biochemical Societies.
  • T Endo, T Imanishi, T Gojobori, H Inoko
    GENE 210 (2) 351 - + 0378-1119 1998/04 [Refereed][Not invited]
  • Toshinori Endo, Tadashi Imanishi, Takashi Gojobori, Hidetoshi Inoko
    GENE 205 (1/2) 19 - 27 1997/12 [Refereed][Not invited]
  • Y Tateno, K Ikeo, T Imanishi, H Watanabe, T Endo, Y Yamaguchi, Y Suzuki, K Takahashi, K Tsunoyama, M Kawai, Y Kawanishi, K Naitou, T Gojobori
    JOURNAL OF MOLECULAR EVOLUTION 44 S38 - S43 0022-2844 1997 [Refereed][Not invited]
    We developed a method for multiple alignment of protein sequences. The main feature of this method is that it takes the evolutionary relationships of the proteins in question into account repeatedly for execution, until the relationships and alignment results are in agreement. We then applied this method to the data of the international DNA sequence databases, which are the most comprehensive and updated DNA databases in the world, in order to estimate the ''evolutionary motif'' by extensive use of a supercomputer. Though a few problems needed to be solved, we could estimate the length of the motifs in the range of 20 to 200 amino acids, with about 60 the most frequent length. We then discussed their biological and structural significance. We believe that we are now in a position to analyze DNA and protein not only in vivo and in vitro but also in silico.
  • A Fumihito, T Miyake, M Takada, R Shingu, T Endo, T Gojobori, N Kondo, S Ohno
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA 93 (13) 6792 - 6795 0027-8424 1996/06 [Refereed][Not invited]
    With the aim of elucidating in greater detail the genealogical origin of the present domestic fowls of the world, we have determined mtDNA sequences of the D-loop regions for a total of 21 birds, of which 12 samples belong to red junglefowl (Gallus gallus) comprising three subspecies (six Gallus gallus gallus, three Gallus gallus spadiceus, and three Gallus gallus bankiva) and nine represent diverse domestic breeds (Gallus gallus domesticus). We also sequenced four green junglefowl (Gallus varius), two Lafayette's Junglefowl (Gallus lafayettei), and one grey junglefowl (Gallus sonneratii), We then constructed a phylogenetic tree for these birds by the use of nucleotide sequences, choosing the Japanese quail (Coturnix coturnix japonica) as an outgroup, We found that a continental population of G. g. gallus was the real matriarchic origin of all the domestic poultries examined in this study, It is also of particular interest that there were no discernible differences among G. gallus subspecies; G. g. bankiva was a notable exception, This was because G. g. spadiceus and a continental population of G. g. gallus formed a single cluster in the phylogenetic tree, G. g. bankiva, on the other hand, was a distinct entity, thus deserving its subspecies status, It implies that a continental population of G. g. gallus sufficed as the monophyletic ancestor of all domestic breeds, We also discussed a possible significance of the initial dispersal pattern of the present domestic fowls, using the phylogenetic tree.
  • T Endo, K Ikeo, T Gojobori
    MOLECULAR BIOLOGY AND EVOLUTION 13 (5) 685 - 690 0737-4038 1996/05 [Refereed][Not invited]
    We conducted a systematic search for the candidate genes on which positive selection may operate, on the premise that for such genes the number of nonsynonymous substitution is expected to be larger than that of synonymous substitutions when the nucleotide sequences of the genes under investigation are compared with each other. By obtaining 3,595 groups of homologous sequences from the DDBJ, EMBL, and GenBank DNA sequence databases, we found that 17 gene groups can be the candidates for the genes on which positive selection may operate. Thus, such genes are found to occupy only about 0.5% of the vast number of gene groups so far available. Interestingly enough, 9 out of the 17 gene groups were the surface antigens of parasites or viruses.
  • Large-scale search for genes on which positive selection may operate.
    Toshinori Endo
    Dessertation. Graduate School for Advanced Studies 1995 [Refereed][Not invited]
  • T Gojobori, T Endo, K Ikeo
    ONCOGENE 9 (11) 3305 - 3311 0950-9232 1994/11 [Refereed][Not invited]
    Transcription factor AP-1 is comprised of multiple protein complexes that include members of a family of genes related to the proto-oncogene c-fos. In this report, we have extended the analysis of one member of this family, fos-related antigen-2 (fra-2), by isolating and characterising genomic and cDNA clones encoding:the mouse fra-2 homolog. The overall gene structure (number and positions of introns) was similar to that of both the chicken fra-2 gene and other members of the fos family, and the relative positions of putative enhancers in the 5' regulatory region were well conserved between the mouse and chicken fra-2 genes. High levels of fra-2 mRNA were detected in ovary, stomach, small and large intestine, brain, lung and heart. The mouse Fra-2 protein showed 94% and 87.5% conservation with human and chicken Fra-2, respectively, and mouse Fra-2, like the chicken homolog, induced transformation of chicken embryo fibroblasts. The characterisation of the mouse fra-2 gene provides a basis for analysis of Fra-2 function in the whole animal.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA 90 (20) 9369 - 9373 0027-8424 1993/10 [Refereed][Not invited]
    Although a replication-competent retrovirus that carries junD has no transforming activity in chicken embryo fibroblasts, we have isolated mutant viruses that have spontaneously acquired transforming activity. The molecularly cloned junD genes of three such mutant viruses (T1, T2, and T3) were shown to be responsible for the cellular transformation. DNA sequence analysis indicated that a specific polynucleotide in the junD sequence was tandemly multiplied three times or five times in T1 and T2, respectively. The repeated polynucleotide encodes 16 amino acid residues that are located in a highly conserved region among Jun family proteins. The junD mutation in T3 involved an inversion, a translocation, and nucleotide substitutions that caused drastic amino acid exchanges in another well-conserved region among Jun family proteins. The transcriptional activity of these mutants was analyzed by means of transient expression experiments in F9 cells using a reporter gene containing a single AP-1 binding site. Compared with the wild-type JunD, none of them showed enhanced transactivating activity in the forms of homodimers or of heterodimers with c-Fos or Fra-1. However, they did exhibit much higher transactivating activity than the wild type when they formed heterodimers with Fra-2, indicating that the mutated regions function as transactivation domains in a partner-specific manner. Since we have previously reported that there is a basal level of Fra-2 expression in chicken embryo fibroblasts, the results may indicate that protein complexes between JunD mutants and Fra-2 play a crucial role in the cellular transforming activity.
    NUCLEIC ACIDS RESEARCH 19 (20) 5537 - 5542 0305-1048 1991/10 [Refereed][Not invited]
    Fra-2, one of the Fos-related antigens, is promptly expressed after the growth stimulation of fibroblasts, but its induction peak is later than that of c-Fos. In this report, we examined biochemical properties of Fra-2 and compared them with those of two other Fos family proteins, c-Fos and Fra-1. Like c-Fos and Fra-1, Fra-2 formed stable heterodimers with c-Jun, JunB or JunD in vitro and all these complexes had specific DNA-binding activity to AP-1-binding sites (AP-1 sites) or related sequences. When transiently introduced into a mouse embryonic carcinoma cell line, F9, with reporter genes containing the AP-1 site from the collagenase gene, fra-2 plus c-jun suppressed the transactivation by c-jun alone. This property of Fra-2 is in clear contrast to that of c-Fos, which stimulates the transcriptional activity of c-Jun by forming a stable heterodimer. Analysis of chimeric proteins between c-Fos and Fra-2 indicated that this difference is mainly attributable to their C terminal-half regions. Interestingly, this suppressive effect of Fra-2 was not observed in the combination with JunD: fra-2 plus junD, like c-fos plus junD, had higher transcriptional activity than junD alone. Fra-1 showed essentially the same transcriptional regulatory properties as Fra-2. These differential properties greatly expand the potential range of regulatory functions of the Fos family proteins.

Books etc

Conference Activities & Talks

  • Evolutionary analysis of Mollicutes based on standardized phylogenetic tree  [Not invited]
    遠藤 俊徳


Research Grants & Projects

  • Japan Society for the Promotion of Science:Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)
    Date (from‐to) : 2014/04 -2018/03 
    Author : Endo Toshinori
    Gene spectra for 314 animals species across ten phyla was investigated and organized based on the levels of taxonomic classifications; phylum, class, order, family, and genus. The number of genes in common across all investigated phyla was 1,908, most of which were covering fundamental core genes necessary to maintain life activities. A total of 1,299 lineage specific genes were also determined, where the majority were mitochondrial genes, suggesting animal specific mechanism requiring those genes. Interestingly, there were genes absent only in vertebrates. Those findings would provide key information to understand the process of evolution.
  • Japan Society for the Promotion of Science:Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas
    Date (from‐to) : 2005 -2009 
    Author : GOJOBORI Takashi, IKEO Kazuho, SUZUKI Yoshiyuki, ENDO Toshinori, MINETA Katsuhiko, OGURA Atsushi
    We developed three-dimensional brain databases for human, mouse, and tunicate, which constitute the platform for constructing disease-information models of neuronal diseases. We also mirrored the three-dimensional mouse brain database of Paul Allen Brain Institute of the United States, and compared gene expression patterns in human and mouse brains. In particular, we clarified the protein-protein interaction network of the proteins associated with the Altzheimer's disease.
  • GPGPUによる高速マルチプルアラインメント法の開発
    Date (from‐to) : 2008
  • Development of GPGPU based multiple sequence alignment method
    Date (from‐to) : 2008
  • オーファンエンザイム情報に基づく遺伝子機能予測
    Date (from‐to) : 2005
  • Prediction of the function of hypothetical genes based on the information of orphan enzymes
    0104 (Japanese Only)
    Date (from‐to) : 2005

Educational Activities

Teaching Experience

  • Information BiologyInformation Biology Hokkaido University
  • InformaticsInformatics Hokkaido University
  • Information Biology
    開講年度 : 2021
    課程区分 : 修士課程
    開講学部 : 情報科学研究科
    キーワード : バイオインフォマティクス、ゲノムとプロテオーム、分子進化、遺伝子発現、生物学データベース
  • Information Biology
    開講年度 : 2021
    課程区分 : 修士課程
    開講学部 : 情報科学院
    キーワード : バイオインフォマティクス、ゲノムとプロテオーム、分子進化、遺伝子発現、生物学データベース
  • Information Biology
    開講年度 : 2021
    課程区分 : 博士後期課程
    開講学部 : 情報科学研究科
    キーワード : バイオインフォマティクス、ゲノムとプロテオーム、分子進化、遺伝子発現、生物学データベース
  • Information Biology
    開講年度 : 2021
    課程区分 : 博士後期課程
    開講学部 : 情報科学院
    キーワード : バイオインフォマティクス、ゲノムとプロテオーム、分子進化、遺伝子発現、生物学データベース
  • Biology I
    開講年度 : 2021
    課程区分 : 学士課程
    開講学部 : 全学教育
    キーワード : 生体高分子,細胞の構造と機能,エネルギー代謝,細胞の成長と分裂,遺伝現象と遺伝子発現制御
  • Cell biology
    開講年度 : 2021
    課程区分 : 学士課程
    開講学部 : 工学部
    キーワード : 細胞膜、膜タンパク質、膜輸送、イオンチャネル、膜の電気的特性、細胞内区画、タンパク質輸送、シグナルトランスダクション、細胞のメッセンジャー、アポトーシス

Committee Membership

  • 2019/04 - Today   Japanese Society for Bioinformatics   Chair of Hokkaido Region
  • 2020/09 -2022/10   Genetics Society of Japan   Congress Chair of the 94th Annual Meeting
  • 2016/09 -2019/03   Japanese Society for Bioinformatics   Council
  • 2016/10 -2017/12   Informatics In Biology, Medicine and Pharmacology (IIBMP) / Annual Meetings of Japan Society of Bioinformatics   Chair
  • 2011   Society for Molecular Biology and Evolution   Annual meeting committee member   Society for Molecular Biology and Evolution

Copyright © MEDIA FUSION Co.,Ltd. All rights reserved.