SEARCH

Search Details

Arimura Hiroki

Faculty of Information Science and Technology Computer Science and Information Technology Knowledge Software ScienceProfessor
Research Center of Mathematics for Social CreativityProfessor
Institute for Academic InnovationProfessor

Hiroki Arimura is a Professor in the Graduate School of Information Science and Technology of Hokkaido University. After receiving his PhD from Kyushu University in Computer Science in 1994, his primary interests has been in data mining, information retrieval, machine learning, algorithms and data structures, and their applications. He has served on the program committees of a number of international conferences including PAKDD, DS, ALT, PKDD, KDD and others. He was the director of the Global COE (Centers of Excellence) Program of "Center for Next-Generation Information Technology Based on Knowledge Discovery and Knowledge Federation" at Hokkaido University by MEXT (Ministry of Education, Culture, Sports, Science and Technology of Japan) for 2007-2011.

Researcher basic information

■ Degree
  • Doctor of Science (D.Sc), Kyushu University, Jun. 1994
  • Master, Kyushu University, Mar. 1990
  • Bachelor of Science (B.Sc), Kyushu University, Mar. 1988
■ URL
researchmap URLホームページURL■ Various IDs
ORCID IDJ-Global ID■ Research Keywords and Fields
Research Keyword
  • computational learning theory
  • Computer Science
  • Artificial Intelligence
  • data structures and algorithms
  • data mining
  • information retrieval
  • machine learning
  • enumeration algorithms
  • graph
  • decision trees
  • text index
  • string algorithm
  • semi-structured data
  • ウェブマイニング
  • pattern mining
  • explanable machine learning
  • database
  • spatio-temporal data
  • compression
  • stream processing
  • knowledge discovery
  • big data
Research Field
  • Informatics, Theory of informatics, Design and analysis of algorithms
  • Informatics, Database, data mining/Information Retrieval
  • Informatics, Intelligent informatics, machine learning
■ Educational Organization

Career

■ Career
Career
  • Apr. 2021 - Present
    The Global Institution for Collaborative Research and Education (GI-CoRE), Hokkaido University, Associate faculty, Associate faculty, https://www.global.hokudai.ac.jp/research-education/global-research-and-education, Japan
  • Apr. 2019 - Present
    Hokkaido University, Graduate School of Information Science and Technology, Professor
  • Apr. 2020 - Mar. 2026
    Japan Science and Technology Agency (JST), Basic Research program, PRESTO Area "The fundamental technologies for Trustworthy AI" (Trustworthy AI), Research Supervisor (area director)
  • Oct. 2023 - Mar. 2025
    National Institute of Advanced Industrial Science and Technology (AIST), Artificial Intelligence Research Center (AIRC), Invited researcher, Japan
  • Mar. 2025
    Hokkaido University, Research Center of Mathematics for Social Creativity Research Institute for Electronic Science, Adjunct Professor, Japan
  • Apr. 2018 - Mar. 2022
    Hokkaido University Collaborative Project Center, Knowledge Media Laboratory, Leader (director), Japan
  • Jan. 2008 - Mar. 2022
    National Institute of Informatics, 連携研究部門, Adjunct professor
  • Apr. 2016 - Mar. 2021
    Global Station for Big Data and Cybersecurity (GSB), Global Institution for Collaborative Research and Education (GI-CoRE),Hokkaido University, PI (adjunct), Japan
  • Apr. 2004 - Mar. 2019
    Hokkaido University, Graduate School of Information Science and Technology, Professor
  • Apr. 2016 - Mar. 2018
    Global Station for Big Data and Cybersecurity (GSB), Global Institution for Collaborative Research and Education (GI-CoRE), Director, Web site: https://gi-core.oia.hokudai.ac.jp/main/, Japan
  • Apr. 2015 - Mar. 2017
    Hokkaido University Collaborative Project Center, Knowledge Media Laboratory, leader (director)
  • Apr. 2013 - Mar. 2015
    Hokkaido University, Knowledge Media Laboratory, director
  • Jul. 2007 - Mar. 2012
    Graduate School of Information Science and Technology, Hokkaido University, MEXT/JSPS Global COE program, “ Next Generation Information Technology Bases to Support Knowledge Creation”, Director, Japan
  • Mar. 2005 - Jun. 2005
    Université Claude Bernard Lyon 1, Department of Computer Science, Visiting Researcher supported by MEXT (Ministry of Education, Culture, Sports, Science and Technology), Host professor: Pr. Mohand-Saïd Hacid, France
  • Apr. 1996 - Mar. 2004
    Kyushu University, Graduate School of Information Science and Electrical Engineering, Associate Professor
  • Apr. 2002 - Mar. 2003
    Kyushu University, Computing Center/Reseach Institute of Information Technology, Adjunct associate professor
  • Apr. 2001 - Mar. 2002
    Kyushu University, University Library, R&D division, Adjunt associate professor
  • Oct. 1999 - Mar. 2002
    JST PRESTO, "Information and Intelligence, Adjunct researcher
  • May 1996 - Oct. 1996
    University of Helsinki, Department of Computer Science, Visiting researcher, supported by a JSPS (Japanese Society of Promotion of Science) exchange program, Host professor: Esko Ukkonen, Finland
  • Apr. 1995 - Mar. 1996
    Kyushu Institute of Technology, Faculty of Computer Science and Systems Engineering, Associate Professor
  • Apr. 1994 - Mar. 1995
    Kyushu Institute of Technology, Faculty of Computer Science and Systems Engineering, Lecturer
  • Apr. 1990 - Mar. 1994
    Kyushu Institute of Technology, Faculty of Computer Science and Systems Engineering, Research associate (Associate professor), Japan
Educational Background
  • Apr. 1988 - Mar. 1990, Kyushu University, Interdisciplinary Graduate School of Engineering Sciences, Division of Information Systems, Master Course
  • Apr. 1984 - Mar. 1988, Kyushu University, Faculty of Science, Department of Physics
Committee Memberships
  • Oct. 2023 - Present
    Science Council of Japan (SCJ), Member (Section III: Physical Sciences and Engineering, Informatics), Government
  • Apr. 2020 - Mar. 2026
    Japan Science and Technology Agency (JST), Research Supervisor (area director), Basic Research program, PRESTO, area "The fundamental technologies for Trustworthy AI" (Trustworthy AI), Government
  • Oct. 2020 - Sep. 2023
    Science Council of Japan, Associate member (Section III: Physical Sciences and Engineering, Informatics), Government
  • Apr. 2018 - Mar. 2020
    Japan Society for the Promotion of Science, Advisory Board Member (Frontiers of Science Symposia)
  • Jun. 2017 - May 2019
    Japanese Society for Artificial Intelligence (JSAI), Delegate, Society
  • Jun. 2017 - Mar. 2019
    Ministry of Education, Culture, Sports, Science and Technology (MEXT), Associate Member, Council for Science and Technology, Government
  • Apr. 2011 - Mar. 2018
    Japan Society for the Promotion of Science (JSPS), Advisory Board Member (Junior Member), Frontiers of Science Symposia, Society
  • Jun. 2015 - May 2016
    Japanese Society for Artificial Intelligence, Program Committee Chair, the 30th Annual Conference of JSAI (JSAI 2016), Society
  • Jun. 2014 - May 2016
    Japanese Society for Artificial Intelligence (JSAI), Board Member, Society
  • Dec. 2014 - Apr. 2015
    Graduate School of Information Sciences, Tohoku University, External evaluation committee member:, Others
  • May 2008 - Apr. 2014
    IEICE, Expert Member, IEICE Technical Committee on Computation, Society
  • 2008 - 2014
    Japan Science and Technology Agency (JST), Advisor, Basic Research program, PRESTO, area “ Synthesis of Knowledge for Information Oriented Society” (research supervisor: Hideyuki Nakashima), Government
  • 2007 - Sep. 2013
    International Conference on Discovery Science, Steering committee member, International Conference on Discovery Science, Society
  • May 2006 - Apr. 2008
    IEICE, Vice Chair, IEICE Technical Committee on Computation, Society
  • Apr. 2006 - Mar. 2008
    The Japanese Society for Artificial Intelligence (JSAI), Chair, JSAI Special Interest Group on Fundamental Problems in Artificial Intelligence (SIG-FPAI), Society
  • 2008 - 2008
    Japan Society for the Promotion of Science (JSPS), Planning Group Member (PGM), UJFoS2008, Society
  • 2006 - 2007
    Center for Research and Development Strategy, Japan Science and Technology Agency, Leader, Panel B, Science and Technology Foresight Strategy Workshop "Prediction and Discovery", Society
  • 2006 - 2006
    1st Int'l Workshop on Data Mining and Statistical Science (DMSS-2006), Sapporo, Chair, Society
  • 2004 - 2006
    The Japanese Society for Artificial Intelligence (JSAI), Councilor (representative),, Society
  • 2004 - 2006
    情報処理学会, データベース研究会研究運営委員, Society
  • 2004 - 2006
    The Japanese Society for Artificial Intelligence (JSAI), Committee Member, JSAI Special Interest Group on Fundamental Problems in Artificial Intelligence (SIG-FPAI), Society
  • 2003 - 2005
    Japan Society for the Promotion of Science (JSPS), Planning Group Member (PGM), Japanese-German Frontiers of Science Symposium (JGFoS 2004-2005), Society
  • 1999 - 2000
    the 11th International Conference on Algorithmic Learning Theory (ALT'00, Sydney), Program Committee Co-chairs, Society
Position History
  • 知識メディア・ラボラトリー長, 2013年4月1日 - 2015年3月31日

Research activity information

■ Awards
  • Jun. 2024, The Symposium on Combinatorial Pattern Matching (CPM, established in 1994, the 35th in its 2024 edition), Test of Time Award (CPM Test of Time Award in year 2024)
    Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications (CPM 2021)
    Toru Kasai;Gunho Lee;Hiroki Arimura;Setsuo Arikawa;Kunsoo Park, Organization: The Symposium on Combinatorial Pattern Matching (CPM) was established in 1990. This year's conference is the 35th edition since the first CPM was held in Paris in 1990. Criteria: "The award is granted to outstanding papers in algorithms research that were published in the CPM proceedings at least 20 years ago and which are still influential and stimulating for the field today." (from the Award page of CPM 2024: )
    Paper: Toru Kasai, Gunho Lee, Hiroki Arimura, Setsuo Arikawa, Kunsoo Park, ``Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications,'' Proc.~CPM 2001, Springer, LNCS, pp.181-192, Jerusalem, Israel, July 1–4, 2001., International society, Japan
  • Jun. 2022, IPSJ, 2022 IPSJ Computer Science Research Award for Young Scientists
    Tsubasa Oizumi, Hiroki Arimura: Efficient Algorithms for Cartesian Tree Subsequence Matching, IPSJ Technical Report, Vol.2022-AL-186,No.3,1-8, Jan. 2022
    Tsubasa Oizumi
  • Jun. 2022, The Japanese Society for Artificial Intelligence, Best Paper Award
    Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear Optimization
    Kentaro Kanamori;Takuya Takagi;Ken Kobayashi;Hiroki Arimura, 35135885;35136410;11763074
  • Jan. 2019, JSAI, Best Presentation Award
    Fairness-aware Edit of a Learned Decision Tree Using Integer Linear Programming
    Kentaro Kanamori;Hiroki Arimura
  • Jun. 2016, Information Processing Society Japan (IPSJ), Best Paper Award
    Practical Algorithms for Mining Flock Patterns from Trajectories
    Xiaoliang Geng;Takeaki Uno;Hiroki Arimura, Official journal
  • Mar. 2016, IPSJ, 2016 Yamashita Award
    Efficient Approximate 3-Dimensional Point Set Matching and Its Application to Molecular Pattern Matching, 2015-BIO-42,2015-MPS-104, 2015/6
    Yoichi Sasaki
  • 2015, IPSJ, Student Encouragement Award of IPSJ National Convention
    Efficient Approximate 3-Dimensional Point Set Matching and Its Application to Molecular Pattern Matching
    Yoichi Sasaki
  • Jun. 2013, The Japanese Society for Artificial Intelligence, JSAI Incentive Award
    Efficient Enumeration of Acyclic Sub-hypergraphs in Hypergraphs
    Kunihiro Wasa;Hiroki Arimura;Takeaki Uno;Kouichi Hirata
  • Jun. 2010, IEICE, Inf. & Syst. Society Best Paper Award (4 year award)
    「ゼロサプレス型二分決定グラフを用いたトランザクションデータベースの 効率的解析手法」
    Shin-ichi Minato;Hiroki Arimura
  • Feb. 2005, Japan Database Society Japan, Kambayashi memorial award
    Hiroki Arimura
  • Nov. 2004, 2nd Workshop on Frequent Itemset Mining Implementations (FIMI'04), in conjunction with IEEE ICDM'04, BEST IMPLEMENTATION AWARD
    Takeaki Uno;Masashi Kiyomi;Hiroki Arimura
  • Nov. 2004, Japan Society of Artificial Intelligence, 2004 SIG Best Paper Award
    「大規模系列データから代表的な頻出エピソードを発見するための効率よいアルゴリズム」
    Hiroki Arimura;Takeaki Uno
  • Jul. 2004, IEICE SIG-DE, DEWS2004 Best paper award
    「半構造データマイニングのための高速な無順序木パターン発見手法」
    Shinji Fusanobu;Tatsuya Asai;Hiroki Arimura;Takeaki Uno;Shin-ichi Nakano
  • Jun. 2003, 電子情報通信学会DE研究会第14回データ工学ワークショップ, DEWS2003最優秀論文賞
    「領域効率の良い頻出データアイテム発見アルゴリズム」
    川副真治;有村博紀
  • May 2002, 電子情報通信学会DE研究会第14回データ工学ワークショップ, DEWS2002優秀論文賞
    浅井達哉;安部賢治;川副真治;坂本比呂志;有村博紀;有川節夫
  • May 2001, 人工知能学会, 2000年度論文賞
    「テキストデータからの高速データマイニング」, 安部潤一郎, 藤野亮一, 下薗真一, 有村博紀, 有川節夫(2000年7月掲載)
    安部潤一郎;藤野亮一;下薗真一;有村博紀;有川節夫
  • Apr. 2000, PAKDD2000, Paper with Merit Award
    "Discovering unordered and ordered phrase association patterns for text mining"
    Ryoichi Fujino;Hiroki Arimura;Setsuo Arikawa
  • Dec. 1999, 人工知能学会, 1999年度全国大会優秀論文賞
    「 大規模テキストデータからの探索的文書ブラウジング」
    板井力;笠井透;有村博紀;有川節夫
  • Mar. 1998, 情報処理学会, 第55回全国大会大会優秀賞
    「最適パタン発見に基づくテキストデータマイニング」
    有村博紀;渡木厚;藤野亮一;有川節夫
  • Jul. 1992, 人工知能学会, 全国大会優秀論文賞
    「極小汎化に基づくPROLOGプログラムの正事実からの多項式時間推論」
    有村博紀;石坂裕毅;篠原武
  • Jun. 1992, 人工知能学会, 1992年度研究奨励賞
    「Polynomial Time Inference of Unions of Tree Pattern Languages」
    有村博紀;石坂裕毅;篠原武
■ Papers
  • Computing Minimal Absent Words and Extended Bispecial Factors with CDAWG Space
    Shunsuke Inenaga; Takuya Mieno; Hiroki Arimura; Mitsuru Funakoshi; Yuta Fujishige
    Combinatorial Algorithms - 35th International Workshop, IWOCA 2024, Tainan, Taiwan, June 7-10, 2023, Proceedings. Lecture Notes in Computer Science, Springer, 14764, 327, 340, Jul. 2024, [Peer-reviewed], [International Magazine]
    English, Scientific journal, 35135885;35135762;35136410
  • Finding Diverse Strings and Longest Common Subsequences in a Graph.
    Yuto Shida; Giulia Punzi; Yasuaki Kobayashi; Takeaki Uno; Hiroki Arimura
    The 35th Annual Symposium on Combinatorial Pattern Matching, CPM 2024, 296, 27:1, 27:19, Schloss Dagstuhl – Leibniz-Zentrum für Informati, 27 Jun. 2024, [Peer-reviewed], [Corresponding author], [International Magazine]
    English, International conference proceedings, This paper introduces the Diverse Longest Common Subsequences (LCSs) problem under Hamming distance. Given a set of strings, the goal is to find a subset of K LCSs with diversity at least Δ, measured by either the sum or minimum of pairwise Hamming distances. We analyze the computational complexity of this problem, providing polynomial-time algorithms for bounded K and approximation algorithms for unbounded K. Moreover, we presented a few hardness results. The above results are obtained in a more general setting where strings are represented by a DAG, and prove that our positive results hold in this setting., 35135885;35136410;11763074
  • Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph.
    Hiroki Arimura; Shunsuke Inenaga; Yasuaki Kobayashi; Yuto Nakashima; Mizuki Sue
    Proceedings of the 30th International Symposium on String Processing and Information Retrieval (SPIRE 2023), 14240, 28, 34, Lecture Notes in Computer Science, Sep. 2023, [Peer-reviewed], [Lead author, Corresponding author], [International Magazine]
    English, International conference proceedings, This paper investigates the computational complexity of converting a Compact Directed Acyclic Word Graph (CDAWG) into various text indexing structures suitable for highly repetitive texts. By extending previous work, we present optimal construction algorithms in linear time and space for the run-length BWT, irreducible PLCP array, quasi-irreducible LPF array, lex-parse, and LZ77-parse from a CDAWG or its self-index version. Our algorithms consist of novel techniques for characterizing these arrays in terms of a CDAWG and efficiently enumerating specific subsets of suffixes using forward and backward search on the CDAWG., 35135885;11763074
  • Minimum Consistent Subset for Trees Revisited.
    Hiroki Arimura; Tatsuya Gima; Yasuaki Kobayashi 0001; Hiroomi Nochide; Yota Otachi
    CoRR, abs/2305.07259, 2023
    Scientific journal
  • Computing the Collection of Good Models for Rule Lists
    Kota Mata; Kentaro Kanamori; Hiroki Arimura
    Proc. the 18th International Conference on Machine Learning and Data Mining (MLDM 2022), abs/2204.11285, IbaI Publishing, Jul. 2022, [Peer-reviewed], [International Magazine]
    English, International conference proceedings
  • Cartesian Tree Subsequence Matching
    Tsubasa Oizumi; Takeshi Kai; Takuya Mieno; Shunsuke Inenaga; Hiroki Arimura
    Proc. the 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022), Leibniz International Proceedings in Informatics (LIPIcs), LIPIcs223, 14:1, 14:18, DROPS, Jun. 2022, [Peer-reviewed]
    English, International conference proceedings
  • Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear Optimization
    Kentaro Kanamori; Takuya Takagi; Ken Kobayashi; Hiroki Arimura
    Transactions of the Japanese Society for Artificial Intelligence, 36, 6, C, L44_1, Japanese Society for Artificial Intelligence, 01 Nov. 2021, [Peer-reviewed]
    English, Scientific journal, 35136410;11763074;35135885
  • Efficient enumeration of dominating sets for sparse graphs
    Kazuhiro Kurita; Kunihiro Wasa; Hiroki Arimura; Takeaki Uno
    Discrete Applied Mathematics, 303, 283, 295, Elsevier BV, Nov. 2021, [Peer-reviewed]
    English, Scientific journal
  • Fairness-Aware Decision Tree Editing Based on Mixed- Integer Linear Optimization
    Kentaro Kanamori; Takuya Takagi; Ken Kobayashi; Hiroki Arimura
    Transactions of the Japanese Society for Artificial Intelligence, 36, 4, B, L13_1, Japanese Society for Artificial Intelligence, 01 Jul. 2021, [Peer-reviewed]
    English, Scientific journal, 35136410;35135762;35135885
  • A constant amortized time enumeration algorithm for independent sets in graphs with bounded clique number
    Kazuhiro Kurita; Kunihiro Wasa; Takeaki Uno; Hiroki Arimura
    Theoretical Computer Science, 874, 32, 41, Elsevier BV, Jun. 2021, [Peer-reviewed]
    English, Scientific journal
  • Ordered Counterfactual Explanation by Mixed-Integer Linear Optimization.
    Kentaro Kanamori; Takuya Takagi; Ken Kobayashi; Yuichi Ike; Kento Uemura; Hiroki Arimura
    Proceedings of the AAAI Conference on Artificial Intelligence, 35, 13, 11564, 11574, Feb. 2021, [Peer-reviewed]
    English, International conference proceedings
  • DACE: Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear Optimization.
    Kentaro Kanamori; Takuya Takagi; Ken Kobayashi; Hiroki Arimura
    Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20), 2855, 2862, ijcai.org, Jan. 2021, [Peer-reviewed]
    English, International conference proceedings
  • Fully-Online Suffix Tree and Directed Acyclic Word Graph Construction for Multiple Texts.
    Takuya Takagi; Shunsuke Inenaga; Hiroki Arimura; Dany Breslauer; Diptarama Hendrian
    Algorithmica, 82, 5, 1346, 1377, 2020, [Peer-reviewed]
    English, Scientific journal
  • Route graph: joint map-matching by maximizing posterior probability,
    Hiroya Inakoshi; Junichi Shigezumi; Tatsuya Asai; Takuya Kida; Hiroki Arimura
    IPSJ Transactions on Mathematical Modeling and Its Applications, 12, 1, 1, 9, IPSJ, Sep. 2019, [Peer-reviewed]
    English, Scientific journal
  • Discovery of Regularized Areas with Maximal Confidence from Location Data
    Hiroya Inakoshi; Tatsuya Asai; Takuya Kida; Hiroki Arimura
    Transactions of the Japanese Society for Artificial Intelligence, 34, 3, D, I56_1, Japanese Society for Artificial Intelligence, 01 May 2019, [Peer-reviewed]
    English, Scientific journal
  • The Complexity of Induced Tree Reconfiguration Problems
    Kunihiro Wasa; Katsuhisa Yamanaka; Hiroki Arimura
    IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences, E102-D, 3, 464, 469, The Institute of Electronics, Information and Communication Engineers, Mar. 2019, [Peer-reviewed]
    English, Scientific journal,

    Given two feasible solutions A and B, a reconfiguration problem asks whether there exists a reconfiguration sequence (A0=A, A1,...,A=B) such that (i) A0,...,A are feasible solutions and (ii) we can obtain Ai from Ai-1 under the prescribed rule (the reconfiguration rule) for each i ∈ {1,...,ℓ}. In this paper, we address the reconfiguration problem for induced trees, where an induced tree is a connected and acyclic induced subgraph of an input graph. We consider the following two rules as the prescribed rules: Token Jumping: removing u from an induced tree and adding v to the tree, and Token Sliding: removing u from an induced tree and adding v adjacent to u to the tree, where u and v are vertices of an input graph. As the main results, we show that (I) the reconfiguration problemis PSPACE-complete even if the input graph is of bounded maximum degree, (II) the reconfiguration problem is W[1]-hard when parameterized by both the size of induced trees and the length of the reconfiguration sequence, and (III) there exists an FPT algorithm when the problem is parameterized by both the size of induced trees and the maximum degree of an input graph under Token Jumping and Token Sliding.

  • Constant Amortized Time Enumeration of Independent Sets for Graphs with Forbidden Subgraphs on Fixed Number of Vertices.
    Kazuhiro Kurita; Kunihiro Wasa; Hiroki Arimura; Takeaki Uno
    CoRR, abs/1906.09680, 2019, [Peer-reviewed]
    Scientific journal
  • Efficient Approximate 3-Dimensional Point Set Matching Using Root-Mean-Square Deviation Score.
    Yoichi Sasaki; Tetsuo Shibuya; Kimihito Ito; Hiroki Arimura
    IEICE Transactions, 102-A, 9, 1159, 1170, 2019, [Peer-reviewed]
    English, Scientific journal
  • An Efficient Algorithm for Enumerating Chordal Bipartite Induced Subgraphs in Sparse Graphs
    Kazuhiro Kurita; Kunihiro Wasa; Takeaki Uno; Hiroki Arimura
    Lecture Notes in Computer Science, 339, 351, Springer International Publishing, 2019, [Peer-reviewed]
    English, International conference proceedings
  • Enumeration of Distinct Support Vectors for Interactive Decision Making.
    Kentaro Kanamori; Satoshi Hara; Masakazu Ishihata; Hiroki Arimura
    CoRR, abs/1906.01876, arXiv, 2019, [Peer-reviewed], [International Magazine]
    English, International conference proceedings
  • Discovering Co-Cluster Structure from Relationships between Biased Objects
    Iku OHAMA; Takuya KIDA; Hiroki ARIMURA
    IEICE Transactions on Information and Systems, E101.D, 12, 3108, 3122, Institute of Electronics, Information and Communications Engineers (IEICE), 01 Dec. 2018, [Peer-reviewed]
    English, Scientific journal
  • DenseZDD: a compact and fast index for families of sets
    Shuhei Denzumi; Jun Kawahara; Koji Tsuda; Hiroki Arimura; Shin-ichi Minato; Kunihiko Sadakane
    Algorithms, 11, 8, 128, 128, Aug. 2018, [Peer-reviewed]
    English, Scientific journal
  • Efficient Enumeration of Induced Matchings in a Graph without Cycles with Length Four
    KURITA Kazuhiro; WASA Kunihiro; UNO Takeaki; ARIMURA Hiroki
    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 101, 9, 1383, 1391, The Institute of Electronics, Information and Communication Engineers, 2018
    English,

    In this study, we address a problem pertaining to the induced matching enumeration. An edge set M is an induced matching of a graph G=(V,E). The enumeration of matchings has been widely studied in literature; however, there few studies on induced matching. A straightforward algorithm takes O2) time for each solution that is coming from the time to generate a subproblem, where Δ is the maximum degree in an input graph. To generate a subproblem, an algorithm picks up an edge e and generates two graphs, the one is obtained by removing e from G, the other is obtained by removing e, adjacent edge to e, and edges adjacent to adjacent edge of e. Since this operation needs O2) time, a straightforward algorithm enumerates all induced matchings in O2) time per solution. We investigated local structures that enable us to generate subproblems within a short time and proved that the time complexity will be O(1) if the input graph is C4-free. A graph is C4-free if and only if none of its subgraphs have a cycle of length four.

  • Efficient Enumeration of Subgraphs and Induced Subgraphs with Bounded Girth.
    Kazuhiro Kurita; Kunihiro Wasa; Alessio Conte; Takeaki Uno; Hiroki Arimura
    Combinatorial Algorithms - 29th International Workshop, IWOCA 2018, Singapore, July 16-19, 2018, Proceedings, 201, 213, Springer, 2018, [Peer-reviewed]
    International conference proceedings
  • Efficient Enumeration of Dominating Sets for Sparse Graphs.
    Kazuhiro Kurita; Kunihiro Wasa; Hiroki Arimura; Takeaki Uno
    The 29th International Symposium on Algorithms and Computation (ISAAC 2018), 8, 13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018, [Peer-reviewed]
    English, International conference proceedings
  • On the Model Shrinkage Effect of Gamma Process Edge Partition Models
    Iku Ohama; Issei Sato; Takuya Kida; Hiroki Arimura
    Proc. the 31st Annual Conference on Neural Information Processing Systems (NIPS2017), Dec. 2017, [Peer-reviewed]
    English, International conference proceedings
  • Packed compact tries: A fast and efficient data structure for online string processing
    Takuya Takagi; Shunsuke Inenaga; Kunihiko Sadakane; Hiroki Arimura
    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E100A, 9, 1785, 1793, Institute of Electronics, Information and Communication, Engineers, IEICE, 01 Sep. 2017, [Peer-reviewed]
    English, Scientific journal
  • Statistical emerging pattern mining with multiple testing correction
    Junpei Komiyama; Masakazu Ishihata; Hiroki Arimura; Takashi Nishibayashi; Shin-Ichi Minato
    Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 129685, 897, 906, Association for Computing Machinery, 13 Aug. 2017, [Peer-reviewed]
    English, International conference proceedings
  • Discovering Relevance-Dependent Bicluster Structure from Relational Data
    Iku Ohama; Hiromi Iida; Takuya Kida; Hiroki Arimura
    Proc. the 26th International Joint Conference on Artificial Intelligence (IJCAI 2017), 2578, 2584, Aug. 2017, [Peer-reviewed]
    English, International conference proceedings
  • Efficient Enumeration of Induced Matchings in a Graph without Cycles with Length Four.
    Kazuhiro Kurita; Kunihiro Wasa; Takeaki Uno; Hiroki Arimura
    CoRR, abs/1707.02740, 2017, [Peer-reviewed]
    Scientific journal
  • Linear-size CDAWG: New repetition-aware indexing and grammar compression
    Takuya Takagi; Keisuke Goto; Yuta Fujishige; Shunsuke Inenaga; Hiroki Arimura
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10508, 304, 316, Springer Verlag, 2017, [Peer-reviewed], [International Magazine]
    English, International conference proceedings
  • Estimating the Lineage Dynamics of Human Influenza B Viruses
    Mayumbo Nyirenda; Ryosuke Omori; Heidi L. Tessmer; Hiroki Arimura; Kimihito Ito
    PLOS ONE, 11, 11, e0166107, Nov. 2016, [Peer-reviewed]
    English, Scientific journal
  • Sequence binary decision diagram: Minimization, relationship to acyclic automata, and complexities of Boolean set operations
    Shuhei Denzumi; Ryo Yoshinaka; Hiroki Arimura; Shin-ichi Minato
    DISCRETE APPLIED MATHEMATICS, 212, 61, 80, Oct. 2016, [Peer-reviewed]
    English, Scientific journal
  • Fully-online construction of suffix trees for multiple texts
    Takuya Takagi; Shunsuke Inenaga; Hiroki Arimura
    Proc. the 27th Annual Symposium on Combinatorial Pattern Matching (CPM'16), Leibniz International Proceedings in Informatics (LIPIcs), 54, 22:1, 22:13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Jun. 2016, [Peer-reviewed]
    English, International conference proceedings
  • The Relevance Dependent Infinite Relational Model for Discovering Co-Cluster Structure from Relationships with Structured Noise
    Iku Ohama; Hiromi Iida; Takuya Kida; Hiroki Arimura
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E99D, 4, 1139, 1152, Apr. 2016, [Peer-reviewed]
    English, Scientific journal
  • Relaxing the Data Access Bottleneck of Geographic Big-data Analytics Applications using Distributed Quad trees
    Mayumbo Nyirenda; Hiroki Arimura; Kimihito Ito
    PROCEEDINGS OF 2016 5TH INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 189, 194, 2016, [Peer-reviewed]
    English, International conference proceedings
  • The Complexity of Induced Tree Reconfiguration Problems
    Kunihiro Wasa; Katsuhisa Yamanaka; Hiroki Arimura
    LANGUAGE AND AUTOMATA THEORY AND APPLICATIONS, LATA 2016, 9618, 330, 342, 2016, [Peer-reviewed]
    English, International conference proceedings
  • Packed Compact Tries: A Fast and Efficient Data Structure for Online String Processing
    Takuya Takagi; Shunsuke Inenaga; Kunihiko Sadakane; Hiroki Arimura
    Combinatorial Algorithms, 9843, 213, 225, 2016, [Peer-reviewed]
    English, International conference proceedings
  • 二部グラフ中に含まれる弦二部誘導グラフの列挙
    和佐 州洋; 有村 博紀; 宇野 毅明; 平田 耕一
    第154回情報処理学会アルゴリズム研究会, 154, 1, 8, Sep. 2015
    Japanese, Symposium
  • Multi-Layered Framework for Modeling Relationships Between Biased Objects
    Iku Ohama; Takuya Kida; Hiroki Arimura
    Proc. 2015 SIAM International Conference on Data Mining (SDM'15), 9, Apr. 2015, [Peer-reviewed]
    English, International conference proceedings
  • Practical Algorithms for Mining Flock Patterns from Trajectories
    Xiaoliang Geng; Takeaki Uno; Hiroki Arimura
    Transactions of IPSJ, 56, 4, 1292, 1304, Apr. 2015, [Peer-reviewed]
    Japanese, Scientific journal
  • Efficient Approximate 3-Dimensional Point Set Matching Using Root-Mean-Square Deviation Score
    Yoichi Sasaki; Tetsuo Shibuya; Kimihito Ito; Hiroki Arimura
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2015, 9371, 191, 203, 2015, [Peer-reviewed]
    English, International conference proceedings
  • Manipulatable Compressed String Indexing Technology for Pattern Matching
    Shuhei Denzumi; Hiroki Arimura; Kunihiko Sadakane
    The journal of the Institute of Electronics, Information and Communication Engineers, 97, 12, 1080, 1085, Dec. 2014, [Peer-reviewed]
    Japanese, Research society
  • Enumeration of complete set of flock patterns in trajectories
    Xiaoliang Geng; Takuya Takagi; Hiroki Arimura; Takeaki Uno
    Proceedings of the 5th ACM SIGSPATIAL International Workshop on GeoStreaming, IWGS 2014, 53, 61, Association for Computing Machinery, Inc, 04 Nov. 2014, [Peer-reviewed]
    English, International conference proceedings
  • Oblivious evaluation of non-deterministic finite automata with application to privacy-preserving virus genome detection
    Hirohito Sasakawa; Hiroki Harada; David A. Du Verle; Hiroki Arimura; Koji Tsuda; Jun Sakuma
    Proceedings of the ACM Conference on Computer and Communications Security, 21, 30, Association for Computing Machinery, 03 Nov. 2014, [Peer-reviewed]
    English, International conference proceedings
  • Fast Regular Expression Matching Based on Dual Glushkov NFA
    Ryutaro Kurai; Norihito Yasuda; Hiroki Arimura; Shinobu Nagayama; Shin-ichi Minato
    Proc. Prague Stringology Conference 2014 (PSC'14), 3, 16, Sep. 2014, [Peer-reviewed]
    English, International conference proceedings
  • Enumeration of Induced Subtrees in a K-Degenerate Graph
    Kunihiro Wasa; Hiroki Arimura; Takeaki Uno
    第148回情報処理学会アルゴリズム研究会, 148, 17, 1, 6, Jun. 2014
    Japanese, Symposium
  • Constant Time Enumeration of Subtrees with Exactly k Nodes in a Tree
    Kunihiro Wasa; Yusaku Kaneta; Takeaki Uno; Hiroki Arimura
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E97D, 3, 421, 430, Mar. 2014, [Peer-reviewed]
    English, Scientific journal
  • DenseZDD: A Compact and Fast Index for Families of Sets
    Shuhei Denzumi; Jun Kawahara; Koji Tsuda; Hiroki Arimura; Shin-ichi Minato; Kunihiko Sadakane
    EXPERIMENTAL ALGORITHMS, SEA 2014, 8504, 187, 198, 2014, [Peer-reviewed]
    English, International conference proceedings
  • Efficient Enumeration of Induced Subtrees in a K-Degenerate Graph
    Kunihiro Wasa; Hiroki Arimura; Takeaki Uno
    ALGORITHMS AND COMPUTATION, ISAAC 2014, 8889, 94, 102, 2014, [Peer-reviewed]
    English, International conference proceedings
  • Compact Complete Inverted Files for Texts and Directed Acyclic Graphs Based on Sequence Binary Decision Diagrams
    Shuhei Denzumi; Koji Tsuda; Hiroki Arimura; Shin-ichi Minato
    In Proc. of Prague Stringology Conference 2013 (PSC2013), 157, 167, Sep. 2013, [Peer-reviewed]
    English, International conference proceedings
  • Efficient Algorithms for Finding All Length-Maximal Flock Patterns from a Set of Trajectories
    Hiroki Arimura; Xiaoliang Geng; Takeaki Uno
    IPSJ SIG Notes, 2013-AL-144, 3, 1, 8, Information Processing Society of Japan (IPSJ), May 2013
    Japanese, Symposium, In this paper, we study the problem of finding a class of spatio-temporal patterns called (m, k)-flock patterns (Gudmundsson and van Kreveld, Proc. ACM GIS'06; Benkert, Gudmundsson, Hubner, Wolle, Computational Geometry, 41:11, 2008), which represent a groups of moving objects close each other within width at most r under L∞-norm in a given time segment of length at least k, in a collection of 2-dimensional trajectories. For max-width r > 0, min-length k, and a collection S of n trajectories of legnth T, the proposed algorithm finds all length-maximal (m, k) flock patterns in an input collection of trajectory data in O(pnT2) delay and O(p2) space, p = |X| is the size of ID set being enumerated. We also present a practical improvement using geometric indexes.
  • Succinct Indices Based on Zero-Suppressed Binary Decision Diagrams
    Shuhei Denzumi; Jun Kawahara; Koji Tsuda; Hiroki Arimura; Kunihiko Sadakane; Shin-ichi Minato
    IEICE technical report. Theoretical foundations of Computing, 112, 498, 23, 30, The Institute of Electronics, Information and Communication Engineers, Mar. 2013
    English, Symposium, In many real-life problems, we are often faced with manipulating families of sets. Manipulation of large-scale set families is one of the important fundamental techniques for web information retrieval, integration, and mining. For this purpose, a special type of binary decision diagram (BDD), called Zero-suppressed BDDs (ZDDs)js used. However there is a problem of huge required space for storing ZDDs and slow membership operations. This paper introduces DenseZDD, a compressed index for static ZDDs. Our technique not only indexes set families compactly but also executes fast membership operations. We also propose a hybrid method of DenseZDD and ordinary ZDDs to make the index dynamic.
  • Development of a Battery Powered Compact 2000m Class ROV System and Its Method of Operation
    Jun YAMAMOTO; Toshihiro IWAMORI; Naoki HOSHI; Takuzo ABE; Keiichiro SAKAOKA; Yoshihiko KAMEI; Shogo TAKAGI; Osamu NUMAMOTO; Yukihiro SAKA; Kazuhisa SUEOKA; Hiroki ARIMURA; Hidemi WATANABE
    Journal of Fisheries Technology, 5, 2, 171, 174, 水産総合研究センター, Feb. 2013, [Peer-reviewed]
    Japanese, Scientific journal, We developed a battery powered compact 2000m class ROV (Remotely Operated Vehicle) system with a High-Definition video camera. It does not require specialized equipment to operate. It can be operated using only general purpose equipment. This system mainly consists of a shipboard controller, a vehicle and a launcher. A thin, light optical fiber cable (diameter 9mm, length 2,500m), the primary cable, transfers control data and video images between the shipboard controller and the launcher. The secondary cable, a composite cable (diameter 14.2mm, length 50m), transfers control data and video images and supplies power to the vehicle from the six packs of lithium-ion batteries, which are mounted in the launcher. The launcher is suspended by a rope from the support ship, and the depth of the launcher is adjusted by changing the length of the rope using a general purpose rewinder.

    Although initially, we had some trouble due to the launcher rope and the primary cable getting tangled, a newly-designed instrument that restricts the movement of the carabiner, and the use of a low expansion rope, facilitated smoother operation and an easier recovery of the ROV.
  • Faster Algorithms for Tree Similarity Based on Compressed Enumeration of Bounded-Sized Ordered Subtrees
    Kunihiro Wasa; Kouichi Hirata; Takeaki Uno; Hiroki Arimura
    SIMILARITY SEARCH AND APPLICATIONS (SISAP), 8199, 73, 84, 2013, [Peer-reviewed]
    English, International conference proceedings
  • An extension of the infinite relational model incorporating interaction between objects
    Iku Ohama; Hiromi Iida; Takuya Kida; Hiroki Arimura
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7819, 2, 147, 159, Springer, 2013, [Peer-reviewed]
    English, International conference proceedings
  • Polynomial Delay and Space Discovery of Connected and Acyclic Sub-hypergraphs in a Hypergraph
    Kunihiro Wasa; Takeaki Uno; Kouichi Hirata; Hiroki Arimura
    DISCOVERY SCIENCE, 8140, 308, 323, 2013, [Peer-reviewed]
    English, International conference proceedings
  • Trajectory pattern mining in practice algorithms for mining flock patterns from trajectories
    Xiaoliang Geng; Takeaki Uno; Hiroki Arimura
    IC3K 2013; KDIR 2013 - 5th International Conference on Knowledge Discovery and Information Retrieval and KMIS 2013 - 5th International Conference on Knowledge Management and Information Sharing, Proc., 143, 151, SciTePress, 2013
    English, International conference proceedings
  • 超グラフ中に含まれる非巡回部分超グラフの効率よい列挙,
    和佐州洋; 有村博紀; 宇野毅明; 平田耕一
    第88回人工知能基本問題研究会, 88, 85, 90, Jan. 2013
    Japanese, Symposium
  • Counterexamples to the long-standing conjecture on the complexity of BDD binary operations
    Ryo Yoshinaka; Jun Kawahara; Shuhei Denzumi; Hiroki Arimura; Shin-ichi Minato
    INFORMATION PROCESSING LETTERS, 112, 16, 636, 640, Aug. 2012, [Peer-reviewed]
    English, Scientific journal
  • Faster bit-parallel algorithms for unordered pseudo-tree matching and tree homeomorphism
    Yusaku Kaneta; Hiroki Arimura; Rajeev Raman
    Journal of Discrete Algorithms, 14, 119, 135, Jul. 2012, [Peer-reviewed]
    English, International conference proceedings
  • A Dynamically Reconfigurable FPGA-Based Pattern Matching Hardware for Subclasses of Regular Expressions
    Yusaku Kaneta; Shingo Yoshizawa; Shin-ichi Minato; Hiroki Arimura; Yoshikazu Miyanaga
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E95D, 7, 1847, 1857, Jul. 2012, [Peer-reviewed]
    English, Scientific journal
  • Faster Pattern Matching Algorithm for Very Long Extended Patterns with Applications to Large-scale String Matching
    笹川 裕人; 金田 悠作; 有村 博紀
    日本データベース学会論文誌, 11, 1, 55, 60, 日本データベース学会, Jun. 2012
    Japanese
  • Rich Operations for Manipulating Sequence Binary Decision Diagrams
    DENZUMI Shuhei; ARIMURA Hiroki; MINATO Shin-ichi
    IEICE technical report. Theoretical foundations of Computing, 112, 93, 9, 16, 電子情報通信学会, Jun. 2012
    Japanese, Manipulating large sequence data is important problem in string processing field. In this paper, we deal with sequence binary decision diagrams proposed by Loekito et al. in 2009. This data structure compactly and uniquely represents large set of many strings, and executes set operaitions efficiently. However, we have only basic set operations on processing systems already proposed. In this paper, we show fundamental structure of sequence binary decision diagrams, define various and advanced operations for string sets, and propose efficient algorithms to execute the operations.
  • 木に含まれる限定サイズ部分木の列挙
    和佐 州洋; 宇野 毅明; 有村 博紀
    第140回情報処理学会アルゴリズム研究会, 2012-AL-140, 4, 1, 8, May 2012
    Japanese, Symposium
  • Constant time enumeration of bounded-size subtrees in trees and its application
    Kunihiro Wasa; Yusaku Kaneta; Takeaki Uno; Hiroki Arimura
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7434, 347, 359, Springer, 2012, [Peer-reviewed]
    English, International conference proceedings
  • Pattern mining from trajectory GPS data
    Xiaoliang Geng; Hiroki Arimura; Takeaki Uno
    Proceedings of the 2012 IIAI International Conference on Advanced Applied Informatics, IIAIAAI 2012, 60, 65, 2012, [Peer-reviewed]
    English, International conference proceedings
  • Trajectory pattern matching based on bit-parallelism for large GPS data
    Hirohito Sasakawa; Hiroki Arimura
    Proceedings of the 2012 IIAI International Conference on Advanced Applied Informatics, IIAIAAI 2012, 66, 71, 2012, [Peer-reviewed]
    English, International conference proceedings
  • High-speed String and Regular Expression Matching on FPGA
    Yusaku Kaneta; Shingo Yoshizawa; Shin-ichi Minato; Hiroki Arimura
    Proc. of Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2011 (APSIPA ASC 2011, Oct. 2011, [Peer-reviewed]
    English, International conference proceedings
  • Faster Bit-Parallel Algorithms for Unordered Pseudo-tree Matching and Tree Homeomorphism
    Yusaku Kaneta; Hiroki Arimura
    COMBINATORIAL ALGORITHMS, 6460, 68, 81, 2011, [Peer-reviewed]
    English, International conference proceedings
  • Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations
    Shuhei Denzumi; Ryo Yoshinaka; Hiroki Arimura; Shin-ichi Minato
    PROCEEDINGS OF THE PRAGUE STRINGOLOGY CONFERENCE 2011, 147, 161, 2011, [Peer-reviewed]
    English, International conference proceedings
  • Implementation of Sequence BDDs in Erlang
    Shuhei Denzumi; Hiroki Arimura; Shin-ichi Minato
    ERLANG 11: PROCEEDINGS OF THE 2011 ACM SIGPLAN ERLANG WORKSHOP, 90, 91, 2011, [Peer-reviewed]
    English, International conference proceedings
  • Sparse and Truncated Suffix Trees on Variable-Length Codes
    Takashi Uemura; Hiroki Arimura
    COMBINATORIAL PATTERN MATCHING, 22ND ANNUAL SYMPOSIUM, CPM 2011, 6661, 246, 260, 2011, [Peer-reviewed]
    English, International conference proceedings
  • Unsupervised Spam Detection by Document Probability Estimation with Maximal Overlap Method
    Uemura Takashi; Ikeda Daisuke; Kida Takuya; Arimura Hiroki
    Information and Media Technologies, 6, 1, 231, 240, Information and Media Technologies Editorial Board, 2011
    English, In this paper, we study content-based spam detection for spams that are generated by copying a seed document with some random perturbations. We propose an unsupervised detection algorithm based on an entropy-like measure called document complexity, which reflects how many similar documents exist in the input collection of documents. As the document complexity, however, is an ideal measure like Kolmogorov complexity, we substitute an estimated occurrence probability of each document for its complexity. We also present an efficient algorithm that estimates the probabilities of all documents in the collection in linear time to its total length. Experimental results showed that our algorithm especially works well for word salad spams, which are believed to be difficult to detect automatically.
  • Preface
    Hiroki Arimura; Raoul Medina; Jean-Marc Petit; Franz Baader; Jean-François Boulicaut; Bruno Crémilleux; Luc De Raedt; Khaled Elbassioni; Alain Gely; Bart Goethals; Sergei Kuznetsov; Dominique Laurent; Hong-Cheu Liu; Joao Marques-Silva; Rokia Missaoui; Siegfried Nijssen; Lhouari Nourine; Marie-Christine Rousset; Lakhdar Sais; Thomas Schiex; Takeaki Uno; Jef Wijsen
    Proceedings - IEEE International Conference on Data Mining, ICDM, liii, liv, 2011, [Peer-reviewed]
    English, International conference proceedings
  • Mining Frequent k-Partite Episodes from Event Sequences
    Takashi Katoh; Hiroki Arimura; Kouichi Hirata
    NEW FRONTIERS IN ARTIFICIAL INTELLIGENCE, 6284, 331, 344, 2010, [Peer-reviewed]
    English, International conference proceedings
  • Fast Bit-Parallel Matching for Network and Regular Expressions
    Yusaku Kaneta; Shin-ichi Minato; Hiroki Arimura
    STRING PROCESSING AND INFORMATION RETRIEVAL, 6393, 372, 384, 2010, [Peer-reviewed]
    English, International conference proceedings
  • Dynamic reconfigurable bit-parallel architecture for large-scale regular expression matching
    Yusaku Kaneta; Shingo Yoshizawa; Shin-Ichi Minato; Hiroki Arimura; Yoshikazu Miyanaga
    Proceedings - 2010 International Conference on Field-Programmable Technology, FPT'10, 21, 28, IEEE, 2010, [Peer-reviewed]
    English, International conference proceedings
  • An Efficient Depth-first Search Algorithm for Extracting Frequent Diamond Episodes from Event Sequences
    Katoh Takashi; Arimura Hiroki; Hirata Kouichi
    Information and Media Technologies, 5, 2, 459, 470, Information and Media Technologies Editorial Board, 2010
    English, In this paper, we study the problem of mining frequent diamond episodes efficientlyfrom an input event sequence with sliding a window. Here, a diamond episode is of the form aEb, which means that every event of E follows an event a and is followed by an event b. Then, we design a polynomial-delay and polynomial-space algorithm PolyFreqDmd that finds all of the frequent diamond episodes without duplicates from an event sequence in O(|Σ|2l) time per an episode and in O(|Σ|+l) space, where Σ and l are an alphabet and the length of the event sequence, respectively. Finally, we give experimental results on artificial and real-world event sequences with varying several mining parameters to evaluate the efficiency of the algorithm.
  • An Adaptive Algorithm for Splitting Large Sets of Strings and Its Application to Efficient External Sorting
    Tatsuya Asai; Seishi Okamoto; Hiroki Arimura
    NEW FRONTIERS IN APPLIED DATA MINING, 5433, 13, +, 2009, [Peer-reviewed]
    English, International conference proceedings
  • Flexible Framework for Time-Series Pattern Matching over Multi-dimension Data Stream
    Takuya Kida; Tomoya Saito; Hiroki Arimura
    NEW FRONTIERS IN APPLIED DATA MINING, 5433, 1, 12, 2009, [Peer-reviewed]
    English, International conference proceedings
  • Efficient serial episode mining with minimal occurrences
    Hideyuki Ohtani; Takuya Kida; Takeaki Uno; Hiroki Arimura
    Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication, ICUIMC'09, 457, 464, ACM, 2009, [Peer-reviewed]
    English, International conference proceedings
  • Mining Frequent Bipartite Episode from Event Sequences
    Takashi Katoh; Hiroki Arimura; Kouichi Hirata
    DISCOVERY SCIENCE, PROCEEDINGS, 5808, 136, +, 2009, [Peer-reviewed]
    English, International conference proceedings
  • A Linear-Time Off-Line Construction of Property Suffix Trees
    UEMURA Takashi; KIDA Takuya; ARIMURA Hiroki
    The IEICE Trans Inf & Syst., 91, 3, 595, 607, 一般社団法人電子情報通信学会, Mar. 2008
    Japanese, プロパティ付きテキストとは,長さnのテキストに,補助情報としてテキスト上の互いにオーバラップを許した区間の集合(プロパティという)が付加された構造化文書の一種であり,アノテーション付きの系列データの形式的なモデルとなっている. このプロパティ付きテキストへの全文テキスト索引として, Amirら(CPM2006) は,プロパティ接尾辞木を提案した. これは,プロパティの各区間に含まれるすべての部分文字列を格納する索引構造であり,遺伝子情報や,ビデオストリーム,メタデータ付き時系列データなどへの応用がある. また,高度な検索問題である重み付きパターン照合にも用いられる. Amirらは,定数サイズのアルファベット上で,プロパティ接尾辞木をO(n log log n)時間でオフライン構築するアルゴリズムを与えたが,その線形時間構築アルゴリズムは,現在まで未解決の問題であった. 本論文では,定数アルファベット上で,プロパティ接尾辞木を線形時間で構築するオフラインアルゴリズムを与え,この問題を肯定的に解決する. 提案アルゴリズムは,接尾辞リンクの巡回を用いた簡潔な手法であり,理論的に効率良いだけでなく,実際のデータに対しても高速に動作する. 更に,人工データ上の計算機実験を行い,実際の性能を評価する.
  • Ambiguous frequent itemset mining and polynomial delay enumeration
    Takeaki Uno; Hiroki Arimura
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 5012, 357, +, 2008, [Peer-reviewed]
    English, International conference proceedings
  • LCM over ZBDDs: Fast generation of very large-scale frequent itemsets using a compact graph-based representation
    Shin-ichi Minato; Takeaki Uno; Hiroki Arimura
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 5012, 234, +, 2008, [Peer-reviewed]
    English, International conference proceedings
  • Efficient algorithms for mining frequent and closed patterns from semi-structured data
    Hiroki Arimura
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 5012, 2, +, 2008, [Peer-reviewed]
    English, International conference proceedings
  • Unsupervised Spam Detection by Document Complexity Estimation
    Takashi Uemura; Daisuke Ikeda; Hiroki Arimura
    DISCOVERY SCIENCE, PROCEEDINGS, 5255, 319, +, 2008, [Peer-reviewed]
    English, International conference proceedings
  • 擬似頻出アイテムの集合の多項式遅延列挙アルゴリズム
    宇野 毅明; 有村 博紀
    第4回 人工知能学会 データマイニングと統計数理研究会(SIG-DMSM), 2007, Jul. 2007, [Invited]
    Japanese, Symposium
  • Efficient algorithms for finding frequent substructures from semi-structured data streams
    Tatsuya Asai; Kenji Abe; Shinji Kawasoe; Hiroki Arimura; Setsuo Arikawa
    NEW FRONTIERS IN ARTIFICIAL INTELLIGENCE, 3609, 29, +, 2007, [Peer-reviewed]
    English, International conference proceedings
  • An efficient polynomial delay algorithm for pseudo frequent itemset mining
    Takeaki Uno; Hiroki Arimura
    DISCOVERY SCIENCE, PROCEEDINGS, 4755, 219, +, 2007, [Peer-reviewed]
    English, International conference proceedings
  • Time and space efficient discovery of maximal geometric graphs
    Hiroki Arimura; Takeaki Uno; Shinichi Shimozon
    DISCOVERY SCIENCE, PROCEEDINGS, 4755, 42, +, 2007, [Peer-reviewed], [Lead author]
    English, International conference proceedings
  • Frequent Closed Item Set Mining Based on Zero-suppressed BDDs
    Minato Shin-ichi; Arimura Hiroki
    Information and Media Technologies, 2, 1, 309, 316, Information and Media Technologies Editorial Board, 2007
    English, Frequent item set mining is one of the fundamental techniques for knowledge discovery and data mining. In the last decade, a number of efficient algorithms for frequent item set mining have been presented, but most of them focused on just enumerating the item set patterns which satisfy the given conditions, and it was a different matter how to store and index the result of patterns for efficient data analysis. Recently, we proposed a fast algorithm of extracting all frequent item set patterns from transaction databases and simultaneously indexing the result of huge patterns using Zero-suppressed BDDs (ZBDDs). That method, ZBDD-growth, is not only enumerating/listing the patterns efficiently, but also indexing the output data compactly on the memory to be analyzed with various algebraic operations. In this paper, we present a variation of ZBDD-growth algorithm to generate frequent closed item sets. This is a quite simple modification of ZBDD-growth, and additional computation cost is relatively small compared with the original algorithm for generating all patterns. Our method can conveniently be utilized in the environment of ZBDD-based pattern indexing.
  • An efficient algorithm for complex pattern matching over continuous data streams based on bit-parallel method
    Tomoya Saito; Takuya Kida; Hiroki Arimura
    2007 IEEE INTERNATIONAL WORKSHOP ON DATABASES FOR NEXT GENERATION RESEARCHERS, 13, +, 2007, [Peer-reviewed]
    English, International conference proceedings
  • 大規模テキストマイニングのための特徴部分列抽出と極大パター ン発見を組み合わせた効率的アルゴリズム,
    有村 博紀; 宇野 毅明
    第63回人工知能学会人工知能基礎問題研究会, 45, 52, Sep. 2006
    Japanese, Symposium
  • 2次元離散テキストからの効率よい極大パターンマイニング
    下薗 真一; 宇野 毅明; 有村 博紀
    第63回人工知能学会人工知能基本問題研究会, Sep. 2006, [Invited]
    Japanese, Symposium
  • LA_002 Efficient Algorithm of Constructing Suffix Tree with Word Count Limitation
    Uemura Takashi; Kida Takuya; Arimura Hiroki
    情報科学技術レターズ, 5, 5, 8, Forum on Information Technology, Aug. 2006
    Japanese, Symposium
  • Algorithmic Learning Theory (ALT 2000) - Preface
    H Arimura; S Jain
    THEORETICAL COMPUTER SCIENCE, 348, 1, 1, 2, Dec. 2005, [Invited]
    English, Scientific journal
  • Efficient method of combinatorial item set analysis based on zero-suppressed BDDs
    S Minato; H Arimura
    International Workshop on Challenges in Web Information Retrieval and Integration, Proceedings, 4, 11, 2005, [Peer-reviewed]
    English, International conference proceedings
  • Key semantics extraction by dependency tree mining
    Satoshi Morinaga; Hiroki Arimura; Takahiro Ikeda; Yosuke Sakao; Susumu Akamine
    Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 666, 671, ACM, 2005, [Peer-reviewed]
    English, International conference proceedings
  • A polynomial space and polynomial delay algorithm for enumeration of maximal motifs in a sequence
    H Arimura; T Uno
    ALGORITHMS AND COMPUTATION, 3827, 724, 737, 2005, [Peer-reviewed]
    English, Scientific journal
  • Special Issue on Algorithmic Learning Thoery
    Hiroki Arimura
    2005
  • Efficient substructure discovery from large semi-structured data
    T Asai; K Abe; S Kawasoe; H Sakamoto; H Arimura; S Arikawa
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E87D, 12, 2754, 2763, Dec. 2004, [Peer-reviewed]
    English, Scientific journal
  • Pattern Matching with Taxonomic Information
    Kida Takuya; Arimura Hiroki
    Asia Information Retrieval Symposium (AIRS2004), 265, 268, Asia Information Retrieval Symposium, Oct. 2004
    English, In this paper, we study the pattern matching problem with taxonomic information (PMTX, for short), where taxonomy is a partially ordered set of letters describing an IS-A hierarchy. We present an efficient algorithm for PMTX that runs in O(mn/w) time with O(m+mh/w) preprocess and O(mσ/w) extra space, where h, m, t, σ, and w are the size of the taxonomic information H, the length of the pattern P ∈ Σ∗, the length of the text T ∈ Σ∗, the cardinality of the sorted alphabet Σ, and the computer word length, respectively. If the patternlength m is less than w, it runs in O(n) time, O(m + h) preprocess, and O(σ) space, and works very fast in practice. We also discuss various improvements of the algorithms for real world applications.
  • 頻出・飽和・極大頻出集合の効率的な列挙アルゴリズムとその実装
    宇野 毅明; 有村 博紀
    日本ソフトウェア科学会, 第4回データマイニングワークショップ, 47, 54, Sep. 2004
    Japanese, Symposium
  • LCM: 頻出飽和アイテム集合を列挙する高速なアルゴリズム
    宇野 毅明; 内田 雄三; 浅井 達哉; 有村 博紀
    第54回人工知能学会人工知能基礎論研究会, 23, 29, Mar. 2004
    Japanese, Symposium
  • 半構造データマイニングのための高速な無順序木パターン発見手法
    房延慎二; 浅井達哉; 有村博紀; 宇野毅明; 中野眞一
    第15回データ工学ワークショップ(DEWS2004), Mar. 2004
    Japanese, Symposium
  • LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets
    Takeaki Uno; Masashi Kiyomi; Hiroki Arimura
    FIMI '04, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, Brighton, UK, November 1, 2004, CEUR-WS.org, 2004, [Peer-reviewed]
  • An efficient algorithm for enumerating closed patterns in transaction databases
    T Uno; T Asai; Y Uchida; H Arimura
    DISCOVERY SCIENCE, PROCEEDINGS, 3245, 16, 31, 2004, [Peer-reviewed]
    English, Scientific journal
  • Learning elementary formal systems with queries
    H Sakamoto; K Hirata; H Arimura
    THEORETICAL COMPUTER SCIENCE, 298, 1, 21, 50, Apr. 2003, [Peer-reviewed]
    English, Scientific journal
  • LCM: An Efficient Algorithm for Enumerating Frequent Closed Item Sets.
    Takeaki Uno; Tatsuya Asai; Yuzo Uchida; Hiroki Arimura
    FIMI '03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, 19 December 2003, Melbourne, Florida, USA, CEUR-WS.org, 2003, [Peer-reviewed]
  • Efficient substructure discovery from large semi-structured data
    T Asai; K Abe; S Kawaoe; H Arimura; H Sakamoto; S Arikawa
    PROCEEDINGS OF THE SECOND SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 158, 174, 2002, [Peer-reviewed]
    English, International conference proceedings
  • Optimized Substructure Discovery for Semi-structured Data.
    Kenji Abe; Shinji Kawasoe; Tatsuya Asai; Hiroki Arimura; Setsuo Arikawa
    Principles of Data Mining and Knowledge Discovery, 6th European Conference, PKDD 2002, Helsinki, Finland, August 19-23, 2002, Proceedings, 1, 14, Springer, 2002, [Peer-reviewed]
  • Online algorithms for mining semi-structured data stream
    T Asai; H Arimura; K Abe; S Kawasoe; S Arikawa
    2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 27, 34, 2002, [Peer-reviewed]
    English, International conference proceedings
  • Knowledge Discovery from Semistructured Texts.
    Hiroshi Sakamoto; Hiroki Arimura; Setsuo Arikawa
    Progress in Discovery Science, Final Report of the Japanese Discovery Science Project, 586, 599, Springer, 2002, [Peer-reviewed]
  • Efficient Data Mining from Large Text Databases.
    Hiroki Arimura; Hiroshi Sakamoto; Setsuo Arikawa
    Progress in Discovery Science, Final Report of the Japanese Discovery Science Project, 123, 139, Springer, 2002, [Peer-reviewed]
  • Efficient text mining with optimized pattern discovery
    H Arimura
    COMBINATORIAL PATTERN MATCHING, 2373, 17, 19, 2002, [Peer-reviewed]
    English, Scientific journal
  • Mining Frequent Substructures from Web
    Kenji Abe; Shinji Kawasoe; Tatsuya Asai; Hiroki Arimura; Hiroshi Sakamoto; Setsuo Arikawa
    Active Mining - New Directions of Data Mining, 79, 83, 94, IOS Press, 2002, [Peer-reviewed]
  • Efficiently Mining Frequent Substructures from Semi-structured Data
    Tatsuya Asai; Kenji Abe; Shinji Kawasoe; Hiroshi Sakamoto; Hiroki Arimura; Setsuo Arikawa
    Proc. International Workshop on Informations & Electrical Engineering (IWIE2002), 59, 64, 2002, [Peer-reviewed]
  • Practical Algorithms for Constructing Large Suffix Arrays on Distributed Memory Parallel Computers
    ASAKA HIROKI; KAWASOE SHINJI; ABE JUNICHIRO; ARIMURA HIROKI; ARIKAWA SETSUO
    情報処理学会論文誌数理モデル化と応用(TOM), 42, 14, 14, 24, Information Processing Society of Japan (IPSJ), 15 Dec. 2001
    Japanese, In this paper, we study efficient parallel construction of full-text indexing structures for large text data. The suffix array is a compact full-text indexing structure that is useful in information retrieval and bio-informatics. We propose an efficient parallel algorithm for constructing suffix arrays on distributed memory parallel computers. This algorithm is a parallel implementation of the well-known external memory algorithm, called Baeza-Yates-Gonnet-Sinder (BGS) algorithm. By theoretical analysis, we show that our algorithm runs more efficiently than Riberio-Kitajima-Ziviani (RKZ) algorithm, another parallel implementation of the BGS algorithm, in terms of parallel time and communication complexities.
  • Extracting Text Data from HTML Documents
    MURAKAMI YOSHITSUGU; SAKAMOTO HIROSHI; ARIMURA HIROKI; ARIKAWA SETSUO
    情報処理学会論文誌数理モデル化と応用(TOM), 42, 14, 39, 49, Information Processing Society of Japan (IPSJ), 15 Dec. 2001
    Japanese, This paper introduces the new model of the HTML Wrapper for the information extration from HTML documents and presents the learning algorithm for the HTML Wrappers in the framework of learning by exmaples. The expressiveness of this model is shown by experimental results.
  • Modelling Semi-structured Documents with Hedges for Deduction and Induction.
    Akihiro Yamamoto; Kimihito Ito; Akira Ishino; Hiroki Arimura
    Inductive Logic Programming, 11th International Conference, ILP 2001, Strasbourg, France, September 9-11, 2001, Proceedings, 240, 247, Springer, 2001, [Peer-reviewed]
    International conference proceedings
  • Extracting Partial Structures from HTML Documents.
    Hiroshi Sakamoto; Yoshitsugu Murakami; Hiroki Arimura; Setsuo Arikawa
    Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference, May 21-23, 2001, Key West, Florida, USA, 264, 268, AAAI Press, 2001, [Peer-reviewed]
  • Mining Semi-structured Data by Path Expressions.
    Katsuaki Taniguchi; Hiroshi Sakamoto; Hiroki Arimura; Shinichi Shimozono; Setsuo Arikawa
    Discovery Science, 4th International Conference, DS 2001, Washington, DC, USA, November 25-28, 2001, Proceedings, 2226, 378, 388, Springer, 2001, [Peer-reviewed]
  • Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications.
    Toru Kasai; Gunho Lee; Hiroki Arimura; Setsuo Arikawa; Kunsoo Park
    Combinatorial Pattern Matching, 12th Annual Symposium, CPM 2001 Jerusalem, Israel, July 1-4, 2001 Proceedings, 181, 192, Springer, 2001, [Peer-reviewed]
  • Efficient Discovery of Proximity Patterns with Suffix Arrays.
    Hiroki Arimura; Hiroki Asaka; Hiroshi Sakamoto; Setsuo Arikawa
    Combinatorial Pattern Matching, 12th Annual Symposium, CPM 2001 Jerusalem, Israel, July 1-4, 2001 Proceedings, 152, 156, Springer, 2001, [Peer-reviewed]
  • Efficient Learning of Semi-structured Data from Queries.
    Hiroki Arimura; Hiroshi Sakamoto; Setsuo Arikawa
    Algorithmic Learning Theory, 12th International Conference, ALT 2001, Washington, DC, USA, November 25-28, 2001, Proceedings, 315, 331, Springer, 2001, [Peer-reviewed]
  • Inductive inference of unbounded unions of pattern languages from positive data
    T Shinohara; H Arimura
    THEORETICAL COMPUTER SCIENCE, 241, 1-2, 191, 209, Jun. 2000, [Peer-reviewed]
    English, Scientific journal
  • On approximation algorithms for local multiple alignment.
    Tatsuya Akutsu; Hiroki Arimura; Shinichi Shimozono
    RECOMB 2000, 1, 7, 2000, [Peer-reviewed]
  • Discovering unordered and ordered phrase association patterns for text mining
    R Fujino; H Arimura; S Arikawa
    KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 1805, 281, 293, 2000, [Peer-reviewed]
    English, Scientific journal
  • Text data mining: Discovery of important keywords in the cyberspace
    H Arimura; J Abe; R Fujino; H Sakamoto; S Shimozono; S Arikawa
    2000 KYOTO INTERNATIONAL CONFERENCE ON DIGITAL LIBRARIES: RESEARCH AND PRACTICE, PROCEEDINGS, 220, 226, 2000, [Peer-reviewed]
    English, International conference proceedings
  • Identification of tree translation rules from examples
    H Sakamoto; H Arimura; S Arikawa
    GRAMMATICAL INFERENCE: ALGORITHMS AND APPLICATIONS, 1891, 241, 255, 2000, [Peer-reviewed]
    English, Scientific journal
  • Efficient discovery of optimal word-association patterns in large text databases
    S Shimozono; H Arimura; S Arikawa
    NEW GENERATION COMPUTING, 18, 1, 49, 60, 2000, [Peer-reviewed]
    English, Scientific journal
  • Inductive logic programming: From logic of discovery to machine learning (Special Issue on Surveys on Discovery Science)
    Hiroki Arimura; Akihiro Yamamoto
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E83D, 1, 10, 18, Jan. 2000, [Peer-reviewed], [Invited]
    English, Scientific journal, 35138305;13674145
  • Knowledge discovery from health data using weighted aggregation classifier
    T Takae; M Chikamune; H Arimura; A Shinohara; H Inoue; S Takeya; K Uezono; T Kawasaki
    DISCOVERY SCIENCE, PROCEEDINGS, 1721, 359, 361, 1999, [Peer-reviewed]
    English, Scientific journal
  • Empirical Evaluation of a Regular Pattern Inference Algorithm by Minimal Multiple Generalization
    KASAI Toru; ARIMURA Hiroki; SHINOHARA Takeshi
    Research reports on information science and electrical engineering of Kyushu University, 3, 2, 185, 190, Kyushu University, Sep. 1998
    A regular pattern is a string consisting of constant symbols and mutually distinct variables, and represents the set of the constant strings obtained by substituting possibly empty constant strings for the variables. A learning algorithm, called k-minimal multiple generalization (k-rnmg), finds a minimally general collection of at most k regular patterns that explains all the positive examples. Recently, several attempts have been made at applying this algorithm to protein motif discovery and other knowledge acquisitions. In such applications, its performance is considerably influenced by a class of data and values of learning parameters. This paper empirically evaluates performance of the algorithm k-rnmg on synthesized data to apply it to protein motif discovery.
  • An efficient tool for discovering simple combinatorial patterns from large text databases
    Hiroki Arimura; Atsushi Wataki; Ryoichi Fujino; Shinichi Shimozono; Setsuo Arikawa
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 1532, 393, 395, 1998, [Peer-reviewed]
    English, International conference proceedings
  • A fast algorithm for discovering optimal string patterns in large text databases
    H Arimura; A Wataki; R Fujino; S Araikawa
    ALGORITHMIC LEARNING THEORY, 1501, 247, 261, 1998, [Peer-reviewed]
    English, Scientific journal
  • Finding tree patterns consistent with positive and negative examples using queries
    H Ishizaka; H Arimura; T Shinohara
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 23, 1-2, 101, 115, 1998, [Peer-reviewed]
    English, Scientific journal
  • Maximizing agreement with a classification by bounded or unbounded number of associated words
    H Arimura; S Shimozono
    ALGORITHMS AND COMPUTATIONS, 1533, 39, 48, 1998, [Peer-reviewed]
    English, Scientific journal
  • Learning unions of tree patterns using queries
    Hiroki Arimura; Hiroki Ishizaka; Takeshi Shinohara
    Theoretical Computer Science, 185, 1, 47, 62, Oct. 1997, [International Magazine]
    Scientific journal
  • Inductive inference of unbounded unions of pattern languages from positive data
    Takeshi Shinohara; Hiroki Arimura
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 1160, 257, 271, Springer Verlag, 1996
    English, International conference proceedings
  • Learning unions of tree patterns using queries
    Hiroki Arimura; Hiroki Ishizaka; Takeshi Shinohara
    Lecture Notes in Computer Science, 66, 79, Springer Berlin Heidelberg, 1995
    In book
  • Unions of a Bounded Number of Tree Pattern Languages Are Hard To Learn
    Hiroki Arimura
    RIFIS Technical Report, RIFIS-TR-CS-81, Research Institute of Fundamental Information Science (RIFIS), Kyushu University, 13 Feb. 1994, [International Magazine]
    English, Research institution
  • Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data
    Hiroki Arimura; Takeshi Shinohara; Setsuko Otsuki
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 775, 649, 660, Springer Verlag, 1994
    English, International conference proceedings
  • Finding tree patterns consistent with positive and negative examples using queries
    Hiroki Ishizaka; Hiroki Arimura; Takeshi Shinohara
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 872, 317, 332, Springer Verlag, 1994
    English, International conference proceedings
  • A Generalization of the Least General Generalization
    Hiroki Arimura; Takeshi Shinohara; Setsuko Otsuki; Hiroki Ishizaka
    Machine Intelligence 13 -- Machine Intelligence and Inductive Learning, edited by K. Furukawa, D. Michie, and S. Muggleton, Clarendon Press, 1994, [Peer-reviewed], [Invited], [Lead author], [International Magazine]
    English, In book
  • Protein Motif Discovery from Positive Examples by Minimal Multiple Generalization over Regular Patterns
    Arimura Hiroki; Fujino Ryoichi; Shinohara Takeshi; Arikawa Setsuo
    GI, 5, 39, 48, Japanese Society for Bioinformatics, 1994
    English, Recently, several attempts have been made at applying machine learning method to protein motif discovery, but most of these methods require negative examples in addition to positive examples. This paper proposes an efficient method for learning protein motif from positive examples. A regular pattern is a string consisting of constant symbols and mutually distinct variables, and represents the set of the constant strings obtained by substituting nonempty constant strings for variables. Regular patterns and their languages are called extended if empty substitutions are allowed. Our learning algorithm, called k-minimal multiple generalization (k-mmg), finds a minimally general collection of at most k regular patterns that explains all the positive examples. We have implemented this algorithm for subclasses for regular patterns and extended regular patterns where the number of variables are bounded by a small constant, and run experiments on protein data taken from GenBank and PIR databases. We incorporate three heuristics into these algorithms for controlling nondeterministic choices. The experiments show that the k-mmg algorithm can very quickly find a hypothesis on the computers in practice, and that the results of our system are comparable with the results of learning method from positive and negative data.
  • Inductive inference of Prolog programs with linear data dependency from positive data
    Hiroki Arimura; Takeshi Shinohara
    Information Modelling and Knowledge Bases V, 365, 375, IOS Press, Aug. 1993, [Peer-reviewed], [Lead author]
    English, International conference proceedings
  • Inductive Learning of Logic Programs From Positive Facts Using Minimal Multiple Generalization
    ISHIZAKA Hiroki; ARIMURA Hiroki; SHINOHARA Takeshi
    Journal of Japanese Society for Artificial Intelligence, Special Issue on ALT'92, 8, 4, 419, 426, 01 Jul. 1993, [Peer-reviewed], [Invited], [Domestic magazines]
    Japanese, Scientific journal, We consider the polynomial time inferability of primitive Prologs from positive facts. The class of primitive Prologs is a proper subclass of that of linear Prologs which is known to be inferable from only positive facts. In this paper, we discuss the polynomial time inferability of the subclass using minimal multiple generalizations. The minimal multiple generalization is a natural extension of the least generalization given by Plotkin in 1970. The minimal multiple generalization generalizes given first order words by several words, while the least generalization does by a single word. The property of the minimal multiple generalization makes it possible to perform fine generalization and to construct the heads of several clauses in a target program at the same time. We give an outline of a consistent polynomial time inference algorithm which identifies the class of primitive Prologs in the limit. The algorithm infers the heads of clauses in a target program as a minimal multiple generalization of a set of given positive facts. Furthermore, we give a similar result on the inferability of a subclass of context-free transformations which includes several well-known Prolog programs.
  • Depth-Bounded Inference for Nonterminating PROLOGs
    Hiroki Arimura
    Bulletin of Informatics and Cybernetics, 25, 3-4, 125, 136, Research Association of Statistical Sciences, Fukuoka, Japan, Jun. 1993, [Peer-reviewed], [Lead author], [International Magazine]
    English, Scientific journal
  • A polynomial time algorithm for finding finite unions of tree pattern languages
    Hiroki Arimura; Takeshi Shinohara; Setsuko Otsuki
    Nonmonotonic and Inductive Logic (NIL1991), Second International Workshop, Reinhardsbrunn Vastle, Germany, December 1991, Proceedings, LNAI, 659, 118, 131, Springer-Verlag, Feb. 1993, [Peer-reviewed]
    English, International conference proceedings, Post-proceedings of the 2nd International Workshop on Nonmonotonic and Inductive Inference (NIL'91), Reinhardsbrunn Castle, Germany, December 2-6, 1991; edited by G. Brewka, K.P. Jantke, and P.H. Schmitt.
  • Efficient inductive inference of primitive Prologs from positive data
    Hiroki Ishizaka; Hiroki Arimura; Takeshi Shinohara
    Algorithmic Learning Theory, Third Workshop, ALT'92, Tokyo, Japan, October 1992, Proceedings, Lecture Notes in Computer Science, 743, 135, 146, Springer Berlin Heidelberg, 1993, [Peer-reviewed]
    English, International conference proceedings
  • Polynomial Time Inference of Unions of Two Tree Pattern Languages
    Hiroki ARIMURA; Takeshi SHINOHARA; Setsuko OTSUKI
    IEICE TRANSACTIONS on Information and Systems, E75-D, 4, 426, 434, Jul. 1992, [Peer-reviewed], [Lead author]
    English, Scientific journal
  • Polynomial time inference of a subclass of context-free transformations
    Hiroki Arimura; Hiroki Ishizaka; Takeshi Shinohara
    Proceedings of the fifth annual workshop on Computational learning theory - COLT '92, 136, 143, ACM Press, 1992, [Peer-reviewed]
    English, International conference proceedings
  • Polynomial Time Inference of Unions of Tree Pattern Languages
    Hiroki Arimura; Takeshi Shinohara; Setsuko Otsuki
    Proc. the 2nd Workshop on Algorithmic Learning Theory (ALT'91), 105, 114, Oct. 1991, [Peer-reviewed], [Lead author]
    English, International conference proceedings
  • Completeness of Depth-bounded Resolution for Weakly Reducing Programs
    Hiroki ARIMURA
    Software Science and Engineering, Selected Papers from the Kyoto Symposia, World Scientific Series of Computer Science, edited by Ikuo Nakata and Masami Hagiya, 31, 227, 245, World Scientific, Sep. 1991, [International Magazine]
    In book
■ Other Activities and Achievements
  • My younger days and recent stories (Special feature, Researchers turning 60)
    Hiroki Arimura, LA Symposium, 85, Jul. 2025, [Invited], [Corresponding author]
    Japanese, Others
  • 「おめでとうソサイエティ論文賞」ゼロサプレス型二分決定グラフを用いたトランザクションデータベースの効率的解析手法
    湊真一; 有村博紀, 電子情報通信学会 情報・システムソサイエティ誌, Nov. 2010, [Invited]
    Japanese
  • Interdisciplinary Research Project on Knowledge Software Science
    ARIMURA Hiroki, The Journal of the Institute of Electronics, Information and Communication Engineers, 92, 10, 819, 821, 01 Oct. 2009, [Invited]
    The Institute of Electronics, Information and Communication Engineers, Japanese, Introduction other
  • 「知識創出学」とは? ー北海道大学グローバルCOEプログラムと北海道内情報通信系研究グループの活動 北の国から明日のICTに架ける橋ー(特別小特集 知の創出を支える次世代IT基盤技術)
    有村 博紀, 電子情報通信学会誌, 小特集 知の創出を支える次世代IT基盤技術 ー 北海道大学グローバルCOEプログラムと北海道内情報通信系研究グループの活動 北の国から明日のICTに架ける橋 ー, 92, 10, 816, 818, Oct. 2009, [Invited], [Lead author], [Domestic magazines]
    電子情報通信学会, Japanese, Introduction scientific journal
  • Global COE at Hokkaido University: Establishing next-generation IT infrastructure for supporting knowledge creation
    有村 博紀, Japanese scientific monthly, 60, 12, 1004, 1008, Dec. 2007
    日本学術振興会, Japanese
  • "Prediction and Discovery" Knowledge Acquisition from Large-Scale Information (in Japanese)
    Workshop Report, Science and Technology Foresight Strategy Workshop, Center for Research and Development Strategy (CRDS), CRDS-FY2007-WR-01, Aug. 2007, [Domestic magazines]
    Center for Research and Development Strategy, Japan Science and Technology Agency (JST CRDS), Japanese, Technical report
  • Data Intensive Computing : No.2 Frequent Itemset Mining Algorithms(Intelligent Computing and Related Issues (2))
    UNO Takeaki; ARIMURA Hiroki; Takeaki Uno; Hiroki Arimura; National Institute of Informatics:The graduate University for Advanced Studies; Graduate School of Information Science and Technology Hokkaido University, Journal of Japanese Society for Artificial Intelligence, 22, 3, 425, 436, 01 May 2007
    人工知能学会, Japanese
  • Recent Development of Mining Algorithms for Data Streams
    ARIMURA Hiroki, The transactions of the Institute of Electronics, Information and Communication Engineers. D-I, 88, 3, 563, 575, Mar. 2005
    最近, 大規模データストリームに対するデータマイニングが注目を集めている.本論文では, データストリームマイニングにおけるデータマイニング技法について, 最近の研究動向を紹介する.特に, 概要データを用いた膨大なデータストリームの処理技法に焦点をあて, ストリーム統計の近似計算から, オンラインストリームマイニング, データストリームに対するパターン照合まで, ストリームデータに対する様々なデータマイニング技法を解説する., 一般社団法人電子情報通信学会, Japanese
  • 1. Algorithms for Mining Data Streams(New Frontier of Data Mining Methods)
    ARIMURA Hiroki; KIDA Takuya, IPSJ Magazine, 46, 1, 4, 11, Jan. 2005
    大量の電子化データの流れであるデータストリームから,有用な情報を少ない資源で効率よく取り出すためのストリームマイニング技術を概観する.まず,データストリームの特徴と,データマイニングの目的について整理し,限定された計算資源を用いて無限に続くデータストリームからマイニングを行うためのシステムに要求される性質について議論する.次に,さまざまな要約データ構造とオンライン化技術についてまとめ,最近の研究のうち特色のあるものを紹介する., 一般社団法人情報処理学会, Japanese
  • Algorithms for Mining Semistructured Data
    ASAI Tatsuya; ARIMURA Hiroki, The transactions of the Institute of Electronics, Information and Communication Engineers. D-I, 87, 2, 79, 96, Feb. 2004, [Peer-reviewed], [Invited]
    最近,ウェブページやXMLデータなどの半構造データに対するデークマイニングが注目を集めている.本論文では,半構造データマイニングにおける主要な技術の一つである,グラフ構造データからの部分構造パターン発見技法について,主に木パターンマイニングとグラフマイニングの観点から,近年の研究動向を紹介する.はじめに,半構造データマイニングの萌芽期の研究として,Subdue及び,GBI,Nestrovらのスキーマ発見,DehaspeらのWarmrを紹介する.次に,木パターンとグラフパターンのマイニングに関する現在の研究動向を概観する.木パターンの発見アルゴリズムとして,浅井らのFREQTとUNOT,ZakiのTreeMiner.安部らのOPTTなどを説明し,一般のグラフマイニングアルゴリズムとして,猪口らのAGMとYanらのgSpanなどを説明する.最後に,新しいデークマイニングの方向性を示すものとして,半構造データストリームからのマイニングアルゴリズムStreamTについて述べる., 一般社団法人電子情報通信学会, Japanese, Introduction scientific journal, 10555378;10555377
  • 計算学習理論における学習
    特集「機械学習,それが人に及ばざる理由」(今井むつみ 編),人工知能学会誌, Vol.18, No.5, 531, 536, 2003, [Invited]
    Japanese, Introduction scientific journal
  • .
    ARIMURA Hiroki, CICSJ Bulletin, 21, 2, 28, 28, 2003
    ., Division of Chemical Information and Computer Sciences The Chemical Society of Japan, Japanese
  • Optimized Pattern Discovery for Text Mining(Data/Text Mining)
    Arimura Hiroki; Sakamoto Hiroshi, 応用数理, 12, 4, 366, 378, 25 Dec. 2002
    The Japan Society for Industrial and Applied Mathematics, Japanese
  • Web Data Mining(Special Issue on Data Mining)
    IKEDA Daisuke; SAKAMOTO Hiroshi; ARIMURA Hiroki, Systems, control and information, 46, 4, 177, 183, 15 Apr. 2002
    Institute of Systems, Control and Information Engineers, Japanese
  • 帰納論理プログラミングと証明補完
    山本章博; 有村博紀; 平田耕一, bit別冊 「発見科学とデータマイニング」, 第4章, 共立出版, 01 Jun. 2001, [Domestic magazines]
    共立出版, Japanese, Introduction other
  • Web Mining(Special Issue : "Text Mining")
    Sakamoto Hiroshi; Arimura Hiroki, Journal of Japanese Society for Artificial Intelligence, 16, 2, 233, 238, 01 Mar. 2001
    The Japanese Society for Artificial Intelligence, Japanese
  • Base Technology for Text Mining(Special Issue : "Text Mining")
    Nasukawa Tetsuya; Kawano Hiroyuki; Arimura Hiroki, Journal of Japanese Society for Artificial Intelligence, 16, 2, 201, 211, 01 Mar. 2001
    The Japanese Society for Artificial Intelligence, Japanese
  • Algorithmic Learning Theory, 11th International Conference, ALT 2000 Sydney, Australia, December 11-13, 2000 Proceedings
    Hiroki Arimura; Sanjay Jain; Arun Sharma, Lecture Notes in Artificial Intelligence (LNAI), 1968, 2000
    Springer Berlin Heidelberg, English, Meeting report
  • 一階論理式の学習と帰納論理プログラミング (<特集>計算学習理論の進展と応用可能性)
    有村博紀; 平田耕一, 人工知能, 14, 5, 790, 799, Sep. 1999, [Invited], [Corresponding author]
    Japanese, Introduction scientific journal
  • 九州大学大学院システム情報科学研究科情報理学専攻
    有川 節夫; 有村 博紀, 人工知能学会誌 = Journal of Japanese Society for Artificial Intelligence, 12, 2, 332, 333, 01 Mar. 1997
    Japanese
  • ALT'96報告
    杉本 典子; 平田 耕一; 有村 博紀, 人工知能学会誌 = Journal of Japanese Society for Artificial Intelligence, 12, 2, 334, 337, 01 Mar. 1997, [Invited]
    Japanese, Introduction scientific journal
  • Assessment Report of Doctoral Theses ("Inductive Inference from Positive Data Based on Generalization")
    Hiroki Arimura, Engineering sciences reports, Kyushu University, 16, 3, 359, 368, 01 Dec. 1994, [Invited]
    九州大学大学院総合理工学研究科, Japanese, Report research institution
  • Inductive inference from positive data based on generalization
    Hiroki Arimura, Ph.D Thesis, Kyushu University, 27 Jun. 1994, [Lead author]
    English, Others
  • 第2回マシンインテリジェンスに関する国際ワークショップ(International Workshop on Machine Intelligence 1993)の報告
    沼尾 正行; 有村 博紀; 原口 誠; 平田 耕一; 斉藤 和巳; 諏訪 正樹; 月本 洋; 元田 浩, 人工知能学会誌 = Journal of Japanese Society for Artificial Intelligence, 9, 2, 318, 320, 01 Mar. 1994
    Japanese
  • 九州工業大学情報工学部知能情報工学科大槻研究室
    有村 博紀, 人工知能学会誌, 6, 5, 739, 739, 01 Sep. 1991
    社団法人人工知能学会, Japanese
■ Books and other publications
  • Encyclopedia of Natural Language Processing (In Japanese)
    Hiroki Arimura, "Information extraction from semi-structured texts"
    The Kyoritsu Co.,Ltd, Dec. 2009, [Contributor]
  • Computational Challenges of Massive Data Sets and Randomness in Computation, Special Issue on the First and Second Japanese-German Frontiers of Science Symposia
    J. Rothe; H. Arimura, editors
    Journal of Universal Computer Science, Vol. 12, issue 6, 579-761. doi: 10.3217/jucs-012-06, Jul. 2006
  • JSAI Encyclopedia of Aritificial Intelligence (In Japanese)
    Hiroki Arimura, Text mining
    The Kyoritsu Co.,Ltd, Dec. 2005, [Contributor]
  • Proceedings of the 11th International Conference on Algorithmic Learning Theory
    Hiroki Arimura
    Springer-Verlag, Oct. 2000, [Editor]
■ Lectures, oral presentations, etc.
  • Exact Algorithms for Learning Minimal Consistent Boolean Formulas from Examples
    Kawata Yusuke; Hiroki Arimura
    The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026 (to appear), Jun. 2026, The Japanese Society for Artificial Intelligence, Japanese, Poster presentation
    08 Jun. 2026 - 12 Jun. 2026, Gumma, Japan, Abstract: From the viewpoint of explainability in machine learning, we consider the proper PAC learnability of a subclass of $n$-variable Boolean functions, the family of propositional formulas of bounded size $t$. We focus on the computational complexity of the Minimum Consistent Hypothesis problem a combinatorial optimization problem that is closely related to the learnability of this class, in the framework of parameterized complexity. First, we demonstrate that when parameterized by the formula size $t$, the problem admits an XP algorithm with $O^*(n^t)$ running time via exhaustive search, where $O^*$ denotes the big-Oh notation ignoring polynomial factors, but is fixed-parameter intractable as it is shown to be W[2]-hard. Furthermore, we show that when parameterized by the total number of positive and negative examples $m$, the problem can be solved in $O^*(3^m)$ time, establishing its membership in FPT. This latter FPT algorithm introduces a novel learning paradigm based on a propositional formula size game by Razborov (Combinatorical, Vol.10, No.1, 1990), which has been traditionally used to show lower bounds. This approach may be of particular interest as an efficient solution for scenarios with a limited number of training examples.

    Links:
    * A poster on Zenodo repository
    * A preprint on Zenodo repository, [Domestic Conference]
  • 極大反復列の列挙:既存手法の 分類と新規アルゴリズムの提案 (共同研究 with 有村博紀 , 稲永俊介)
    鶴園 悠太
    STRセミナー2025, 25 Sep. 2025, Japanese, Oral presentation
    24 Sep. 2025 - 25 Sep. 2025
  • 代数的決定木計数の定式化と, アルゴリズム, 計算量について
    機械学習とデータ科学研究会, 12 Sep. 2025, 広島市立大学, Japanese, Oral presentation
    11 Sep. 2025 - 12 Sep. 2025, 広島市立大学 情報科学部棟別館 6 階 交流ラウンジ, Japan, 35135885, [Domestic Conference]
  • Simple and Efficient Dynamic and Persistent Data Structures for Dynamic Strings Based on Balanced Search Trees
    Taiki Kaneda; Yasuaki Kobayashi; Hiroki Arimura
    The 25th Korea–Japan Joint Workshop on Algorithms and Computation (WAAC 2025), 19 Aug. 2025, the Special Interest Group on Algorithms (SIGAL) of the Information Processing Society of Japan; the Special Interest Group on Algorithms (SIGAL) of the Korean Institute of Information Scientists and Engineers (KIISE), English, Oral presentation
    19 Aug. 2025 - 20 Aug. 2025, Sapporo, Japan, 35135762, [International presentation]
  • ルールリストの族に対する線形整数計画法に基づくモデル更新手法
    佐々木耀一; 岡嶋 穣; 有村博紀
    情報論的学習理論と機械学習研究会(IBISML), 20 Dec. 2024, 電子情報通信学会, Japanese, Oral presentation
    20 Dec. 2024 - 21 Dec. 2024, 北海道大学大学院環境科学院,札幌市,北海道, Japan, [Domestic Conference]
  • Enumerating Maximal Repeats in a Text Based on Bidirectional Indices
    Yuta Tsuruzono; Hiroki Arimura; Shunsuke Inenaga
    The 24th Korea–Japan Joint Workshop on Algorithms and Computation (WAAC 2024), 02 Aug. 2024, Special Interest Group on Algorithms (SIGAL) of the Korean Institute of Information Scientists and Engineers (KIISE) and the Special Interest Group on Algorithms (SIGAL) of the Information Processing Society of Japan (IPSJ), English, Oral presentation
    02 Aug. 2024 - 03 Aug. 2024, Sungshin Women's University, Seoul, Korea, Korea, Republic of, 35135885;35135762, [International presentation]
  • Finding Diverse Strings and Longest Common Subsequences in a Graph
    有村 博紀
    Forest Workshop 2024, Sapporo (talk only), 31 Mar. 2024, コンピューティング基盤CREST「学習/数理モデルに基づく時空間展開型アーキテクチャの創出と応用」(代表:本村真人(東工大))機械学習グループ, 信頼されるAIシステムCREST「D3-AI: 多様性と環境変化に寄り添う分散機械学習基盤の創出」(代表:高前田伸也(東大)), Japanese, Nominated symposium
    29 Mar. 2024 - 31 Mar. 2024, TKPガーデンシティPREMIUM札幌大通カンファレンスルーム,札幌市, 35136410
  • 多様な最適決定木の集合を発見する近似アルゴリズム
    吉岡 和希
    フォレストワークショップ2024, 30 Mar. 2024, コンピューティング基盤CREST「学習/数理モデルに基づく時空間展開型アーキテクチャの創出と応用」(代表:本村真人(東工大))機械学習グループ 信頼されるAIシステムCREST「D3-AI: 多様性と環境変化に寄り添う分散機械学習基盤の創出」(代表:高前田伸也(東大)), Japanese, Poster presentation
    29 Mar. 2024 - 31 Mar. 2024, TKPガーデンシティPREMIUM札幌大通カンファレンスルーム,札幌市, 35136410, [Domestic Conference]
  • テキスト中の極大反復文字列の効率良い列挙アルゴリズム
    鶴園 悠太
    フォォレストワークショップ2024, 30 Mar. 2024, コンピューティング基盤CREST「学習/数理モデルに基づく時空間展開型アーキテクチャの創出と応用」(代表:本村真人(東工大))機械学習グループ, 信頼されるAIシステムCREST「D3-AI: 多様性と環境変化に寄り添う分散機械学習基盤の創出」(代表:高前田伸也(東大)), Japanese, Poster presentation
    29 Mar. 2024 - 30 Mar. 2024, 札幌市, Japan, 35136410, [Domestic Conference]
  • 文字列集合に対する多様な最長共通部分列の発見
    志田祐仁; 有村博紀; 小林靖明
    電子情報通信学会技術研究報告(Web), 信学技報, vol. 123, no. 325, COMP2023-23, pp. 45-52, 2023年12月, 22 Dec. 2023, Japanese, Oral presentation
    22 Dec. 2023 - 22 Dec. 2023, 宮崎大学 まちなかキャンパス, [Domestic Conference], [Internationally co-authored]
  • 多様な解の発見問題のAIと大規模データ解析への展開
    有村 博紀
    AFSA 2023年度第2回領域集会, 23 Oct. 2023, 文部科学省 科学研究費補助金 学術変革領域(A)「社会変革の源泉となる革新的アルゴリズム基盤の創出と体系化」,2020〜2024年度, Japanese, Nominated symposium
    22 Oct. 2023 - 24 Oct. 2023, 熱海ニューフジヤホテル,静岡県熱海市, Japan, [Domestic Conference]
  • Dynamic Programming for Optimal Decision Trees
    Hiroki Arimura
    AFSA A02-group Seminar, 11 Jul. 2023, A02班, 文部科学省 科学研究費補助金 学術変革領域(A)「社会変革の源泉となる革新的アルゴリズム基盤の創出と体系化」, Japanese, Public discourse
    11 Jul. 2023 - 11 Jul. 2023, 九州大学 情報基盤研究開発センター,伊都キャンパス,福岡市, Japan, 35135885, [Invited], [Domestic Conference]
  • Minimum Consistent Subset for Trees Revisited
    Hiroki Arimura; Tatsuya Gima; Yasuaki Kobayashi; Hiroomi Nochide; Yota Otachi
    The 23rd Japan–Korea Joint Workshop on Algorithms and Computation (WAAC 2023),, the Special Interest Group on Algorithms (SIGAL) of the Information Processing Society of Japan and the Special Interest Group on Theoretical Computer Science (SIGTCS) of the Korean Institute of Information Scientists and Engineers (KIISE)., English, Oral presentation
    24 Jun. 2023 - 25 Jun. 2023, June 24–25, 2023 Nagoya University, Nagoya, Japan, Japan, [International presentation]
  • Efficient Computation of the Run-length Encoded Burrows-Wheeler Transform Based on the Compact Directed Acyclic Word Graph
    Mizuki Sue; Yasuaki Kobayashi; Hiroki Arimura; Yuto Nakashima; Shunsuke Inenaga
    電子情報通信学会技術研究報告(Web), 信学技報, vol.122, no.294, COMP2022-25, pp.21-28(COMP), 06 Dec. 2022, Japanese, Oral presentation
    06 Dec. 2022 - 06 Dec. 2022, 愛媛大メディアホール, [Domestic Conference]
  • コンパクト非巡回語グラフに基づく連長圧縮Burrows-Wheeler変換の効率良い構築(スライド)
    須江瑞樹; 小林靖明; 有村博紀; 中島祐人; 稲永俊介
    電子情報通信学会技術研究報告;信学技報;OM;, 06 Dec. 2022, 電子情報通信学会, Japanese, Oral presentation
    愛媛大メディアホール, Japan, 35135762;35135885, [Domestic Conference]
  • Efficient Construction of Multi-Dimensional Contingency Tables for Small and Accurate Decision Trees
    Shota SHIRAI; Hiroki ARIMURA
    The 25th Information-Based Induction Sciences Workshop, 21 Nov. 2022, IEICE, Japanese, Poster presentation
    20 Nov. 2022 - 23 Nov. 2022, Tsubaka International Conference Hall, Tsukuba, Japan, Japan, [Domestic Conference]
  • 人工知能とビッグデータ応用における離散アルゴリズム
    有村 博紀
    2023年度 第2回領域集会, 学術変革(A)「社会変革アルゴリズム基盤」(AFSA), 23 Oct. 2022, 科研費 学術変革(A)「社会変革アルゴリズム基盤」(AFSA), Japanese, Poster presentation
    22 Oct. 2022 - 24 Oct. 2022, 熱海ニューフジヤホテル, 静岡県熱海市, 35135885, [Domestic Conference]
  • Discovering Closed Patterns over Relational Databases with Generalization Hierarchy
    Hiroki Arimura; Naoki Taguchi; Ye Wang
    JSAI 2022 Annual Meeting, JSAI, Japanese, Poster presentation
    14 Jun. 2022 - 17 Jun. 2022, 国立京都国際会館, オンライン, Japan, [Domestic Conference]
  • デカルト木部分列照合問題の高速なアルゴリズム
    大泉 翼; 有村博紀
    アルゴリズム(AL)研究会, 2022-AL-186, Vol.3, pp.1-8, 20 Jan. 2022, 情報処理学会, Japanese, Oral presentation
    20 Jan. 2022 - 20 Jan. 2022, オンライン, Japan, [Domestic Conference]
  • 半順序集合の弱埋め込み問題に対するパラメータ化アルゴリズム
    宮崎怜子; 有村博紀; 小林靖明
    電子情報通信学会技術研究報告(Web), 06 Dec. 2022, 電子情報通信学会, Japanese, Oral presentation
    2022 - 2022, 愛媛大, Japan, 35135762;35135885, [Domestic Conference]
  • Explainable Machine Learning for Trustworthy AI
    Hiroki Arimura
    HU-DUT Workshop for Big data and AI, Hokkaido University, 17 Dec. 2021, English, Invited oral presentation
    17 Dec. 2021 - 17 Dec. 2021, [Invited]
  • ルールリストに対するRashomon集合の厳密計算と予測多重性解析
    又 康太; 金森 憲太朗; 有村 博紀
    第24回情報論的学習理論ワークショップ (IBIS 2021), 一般セッション, 2-3 モデル解釈・検証, 106, 10 Nov. 2021, Japanese, Oral presentation
    10 Nov. 2021 - 13 Nov. 2021, オンライン, Japan
  • デカルト木照合の部分系列への拡張
    加井丈志; 光吉健汰; 古谷 勇; 有村博紀
    コンピュテーション(COMP)研究会, COMP2021-6, May 2021, 電子情報通信学会, Japanese, Oral presentation
    07 May 2021 - 08 May 2021, オンライン, Japan, [Domestic Conference]
  • Fast Enumeration Algorithms for Feature Selection Based on Consistency Measure
    Wang Ye; Hiroki Arimura
    SIG-FPAI meeting, 116, JSAI, 21 Mar. 2021, 人工知能学会, Japanese, Oral presentation
    オンライン, Japan, 35135762, [Domestic Conference]
  • Variable Importance Cloudの要約方法と決定木に対する実験的評価
    又 康太; 金森 憲太朗; 有村 博紀
    第23回情報論的学習理論ワークショップ (IBIS 2020), 電子情報通信学会, Japanese, Oral presentation
    23 Nov. 2020 - 26 Nov. 2020, オンライン, [Domestic Conference]
  • 決定木要約の効率良い構築法 -- 説明可能な人工知能の実現に向けて --
    有村 博紀; 金森 憲太朗; 王 叶
    2020年度人工知能学会全国大会 (JSAI2020), 3E1-GS-2-05, 人工知能学会, Japanese, Oral presentation
    04 Jun. 2020 - 07 Jun. 2020, オンライン, Japan, [Domestic Conference]
  • 混合整数線形計画法に基づく実現可能性を考慮した反事実的説明法
    金森 憲太朗; 高木 拓也; 小林 健; 有村 博紀
    2020年度人工知能学会全国大会 (JSAI2020), インタラクティブセッション, 3Rin4-44, 人工知能学会,, Japanese, Oral presentation
    04 Jun. 2020 - 07 Jun. 2020, オンライン, Japan, [Domestic Conference]
  • 決定木アンサンブルにおける出現頻度比に基づく変数重要度
    又 康太; 金森 憲太朗; 有村 博紀
    2020年度人工知能学会全国大会 (JSAI2020) , インタラクティブセッション, 4Rin1-26, 人工知能学会, Japanese, Oral presentation
    04 Jun. 2020 - 07 Jun. 2020, オンライン, Japan, [Domestic Conference]
  • 説明可能な機械学習のための拡張シャープレイ値の厳密指数時間アルゴリズム
    大泉翼; 有村 博紀
    第12回データ工学と情報マネジメントに関するフォーラム (DEIM2020), インタラクティブセッション, P2-23, 電子情報通信学会DE研究専門委員会,日本データベース学会,情報処理学会DBS研究会, Japanese, Oral presentation
    02 Mar. 2020 - 04 Mar. 2020, オンライン, Japan, [Domestic Conference]
  • Efficient Constrained Pattern Mining Using Dynamic Item Ordering for Explainable Classification
    IWASHITA Hiroaki; TAKAGI Takuya; SUZUKI Hirofumi; GOTO Keisuke; OHORI Kotaro; ARIMURA Hiroki
    JSAI Technical Report, SIG-FPAI, 111 8-8, Jan. 2020, Japanese, Oral presentation
  • Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear Optimization
    金森憲太朗; 高木拓也; 小林健
    第22回情報論的学習理論ワークショップ (IBIS 2019), ポスター, 1-069, 電子情報通信学会, Japanese, Oral presentation
    20 Nov. 2019 - 23 Nov. 2019, ウインクあいち, 名古屋市, Japan, [Domestic Conference]
  • 説明可能な機械学習に向けて:整数計画法と列挙に基づく最適決定木の厳密学習アルゴリズムの実験的比較
    王 叶; 有村 博紀
    第22回情報論的学習理論ワークショップ (IBIS 2019), ポスター, 2-089, 電子情報通信学会, Japanese, Oral presentation
    20 Nov. 2019 - 23 Nov. 2019, ウインクあいち, 名古屋市, Japan, [Domestic Conference]
  • 決定木アンサンブル予測器の効率的ハードウェア実装のための簡約化に関する研究
    松田 祐汰; 瀧川 一学; 有村 博紀
    第22回情報論的学習理論ワークショップ (IBIS 2019), ポスター, 2-088, 電子情報通信学会, Japanese, Oral presentation
    20 Nov. 2019 - 23 Nov. 2019, ウインクあいち, 名古屋市, Japan, [Domestic Conference]
  • Constant Amortized Time Enumeration of Independent Sets for Graphs with Bounded Clique Number
    栗田 和宏; 和佐 州洋; 宇野 毅明; 有村 博紀
    電子情報通信学会コンピューテーション研究会 119(249) COMP2019 18-28, Nov. 2019, Oral presentation
  • Fairness-aware Edit of Thresholds in a Learned Decision Tree Using a Mixed Integer Programming Formulation
    金森 憲太朗; 有村 博紀
    2019年度人工知能学会全国大会 (JSAI2019), 人工知能学会, インタラクティブセッション, 3Rin2-11, 06 Jun. 2019, 人工知能学会, English, Oral presentation
    朱鷺メッセ, 新潟市, Japan, [Domestic Conference]
  • イベント系列からの有意なエピソードの効率良いマイニング
    谷 陽太; 平田 耕一; 有村 博紀
    第11回データ工学と情報マネジメントに関するフォーラム (DEIM2019), H1-4, 電子情報通信学会DE研究専門委員会,日本データベース学会,情報処理学会DBS研究会, Japanese, Oral presentation
    03 Mar. 2019 - 04 Mar. 2019, ホテルオークラJRハウステンボス, 佐世保市, Japan, [Domestic Conference]
  • An Efficient Algorithm for Enumerating Chordal Bipartite Induced Subgraphs in Graphs
    栗田 和宏; 和佐 州洋; 宇野 毅明; 有村 博紀
    第108回人工知能基本問題研究会(SIG-FPAI), SIG-FPAI-B802-02, 6-11, 人工知能学会, Japanese, Oral presentation
    29 Jan. 2019 - 30 Jan. 2019, 大阪府立大学 I-siteなんば, 大阪市, 受賞:研究会優秀賞, 金森 憲太朗, 有村 博紀, 2018年度,人工知能学会,2019年6月27日
  • 大きな正規表現に対する系列二分決定グラフを用いた効率よい照合手法
    瀧澤涼介; 喜田拓也; 有村博紀; 瀧川一学
    電子情報通信学会技術研究報告, 2019
    2019 - 2019
  • Subgraph Enumeration: Efficient Algorithms and Empirical Studies
    Kazuhiro Kurita; Kunihiro Wasa; Takeaki Uno; Hiroki Arimura
    The 2nd International Workshop on Enumeration Problems & Application (WEPA2018), 06 Nov. 2018, English, Oral presentation
    06 Nov. 2018 - 06 Nov. 2018, University of Pisa, Italy, [International presentation], [Internationally co-authored]
  • Efficient Enumeration Algorithm for Dominating Sets in Bounded Degenerate Graphs (Foundations and Applications of Algorithms and Computation)
    Kurita Kazuhiro; Wasa Kunihiro; Arimura Hiroki; Uno Takeaki
    数理解析研究所講究録, Aug. 2018, 京都大学数理解析研究所, English
    Aug. 2018 - Aug. 2018, Dominating sets are fundamental graph structures. However, enumeration of dominating sets has not received much attention. This study aims to propose an efficient enumeration algorithms for bounded degenerate graphs. The algorithm enumerates all the dominating sets for k-degenerate graphs in O(k) time per solution using O(n+m) space. Since planar graphs have a constant degeneracy, this algorithm can enumerate all such sets for planar graphs in constant time per solution.
  • モデル選択のためのサポートベクトル列挙
    金森 憲太朗; 原 聡; 石畠 正和; 有村 博紀
    第29回IBISML研究会, 情報論的学習理論と機械学習研究会, 電子情報通信学会 情報・システムソサイエティ, Japanese, Oral presentation
    13 Jun. 2018 - 15 Jun. 2018, 沖縄科学技術大学院大学(OIST), Japan, [Domestic Conference]
  • グラフ断片決定木を用いたグラフ特徴抽出手法
    坂上 陽規; 瀧川 一学; 有村 博紀
    人工知能学会2018年全国大会(JSAI2018),インタラクティブセッション, 3Pin1-10, 人工知能学会, Japanese, Oral presentation
    05 Jun. 2018 - 08 Jun. 2018, 城山観光ホテル、鹿児島県鹿児島市, Japan, [Domestic Conference]
  • イベント系列からの有意性を考慮した菱形エピソードマイニング
    谷 陽太; 古谷 勇; 平田 耕一; 有村 博紀
    人工知能学会2018全国大会(JSAI2018), インタラクティブセッション, 3Pin1-10, 人工知能学会, Japanese, Oral presentation
    05 Jun. 2018 - 08 Jun. 2018, 城山観光ホテル、鹿児島県鹿児島市, Japan, [Domestic Conference]
  • グラフに含まれる大きな内周を持つ部分グラフの効率良い列挙
    栗田 和宏; Alessio Conte; 和佐 州洋; 宇野 毅明; 有村 博紀
    情報処理学会第80回全国大会講演論文集, Vol.80, No.1, 1.347-1.348, 13 Mar. 2018, 情報処理学会, Japanese
    13 Mar. 2018 - 13 Mar. 2018, 内周はグラフに含まれる最小の閉路長を表し,グラフの内周を計算する問題はよく研究されている.ItaiとRodehは一般グラフの内周を計算する非自明な最初のアルゴリズムを開発した.このアルゴリズムは最悪時O(nm)時間で動作し,平均時間O(n^2)時間で動作する.重みなし平面グラフに対しては,DjidjevがO(n^{5/4} log n)時間アルゴリズムを与え,Changらがこのアルゴリズムを改良し線形時間アルゴリズムを与えた.我々の知る限りでは大きな内周を持つ連結な部分グラフを列挙するアルゴリズムは存在しない.本論文では,これらの問題を解く列挙アルゴリズムを与える.このアルゴリズムの単純な拡張により,k以上の内周を持つ重み付きグラフの部分グラフや非連結なグラフの列挙も可能である., [Domestic Conference]
  • 順序決定木に対する正則化パラメータ推定の高速化
    金森 憲太朗; 石畠 正和; 湊 真一; 有村 博紀
    第105回人工知能基本問題研究会(SIG-FPAI),SIG-FPAI-B508, 人工知能学会, Japanese, Oral presentation
    28 Jan. 2018 - 29 Jan. 2018, 石垣島大濱信泉記念館, 沖縄県石垣市, Japan, [Domestic Conference]
  • 決定化されたグラフパターントライの学習アルゴリズム
    坂上 陽規; 栗田 和宏; 瀧川 一学; 有村 博紀
    第105回人工知能基本問題研究会(SIG-FPAI),SIG-FPAI-B508, 人工知能学会, Japanese, Oral presentation
    28 Jan. 2018 - 29 Jan. 2018, 石垣島大濱信泉記念館, 沖縄県石垣市, Japan, [Domestic Conference]
  • Efficient Enumeration of Dominating Sets in k-degenerate Graphs (情報セキュリティ)
    栗田 和宏; 和佐 州洋; 宇野 毅明; 有村 博紀
    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 21 Dec. 2017, 電子情報通信学会, English
    21 Dec. 2017 - 21 Dec. 2017, [Domestic Conference]
  • 〔Major achievements〕Optimization and Enumeration of Decision Trees from Massive Data Sets (talk)
    Hiroki Arimura; Kazuhito Osabe; Takeaki Uno
    The 23st Conference of the International Federation of Operational Research Societies (IFORS 2017), 17 Jul. 2017, International Federation of Operational Research Societies, IFORS, English, Invited oral presentation
    17 Jul. 2017 - 21 Jul. 2017, Québec City Convention Centre, Québec, Canada, Canada, In this talk, we study optimization and counting of complex rules in data mining/machine learning based on enumeration. In particular, we consider exact algorithms using small space (of input polynomial) for the task of mining a class of small and accurate decision trees in a labeled data set., 11763074, [Invited], [International presentation]
  • Fast Discovery of Optimal Ordered Decision Tree Based on Item Set Enumeration
    長部 和仁; 宇野 毅明; 有村 博紀
    人工知能基本問題研究会, 12 Dec. 2016, 人工知能学会, Japanese
    12 Dec. 2016 - 12 Dec. 2016
  • グラフに含まれる誘導マッチングの列挙
    栗田和宏; 和佐州洋; 喜田拓也; 有村博紀
    情報処理学会研究報告(Web), 2016
    2016 - 2016
  • トラジェクトリデータに対する効率良い近似パターン照合アルゴリズム (コンピュテーション)
    笹川 裕人; 有村 博紀
    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 12 Jun. 2015, 電子情報通信学会, Japanese
    12 Jun. 2015 - 12 Jun. 2015
  • Extension of Ukkonen's Online Suffix Tree Algorithm to Multi-Stream Texts
    高木 拓也; 有村 博紀
    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 12 Jun. 2015, 電子情報通信学会, Japanese
    12 Jun. 2015 - 12 Jun. 2015
  • Reconfiguration Problem for Maximal Induced Trees
    和佐 州洋; 山中 克久; 有村 博紀
    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 12 Jun. 2015, 電子情報通信学会, Japanese
    12 Jun. 2015 - 12 Jun. 2015, [Domestic Conference]
  • Efficient Approximate 3-Dimensional Point Set Matching and Its Application to Molecular Pattern Matching
    佐々木耀一; 渋谷哲朗; 伊藤公人; 有村博紀
    電子情報通信学会技術研究報告, 2015
    2015 - 2015
  • Secure Pattern Matching Using Homomorphic Encryption for Extended String Patterns
    原田 弘毅; 笹川 裕人; 有村 博紀; 佐久間 淳
    コンピュータセキュリティシンポジウム2013論文集, 14 Oct. 2013, Japanese
    14 Oct. 2013 - 14 Oct. 2013
  • Event Clustering Method based on Linear HMM for Geotagged Personal Contents
    OHAMA Iku; KIDA Takuya; ARIMURA Hiroki; ABE Toshihisa
    IEICE technical report, 21 Mar. 2011, The Institute of Electronics, Information and Communication Engineers, Japanese
    21 Mar. 2011 - 21 Mar. 2011, We discuss the geographical clustering problem on human activity logs. In this paper, we formulate the problem as a segmentation problem for 2-dimensional sequence data, and propose two clustering methods, named LS-linHMM and X-linHMM. LS-linHMM is based on linear constrained HMMs which estimate the optimal number of clusters by model selection with an information criterion. X-linHMM is a hierarchical clustering method which is based on the idea of x-means and use 2-state linear constrained HMMs. It runs faster than LS-linHMM in many cases. In this time, we apply these clustering methods into the clustering problem for geo-tagged personal picture contents. Our experimental results using real data showed that the proposed methods are more effective in clustering than the simple x-means.
  • D-4-2 An Efficient Stream Pattern Matching Algorithm for Tree Regular Expressions
    Fujikane Yasuyuki; Kaneta Yusaku; Arimura Hiroki
    Proceedings of the IEICE General Conference, 28 Feb. 2011, The Institute of Electronics, Information and Communication Engineers, Japanese
    28 Feb. 2011 - 28 Feb. 2011
  • TK-7-1 Center for Next-Generation Information Technology based on Knowledge Discovery and Knowledge Federation
    有村 博紀
    Proceedings of the IEICE General Conference, 28 Feb. 2011, The Institute of Electronics, Information and Communication Engineers, Japanese
    28 Feb. 2011 - 28 Feb. 2011
  • An EM algorithm for inferring geographic transmission probability tables from a large phylogenetic tree (特集 「ベイジアン・ネットワーク」および一般)
    柳橋 史成; 伊藤 公人; 有村 博紀
    人工知能基本問題研究会, 17 Nov. 2010, 人工知能学会, English
    17 Nov. 2010 - 17 Nov. 2010
  • Extraction of frequent bipartite episodes from bacterial culture data
    河東 孝; 有村 博紀; 平田 耕一
    人工知能基本問題研究会, 24 Sep. 2010, 人工知能学会, Japanese
    24 Sep. 2010 - 24 Sep. 2010
  • Efficient Pattern Matching for Acyclic Regular Expressions
    金田 悠作; 湊 真一; 有村 博紀
    情報処理学会研究報告, Jun. 2010, 情報処理学会, Japanese
    Jun. 2010 - Jun. 2010
  • Efficient Pattern Matching for Acyclic Regular Expressions
    KANETA Yusaku; MINATO Shin-ichi; ARIMURA Hiroki
    IEICE technical report. Theoretical foundations of Computing, 12 May 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
    12 May 2010 - 12 May 2010, A regular expression is acyclic if it is over the basis in Σ, dot "・", and union "|". In this paper, for the subclass of acyclic regular expressions, we give an efficient algorithm that solves the regular expression matching problem for an acyclic regular expression of length m and depth d and an input text of length n in O(nmd/w) time using O(md) preprocessing and O(md/w) space in words on unit-cost RAM model with word length w. We introduce new bit-parallel techniques, called scatter and gather operations to simulate Thompson NFA for a given regular expression, and naturally extend SHIFT-AND approach by Wu and Manber (CACM 35(10), 1992) to the regular expression matching problem. For an acyclic regular expression with unbounded depth d=O(1), our approach is faster than Bille's approach for any regular expression keeping O(m/w) space.
  • D-1-7 AN EFFICIENT REGULAR EXPRESSION MATCHING ALGORITHM BASED PARALLEL BIT-DISTRIBUTION
    Kaneta Yusaku; Minato Shin-ichi; Arimura Hiroki
    Proceedings of the IEICE General Conference, 02 Mar. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
    02 Mar. 2010 - 02 Mar. 2010
  • An efficient algorithm for extracting frequent partite episodes with minimum occurrences
    河東 孝; 有村 博紀; 平田 耕一
    人工知能基本問題研究会, 27 Jan. 2010, 人工知能学会, Japanese
    27 Jan. 2010 - 27 Jan. 2010
  • An efficient hardware-oriented algorithm for regular expression matching based on parallel bit-distribution
    KANETA Yusaku; YOSHIZAWA Shingo; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
    Technical report of IEICE. VLD, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
    19 Jan. 2010 - 19 Jan. 2010, In this paper, we study the regular expression matching problem for fast data stream processing. We present an efficient algorithm for based on new bit-parallel methods, called parallel scatter and gather exploiting bit-parallelism in a computer word. Finally, we show an architecture for a hardware implementation of our algorithm. The architecture can change its regular expression patterns on-the-fly without expensive reconfiguration.
  • Development of Interdisciplinary Research Environment by Collaboration of e-Learning and Remote FPGA
    KIM Jaseong; YOSHIZAWA Shingo; KANETA Yusaku; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
    Technical report of IEICE. VLD, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
    19 Jan. 2010 - 19 Jan. 2010, Field programmable gate array (FPGA) can reconfigure logic circuits after production, which is embedded into electric instruments. Recently, researchers and engineers who do not engage in the field of LSI have developed exclusive computer by FPGA as their researches. However, they need to face to overcome barriers in acquiring expert hardware knowledge and preparing design environments when they enter FPGA studies. In this report, we have developed a remote FPGA system combined with e-learning for smooth promotion of interdisciplinary research. This system provides FPGA design education, research and development, field tests by remote control, where users prepare only their own computers. This report describes the framework and reports the research activities using the system.
  • An efficient hardware-oriented algorithm for regular expression matching based on parallel bit-distribution
    KANETA Yusaku; YOSHIZAWA Shingo; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
    IEICE technical report. Computer systems, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
    19 Jan. 2010 - 19 Jan. 2010, In this paper, we study the regular expression matching problem for fast data stream processing. We present an efficient algorithm for based on new bit-parallel methods, called parallel scatter and gather exploiting bit-parallelism in a computer word. Finally, we show an architecture for a hardware implementation of our algorithm. The architecture can change its regular expression patterns on-the-fly without expensive reconfiguration.
  • Development of Interdisciplinary Research Environment by Collaboration of e-Learning and Remote FPGA
    KIM Jaseong; YOSHIZAWA Shingo; KANETA Yusaku; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
    IEICE technical report. Computer systems, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
    19 Jan. 2010 - 19 Jan. 2010, Field programmable gate array (FPGA) can reconfigure logic circuits after production, which is embedded into electric instruments. Recently, researchers and engineers who do not engage in the field of LSI have developed exclusive computer by FPGA as their researches. However, they need to face to overcome barriers in acquiring expert hardware knowledge and preparing design environments when they enter FPGA studies. In this report, we have developed a remote FPGA system combined with e-learning for smooth promotion of interdisciplinary research. This system provides FPGA design education, research and development, field tests by remote control, where users prepare only their own computers. This report describes the framework and reports the research activities using the system.
  • An efficient hardware-oriented algorithm for regular expression matching based on parallel bit-distribution
    KANETA Yusaku; YOSHIZAWA Shingo; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
    IEICE technical report, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
    19 Jan. 2010 - 19 Jan. 2010, In this paper, we study the regular expression matching problem for fast data stream processing. We present an efficient algorithm for based on new bit-parallel methods, called parallel scatter and gather exploiting bit-parallelism in a computer word. Finally, we show an architecture for a hardware implementation of our algorithm. The architecture can change its regular expression patterns on-the-fly without expensive reconfiguration.
  • Development of Interdisciplinary Research Environment by Collaboration of e-Learning and Remote FPGA
    KIM Jaseong; YOSHIZAWA Shingo; KANETA Yusaku; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
    IEICE technical report, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
    19 Jan. 2010 - 19 Jan. 2010, Field programmable gate array(FPGA)can reconfigure logic circuits after production, which is embedded into electric instruments. Recently, researchers and engineers who do not engage in the field of LSI have developed exclusive computer by FPGA as their researches. However, they need to face to overcome barriers in acquiring expert hardware knowledge and preparing design environments when they enter FPGA studies. In this report, we have developed a remote FPGA system combined with e-learning for smooth promotion of interdisciplinary research. This system provides FPGA design education, research and development, field tests by remote control, where users prepare only their own computers. This report describes the framework and reports the research activities using the system.
  • Efficient Mining Algorithms for Partite Episodes
    河東 孝; 有村 博紀; 平田 耕一
    人工知能学会全国大会論文集, 2010, 人工知能学会, Japanese
    2010 - 2010
  • D-4-18 A Fast String Matching Algorithm and Its FPGA Design for High-Speed Stream Processing
    KANETA Yusaku; YOSHIZAWA Shingo; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
    Proceedings of the IEICE General Conference, 04 Mar. 2009, The Institute of Electronics, Information and Communication Engineers, Japanese
    04 Mar. 2009 - 04 Mar. 2009
  • A Polynomial-Delay Polynomial-Space Algorithm for Extracting Frequent Diamond Episodes from Event Sequences
    河東 孝; 有村 博紀; 平田 耕一
    論文集, 2009, 人工知能学会, Japanese
    2009 - 2009
  • XQUBE : A Visual Language for XQuery Construction from Interaction and Examples
    TSUTSUI Junpei; ARIMURA Hiroki
    IPSJ SIG Notes, 14 Sep. 2008, Information Processing Society of Japan (IPSJ), Japanese
    14 Sep. 2008 - 14 Sep. 2008, This paper studies visual languages for interactive construction and editting of XQuery. We present a tree pattern-like representation TQ by extending Braga et al.'s XQBE (TODS, 30:2, 2005), and then give its visual language counterpart XQUBE. Furthermore, we show that these two representations are equivalent to a restricted fragment of XQuery called TQ-XQuery that has for, child-axis paths, concatenation constructions, but does not have let, if/where constructions, and arbitrary compsition.
  • Efficent Serial Episode Mining with Minimal Occurrences
    Ohtani Hideyuki; Kida Takuya; Uno Takeaki; Arimura Hiroki
    IPSJ SIG Notes, 12 Jun. 2008, Information Processing Society of Japan (IPSJ), Japanese
    12 Jun. 2008 - 12 Jun. 2008, Recently, knowledge discovery in large data increases its importance in various fields. Especially, data mining from time-series data gains much attension. This paper studies the problem of finding frequent episodes appearing in a sequence of events. We propose an efficient depth-first-search algorithm for mining frequent serial episodes in a given event sequence using the notion of right-minimal occurrences. Then, we present some techniques for speeding up the algorithm. Finally, we ran experiments on real datasets to evaluate the usefulness of the proposed methods.
  • Learning Walks from Graphs
    Junpei TSUTSUI; Hiroki ARIMURA
    Technical Committee on Theoretical Foundations of Computing (COMP), 18 Apr. 2008, The Institute of Electronics, Information and Communication Engineers (IEICE), Japan, Japanese, Oral presentation
    18 Apr. 2008 - 18 Apr. 2008, Osaka Prefecture University, Nakamozu, Osaka, Japan, Abstract In this paper, we study the problem of learning an unknown label sequence, called a walk, that is embedded in a collection of vertex-labeled graphs. We present a polynomial time learning algorithm that learns all walks π∗ from one positive example and using O(m + n) membership queries, where m = |π∗| is the size of the walk and n is the size of the positive example, respectively. Based on prediction preserving reduction, we also show that the prediction problem for the class of walks from positive and negative examples of graphs under both of vertex-non-overlapping and edge-non-overlapping embeddings are as hard as the prediction problem of DNF (disjunctive normal form formulas) from positive and negative examples.
    Key words: graph inference, query learning model, membership queries, polynomial time learning, positive examples, [Domestic Conference]
  • 4ZK-7 Keyword Extraction for Browsing Support
    Uemura Takashi; KIDA Takuya; ARIMURA Hiroki
    全国大会講演論文集, 13 Mar. 2008, Information Processing Society of Japan (IPSJ), Japanese
    13 Mar. 2008 - 13 Mar. 2008
  • Fast Algorithms for Finding Correlated Mutations from Nucleic / Amino Acid Sequences
    ITO Kimihito; TANIGUCHI Tsuyoshi; MURAKAMI Teiji; ARIMURA Hiroki; HARAGUCHI Makoto
    IPSJ SIG technical reports, 03 Mar. 2008, Information Processing Society of Japan (IPSJ), Japanese
    03 Mar. 2008 - 03 Mar. 2008, We present new algorithms to find the sets of positions involved in correlated mutations from amino/nucleic acid sequences. In this paper, we show that 1) use of the conditional entropy H(i|j) is overall effective in detecting the sets of positions involved in correlated mutations, 2) a natural extension of H(i|j) to the conditional entropy H(i_1…i_m|j) allows us to detect the sets of more than two positions involved in correlated mutations, 3) pruning rules derived from the monotonicity and closure of positions under logical equivalence boost the running performance in terms of computational speed remarkably.
  • D-020 Property Suffix Tree : An Efficient Data Structure for Stream Data with Meta Data
    Uemura Takashi; Kida Takuya; Arimura Hiroki
    情報科学技術フォーラム一般講演論文集, 22 Aug. 2007, Forum on Information Technology, Japanese
    22 Aug. 2007 - 22 Aug. 2007
  • D-019 Massive Continuous Stream Pattern Matching Based on Bit-Parallel Method
    Saito Tomoya; Kida Takuya; Arimura Hiroki
    情報科学技術フォーラム一般講演論文集, 22 Aug. 2007, Forum on Information Technology, Japanese
    22 Aug. 2007 - 22 Aug. 2007
  • Efficient maximal pattern discovery from massive geometric graphs
    有村 博紀; 宇野 毅明; 下菌 真一
    人工知能基本問題研究会, 13 Jul. 2007, 人工知能学会, Japanese
    13 Jul. 2007 - 13 Jul. 2007
  • Tane: a flexible information extraction browser with learning
    筒井 淳平; 金田 悠作; 有村 博紀
    人工知能基本問題研究会, 13 Jul. 2007, 人工知能学会, Japanese
    13 Jul. 2007 - 13 Jul. 2007
  • On an Efficient off-Line Construction of Property Suffix Trees
    UEMURA Takashi; KIDA Takuya; ARIMURA Hiroki
    IEICE technical report. Theoretical foundations of Computing, 19 Apr. 2007, The Institute of Electronics, Information and Communication Engineers, Japanese
    19 Apr. 2007 - 19 Apr. 2007, In some intelligent application of text retrieval, it is required to do a search just through particular parts of target text data with consideration for some properties that the text have. Namely, the texts are followed by additional information, and the each target part satisfies a certain condition on it. In this paper, we address the problem to construct an efficient index structure for this kind of search. In 2006, Amir et al. proposed a modified suffix tree for this problem, called the property suffix tree, which stores only substrings that belong to the given property. This algorithm constructs a suffix tree at first, and then pluning edges indicating substrings which are not included in the property. In this paper, we present an algorithm for finding all borders between nodes to be remained and nodes to be eliminated in the suffix tree.
  • Hardness Results on Local Multiple Alignment of Biological Sequences
    AKUTSU Tatsuya; ARIMURA Hiroki; SHIMOZONO Shinichi
    IPSJ SIG technical reports, 05 Mar. 2007, Information Processing Society of Japan (IPSJ), English
    05 Mar. 2007 - 05 Mar. 2007, This article studies the local multiple alignment problem, which is, given protein or DNA sequences, to locate a region (i.e., a substring) of fixed length from each sequence so that the score determined from the set of regions is optimized. We consider the following scoring schemes: the relative entropy score (i.e., average information content), the sum-of-pairs score and a relative entropy-like score introduced by Li et al. We prove that multiple local alignment is NP-hard under each of these scoring schemes. In particular, we prove that multiple local alignment is APX-hard under relative entropy scoring. It implies that unless P = NP there is no polynomial time algorithm whose worst case approximation error can be arbitrarily specified (precisely, a polynomial time approximation scheme). Scveral related theoretical results are also provided.
  • Efficient Discovery of Maximal Two-Dimensional Patterns with Don't-Cares(extended abstract) (テーマ:特集「ウェブデータの知的処理」および一般)
    有村 博紀; 宇野 毅明; 下薗 真一
    人工知能基本問題研究会, 08 Sep. 2006, 人工知能学会, English
    08 Sep. 2006 - 08 Sep. 2006
  • D_045 An Adaptive Algorithm for Dividing Large Sets of Strings to Efficiently Sort
    Asai Tatsuya; Okamoto Seishi; Arimura Hiroki
    情報科学技術フォーラム一般講演論文集, 21 Aug. 2006, Forum on Information Technology, Japanese
    21 Aug. 2006 - 21 Aug. 2006
  • 深さ優先探索に基づく変数制限つき極大モチーフの高速マイニング (テーマ:「データマイニングと統計数理」および一般)
    有村 博紀; 宇野 毅明
    知識ベ-スシステム研究会, 09 Mar. 2006, 人工知能学会, Japanese
    09 Mar. 2006 - 09 Mar. 2006
  • Efficient Method of Transaction Database Analysis Using Zero-Suppressed BDDs
    MINATO Shin-ichi; ARIMURA Hiroki
    The IEICE transactions on information and systems (Japanese edetion), 01 Feb. 2006, The Institute of Electronics, Information and Communication Engineers, Japanese
    01 Feb. 2006 - 01 Feb. 2006, 大規模なトランザクションデータを,計算機上でコンパクトに表現して効率的に処理することは,データマイニングにおける重要な基盤技術の一つである.本論文では,VLSI CADの分野で大規模論理関数データの表現法として広く用いられている二分決定グラフ(BDD: Binary Decision Diagrams)をデータマイニングの分野に応用する方法を提案する.BDDの中でも「ゼロサプレス型BDD」(ZBDD: Zero-suppressed BDD)と呼ばれるデータ構造は,大規模な組合せ集合データを非明示的に列挙し,効率良く演算処理することができる.このZBDDを用いて,トランザクションデータベースに関する種々の計算処理を行う手法について述べ,実験結果を示す.
  • A Polynomial Space Polynomial Delay Algorithm for Enumerating Maximal Motifs in a Sequence
    ARIMURA Hiroki; UNO Takeaki
    IEICE technical report. Theoretical foundations of Computing, 08 Sep. 2005, The Institute of Electronics, Information and Communication Engineers, English
    08 Sep. 2005 - 08 Sep. 2005, In this paper, we consider the problem of finding all maximal motifs in an input string for the class of repeated motifs with wild cards such as ABoBoBC. Since the proboem has various applications in bioinformatics, sequence mining, and data compression, output-sensitive enumeration of such maximal motifs has been extensively studied. However, no algorithms achieved polynomial space and polynomial delay so far. For the problem, we present a polynomial space polynomial delay algorithm for the maximal motif enumeration problem for the repeated motifs with wild cards. We also show exponential lower bound and succinctness results on the number of maximal motifs, which indicate the limit of a straightforward approach.
  • 大規模系列データから代表的な頻出エピソードを発見するための効率よいアルゴリズム (特集「人工知能における論理の新たな展開」)
    有村 博紀; 宇野 毅明
    人工知能基本問題研究会, 04 Nov. 2004, 人工知能学会, Japanese
    04 Nov. 2004 - 04 Nov. 2004
  • An Effcient Algorithm for Mining Frequent Closed Sequential Episodes
    ARIMURA Hiroki; UNO Takeaki
    IPSJ SIG Notes, 14 Oct. 2004, Information Processing Society of Japan (IPSJ), Japanese
    14 Oct. 2004 - 14 Oct. 2004, In this paper, we consider the enumeration problem for the class of frequent closed sequence patterns, called frequent position-closed sequences, from a given collection of sequences, where closed sequence patterns are a generalization of closed itemsets for sequence databases. Out notion of closed sequences are are based on the position occurrences, and thus it is weaker than the traditional notion of closed patterns based on the document occurrences. We present an efficient algorithm for enumerating all frequent maximal sequence patterns without duplicates, which runs in polynomial time per pattern in the total size of the input sequence database based on the framework of reverse search. As a corollary, the enumeration problem for frequent maximal sequence patterns is shown to be solvable in output polynomial time.
  • An Effcient Algorithm for Mining Frequent Closed Sequential Episodes
    ARIMURA Hiroki; UNO Takeaki
    IEICE technical report. Theoretical foundations of Computing, 08 Oct. 2004, The Institute of Electronics, Information and Communication Engineers, Japanese
    08 Oct. 2004 - 08 Oct. 2004, In this paper, we consider the enumeration problem for the class of frequent closed sequence patterns, called frequent position-closed sequences, from a given collection of sequences, where closed sequence patterns are a generalization of closed itemsets for sequence databases. Out notion of closed sequences are are based on the position occurrences, and thus it is weaker than the traditional notion of closed patterns based on the document occurrences. We present an efficient algorithm for enumerating all frequent maximal sequence patterns without duplicates, which runs in polynomial time per pattern in the total size of the input sequence database based on the framework of reverse search. As a corollary, the enumeration problem for frequent maximal sequence patterns is shown to be solvable in output polynomial time.
  • E-007 Mining Key Semantics Using Dependency Graphs
    Morinaga Satoshi; Arimura Hiroki; Ikeda Takahiro; Sakao Yosuke; Akamine Susumu
    情報科学技術フォーラム一般講演論文集, 20 Aug. 2004, Forum on Information Technology, Japanese
    20 Aug. 2004 - 20 Aug. 2004
  • 分類階層を考慮したパタン照合アルゴリズム (特集 オントロジー)
    喜田 拓也; 有村 博紀
    人工知能基本問題研究会, 25 Jul. 2004, 人工知能学会, Japanese
    25 Jul. 2004 - 25 Jul. 2004
  • An Efficient Algorithm for Discovering Frequent Patterns from Large Unordered Trees (Evolutionary Advancement in Fundamental Theories of Computer Science)
    Asai Tatsuya; Fusanobu Shinji; Arimura Hiroki; Uno Takeaki; Nakano Shin-ichi
    RIMS Kokyuroku, May 2004, Kyoto University, Japanese
    May 2004 - May 2004
  • 高速な無順序木パターン発見アルゴリズム (人工知能基礎論研究会(第54回)特集「医療及び化学情報マイニング」および一般)
    浅井 達哉; 房延 慎二; 有村 博紀
    人工知能基礎論研究会, 03 Mar. 2004, 人工知能学会, Japanese
    03 Mar. 2004 - 03 Mar. 2004
  • Practical Techniques for Speeding Up Enumeration Algorithms for Frequent Itemset Mining Problems
    UNO Takeaki; ASAI Tatsuya; ARIMURA Hiroki; UCHIDA Yuzo
    IPSJ SIG Notes, Mar. 2004, Information Processing Society of Japan (IPSJ), Japanese
    Mar. 2004 - Mar. 2004, In this paper, we address practical techniques for efficiently mining all frequent item sets, frequent closed item sets, and maximal frequent item sets, respectively, from transaction databases. We modify the algorithm for enumerating maximal bipartite cliques proposed in SIGAL89 [9, 10], and propose an algorithm for frequent closed item set enumeration. The computation time of the algorithm is theoretically proved to be linear in the number of output, which is not proved for existing algorithms. We further modify the algorithm for practical computation, and for enumerating all frequent item sets, and maximal frequent item sets. By computational experiments, which is done by a competition of frequent set mining algorithms, we showed our approach is quite efficient, especially for realworld problems witch are said to be difficult.
  • Efficiently Mining Frequent Substructures from Large Unordered Trees
    ASAI Tatsuya; ARIMURA Hiroki; UNO Takeaki; NAKANO Shin-ichi
    IEICE technical report. Theoretical foundations of Computing, 22 Jan. 2004, The Institute of Electronics, Information and Communication Engineers, Japanese
    22 Jan. 2004 - 22 Jan. 2004, In this paper, we study a data mining problem of discovering frequent substructures in a large collection of semi-structured data, where both of the patterns and the data are modeled by labeled unordered trees. The keys of the algorithm are efficient enumerating all unordered trees and incrementally computation of the occurrences based on a powerful design technique known as the reverse search. We present an efficient algorithm called UNOT that computes all labeled unordered trees appearing in a collection of data trees with frequency above a user-specified threshold. We prove that the algorithm enumerates each frequent pattern T in O (kb^2m) per pattern, where k is the size of T, b is the branching factor of the data tree, and m is the total number of occurrences of T in the data trees.
  • An Efficient Algorithm for Mining Frequent Unordered Trees from Semi-structured Data
    ASAI Tatsuya; ARIMURA Hiroki; UNO Takeaki; NAKANO Shinichi
    IEICE technical report. Data engineering, 01 Oct. 2003, The Institute of Electronics, Information and Communication Engineers, Japanese
    01 Oct. 2003 - 01 Oct. 2003, In this paper, we study a data mining problem of discovering frequent substructures in a large collection of semi-structured data, where both of the patterns and the data are modeled by labeled unordered trees. The keys of the algorithm are efficient enumerating all unordered trees and incrementally computation of the occurrences based on a powerful design technique known as the reverse search. We present an efficient algorithm called U NOT that computes all labeled unordered trees appearing in a collection of data trees with frequency above a user-specified threshold. We prove that the algorithm enumerates each frequent pattern T in O(kb^2m) per pattern, where k is the size of T, b is the branching factor of the data tree, and m is the total number of occurrences of T in the data trees.
  • An Efficient Algorithm for Mining Frequent Unordered Trees from Semi-structured Data
    ASAI Tatsuya; ARIMURA Hiroki; UNO Takeaki; NAKANO Shinichi
    IEICE technical report. Dependable computing, 01 Oct. 2003, The Institute of Electronics, Information and Communication Engineers, Japanese
    01 Oct. 2003 - 01 Oct. 2003, In this paper, we study a data mining problem of discovering frequent substructures in a large collection of semi-structured data, where both of the patterns and the data are modeled by labeled unordered trees. The keys of the algorithm are efficient enumerating all unordered trees and incrementally computation of the occurrences based on a powerful design technique known as the reverse search. We present an efficient algorithm called U NOT that computes all labeled unordered trees appearing in a collection of data trees with frequency above a user-specified threshold. We prove that the algorithm enumerates each frequent pattern T in O(kb^2m) per pattern, where k is the size of T, b is the branching factor of the data tree, and m is the total number of occurrences of T in the data trees.
  • An Efficient Algorithm for Mining Frequent Unordered Trees from Semi-structured Data
    ASAI Tatsuya; ARIMURA Hiroki; UNO Takeaki; NAKANO Shin-ichi
    電子情報通信学会技術研究報告. DE, データ工学, 01 Oct. 2003, Japanese
    01 Oct. 2003 - 01 Oct. 2003
  • Machine Learning in Computational Learning Theory(Machine Learning vs. Human Learning : Why Can't Machines Outsmart Humans ?")
    Arimura Hiroki
    Journal of Japanese Society for Artificial Intelligence, 01 Sep. 2003, The Japanese Society for Artificial Intelligence, Japanese
    01 Sep. 2003 - 01 Sep. 2003
  • Efficiently Mining Frequent Substructures from Large Unordered Trees
    ASI Tatsuya; ARIMURA Hiroki; UNO Takeaki; NAKANO Shin-ichi
    IEICE technical report. Artificial intelligence and knowledge-based processing, 24 Jul. 2003, The Institute of Electronics, Information and Communication Engineers, English
    24 Jul. 2003 - 24 Jul. 2003, In this paper, we study a data mining problem of discovering frequent substructures in a large collectionof semi-structured data, where both of the patterns and the data are modeled by labeled unordered trees. The keys ofthe algorithm are efficient enumerating all unordered trees and incrementally computation of the occurrences basedon a powerful design technique known as the reverse search. We present an efficient algorithm called UNOT thatcomputes all labeled unordered trees appearing in a collection of data trees with frequency above a user-specifiedthreshold. We prove that the algorithm enumerates each frequent pattern T in O(kb2n] per pattern, where A; is thesize of T, b is the branching factor of the data tree, and n is the total number of occurrences of T in the data trees.
  • Enumerating Closed Item Sets via Maximal Bipartite Cliques
    UNO Takeaki; ARIMURA Hiroki; ASAI Tatsuya
    IEICE technical report. Artificial intelligence and knowledge-based processing, 24 Jul. 2003, The Institute of Electronics, Information and Communication Engineers, Japanese
    24 Jul. 2003 - 24 Jul. 2003, For an item set and its subset family, a frequent set is a subset of the item set included in at least a specified number of elements of the subset family. Frequent sets have some applications in data mining. Recently, maximal frequent set and closed item sets are remarked where a closed item set is a group of similar frequent sets, since we can obtain necessary frequent sets by enumerating closed item sets. In this paper, we propose an efficient method for enumerating closed item sets by using enumeration of maximal bipartite cliques, and enumeration algo- rithm for maximal frequent sets via closed item sets. We evaluate their performances by computational experiments using benchmark problems, and show that our algorithms are faster than existing algorithms. Key words
  • Efficient XPath Processing for Semi-structured Data Streams
    MORIKAWA Hiroaki; ASAI Tatsuya; ARIMURA Hiroki
    IPSJ SIG Notes, 16 Jul. 2003, Information Processing Society of Japan (IPSJ), Japanese
    16 Jul. 2003 - 16 Jul. 2003, In this paper, we consider the problem of efficient processing of XML data streams online. We give another version of XML scanner called ASAX (active SAX), and then develop an efficient lightweight XPath pattern matching engine XMatch for fast XML data stream processing on Internet.
  • Efficient XPath Processing for Semi-structured Data Streams
    MORIKAWA Hiroaki; ASAI Tatsuya; ARIMURA Hiroki
    IEICE technical report. Data engineering, 10 Jul. 2003, The Institute of Electronics, Information and Communication Engineers, Japanese
    10 Jul. 2003 - 10 Jul. 2003, In this paper, we consider the problem of efficient processing of XML data streams online. We give another version of XML scanner called ASAX (active SAX), and then develop an efficient lightweight XPath pattern matching engine XMatch for fast XML data stream processing on Internet.
  • Online Algorithms for Discovering Frequent Substructures from Semi-structured Data Stream (New Aspects of Theoretical Computer Science)
    Asai Tatsuya; Abe Kenji; Kawasoe Shinji; Arimura Hiroki; Arikawa Setsuo
    RIMS Kokyuroku, May 2003, Kyoto University, Japanese
    May 2003 - May 2003
  • テキストマイニング:ウェブデータからの知識発見を目指して (特集 情報論的学習理論--機械学習のさまざまな形)
    有村 博紀; 浅井 達哉
    Computer today, Mar. 2003, サイエンス社, Japanese
    Mar. 2003 - Mar. 2003
  • Efficient Algorithms for Mining XML Data Streams
    川副 真治; 浅井 達哉; 有村 博紀
    人工知能学会全国大会論文集, 2003, 人工知能学会, Japanese
    2003 - 2003
  • Online Algorithms for Mining Semi-structured Data Stream with Sliding Window and Forgetting Factor
    ASAI Tatsuya; ARIMURA Hiroki; ABE Kenji; KAWASOE Shinji; ARIKAWA Setsuo
    IEICE technical report. Data engineering, 10 Oct. 2002, The Institute of Electronics, Information and Communication Engineers, Japanese
    10 Oct. 2002 - 10 Oct. 2002, In this paper, we study an online data mining problem from streams of semi-structured data such as XML data. Modeling semi-structured data and patterns as labeled ordered trees, we present an online algorithm StreamT that receives fragments of an unseen possibly infinite semi-structured data in the document order through a data stream, and can return the current set of frequent patterns immediately on request at any time. A crucial part of our algorithm is the incremental maintenance of the occurrences of possibly frequent patterns using a tree sweeping technique. We give modifications of the algorithm to other online mining models. We present theoretical and empirical analyses to evaluate the performance of the algorithm.
  • Online Algorithms for Mining Semi-structured Data Stream with Sliding Window and Forgetting Factor
    ASAI Tatsuya; ARIMURA Hiroki; ABE Kenji; KAWASOE Shinji; ARIKAWA Setsuo
    IEICE technical report. Dependable computing, 10 Oct. 2002, The Institute of Electronics, Information and Communication Engineers, Japanese
    10 Oct. 2002 - 10 Oct. 2002, In this paper, we study an online data mining problem from streams of semi-structured data such as XML data. Modeling semi-structured data and patterns as labeled ordered trees, we present an online algorithm StreamT that receives fragments of an unseen possibly infinite semi-structured data in the document order through a data stream, and can return the current set of frequent patterns immediately on request at any time. A crucial part of our algorithm is the incremental maintenance of the occurrences of possibly frequent patterns using a tree sweeping technique. We give modifications of the algorithm to other online mining models. We present theoretical and empirical analyses to evaluate the performance of the algorithm.
  • Efficient Substructure Discovery from Large Semi-structured Data
    ASAI Tatsuya; ABE Kenji; KAWASOE Shinji; ARIMURA Hiroki; ARIKAWA Setsuo
    IEICE technical report. Data engineering, 04 Oct. 2001, The Institute of Electronics, Information and Communication Engineers, Japanese
    04 Oct. 2001 - 04 Oct. 2001, In this paper, we consider a data mining problem for semi-structured data. We present an efficient algorithm for discovering frequent substructures from a given large collection of semi-structured data by modeling semi-structured data as labeled ordered trees. This algorithm is a generalization of the itemset enumeration technique, called set-enumeration tree, by Bayardo(SIGMOD'98)to ordered tree enumeration. The experiments on HTML documents show that the algorithm is efficient and scalabel on realworld data.
  • Discovery of Important Keywords From the Web Using Text Mining
    Arimura Hiroki; Abe Junichirou; Sakamoto Hiroshi; Arikawa Setsuo
    Annual report of Computing and Communications Center Kyushu University, Oct. 2001, Kyushu University, Japanese
    Oct. 2001 - Oct. 2001
  • Text Mining with Optimal Pattern Discovery
    ARIMURA Hiroki
    Proceedings of the Society Conference of IEICE, 29 Aug. 2001, The Institute of Electronics, Information and Communication Engineers, Japanese
    29 Aug. 2001 - 29 Aug. 2001
  • A Catalog for Prediction-Preserving Reducibility with Membership Queries on Formal Languages (New Developments of Theory of Computation and Algorithms)
    Hirata Kouichi; Sakamoto Hiroshi; Arimura Hiroki
    RIMS Kokyuroku, May 2001, Kyoto University, English
    May 2001 - May 2001
  • Practical Algorithms for Constructing Large Suffix Arrays
    ASAKA HIROKI; ABE JUNICHIRO; ARIMURA HIROKI; ARIKAWA SETSUO
    IPSJ SIG Notes, 15 Mar. 2001, Information Processing Society of Japan (IPSJ), Japanese
    15 Mar. 2001 - 15 Mar. 2001, In this paper, we study efficient parallel construction of full-text indexing structures for large text data. The suffix array is a compact full-text indexing structure that is useful in information retrieval and bio-informatics. We propose an efficient parallel algorithm for constructing suffix arrays on distributed memory parallel computers. This algorithm is a parallel implementation of the well-known external memory algorithm, called the Baeza-Yates-Gonnet-Sinder (BGS) algorithm. By theoretical analysis, we show that our algorithm runs more efficiently than the Riberio-Kitajima-Ziviani (RKZ) algorithm, another the parallel implementation of the BGS algorithm, in terms of the parallel time and communication complexities.
  • Extracting Text Data from HTML Documents
    MURAKAMI YOSHITSUGU; SAKAMOTO HIROSHI; ARIMURA HIROKI; ARIKAWA SETSUO
    IPSJ SIG Notes, Mar. 2001, 一般社団法人情報処理学会, Japanese
    Mar. 2001 - Mar. 2001
  • Data Mining : Towards Knowledge Discovery from Text and Web Data
    Arimura Hiroki
    IEICE technical report. Information theory, 03 Oct. 2000, The Institute of Electronics, Information and Communication Engineers, Japanese, Invited oral presentation
    03 Oct. 2000 - 03 Oct. 2000, The rapid progress of computer and network technologies makes it easy to store large amount of unstructured or semi-structured texts such as webpages, HTML/XML archives, and a collection of emails or text files on a computer. Thus, there is a potential demand for efficient semi-automatic tool that supports a human discovery from large text databases. In this paper, we give a short survey of the current data mining methods and their limitations in the context of text and Web mining. Then, we present efficient and robust data mining algorithms for unstructured text data based on optimized pattern discovery. We also discuss the applications of the algorithms to Web mining., [Invited]
  • Learning Term Rewriting Systems from Entailment
    Hiroki Arimura; Hiroshi Sakamoto; Setsuo Arikawa
    Inductive Logic Programming, 10th International Conference, ILP 2000, Work-in-progress reports, London, UK, July 2000, Proceedings, 24 Jul. 2000, ILP 1995, English, Oral presentation
    24 Jul. 2000 - 27 Jul. 2000, London, UK, United Kingdom, editors: James Cussens and Alan Frisch, [International presentation]
  • 木の変換規則の例からの学習 (小特集 「発見科学」及び一般演題)
    坂本 比呂志; 有村 博紀; 有川 節夫
    人工知能基礎論研究会, 15 Jul. 2000, 人工知能学会, Japanese
    15 Jul. 2000 - 15 Jul. 2000
  • Finding Unexpected Regression Association Rules from Numeric Database
    TAKAE Toru; ABE Junichiro; SAKAMOTO Hiroshi; ARIMURA Hiroki; INOUE Hitoshi; TAKEYA Shunichi; UEZONO Keiko; KAWASAKI Terukazu
    Proceedings of the Annual Conference of JSAI, 03 Jul. 2000, 人工知能学会, Japanese
    03 Jul. 2000 - 03 Jul. 2000
  • Text Data Mining : Applications to Browsing Large Document Collections and Web Data(Discovery Science)
    Abe Junichiro; Fujino Ryoichi; Shimozono Shinichi; Arimura Hiroki; Arikawa Setsuo
    Journal of Japanese Society for Artificial Intelligence, 01 Jul. 2000, The Japanese Society for Artificial Intelligence, Japanese
    01 Jul. 2000 - 01 Jul. 2000, This paper describes applications of text data mining to interactive document browsing and web mining. We develop a fast text data mining method based on optimal pattern discovery, and make experiments on large collections of documents and Web pages to evaluate the proposed method.
  • Maximizing Agreement with a Classification by Bounded Number of Associated Words
    SHIMOZONO Shinichi; ARIMURA Hiroki; ARIKAWA Setsuo
    IEICE technical report. Theoretical foundations of Computing, 16 Nov. 1999, The Institute of Electronics, Information and Communication Engineers, English
    16 Nov. 1999 - 16 Nov. 1999, We study the efficient discovery of word-association patterns, defined by a sequence of strings and a proximity gap between them, from a collection of texts with binary labels. We present an algorithm that finds all d strings and k proximity word-association patterns that maximizes agreement with the labels. It runs in expected time complexity O(k∧nlog∧n)and O(k∧n)space with the total length n of texts, if texts are uniformly random strings. It shows a contrast to a known fact that finding a best word-association pattern with arbitrarily many strings is intractable and hard to approximate within a factor arbitrary close to one, i.e., has no polynomial-time approximation scheme unless P=NP.
  • データマイニングを用いた探索的文書ブラウジングシステム
    安部 潤一郎; 笠井 透; 有村 博紀
    人工知能基礎論研究会, 07 Jul. 1999, 人工知能学会, Japanese
    07 Jul. 1999 - 07 Jul. 1999
  • Text Data Mining : Application to Browsing Large Document Collections
    ITAI Tsutomu; KASAI Toru; ARIMURA Hiroki; ARIKAWA Setsuo
    Proceedings of the Annual Conference of JSAI, 15 Jun. 1999, 人工知能学会, Japanese
    15 Jun. 1999 - 15 Jun. 1999
  • Finding Proximity Word-Association Patterns from Large Text Databases
    FUJINO Ryoichi; ARIMURA Hiroki; ARIKAWA Setsuo
    Proceedings of the Annual Conference of JSAI, 15 Jun. 1999, 人工知能学会, Japanese
    15 Jun. 1999 - 15 Jun. 1999
  • 部分語計数問題の接尾辞配列を用いた高速アルゴリズム (計算モデルとアルゴリズム)
    笠井 透; 有村 博紀; 有川 節夫
    数理解析研究所講究録, Apr. 1999, 京都大学, Japanese
    Apr. 1999 - Apr. 1999
  • 1T-10 Virtual suffix tree : fast computation of subword frequency using suffix array for text data mining
    笠井 透; 有村 博紀; 有川 節夫
    全国大会講演論文集, 09 Mar. 1999, Information Processing Society of Japan (IPSJ), Japanese
    09 Mar. 1999 - 09 Mar. 1999
  • Knowledge discovery from health data using weighted aggregation classifiers
    Toru Takae; Minoru Chikamune; Hiroki Arimura; Ayumi Shinohara; Hitoshi Inoue; Shun-Ichi Takeya; Keiko Uezono; Terukazu Kawasaki
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 1999, Springer Verlag, English
    1999 - 1999
  • On Approximation Algorithms for Local Multiple Alignment (合同研究会"AIシンポジウム'99"(第10回))
    阿久津 達也; 有村 博紀; 下薗 真一
    合同研究会AIシンポジウム, 1999, 人工知能学会, English
    1999 - 1999
  • Maximizing Agreement with a Classification by Bounded Number of Associated Words
    Shimozono Shinichi; Arimura Hiroki; Arikawa Setsuo
    IEICE technical report. Theoretical foundations of Computing, 20 Nov. 1998, The Institute of Electronics, Information and Communication Engineers, English
    20 Nov. 1998 - 20 Nov. 1998, We study the efficient discovery of word-association patterns, defined by a sequence of strings and a proximity gap between them, from a collection of texts with binary labels. We present an algorithm that finds all d strings and k proximity word-association patterns that maximizes agreement with the labels. It runs in expected time complexity O(k∧<d-1>nlog∧<d+1>n)and O(k∧<d-1>n)space with the total length n of texts, if texts are uniformly random strings. It shows a contrast to a known fact that finding a best word-association pattern with arbitrarily many strings is intractable and hard to approximate within a factor arbitrary close to one, i.e., has no polynomial-time approximation scheme unless P=NP.
  • Empirical Evaluation of an Inference Algorithm for Unions of Patterns by Minimal Multiple Generalization
    笠井 透; 有村 博紀; 篠原 武
    全国大会講演論文集, 05 Oct. 1998, Information Processing Society of Japan (IPSJ), Japanese
    05 Oct. 1998 - 05 Oct. 1998
  • Empirical Evaluation of a Regular Pattern Inference Algorithm by Minimal Multiple Generalization
    KASAI Toru; ARIMURA Hiroki; SHINOHARA Takeshi
    Research reports on information science and electrical engineering of Kyushu University, Sep. 1998, Kyushu University
    Sep. 1998 - Sep. 1998
  • Text Data Mining Based on Optimal Pattern Discovery : Towards a Scalable Data Mining System for Large Text Databases
    Kasai Toru; Arimura Hiroki; Fujino Ryoichi; Arikawa Setsuo
    IPSJ SIG Notes, 08 Jul. 1998, Information Processing Society of Japan (IPSJ), Japanese
    08 Jul. 1998 - 08 Jul. 1998, We consider the problem to find a pattern over strings that maximizes the precision of classifying the given positive- and negative-labeled strings. We introduce a notion of patterns for specifying associations between strings, and present an efficient algorithm that finds a best pattern that maximizes the precision and runs in O(k^<d-1>n log^<d+1> n )expected time and with O(k^<d-1>n)space for uniformly random texts. We also discuss an implementation technique for the algorithm on the suffix array indexing structure.
  • 1変数パタン言語の多項式時間オンライン学習 (アルゴリズムと計算の理論)
    稲子 希望; 有村 博紀
    数理解析研究所講究録, Apr. 1998, 京都大学, Japanese
    Apr. 1998 - Apr. 1998
  • 部分語相関ルール発見のための高速アルゴリズム (アルゴリズムと計算の理論)
    渡木 厚; 有村 博紀; 藤野 亮一; 有川 節夫
    数理解析研究所講究録, Apr. 1998, 京都大学, Japanese
    Apr. 1998 - Apr. 1998
  • ONline learning algorithms for one-variable pattern languages
    Nozomu Inago; Hiroki Arimura
    情報処理学会九州支部研究会, 1B-2, 13 Mar. 1998, 第12回情報処理学会九州支部, Japanese, Oral presentation
    13 Mar. 1998 - 13 Mar. 1998, 鹿児島大学工学部情報科学科, Japan, 情報処理学会九州支部長 松尾文碩, [Domestic Conference]
  • An Efficient Algorithm for Text Data Mining
    Atsushi Wataki; Hiroki Arimura; Ryoichi Fujino; Setsuo Arikawa
    第12回情報処理学会九州支部研究会, 13 Mar. 1998, 情報処理学会九州支部, Japanese, Oral presentation
    鹿児島大学工学部情報科学科, Japan, [Domestic Conference]
  • Efficient Inductive Logic Programming with Majority Vote
    Jun Kuramoto; Hiroki Arimura
    第12回情報処理学会九州支部研究会, 13 Mar. 1998, 情報処理学会九州支部, Japanese, Oral presentation
    鹿児島大学工学部情報科学科, Japan, [Domestic Conference]
  • Maximum Agreement Problem for Word Association Patterns
    Arimura Hiroki; Wataki Atsushi; Shimozono Shinichi
    IEICE technical report. Theoretical foundations of Computing, 31 Oct. 1997, The Institute of Electronics, Information and Communication Engineers, Japanese
    31 Oct. 1997 - 31 Oct. 1997, We consider the problem to find a pattern over strings that maximizes the precision of classifying the given positive- and negative-labeled strings. We introduce a notion of patterns for specifying associations between two strings, and present an efficient algorithm that finds a pattern with the best precision and runs in O(n^2) time and with O(kn) space. Furthermore, we show that if patterns are extended to associations over arbitrary many strings then the problem is not approximable within arbitrary small error ratio in polynomial time unless P=NP.
  • Text data mining with optimal string patterns
    Arimura Hiroki; Wataki Atsushi; Fujino Ryoichi; Arikawa Setsuo
    全国大会講演論文集, 24 Sep. 1997, Information Processing Society of Japan (IPSJ), Japanese
    24 Sep. 1997 - 24 Sep. 1997, 本研究では, 大量の文書の集積から, 分類精度を最適化するパタンを見つける問題を考察する。二語相関パタンとよばれる単純なパタンを仮説としたとき, 分類精度を最大化する最適パタンをO(n^2)時間および領域O(kn)領域で計算するアルゴリズムを与える。
  • On-Web-Teaching 2300 Students Information Processing and Computer Literacy at Kyushu University
    MINE Tsunenori; SATOH Hiroyuki; SHOUDAI Takayoshi; HIROKAWA Sachio; ARIMURA Hiroki; MORI Masao; SHINOHARA Ayumi; TAKEDA Masayuki
    IPSJ SIG Notes, 23 May 1997, Information Processing Society of Japan (IPSJ), Japanese
    23 May 1997 - 23 May 1997, We are developing a Web system for the introductory course of programming and computer literacy. 2300 students take the course with this Web system at Kyushu University. Our main object of this system is to reappear our lecture in front of students in their private studies. Teaching how programs run is a hard problem to deal with in information processing educations. In order to deal with such a difficult subject easily, we propose a program animation compiler for assisting students to understand a motion of programs. This compiler systematically translates tiny Pascal programs into Java Applets. In this paper, we state our idea and future plans of the system.
  • The Computational Complexity of Hereditary Elementary Formal Systems
    Ikeda Daisuke; Arimura Hiroki
    RIMS Kokyuroku, May 1997, Kyoto University, English
    May 1997 - May 1997
  • ALT'96報告
    杉本 典子; 平田 耕一; 有村 博紀
    人工知能学会誌, 01 Mar. 1997, 社団法人人工知能学会, Japanese
    01 Mar. 1997 - 01 Mar. 1997
  • 九州大学大学院システム情報科学研究科情報理学専攻
    有川 節夫; 有村 博紀
    人工知能学会誌, 01 Mar. 1997, 社団法人人工知能学会, Japanese
    01 Mar. 1997 - 01 Mar. 1997
  • On the strong compactness of unions of regular pattern languages with respect to inclusion
    Hiroki Arimura; Takeshi Shinohara
    RIMS Lecture Notes, May 1996, 京都大学, Japanese, Oral presentation
    May 1996 - May 1996
  • A polynomial-time relative generalization algorithm for learning logic programs with background knowledge
    Arimura Hiroki; Shinohara Takeshi
    全国大会講演論文集, 20 Sep. 1995, Information Processing Society of Japan (IPSJ), Japanese, Oral presentation
    20 Sep. 1995 - 20 Sep. 1995, Japan, 機械学習や帰納推論,仮説推論等における重要な推論手法の一つが,概念の汎化である.概念の汎化とは,いくつかの具体例が与えられたときに,それらを説明するより一般的な概念を見つけることであり,構成的学習において有効な手法である.とくに帰納論理プログラミング等の論理プログラムの学習の分野では,汎化にもとづく学習システムについて活発な研究がおこなわれている.本研究では,論理プログラムを背景知識の存在のもとで汎化する問題について議論する.とくに,高々k個の制約確定節からなる集合のうちで,例として与えられたすべての確定節を導き,背景知識Bのもとで含意に関して極小なものを見つける問題を考察し,この問題を多項式時間で解くアルゴリズムを与える.これは[2]の結果を複数の節からなる論理プログラムに拡張するものである.さらに,論理プログラムの学習への応用について述べる., [Domestic Conference]
  • Efficient string pattern matching for compressed Japanese text
    Miyazaki Tetsuji; Hirata Hiroaki; Koga Hisanori; Fukamachi Shuichi; Arimura Hiroki; Shinohara Takeshi
    全国大会講演論文集, 20 Sep. 1995, Information Processing Society of Japan (IPSJ), Japanese
    20 Sep. 1995 - 20 Sep. 1995, 情報検索の問題の1つとして,パターン照合問題がある.パターン照合間題とは,パターンおよびテキストという2つの文字列が与えられたとき,テキスト中におけるパターンのすべての出現を見つけだす問題である.このパターン照合問題を解くアルゴリズムに,Aho-Corasick法がある.Aho-Corasick法の特長は,パターンよりパターン照合機械という一種の有限オートマトンを作り,そのパターン照合機械をテキスト上で1回走査させるだけで複数のパターンを同時に検出できることである.Aho-Corasick法を用いて実際に検索を行う際に問題になるのが,ディスクからテキストを読み込むのに要する時間である.この間題に対して,テキストをあらかじめ圧縮しておき,その圧縮されたテキストを復号せずに検索する方法が提案されている.日本語テキストには,ひらがな,カタカナ,漢字,記号といった複数の文字種が存在し,テキスト中では同じ文字種が連続して出現するという性質がある.この性質を利用して,文字種ごとに圧縮符号を割り当てることにする.このとき,異なる文字種間で同じ符号が割り当てられる可能性があるので,文字種を保持する必要がある.文字種の変わり目に文字種が変わることを示すシフトコードを用いる.この方法により,文字種を考慮に入れずに全体で圧縮符号を割り当てた場合より圧縮率はよくなる.本稿では,Aho-Corasick法のパターン照合機械を拡張し,シフトコード付き圧縮符号によって圧縮された日本語テキストのためのパターン照合機械を設計した.これを用いて日本語テキストを対象として実験を行った結果,圧縮率は69.7%,走査時間は78.3%に短縮できた.
  • Protein Motif Discovery from Amino Acid Sequences by Sets of Regular Patterns
    Yamaguchi Michiyo; Shinohara Takeshi; Fujino Ryouichi; Arimura Hiroki; Arikawa Setsuo
    IPSJ SIG Notes, 28 Jul. 1995, Information Processing Society of Japan (IPSJ), Japanese, Oral presentation
    28 Jul. 1995 - 28 Jul. 1995, A regular pattern is a string consisting of constant symbols and mutually distinct variables, and represents the set of the constant strings obtained by Substituting possibly empty constant strings for variables. To the problem of discovering protein motifs from given amino acid sequences, we apply the learning algorithm, called k-minimal multiple generalization (k-mmg), that finds a minimally general collections of at most k regular patterns explaining all the positive examples. We incorporate into the learning algorithm a search heuristics, called minimal covering approach, for detecting exceptions in examples. The experiments on protein data show that our algorithm argumented with the new search heuristics more quickly finds accurate hypotheses from given amino acid sequences than the previous heuristics.
  • Protein Motif Discovery by Multiple String Patterns from Positive Examples
    SHINOHARA Takeshi; FUJINO Ryoichi; ARIMURA Hiroki; ARIKAWA Setsuo
    人工知能学会全国大会論文集 = Proceedings of the Annual Conference of JSAI, 24 Jul. 1995, Japanese
    24 Jul. 1995 - 24 Jul. 1995
  • Learning Unions of Tree Patterns Using Queries
    ARIMURA Hiroki; ISHIZAKA Hiroki; SHINOHARA Takeshi
    人工知能学会全国大会論文集 = Proceedings of the Annual Conference of JSAI, 24 Jul. 1995, Japanese
    24 Jul. 1995 - 24 Jul. 1995
  • Logical Generalization for Learning with Background Knowledge
    Hiroki Arimura; Takeshi Shinohara
    ICLP'95 Post-Conference Workshop on Inductive Logic Programming, SUT Technical Report, IA-TR-95-03, 17 Jun. 1995, Stephen Muggleton, Fumio Mizoguchi, and Koichi Furukawa, English, Nominated symposium
    17 Jun. 1995 - 17 Jun. 1995, Shonan Village Center, Hayama-Cho, Kanagawa 240-01, Japan, Japan, Proceedings of the ICLP'95 Post-Conference Workshop on Inductive Logic Programming, SUT Technical Report, Edited by Stephen Muggleton, Fumio Mizoguchi, and Koichi Furukawa, 17 June 1995; co-located with: Twelfth International Conference on Logic Programming (ICLP 1995), June 13-16, 1995. Tokyo, Japan., [Invited], [International presentation]
  • Lossy Text Compression for String Pattern Matching
    Fukamachi Shuichi; Shimozono Shinichi; Arimura Hiroki; Shinohara Takeshi
    IEICE technical report. Natural language understanding and models of communication, 12 May 1995, The Institute of Electronics, Information and Communication Engineers, Japanese
    12 May 1995 - 12 May 1995, A mapping, called k-indexing, from an alphabet to a smaller alphabet of size k is considered as a lossy text compression for speeding up string pattern matching. An approximation algorithm to find a k-indexing that minimizes false detection is presented. Using optimized 2^n-indexing, texts can be compressed in fixed length code of n (bit/char) and the expectation of false detection for patterns of length l is roughly <1/2>^<nl>. Optimized indexings for symbols, 2-grams, and 3-grams are compared with each other on the actual English text.
  • 第2回マシンインテリジェンスに関する国際ワークショップ(International Workshop on Machine Intelligence 1993)の報告
    沼尾 正行; 有村 博紀; 原口 誠; 平田 耕一; 斉藤 和巳; 諏訪 正樹; 月本 洋; 元田 浩
    人工知能学会誌, 01 Mar. 1994, 社団法人人工知能学会, Japanese
    01 Mar. 1994 - 01 Mar. 1994
  • How to acquire Meta-cognition from Concrete Databases
    Hiroki Arimura; Akira Takeuchi; Setsuko Otsuki
    World Conference on Artificial Intelligence in Education (AI-ED 1993), poster, pp.539, Aug. 1993, AACE, English, Poster presentation
    23 Aug. 1993 - 27 Aug. 1993, Edinburgh, Scotland
  • Inductive Inference of Prolog Programs with Internal Variables from Positive Data
    ARIMURA Hiroki; SHINOHARA Takeshi
    人工知能学会全国大会論文集 = Proceedings of the Annual Conference of JSAI, 20 Jul. 1993, Japanese
    20 Jul. 1993 - 20 Jul. 1993
  • Inductive Learning of Logic Programs From Positive Facts Using Minimal Multiple Generalization
    Ishizaka Hiroki; Arimura Hiroki; Shinohara Takeshi
    Journal of Japanese Society for Artificial Intelligence, 01 Jul. 1993, The Japanese Society for Artificial Intelligence, Japanese
    01 Jul. 1993 - 01 Jul. 1993, We consider the polynomial time inferability of primitive Prologs from positive facts. The class of primitive Prologs is a proper subclass of that of linear Prologs which is known to be inferable from only positive facts. In this paper, we discuss the polynomial time inferability of the subclass using minimal multiple generalizations. The minimal multiple generalization is a natural extension of the least generalization given by Plotkin in 1970. The minimal multiple generalization generalizes given first order words by several words, while the least generalization does by a single word. The property of the minimal multiple generalization makes it possible to perform fine generalization and to construct the heads of several clauses in a target program at the same time. We give an outline of a consistent polynomial time inference algorithm which identifies the class of primitive Prologs in the limit. The algorithm infers the heads of clauses in a target program as a minimal multiple generalization of a set of given positive facts. Furthermore, we give a similar result on the inferability of a subclass of context-free transformations which includes several well-known Prolog programs.
  • データベースからメタ認知を獲得する問題
    有村博紀; 岸本拓也; 竹内 章; 大槻説乎
    人工知能学会第4回知的教育システム研究会, 19 Mar. 1993, Japanese, Oral presentation
    19 Mar. 1993 - 20 Mar. 1993
  • Improvements of the Analogical Reasoning System ARTS for Definite Clauses
    Ryuji Wakizono; Hiroki Arimura; Hitoshi Inoue; Makoto Haraguchi; Setsuo Arikawa
    the 6th JSSST Annual Meeting, 19 Oct. 1989, JSSST, Japanese, Oral presentation
    19 Oct. 1989 - 21 Oct. 1989, 九州大学工学部(福岡市箱崎), Japan, [Domestic Conference]
  • Completeness of Depth-Bounded Resolution in Logic Programming
    Hiroki Arimura
    the 6th JSSST Annual Meeting, 19 Oct. 1989, JSSST, Japanese, Oral presentation
    19 Oct. 1989 - 21 Oct. 1989, Kyushu University, Fukuoka, Japan, [Domestic Conference]
■ Syllabus
  • 情報知識ネットワーク特論, 2024年, 修士課程, 情報科学院
  • Introduction to Artificial Intelligence, Big Data, and Cybersecurity for Graduate Students, 2024年, 修士課程, 情報科学院
  • 情報知識ネットワーク特論, 2024年, 博士後期課程, 情報科学研究科
  • 情報知識ネットワーク特論, 2024年, 博士後期課程, 情報科学院
  • Introduction to Artificial Intelligence, Big Data, and Cybersecurity for Graduate Students, 2024年, 博士後期課程, 情報科学院
  • プログラム理論と言語, 2024年, 学士課程, 工学部
  • 情報理工学演習Ⅲ, 2024年, 学士課程, 工学部
  • アルゴリズムとデータ構造, 2024年, 学士課程, 工学部
  • 一般教育演習(フレッシュマンセミナー), 2024年, 学士課程, 全学教育
■ Affiliated academic society
■ Research Themes
  • 社会を志向した革新的アルゴリズムの実装
    Grants-in-Aid for Scientific Research Grant-in-Aid for Transformative Research Areas (A)
    Nov. 2020 - Mar. 2025
    安田 宜仁; 有村 博紀; 鍋島 英知; 井上 武; 西野 正彬
    2020年度は採択後からの4か月の期間を、今後の研究活動の準備期間と捉え、 遠隔での交流のための環境整備や、実装を意識した場合に注力するべき項目の洗い出しを中心に行った。特に後者については、二分決定図を活用した既存ライブラリである graphillion の課題についての検討を進めた。
    個別の研究としては、厳密被覆の全解列挙、非同期ストリーム索引、離散最適化を用いた説明可能性機械学習において進展があった。
    厳密被覆(集合とそのいくつかの部分集合が与えられた場合に、和集合が重複なく入力集合を構成するような組合せを見つける問題)の全解列挙に関して、入力の部分集合の数が大きい場合でも効率的に動作するような方法を考案した。ゼロサプレス型二分決定図(ZDD)に類似する「DanceDD」というデータ構造を考案し、これを実現した。
    高速非同期ストリーム索引について、高木と、稲永、有村、Breslauer、Hendrianらは、一本の成長する系列の索引である接尾辞木のオンライン線形時間構築法を、非同期に成長する複数本の系列の索引である「全オンライン接尾辞木」に自然に拡張することに世界で初めて成功し、その線形時間アルゴリズムを与えた(学術論文1).
    離散最適化を用いた説明可能性機械学習について、金森と有村等は、説明可能機械学習問題において、因果構造を反映した反実仮想説明の整数計画法を用いた生成手法を開発し、人工知能分野のトップ会議AAAI2021に採択され、2022年1月に発表を行った(国際会議発表1)。この成果は、日本経済新聞や各種ウェブメディア等で報道されるなど社会的関心を集めた (報道他1).
    Japan Society for the Promotion of Science, Grant-in-Aid for Transformative Research Areas (A), NTT Communication Science Laboratories, Coinvestigator, 20H05963
  • Development of Next-generation Semi-Structured Data Mining Technology Towards The Real-World Knowledge Creation Infrastructure
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)
    Apr. 2020 - Mar. 2025
    有村 博紀; 宇野 毅明; 平田 耕一; 山本 章博; 喜田 拓也; ジョーダン チャールズハロルド
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (A), Hokkaido University, Principal investigator, 20H00595
  • Steering Toward Spatio-Temporal Computing Architecture Driven by Learning/Math-Scientific Models
    Oct. 2018 - Mar. 2025
    有村博紀; 中村篤祥; 瀧川一学; 和佐州洋; 栗田和宏
    Japan Science and Technology Agency (JST), JST CREST " Technology for Computing Revolution for Society 5.0", Coinvestigator
  • Expansion of efficient search and discovery technology for processing massive data stream in the real world
    Grants-in-Aid for Scientific Research Grant-in-Aid for Challenging Research (Exploratory)
    Jun. 2018 - Mar. 2021
    Arimura Hiroki
    In this research, we study efficient methods for search, count, and discovery of complex patterns and rules from multiple, massive, changing, uncertain, unstructured stream data as a basis of large-scale time-space data stream processing in the real world. In this research project, Hiroki Arimura conducted research and take part in R & D of fast pattern search and discovery, Thomas Zeugmann with Arimura studied the theory of multi-stream data processing and knowledge discovery, and Charles Jordan with Arimura devised efficient massively parallel and stochastic search methods for combinatorial structures as patterns. We focus on multiplicity, adaptivity, context-sensitiveness, and low-memory footprint of the algorithms, and built implementations and theretical and empirical analyses of the devised algorithms as libraries for real world data processing.
    Japan Society for the Promotion of Science, Grant-in-Aid for Challenging Research (Exploratory), Hokkaido University, Principal investigator, 18K19771
  • Next-generation semi-structured data mining technologies for real-world knowledge infrastructures
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)
    Apr. 2016 - Mar. 2020
    Arimura Hiroki
    The goal of this research is to establish base technologies for forming large-scale knowledge bases from massive data in real and cyber spaces. For this purpose, we study next-generation data mining technologies for efficiently extracting useful knowledge as patterns and rules from large semi-structured data, that is, huge and heterogeneous collections of weakly structured data. In more detail, we studied (1) efficient semi-structured data mining engines in the optimal pattern discovery framework, (2) semi-strcuture mining based on spatio-temporal information, (3) combination of search, compression, and indexing/storage technologies for massive semi-structured data, development of knowledge base creation systems based on semi-structured data mining.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (A), Hokkaido University, Principal investigator, 16H01743
  • Research on Fundamental Algorithms of Discrete Structure Manipulation Systems
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (S)
    May 2015 - Mar. 2020
    MINATO Shin-ichi
    In this project, we aimed to construct the core algorithms for discrete structure manipulation systems, and to provide efficient software tools for many researchers in various application areas. Our achievements include that (1) we first succeeded in enumerating all connected sub-block patterns (in total 109.8 billion patterns) of 47 prefectures in Japan, the data is open for all Japanese citizens from the governmental statistics center, and (2) we produced many academic papers, accepted at the top-conferences such as AAAI, WWW, KDD, INFOCOM, AISTATS, SDM, etc.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (S), Coinvestigator, 15H05711
  • Statistically Sound Pattern Mining
    Grants-in-Aid for Scientific Research
    01 Apr. 2017 - 31 Mar. 2019
    Komiyama Junpei; Ishihata Masakazu; Arimura Hiroki; Nishibayashi Takashi; Minato Shin-ichi; Maehara Takanori
    Pattern mining algorithms enumerate all the combinatorial patterns with their frequency larger than a given threshold. Existing algorithms output many patterns that characterizes a dataset, they do not address how significant the found patterns are in terms of statistical significance. To address this issue, we propose a method that guarantees the rate of false discovery in found patterns while keeping its computational efficiency. The proposed method is presented in a top-tier data mining / artificial intelligence conference (KDD2017).
    Japan Society for the Promotion of Science, Grant-in-Aid for Young Scientists (B), The University of Tokyo, 17K12736
  • Reconfigurable Architectures for Accelerating Data Mining
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)
    Apr. 2015 - Mar. 2019
    Motomura Masato
    We have conducted researches of architectures for combinatory optimization problems. Conventional idea has been to conduct minor-embedding or a target graph onto hardware graph that is more sparse than the former. Our new architectural proposal is to time-multiplexing a hardware structure for expanding hardware graph, that is powerful enough to sustain the original denser target graph. This idea has been verified to achieve higher accuracy for large graphs in our simulation.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B), Hokkaido University, Coinvestigator, 15H02673
  • Efficient search and discovery technology for processing massive data stream in the real world
    Grants-in-Aid for Scientific Research Grant-in-Aid for Challenging Exploratory Research
    Apr. 2015 - Mar. 2018
    Arimura Hiroki; JORDAN Charles
    In this research, we study the algorithms and data structures for search, discovery, and analysis of complex combinatorial patterns as the base technology for massive stream data processing in the real world. Especially, we focus on the following key aspects of such algorithms: low-memory footprint, adaptivity, context-sensitivity, multiplicity. The followings are the research topics that we study: A1. Fast search algorithms based on bit-parallel techniques; A2. Ultra-fast counting technology based on stochastic testing; A3. pattern discovery using substructure enumeration; A4. Theory of knowledge discovery from massive high-speed streams; B1. Prototype implementation and preliminary evaluation.
    Japan Society for the Promotion of Science, Grant-in-Aid for Challenging Exploratory Research, Hokkaido University, Principal investigator, 15K12022
  • Development of Next-Generation Semi-structured Data Mining for Large-Scale Knowledge Base Formation
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)
    Apr. 2012 - Mar. 2016
    Arimura Hiroki; UNO TAKEAKI; MINATO S.; HIRATA KOUICHI; ITO K.; SHIMOZONO S.; KIDA T.
    The final goal of this research is to establish a strategy for forming large-scale knowledge bases from massive data and information on a wide range of human activities on social, scientific, and industrial aspects in the cyber space. For this purpose, we study next-generation data mining technologies for efficiently extracting useful knowledge as patterns and rules from semi-structured data, that is, huge and heterogeneous collections of weakly structured data in the cyber space. (1) Efficient semi-structured data mining engines based on optimal pattern discovery framework. (2) Semi-strcuture mining based on spatio-temporal information. (3) Combining semi-structured data mining with stochastic information processing schema. (4) Knowledge federation technologies for large-scale knowledge bases creation. (5) Knowledge indexing technologies for large-scale knowledge bases creation. (6) Development of knowledge base creation systems based on semi-structured data mining.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (A), Hokkaido University, Principal investigator, 24240021
  • バイオミメティクス・データベースのオープンイノベーションプラットフォームへの展開
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Innovative Areas (Research in a proposed research area)
    Apr. 2013 - Mar. 2015
    有村 博紀
    本研究が連携する計画研究A01班で構築するバイオミメティクス・データベースは,生物学や材料科学等の異分野のデータベースを,画像データ検索技術により横断的に検索し,クラスタリングして提示することで,専門の異なる研究者に新たな気づきを提供し,異分野連携と産学連携の強力な推進基盤となる.また,ISO においてバイオミメティクスの国際標準化の検討が開始されており,先に述べた画像検索技術を踏まえ,国際標準化に参画することで,我が国の国際競争力強化に貢献できる.
    以上の背景のもと,本研究では,オープン・イノベーション・プラットフォームの実現に向けて,国際的産業応用可能なバイオミメティクス・データベースの機能についての検討を目的として調査研究を行った.産学連携については,国内外におけるバイオミメティクスに関する異分野連携・産学連携の動向について,欧州のBiomimicry Europa等,昨年度の調査対象組織のその後の動向について詳細に調べた.
    バイオミメティクスの国際標準化を検討する技術委員会ISO/TC266における標準化活動を行った.具体的には,国際委員会(第4回ISO総会,2014年10月,ベルギー),国内審議委員会としてISO国内審議委員会(2014年4月,第9回から第11回),国内WG4会議等に参加し,調査および標準化活動を行った.また,標準化に関連する大規模検索とデータ解析に関する技術開発も行った.
    以上の活動により,我が国がバイオミメティクスの産業応用で世界に先んじ,我が国の産業界に資することを目指した.さらに,上記を通じて同分野の人材育成も行った.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Innovative Areas (Research in a proposed research area), Hokkaido University, Principal investigator, 25120501
  • A study on positive-definite and efficient kernels for structured data
    Grants-in-Aid for Scientific Research
    01 Apr. 2011 - 31 Mar. 2014
    SHIN Yoshihiro; OKAMOTO Hiroshi; ARIMURA Hiroki; SAKAMOTO Hiroshi; KUBOYAMA Tetsuji; CUTURI Marco
    Kernel method is an important field of machine learning research and allows us to leverage information assets like big data to make useful predictions in various applications. In addition to vector data,there exist huge amount of data that have structures. For example, DNA is an array of nucleotides; Protein, parse trees and XML documents are naturally structured as trees; Various kinds of networks are represented using the graph structure. In this regard, this project aims to establish a theory of kernels for structured data and practical techniques to apply kernels to structured data of the real applications. Specifically, we have developed a mathematical theory to investigate positive definiteness of kernels and various types of kernels that deal with a wide variety of structured data. Furthermore, we have developed a utility to compute kernels and have publicized it to researchers in the field of machine learning over the Internet.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B), University of Hyogo, 23300061
  • Research on database analysis algorithms using very large-scale monolithic memory space
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)
    2008 - 2011
    MINATO Shin-ichi; ARIMURA Hiroki; ZEUGMANN Thomas; KIDA Takuya; OKUBO Yoshiaki
    Formerly, database applications required to store all the data into a hard disk, however, after 2000's, middle or large-scale practical databases become possible to be stored in the main memory of PCs. In this project, we proposed and developed a new method of high-speed database analysis with a very large-scale monolithic main memory. Our method is based on Binary Decision Diagram (BDD), which is a new type of compressed graph structure for efficiently manipulating large-scale combinatorial data.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B), Hokkaido University, Coinvestigator, 20300051
  • Next-Generation Semi-structured Data Mining for Large-Scale Knowledge Base Formation
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)
    2008 - 2011
    ARIMURA Hiroki; KIDA Takuya; MINATO Shin-ichi; ITO Kimihito; UNO Takeaki; HIRATA Kouichi
    In this research, we study efficient semi-structured data mining technologies and related knowledge federation and indexing technologies, which enables knowledge extraction from massive semi-structured data in the real world. Finally, we made a number of knowledge extraction experiments.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (A), Hokkaido University, Principal investigator, 20240014
  • 情報ネットワークにおける大規模知識処理のための超高速アルゴリズムの研究
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas
    2009 - 2010
    トーマス ツォイクマン; 湊 真一; 喜田 拓也; 下薗 真一; 高木 剛; 宇野 毅明; 有村 博紀
    本研究課題では,大規模知識処理のための超高速アルゴリズムの研究に取り組んでいる.本研究では,ウェブからの知識獲得を人間による大量のウェブ情報アクセスを支援する効率的な半自動的ツールとしてとらえ,従来の情報検索システムを超えた新しい情報アクセスシステムの開発を目指している.そのために,現在までに開発してきたウェブ情報処理技術を元に,中核技術として,機械学習と知識索引技術に基づく,適応的な知的情報獲得システムの基盤技術を開発することを目標としている.
    特に本年度は,下記の課題に取り組み,各課題において多くの国際会議論文の採択を受けた.
    (1)自律・適応的な知識発見のための機械学習の研究.(ツォイクマン・湊)
    (2)超高速知識獲得アルゴリズムの研究.(ツォイクマン・宇野・有村)
    (3)ZDDに基づく大規模知識索引の研究.(湊・宇野・有村)
    (4)高速情報検索のためのデータ圧縮技術の研究.(喜田・有村・ツォイクマン)
    (5)開放分散環境における高度知識処理のためのプライバシー保護機構の研究.(高木・ツォイクマン)
    これら一連の知識発見に関する我々の仕事が国際的にも評価され,第25回計算機と情報科学に関する国際会議において,研究代表者であるツォイクマンを筆頭に,分担者である湊および有村らがそれぞれ招待講演者として招かれた.また,分担者の宇野毅明准教授が,文部科学大臣表彰若手科学者賞(業績名:巨大データ解析に対する超高速アルゴリズム構築法の研究)を受賞した.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Priority Areas, Hokkaido University, Principal investigator, 21013001
  • Efficient Pattern Discovery from Massive Semi-Structured Data for Knowledge Infrastructure Formation on the Web
    Grants-in-Aid for Scientific Research Grant-in-Aid for Specially Promoted Research
    2005 - 2007
    ARIMURA Hiroki; KIDA Takuya; MINATO Shin-ichi; ITO Kimihito
    By rapid progress of network and storage technologies for the last decade, a huge amount of weakly structured electronic data of various types, called semi-structured data, become available over the Internet. In this research project, we study efficient semi-structured data mining technologies that supports human's discovery of useful knowledge from massive collections of semi-structured data. In particular, we develop high-speed semi-structured data mining engines as a core of large-scale knowledge Infrastructure formation from the Internet and establish their architecture and base technologies. In particular, we have studied the following research topics:
    1. High-speed semi-structured data mining technologies. We developed efficient depth-first mining algorithms for sequences and trees based on the rightmost expansion technique, proposed by Hiroki Arimura and other researchers. We also devised various optimization techniques in frequent and maximal pattern mining frameworks. We made theoretical analysis of their computational complexity, and show that they are actually polynomial delay and polynomial space complexity enumeration algorithms, which means that they can be used as light-weight and high-throughput mining engines. From theory point of view, they are the first maximal pattern mining algorithms for non-trivial subclasses of semi-structured data having theoretically guaranteed performance.
    2. Knowledge indexing technologies. The pre- and post-processing of the data and patterns are important issues for the next-generation data mining. We developed an efficient knowledge indexing data structure, called VSOP, based on Zero-suppressed BDD (ZBDD) proposed by Shin-ichi Minato. VSOP is a compressed indexing structure for discrete structure, equipped with a rich set of algebraic manipulation operators for compressed transactions and itemsets. On the top of VSOP, we devised efficient data mining algorithms that can extract interesting structures hidden in a huge input dataset in an highly interactive way.
    3. Knowledge federation technologies. We combine semi-structured data processing techniques with information retrieval and information extraction from the web and stream data. As results, we developed fast pattern matching algorithms for multi-dimensional numerical streams based on bit-parallel techniques and semi-automatic information extraction algorithms from the Web based on demonstration-by-example. We also implemented a set of knowledge discovery tools based on our algorithms and technologies devised in this project. We also made experiments to evaluate the prototype system on real world data on the Web and in Bioinformatics fields.
    Japan Society for the Promotion of Science, Grant-in-Aid for Specially Promoted Research, Hokkaido University, Principal investigator, 17002008
  • 超高速データストリームのためのオンライン型半構造情報変換システムの開発
    Grants-in-Aid for Scientific Research Grant-in-Aid for Exploratory Research
    2004 - 2005
    坂本 比呂志; 有村 博紀; 坂本 比呂志
    本研究では,半構造データに対する高速なXPath処理法を提案した.これまでに,データを効率的に圧縮する手法として知られている算術符号化を半構造データの検索に応用した,逆算術符号化が提案されている.これは,木構造データ上のパスの依存関係を,データを圧縮したまま復号化することなく検査できる手法であり,この関係性を利用することで,パスによる問い合わせを高速に処理できる.しかしながら,この問い合わせで利用可能なパスの形式は限定されているため,一般のXPathの問い合わせは処理が困難である.そこで本研究では,このような逆算術符号化にノード間の先祖子孫関係を判定可能な範囲ラベルを導入することにより,より複雑な問い合わせ処理を高速に実現するための手法を提案する.評価実験の結果,300MB程度のXMLデータに対してテキストを直接処理する既存の手法と比較し,数十から百倍の高速化を達成した.また,本研究では,畳み込みカーネルのアイディアに基づいた,ラベル付き順序木に対するこれまでにない新しいカーネル関数を提案した.まず,畳み込みカーネルの枠組みにおいてラベル付き順序木に対して任意の部分グラフを部分構造として用いた場合の,効率の良いカーネル計算のアルゴリズムを提案し,曖昧なラベルや構造を取り込むような拡張を行った.さらに,より一般的な木構造として,順序のないラベル付き根付き木に対するカーネルを考えた場合には,カーネルの計算が#P-完全問題であることを示した.
    Japan Society for the Promotion of Science, Grant-in-Aid for Exploratory Research, Principal investigator, 16650021
  • 最適パターン発見に基づく大規模半構造データからの知的情報獲得システムの開発
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas
    2004 - 2005
    トーマス ツォイクマン; 有村 博紀; 坂本 比呂志; 篠原 歩; 下薗 真一; 湊 真一; 喜田 拓也; 竹田 正幸
    本研究は,大量のウェブページやXML等の大規模半構造データからのデータマイニング(ウェブマイニング)に基づき,大量のデータ解析を対話的に支援する効率的なツールとして,従来の情報検索システムを超えた新しい情報アクセスシステムの実現方式を明らかにすることを目標としている.その鍵になる技術として,最適パターン発見を木やグラフ構造に拡張し,計算量理論と計算学習理論の最新の成果を援用しながら,半構造データに対する頑健かつ高速な最適化パターン発見アルゴリズムの開発に取り組んだ.
    平成17年度は,初年度から昨年度までの研究成果と統合し,最適半構造マイニングのプロトタイプシステム構築を目指した.研究項目としては,有用な情報源の発見,特徴的なパターンの発見,情報の抽出の3つの情報獲得問題に加えて,昨年度から新たに研究を開始した知識索引問題について取り組んだ.今年度得られた具体的な結果のうち主要なものは以下のとおりである.
    (1)大規模なトランザクションデータによく見られる疎な組み合わせ集合データを効率よく扱うことのできるデータ構造であるZBDD(Zero-suppress BDD)をベースに,その構造の元で重み付き積和集合を計算可能なZBDDパッケージツールVSOP(Valued Sum-Of-Products)の開発を推し進め,頻出するパターン集合を表現するZBDDを単純直交分解する機能を追加した.これにより,そのデータに内包された意味的構造を自動抽出することが可能になった.(湊)
    (2)パターン発見アルゴリズムによる分類・予測の長期的ふるまいに関する理論保証を与えることに成功した.(ツォイクマン)
    (3)系列データからの極大モチーフパターンを効率よく枚挙するアルゴリズムを得た.(有村:H13-H16代表)
    (4)Arc構造付きテキストに対する高速なパターン照合アルゴリズムを得た.(喜田)
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Priority Areas, Hokkaido University, Principal investigator, 16016266
  • Study of High-speed Data Mining Algorithms from Massive Data Streams
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)
    2003 - 2005
    IKEDA Daisuke; TAKEDA Masayuki; SHINOHARA Ayumi; KIDA Takuya; KASAHARA Yoshiaki; ISHINO Akira
    In this research, we investigated high-speed online knowledge discovery system for extracting useful information from massive semi-structured data streams. Particularly in this year, as theoretical researches, we extended further the theory of efficient pattern matching and pattern discovery methods for online streams. As application studies, we made a series of experiments on collection and analysis of network data from real high-speed networks in a huge organization. We have also published the results obtained in the research period of the last three years. In particular, we proceed the studies on the following issues:
    (1)Survey on semi-structured data : We have summarized and published a survey on stream data mining in an academic journal, which has been studied through this project for the last three years.
    (2)Study on streaming pattern matching technology for semi-structured data : We developed an efficient method for performing tree pattern matching with horizontal wildcards by bit parallel technology, which potentially gives drastic speed-up for Xpath and XQuery pattern matching languages for huge XML data.
    (3)Study on sequential and streaming pattern discovery technology for semi-structured data : We developed efficient algorithms for finding interesting patterns from massive data streams for various classes of complex patterns/motifs. In this year, we also published pattern discovery algorithms developed in the last year. Also, one of them got awarded for 2004 JSAI SIG AWARD.
    (4)Empirical study on knowledge discovery from real massive network data : As applications, we performed a series of surveys on data collection and online analysis of high-speed large-scale network for middle sized organization at Kyushu University. These experiments will give insights for future research on the development of efficient pattern matching/discovery algorithms for high-speed streaming data.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B), Principal investigator, 15300036
  • 最適パターン発見に基づく大規模半構造データからの知的情報獲得システムの開発
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas
    2001 - 2003
    有村 博紀; 篠原 歩; 竹田 正幸; 坂本 比呂志; 下薗 真一
    本研究では,大量のウェブページやXML等の大規模半構造データからのデータマイニング(ウェブマイニング)に基づき,大量のデータ解析を対話的に支援する効率的なツールとして,従来の情報検索システムを超えた新しい情報アクセスシステムの実現方式を明らかにすることを目標としている.鍵になる技術として,最適パターン発見を木やグラフ構造に拡張し,計算量理論と計算学習理論の最新の成果を援用しながら,半構造データに対する頑健かつ高速な最適化パターン発見アルゴリズムの開発に取り組んだ.平成15年度は,前年度の成果に基づき,(a)有用な情報源の発見,(b)特徴的なパターンの発見,(c)情報の抽出の3つの情報獲得問題について,次の研究を行った.
    (1)前年度開発した最適分類パターンを用いた高精度テキスト自動分類手法を一種の正規表現を扱えるよう一般化した.さらに,近似文字列照合を可能なパターン発見エンジンを開発し,加速学習法を用いて決定木学習システムBONSAIに組み込んだ(竹田・篠原).
    (2)XPathパターンに対する最適パターン発見アルゴリズムを設計し,半構造データマイニングエンジンを開発した.とくに今回は,より現実の半構造データに近い無順序木モデルに対して,高速なパターン発見エンジンを開発した.パターンの圧縮表現に対する,高速な列挙アルゴリズムを開発した.独自に開発したオンライン化技術を用いて,オンラインパターン発見手法を開発した(有村).
    (3)情報抽出に関して,現状の技術動向の調査を行い,水平方向制約(Sequence制約)付き半構造データに対するラッパー帰納アルゴリズムの設計を行った(坂本).さらに,半構造データに適した高性能圧縮アルゴリズムを開発し,性能に関する理論的解析を行った.並行して,開発したアルゴリズムの計算量を解析し(下薗,篠原),個々のアルゴリズムの最適化をおこない,性能を向上させた(全員).最後に,ウェブデータとXMLデータに関する評価実験をおこない,この方式の有効性を評価した(有村・篠原).
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Priority Areas, Kyushu University, Principal investigator, 15017268
  • Study of efficient text data mining based on optimized pattern discovery
    PRESTO21, "Information and Intelligence"
    Oct. 1999 - Mar. 2002
    Hiroki Arimura
    JST, Principal investigator, Competitive research funding
  • 最適パターン発見に基づく大規模半構造データからの知的情報獲得システムの開発
    科学研究費助成事業
    2002 - 2002
    有村 博紀; 下薗 真一; 篠原 歩; 竹田 正幸
    ネットワーク上に分散したウェブページやXML等の半構造データの急速な増大に対して,これらのコンテンツに直接アクセスするための効率良い手法の開発が緊急の課題となっている.本研究では,大規模半構造データからのデータマイニング(ウェブマイニング)に基づき,大量のデータ解析を対話的に支援する効率的なツールとして,従来の情報検索システムを超えた新しい情報アクセスシステムの開発を目指す.
    そのための鍵となる技術として,最適パターン発見を木やグラフ構造に拡張して,半構造データに対する頑健かつ高速な最適化パターン発見アルゴリズムを開発する.さらに,ウェブマイニングを(a)有用な情報源の発見,および(b)特徴的なパターンの発見,(c)情報抽出の3つの過程ととらえ,これらを有機的に結合して,半構造データを対象とした知識獲得システムの効率良い実現方式を明らかにすることを目標とする.また,計算量理論と計算学習理論の最新成果を援用し,計算量に徹底的に配慮した高速なアルゴリズムの開発を目指すのも特色である.
    平成14年度は,次の研究成果を得た.
    (a)「有用な情報源の発見」に関しては,従来より表現力の高いパターンである可変長変数パターン(VLDCパターン)に対する新しいテキスト索引構造を開発し,これを用いて,効率よい最適化マイニングアルゴリズムを開発した.
    (b)「特徴的なパターンの発見」に関しては,XML-MessagingとSOAPへの応用を目指して,昨年開発した半構造データマイニング手法FREQTを元に,高速な半構造データストリームマイニングSTREAMTを開発した.これは,非常に少なく資源を使いながらデータストリームを監視し、有用なパターンを逐次報告するアルゴリズムである.また,FREQTの最適化マイニングへの拡張と理論的性能解析を行い,この方式の最適性を示した.
    (c)「情報抽出」に関しては,XMLデータ処理の基本技術である符号語列上のパターン照合機械技術を開発し,XMLパターン検索への応用を考察した.
    日本学術振興会, 特定領域研究, 九州大学, 14019070
  • 最適パターン発見に基づく大規模半構造データからの知的情報獲得システムの開発
    科学研究費助成事業
    2001 - 2001
    有村 博紀; 篠原 歩; 竹田 正幸; 坂本 比呂志; 下薗 真一
    ネットワーク上に分散したウェブページやXML等の半構造データの急速な増大に対して,これらのコンテンツに直接アクセスするための効率良い手法の開発が緊急の課題となっている.本研究では,大規模半構造データからのデータマイニング(ウェブマイニング)に基づき,大量のデータ解析を対話的に支援する効率的なツールとして,従来の情報検索システムを超えた新しい情報アクセスシステムの開発を目指す.
    そのために,鍵となる技術として,最適パターン発見を木やグラフ構造に拡張して,半構造データに対する頑健かつ高速な最適化パターン発見アルゴリズムを開発する.さらに,ウェブマイニングを(a)有用な情報源の発見,および(b)特徴的なパターンの発見,(c)情報抽出の3つの過程からなると考え,これらを有機的に結合して,半構造データを対象とした知識獲得システムの効率良い実現方式を明らかにすることを目標とする,また,計算量理論と計算学習理論の最新の成果を援用して,計算量に徹底的に配慮した高速なアルゴリズムの開発を目指すことも特色である.
    平成13年度は,次の研究成果を得た.
    (a)「有用な情報源の発見」に関しては,部分系列パターンとエピソードパターンと呼ぶ組合せパターンに対する効率よい最適化マイニングアルゴリズムを開発し,これを文字列分類のための決定木学習アルゴリズムBONSAIに組み込んだ.
    (b)「特徴的なパターンの発見」に関しては,半構造データを最も基本的なラベル付き順序木(labeled ordered trees)のクラスとしてモデル化し,データ中の頻出共通部分構造に対する高速な発見アルゴリズムを開発した.木に関するパターン発見問題は,一般に高い計算量をもつことが多い.そこで,最右枝拡張法という効率よい発見手法を与え,これを複数の最適化手法と組み合わせて,半構造データに対する高速なマイニングアルゴリズムを与えた.
    (c)「情報抽出」に関しては,ウェブからの情報抽出問題を考察し,HTMLデータから木構造の情報を利用して必要な情報を効率よく切り出すTree-Wrapperアルゴリズムを開発した.
    日本学術振興会, 特定領域研究(C), 九州大学, 13224073
  • Development of Efficient Data Mining Systems for Large Semi-Structured Text Data
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)
    1999 - 2001
    ARIMURA Hiroki; SHINOHARA Ayumi; TAKEDA Masayuki; SHOUDAI Takayoshi; HIRATA Kouichi; ISHINO Akira
    The goal of this research project is to devise an efficient semi-automatic tool that supports human discovery from large unstructured and semi-structured text data. To achieve this goal, we studied in the following three directions.
    1. The central process of text mining is pattern discovery. We employed the framework of optimized pattern discovery, and developed effcient and robust text mining algorithms that find simple combinatorial patterns from large unstructured texts. To implement these algorithms, we developed a text index structure based on the suffix arrays suitable for text mining. Based on these technologies, we implemented a prototype system and run computer experiments on Web data.
    2. Another important technology for text is efficient pattern matching. As a theoretical framework, we proposed a unified framework, called Collage system, for realizing various dictionary-based compression methods. We developed both Knuth-Morris-Pratt type and Byer-Moore type pattern matching algorithms employing this framework. We also applied this framework to Byte-Pair-Encoding compression method and Sequitur, the former of which yields the fastest compressed pattern matching algorithm.
    3. Final process of text mining is information extraction. From theoretical point of view, we first formalize the information extraction problem from semi-structured data, and then gave theoretical analysis of the power and the limitation of such tasks. Then, we developed efficient information extraction algorithms for various types of extraction rules including tree wrappers and hedge patterns and evaluate them through experiments on real-life semi-structured data on the internet.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B), Kyushu University, Principal investigator, 11558040
  • 高度知識ベースを対象とした知識獲得システムの研究
    Grants-in-Aid for Scientific Research Grant-in-Aid for Encouragement of Young Scientists (A)
    1999 - 2000
    有村 博紀
    本研究の目的は,大規模知識データベースから,複雑な知識構造を効率よく獲得するシステムの実現方法を明らかにすることである.大規模知識データベースは,(i)大量の(ii)非均質なデータを含み,同時に(iii)その多くは正例だけを含むので,従来の知識獲得手法は適用できない.そこで,本研究では,このような大規模知識データベースに適用可能な新しい学習手法について研究をおこなった.
    本年度は,以下の項目について研究を実施した.
    (1)前年度に開発した一般的な知識獲得手法を,ネットワーク上の半構造データからの知識獲得に応用した.具体的には,ネットワーク上でのデータ交換および記述言語として急速に利用が進んでいるXMLデータを対象とし,与えられたXMLデータから,対話によって,データ変換規則を効率良く発見するアルゴリズムを開発する.(ICGI 2000;人工知能学会誌Vol.16,No.2)
    (2)さらに(1)項の半構造データからのパターン発見アルゴリズムを,前年度に開発したテキストデータからのパターン発見手法と融合し,構造と内容の両方を利用した知識獲得手法を開発した.さらに,この手法の時間計算量と質問計算量を理論的に解析した.(PAKDD 2000;RECOMB 2000;ビット別冊)
    (3)開発した知識獲得手法の有効性を計算機実験を通じて評価する.開発中の知識獲得システムを,本研究で得たパターン発見手続きの導入によって拡張した.さらに,ネットワーク上に分散したウェブページからの知識獲得実験をおこなった.(Kyoto ICDL2000;人工知能学会誌Vol.15,No.4)
    Japan Society for the Promotion of Science, Grant-in-Aid for Encouragement of Young Scientists (A), Kyushu University, Principal investigator, 11780277
  • Knowledge Discovery by Inferences
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas (A)
    1998 - 2000
    SATO Taisuke; HARAGUCHI Makoto; IMAI Mutsumi; ARIMURA Hiroki; SATO Masako; SHINOHARA Takeshi
    From the fiscal year 1999 to 2001, we conducted research in knowledge discovery and developed inference various methods that can cope with uncertainty and complexities in the real data as follows.
    T. Sato developed a symbolic-statistical modeling language PRISM. Arimura et al. developed two types of fast text pattern matching algorithm. Tsukimoto proposed logical regression analysis. K.Satoh continued the analysis of minimal case base required for representing concepts. Sakama proposed non-monotonic inverse resolution.
    Imai et al. developed an algorithm for computing maximally specific hypotheses for ILP. Haraguchi proposed data abstraction for decision trees. Yamamoto reconstructed the theoretical base of ILP based on the logic of Suggestion.
    M. Sato et al. studied the theory of refutable/inductive inference. Shinohara proposed reducing data dimension by his H-map. Ohsawa showed that his KeyGraph approach is effective by examples.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Priority Areas (A), Tokyo Institute of Technology, Coinvestigator, 10143104
  • 高度知識ベースを対象とした知識獲得システムの研究
    Grants-in-Aid for Scientific Research Grant-in-Aid for Encouragement of Young Scientists (A)
    1997 - 1998
    有村 博紀
    本研究では,構造化データからの対話的な知識獲得について,基礎的な研究をおこなった[1,3].まず,オブジェクト指向データベース等の複合オブジェクト(complex object)の単純なモデルとして,述語論理の一階項(first-order term)を考えた.さらに,能動的学習(active learning)の枠組みのもとで,さまざまな構造化パタン族に対して,効率的なパタン発見手法の開発と,発見問題の本質的複雑さの探求をおこない,次の結果を得た[1,3,2].
    1. 2個の一階項からなる簡単なパタンでさえ,効率よい学習が困難なことを示した.その一方で,所属性質問を用いて実験をおこなうことで,この種のパタンが効率よく多項式時間で同定できることを示した.
    2. 上記の結果を2個以上の一階項からなるパタンに拡張して,所属性質問を用いた効率よい学習アルゴリズムを与えた.このアルゴリズムは構造化データの学習における基本的機構を確立するもので,所属性質問を用いるさまざまな学習モデルに適用可能である.
    3. ACH(k)(acyclic constrained Horn programs of bounded arity k)と呼ばれる一階ホーン論理式の族(再帰論理プログラムの一種)が,含意に基づく学習(能動学習の一種)の枠組みで多項式時間学習可能であることを示した[1].さらに,質問計算量に関する下限定理と学習困難性を示して,一般的な一階ホーン論理式の学習には,質問の利用が不可欠なことを明らかにした.
    以上の結果は,構造化データからの知識獲得の本質的難しさを明らかにするものである.同時に,対話による知識ベース構築のための基本的方法論を与えるものである.
    Japan Society for the Promotion of Science, Grant-in-Aid for Encouragement of Young Scientists (A), Kyushu University, Principal investigator, 09780343
  • 超大規模データからの高速データマイニング・システムの研究
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas
    1997 - 1997
    有村 博紀; 正代 隆義; 有川 節夫
    データマイニング(Data Mining)は,データベースからの知識発見とも呼ばれ,現在,ビジネス分野や科学技術分野等,さまざまな対象領域で,その適用が盛んにおこなわれている.しかし,現在のデータマイニングの対象は関係データベースが中心であり,現在急速に利用が進みつつあるテキストデータベースやオブジェクト指向データベースに関しては,明示的な構造をもたない,あるいは非均質な構造しかもたない,膨大なデータの集積であるなどの理由から,従来の手法をそのまま適用することができないため,ほとんど研究がおこなわれていない.
    そこで本研究では,ここにあげたような非構造的データや構造データからのデータマイニングについて研究した.平成9年度は,具体的にはつぎの問題に中心に研究した.
    1.高速パターン発見アルゴリズムの研究:近年発展著しいテキストデータベースを対象に,高速なパターン発見アルゴリズムを開発した.とくに,単純なパタンに仮説を制限する代わりに,誤差や欠落を含む不完全データにたいしても働くような,頑健かつ高速な手法を開発することができた.また,確率的サンプリングを用いた高速化や効率的な探索の枝刈り法についても成果を得た.
    2.属性効率のよいパタン発見アルゴリズム:テキストデータベースにおいては,属性に対応する部分列や語彙の数は膨大になる.本項では,1変数パタンと呼ばれる単純な規則を対象に,発見に必要な具体例が,発見対象に関連しない属性数にはほとんど依存しないような,「属性効率がよい」パタン発見アルゴリズムを開発し,この族が多項式オンライン学習可能であることを示した.
    3.構造化データからの対話を用いた知識獲得:本項では,構造化データからの対話的な知識獲得について基礎的な研究をおこなった.関係データベースではさまざまな演繹質問や統合性制約が一階ホーン論理式として表される.主結果として,一階ホーン論理式の部分族ACH(k)に対する多項式時間学習アルゴリズムを与え,さらに,これが質問計算量に関してほぼ最適であることを示した.
    他にも,テキストデータベース用の質問言語の計算量と表現力を調べ,完全に特徴づけた.また,本年度に1項のテキストデータマイニング・システムの小規模プロトタイプを試験的に実装し,問題点の洗い出しをした.次年度は,これに基づいて効果的な実装法を研究し,分子生物学のデータを対象として大規模な知識獲得実験をおこないたい.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Priority Areas, Kyushu University, Principal investigator, 09230215
  • Speedup of Text Database by Data Compression
    Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)
    1995 - 1997
    SHINOHARA Takeshi; FUKAMACHI Shuichi; SHIMOZONO Shinichi; ISHIZAKA Hiroki
    The objective of this research is to establish a speedup method for sequential pattern matching by data compression and demonstrate its availability in text database.
    We design a pattern matching machine for compressed data by Huffman codes without decoding. In the experiment on this algorithm, although the effect of this method depends on the characteristics of data, the text size and the response time of searching are reduced to 60% and 70%, respectively, for English text.
    We also design a similar technique for new compression scheme, called Byte-Pair-Encoding (BPE,for short). This technique compresses English text to around 50% and reduces search time to 60%. BPE is basically a fixed length code, and therefore compresses text by BPE is efficiently distributed to processors in parallel environment.
    Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (A), Kyushu Institute of Technology, Coinvestigator, 07558159
  • 大規模オブジェクト指向データベースを対象とした知識獲得システムの研究
    Grants-in-Aid for Scientific Research Grant-in-Aid for Encouragement of Young Scientists (A)
    1996 - 1996
    有村 博紀
    本研究の目的は,CADデータベースやマルチメディアデータベースのように,大量の非均質なデータを含み,同時に,主に正例だけを含むようなデータベースに対しても適用可能な,新しい学習手法を開発することである。具体的な学習手法としては,申請者らが開発した「極小多重汎化」手法を採用し,大規模科学データベースを対象として,高速な極小多重汎化手続きを開発することを目指して研究をおこなった.
    具体的には,極小多重汎化をデータベースを対象に拡張し,高速な学習手続きを開発した.
    1.まず,データの理論的モデルの基盤として,関係データベースをとりあげ,知識表現の立場からこれを概念階層の導入によって拡張した.こうして拡張されたデータベースを対象として,高速な極小多重汎化手続きを開発することに成功した.この手法の能力と適用限界を理論的に調べた.
    2.さらにこの汎化手続きを計算機上に実装し,その有用性をデータベースからの知識獲得に関する計算機実験を通じて確認した.この際,確率的選択を含む数種の経験的発見法を導入により,実際的な効率の向上をはかった.
    3.近年のマルチメディア等の高度データベースの急激な発展に対処するため,上で開発した拡張データベースにテキストデータかたの知識獲得手続きを組み込むための基礎的研究をおこなった.文字列パターンとよばれる表現をとりあげ,パターンへの概念階層の組み込みと高速汎化アルゴリズムの開発をおこなった.
    4.また,複数概念の個数にあらかじめ上限を与えない場合の学習可能性について理論的考察をおこない,正例からの学習の可能性と上限を明らかにした.
    5.背景知識を許した関係記述言語である基本形式体系の計算量についても考察した.
    Japan Society for the Promotion of Science, Grant-in-Aid for Encouragement of Young Scientists (A), Kyushu University, Principal investigator, 08780371
  • 大規模オブジエクト指向データベースを対象とした知識獲得システムの研究
    Grants-in-Aid for Scientific Research Grant-in-Aid for Encouragement of Young Scientists (A)
    1995 - 1995
    有村 博紀
    本研究では,大規模オブジェクト指向データベースを対象とした知識獲得システムの実現方法をあきらかにすることを目的として,研究をおこなった.このために申請者が提案した学習手法である極小多重汎化を,データベースを対象に拡張し,高速な学習手続きを開発した.具体的には,つぎの研究をおこなった.
    (1)オブジェクト指向データベースにおける概念継承と背景知識を統一的にあつかうために,背景知識として論理プロブラムが与えられた場合に,与えられデータベースの極小汎化を求める問題を考察し,高速な汎化手続きを開発した.さらに,この汎化手続きによって,制限された述語論理プログラムのクラスが,正例だけから多項式時間極限可能であることを示した.
    (2)計算量理論の立場から,最も単純な体系である原子式の極小多重汎化問題を考察し,未知仮説の正確な同定に必要な時間計算量と質問情報量を理論的に明らかにした.これにより,正確な同定を高速におこなうには,知識獲得システム自身が決定実験をおこない,必要な例を動的に得る仕組みが必要なことがわかった.
    (3)開発中の知識獲得システムを,本研究で得た汎化手続きの導入によって拡張し,遺伝子配列と蛋白質に関するオブジェクト指向データベースを対象として知識獲得実験をおこなった.この計算機実験によって,開発した汎化手続きの実際のデータベースにおける有効性を検証した.
    Japan Society for the Promotion of Science, Grant-in-Aid for Encouragement of Young Scientists (A), Kyushu Institute of Technology, Principal investigator, 07780339
  • 概念の理解と知識の定着をはかる知的CAIに関する研究
    科学研究費助成事業
    1993 - 1993
    竹内 章; 有村 博紀
    発見的な学習を支援する知的教育システムに関する研究を行った。本研究では、発見的学習とAIとの関係から、2つの立場より研究を行った。一つは、知的教授システムを実験台にして、知識処理の特徴を体験を通して学習させる方法に関する研究、今一つは発見学習の支援方法に関する研究である。
    前者は、知的教授システムの特徴的機能が知識処理によって初めて可能となることから、知識処理を学ぶよい教材となるのを利用して、実験によって知識処理の深い理解を得させるための環境に関する研究である。ここでは学習環境の構成方法と、計算機によってシミュレートされた疑似学習者の役割について提案を行った。
    後者は、システムが学習者の振る舞いを認識して、学習者の意図に対して適応的な発見支援をめざす。このとき問題になるのは、学習者の意図を推定する方法と支援の方法である。我々は、学習者がすでに獲得済みの背景知識を用いながら、実験環境での試行錯誤を通して新しい概念を獲得するとの考えにたって、学習者とシステムが実験環境を共有し、協調作業を行う学習支援システムのアーキテクチャと、そこでの支援方法を提案した。ここでは発見のプロセスを、実験装置とデータの整理・加工のための道具の状態で表現される状態空間中での、それぞれの部品で定義されているオペレータの適用による移動としてモデル化した。この状態空間は、実験によるデータ収集を行う操作空間と、実験データをもとに仮説を生成する仮説空間に分けられる。また、システムが与えられた環境から規則を発見するため、そして学習者の指導のために用いる発見の戦略を、個々の環境に依存したもの、一つの領域に共通なもの、全体に共通なものの3レベルに分け記述した。
    日本学術振興会, 重点領域研究, 九州工業大学, 05213219
  • 概念の理解と知識の定着をはかる知的CAIに関する研究
    科学研究費助成事業
    1992 - 1992
    竹内 章; 有村 博紀
    マイクロワールドに代表される発見的学習環境と、知的教授システムに代表される問題解決中心の学習環境を統合した、Bimodus CAIと呼ぶ新しい教育環境を提案し、特に、発見学習の支援方法を中心に研究した。
    Bimodus CAIにおける発見学習環境は、現実世界のモデルを計算機上に実現したダイレクト・マニピュレーション環境である。学習者はモデルを操作し、反応を観察しながら、モデルに埋め込まれた概念を獲得する。この過程で学習者は実験環境から情報を集め、整理し、自分の持っている背景知識を用いて仮説を立て、実験結果を解釈し、加工しながら、試行錯誤を繰り返す。
    学習者が新しい概念を発見するには、それを導くための前提知識を持っていることが必要であるが、この条件のもので、学習者に対する支援は、次の二つの目的を満たすシステムの振舞いであると考える。
    1)学習者の行う思考の領域を、背景知識全体から関連のある知識領域へ、さらに目標概念の発見に必要な前提知識領域へと、次第に縮小する。
    2)前提知識の領域において、正しい知識を導出できるような実験環境を提示する。
    学習者支援には学習者モデルが必要で、これは、システムに学習者と同じように実験環境から新しい概念を発見する機能をもたせ、学習者のふるまいをモニタして構築する。この機構は、内部実験装置、機械学習プログラム、背景知識から構成される。機械学習プログラムは、背景知識と内部実験装置から得られる情報を基に目標概念を獲得する。内部実験装置は学習者が行うのと同じ操作が可能で、学習者が収集するのと同じ情報が機械学習プログラムに与えられる。内部実験装置と学習者が操作する実験装置は表裏一体の関係にあり、学習者の操作をモニターしたり、システムが実験装置を操作してその結果を学習者に示したりする。
    日本学術振興会, 重点領域研究, 九州工業大学, 04229221
■ Academic and Social Contribution Activities/Other
Industrial Property Rights
  • STR-seminar 2024 Sapporo, an inter-university research meeting for young researchers
    02 Sep. 2024 - 04 Sep. 2024
    Planning etc
    Academic society etc
    STR Seminar, Steering committee
    Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Hokkaido, Japan, 開催概要
    名称:「STRセミナー2024北海道 〜 若手研究者のための大学間合同研究集会」
    日時: 2024年9月2日(月) 〜9月4日(水):セミナー,9月4日(水)午後: 研究討論会
    場所: 北海道大学 大学院情報科学研究院,A13教室(第1〜3日目すべて)
    (北海道 札幌市北区北14条西9丁目,北海道大学札幌キャンパス,情報棟,1階,玄関入ってすぐ右)
    アクセス: https://www.ist.hokudai.ac.jp/access/
    集会HP: https://sites.google.com/view/str2024sapporo/
    35135885;11763074
Social Contribution Activities
  • 学術講演会「北海道から多文化共生を考える」
    17 Nov. 2024 - 17 Nov. 2024
    Presenter, Organizing member
    日本学術会議 北海道地区会議
    学術講演会「北海道から多文化共生を考える」
    北海道地区会議ニュース,学術講演会開催報告,No.55, 2025-3
    High school students, College students, Graduate students, Teachers, Guardians, Researchers, General, Scientific organization, Company, Civic organization, Governmental agency, Media
  • Public Lecture Series on Global COE Program of Center for Next-Generation Information Technology Based on Knowledge Discovery and Knowledge Federation (GCOE-NGIT)
    2011 - 2011
    Lecturer, Planner
    Graduate School of Information Science and Technology
    Global COE Program Public Lecture Series on Next-Generation Information Technology Based on Knowledge Discovery and Knowledge Federation (GCOE-NGIT): From networks via Nano-technology to Genome Science
Media Coverage
  • Press Fujitsu and Hokkaido University Develop "Explainable AI" Technology Providing Users with Concrete Steps to Achieve Desired Outcomes
    04 Feb. 2021
    Hokkaido University / Fujitsu Laboratories Ltd.
    Fujitsu Laboratories Ltd., Press Releases: https://www.fujitsu.com/global/about/resources/news/press-releases/2021/0204-01.html, [Others]
  • Fujitsu and Hokkaido University Develop "Explainable AI" Technology Providing Users with Concrete Steps to Achieve Desired Outcomes
    04 Feb. 2021
    Other than myself
    Nikkei
    日経新聞: https://www.nikkei.com/article/DGXLRSP604512_U1A200C2000000/, [Paper]