Arimura Hiroki(Faculty of Information Science and Technology　Computer Science and Information Technology　Knowledge Software Science)

SEARCH

Search Details

Arimura Hiroki

Faculty of Information Science and Technology　Computer Science and Information Technology　Knowledge Software Science Professor
Research Center of Mathematics for Social Creativity Professor
Institute for Academic Innovation Professor

Hiroki Arimura is a Professor in the Graduate School of Information Science and Technology of Hokkaido University. After receiving his PhD from Kyushu University in Computer Science in 1994, his primary interests has been in data mining, information retrieval, machine learning, algorithms and data structures, and their applications. He has served on the program committees of a number of international conferences including PAKDD, DS, ALT, PKDD, KDD and others. He was the director of the Global COE (Centers of Excellence) Program of "Center for Next-Generation Information Technology Based on Knowledge Discovery and Knowledge Federation" at Hokkaido University by MEXT (Ministry of Education, Culture, Sports, Science and Technology of Japan) for 2007-2011.

Researcher basic information

■ Degree

Doctor of Science (D.Sc), Kyushu University, Jun. 1994

Master, Kyushu University, Mar. 1990

Bachelor of Science (B.Sc), Kyushu University, Mar. 1988

■ URL
researchmap URL

https://researchmap.jp/read0171664?lang=en

ホームページURL

https://orcid.org/0000-0002-2701-0271

https://jglobal.jst.go.jp/detail?JGLOBAL_ID=200901084578372110

http://www-ikn.ist.hokudai.ac.jp/~arim/

http://www-ikn.ist.hokudai.ac.jp/~arim/publication.html

■ Various IDs
ORCID ID

https://orcid.org/0000-0002-2701-0271

J-Global ID

200901084578372110

■ Research Keywords and Fields
Research Keyword

computational learning theory

Computer Science

Artificial Intelligence

data structures and algorithms

data mining

information retrieval

machine learning

enumeration algorithms

graph

decision trees

text index

string algorithm

semi-structured data

ウェブマイニング

pattern mining

explanable machine learning

database

spatio-temporal data

compression

stream processing

knowledge discovery

big data

Research Field

Informatics, Theory of informatics, Design and analysis of algorithms

Informatics, Database, data mining/Information Retrieval

Informatics, Intelligent informatics, machine learning

■ Educational Organization

Bachelor's degree program, School of Engineering

Master's degree program, Graduate School of Information Science and Technology

Doctoral (PhD) degree program, Graduate School of Information Science and Technology

Career

■ Career
Career

Apr. 2021 - Present
The Global Institution for Collaborative Research and Education (GI-CoRE), Hokkaido University, Associate faculty, Associate faculty, https://www.global.hokudai.ac.jp/research-education/global-research-and-education, Japan

Apr. 2019 - Present
Hokkaido University, Graduate School of Information Science and Technology, Professor

Apr. 2020 - Mar. 2026
Japan Science and Technology Agency (JST), Basic Research program, PRESTO Area "The fundamental technologies for Trustworthy AI" (Trustworthy AI), Research Supervisor (area director)

Oct. 2023 - Mar. 2025
National Institute of Advanced Industrial Science and Technology (AIST), Artificial Intelligence Research Center (AIRC), Invited researcher, Japan

Mar. 2025
Hokkaido University, Research Center of Mathematics for Social Creativity Research Institute for Electronic Science, Adjunct Professor, Japan

Apr. 2018 - Mar. 2022
Hokkaido University Collaborative Project Center, Knowledge Media Laboratory, Leader (director), Japan

Jan. 2008 - Mar. 2022
National Institute of Informatics, 連携研究部門, Adjunct professor

Apr. 2016 - Mar. 2021
Global Station for Big Data and Cybersecurity (GSB), Global Institution for Collaborative Research and Education (GI-CoRE),Hokkaido University, PI (adjunct), Japan

Apr. 2004 - Mar. 2019
Hokkaido University, Graduate School of Information Science and Technology, Professor

Apr. 2016 - Mar. 2018
Global Station for Big Data and Cybersecurity (GSB), Global Institution for Collaborative Research and Education (GI-CoRE), Director, Web site: https://gi-core.oia.hokudai.ac.jp/main/, Japan

Apr. 2015 - Mar. 2017
Hokkaido University Collaborative Project Center, Knowledge Media Laboratory, leader (director)

Apr. 2013 - Mar. 2015
Hokkaido University, Knowledge Media Laboratory, director

Jul. 2007 - Mar. 2012
Graduate School of Information Science and Technology, Hokkaido University, MEXT/JSPS Global COE program, “ Next Generation Information Technology Bases to Support Knowledge Creation”, Director, Japan

Mar. 2005 - Jun. 2005
Université Claude Bernard Lyon 1, Department of Computer Science, Visiting Researcher supported by MEXT (Ministry of Education, Culture, Sports, Science and Technology), Host professor: Pr. Mohand-Saïd Hacid, France

Apr. 1996 - Mar. 2004
Kyushu University, Graduate School of Information Science and Electrical Engineering, Associate Professor

Apr. 2002 - Mar. 2003
Kyushu University, Computing Center/Reseach Institute of Information Technology, Adjunct associate professor

Apr. 2001 - Mar. 2002
Kyushu University, University Library, R&D division, Adjunt associate professor

Oct. 1999 - Mar. 2002
JST PRESTO, "Information and Intelligence, Adjunct researcher

May 1996 - Oct. 1996
University of Helsinki, Department of Computer Science, Visiting researcher, supported by a JSPS (Japanese Society of Promotion of Science) exchange program, Host professor: Esko Ukkonen, Finland

Apr. 1995 - Mar. 1996
Kyushu Institute of Technology, Faculty of Computer Science and Systems Engineering, Associate Professor

Apr. 1994 - Mar. 1995
Kyushu Institute of Technology, Faculty of Computer Science and Systems Engineering, Lecturer

Apr. 1990 - Mar. 1994
Kyushu Institute of Technology, Faculty of Computer Science and Systems Engineering, Research associate (Associate professor), Japan

Educational Background

Apr. 1988 - Mar. 1990, Kyushu University, Interdisciplinary Graduate School of Engineering Sciences, Division of Information Systems, Master Course

Apr. 1984 - Mar. 1988, Kyushu University, Faculty of Science, Department of Physics

Committee Memberships

Oct. 2023 - Present
Science Council of Japan (SCJ), Member (Section III: Physical Sciences and Engineering, Informatics), Government

Apr. 2020 - Mar. 2026
Japan Science and Technology Agency (JST), Research Supervisor (area director), Basic Research program, PRESTO, area "The fundamental technologies for Trustworthy AI" (Trustworthy AI), Government

Oct. 2020 - Sep. 2023
Science Council of Japan, Associate member (Section III: Physical Sciences and Engineering, Informatics), Government

Apr. 2018 - Mar. 2020
Japan Society for the Promotion of Science, Advisory Board Member (Frontiers of Science Symposia)

Jun. 2017 - May 2019
Japanese Society for Artificial Intelligence (JSAI), Delegate, Society

Jun. 2017 - Mar. 2019
Ministry of Education, Culture, Sports, Science and Technology (MEXT), Associate Member, Council for Science and Technology, Government

Apr. 2011 - Mar. 2018
Japan Society for the Promotion of Science (JSPS), Advisory Board Member (Junior Member), Frontiers of Science Symposia, Society

Jun. 2015 - May 2016
Japanese Society for Artificial Intelligence, Program Committee Chair, the 30th Annual Conference of JSAI (JSAI 2016), Society

Jun. 2014 - May 2016
Japanese Society for Artificial Intelligence (JSAI), Board Member, Society

Dec. 2014 - Apr. 2015
Graduate School of Information Sciences, Tohoku University, External evaluation committee member:, Others

May 2008 - Apr. 2014
IEICE, Expert Member, IEICE Technical Committee on Computation, Society

2008 - 2014
Japan Science and Technology Agency (JST), Advisor, Basic Research program, PRESTO, area “ Synthesis of Knowledge for Information Oriented Society” (research supervisor: Hideyuki Nakashima), Government

2007 - Sep. 2013
International Conference on Discovery Science, Steering committee member, International Conference on Discovery Science, Society

May 2006 - Apr. 2008
IEICE, Vice Chair, IEICE Technical Committee on Computation, Society

Apr. 2006 - Mar. 2008
The Japanese Society for Artificial Intelligence (JSAI), Chair, JSAI Special Interest Group on Fundamental Problems in Artificial Intelligence (SIG-FPAI), Society

2008 - 2008
Japan Society for the Promotion of Science (JSPS), Planning Group Member (PGM), UJFoS2008, Society

2006 - 2007
Center for Research and Development Strategy, Japan Science and Technology Agency, Leader, Panel B, Science and Technology Foresight Strategy Workshop "Prediction and Discovery", Society

2006 - 2006
1st Int'l Workshop on Data Mining and Statistical Science (DMSS-2006), Sapporo, Chair, Society

2004 - 2006
The Japanese Society for Artificial Intelligence (JSAI), Councilor (representative),, Society

2004 - 2006
情報処理学会, データベース研究会研究運営委員, Society

2004 - 2006
The Japanese Society for Artificial Intelligence (JSAI), Committee Member, JSAI Special Interest Group on Fundamental Problems in Artificial Intelligence (SIG-FPAI), Society

2003 - 2005
Japan Society for the Promotion of Science (JSPS), Planning Group Member (PGM), Japanese-German Frontiers of Science Symposium (JGFoS 2004-2005), Society

1999 - 2000
the 11th International Conference on Algorithmic Learning Theory (ALT'00, Sydney), Program Committee Co-chairs, Society

Position History

知識メディア・ラボラトリー長, 2013年4月1日 - 2015年3月31日

Research activity information

■ Awards

Jun. 2024, The Symposium on Combinatorial Pattern Matching (CPM, established in 1994, the 35th in its 2024 edition), Test of Time Award (CPM Test of Time Award in year 2024)
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications (CPM 2021)
Toru Kasai;Gunho Lee;Hiroki Arimura;Setsuo Arikawa;Kunsoo Park, Organization: The Symposium on Combinatorial Pattern Matching (CPM) was established in 1990. This year's conference is the 35th edition since the first CPM was held in Paris in 1990. Criteria: "The award is granted to outstanding papers in algorithms research that were published in the CPM proceedings at least 20 years ago and which are still influential and stimulating for the field today." (from the Award page of CPM 2024: )
Paper: Toru Kasai, Gunho Lee, Hiroki Arimura, Setsuo Arikawa, Kunsoo Park, ``Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications,'' Proc.~CPM 2001, Springer, LNCS, pp.181-192, Jerusalem, Israel, July 1–4, 2001., International society, Japan

Jun. 2022, IPSJ, 2022 IPSJ Computer Science Research Award for Young Scientists
Tsubasa Oizumi, Hiroki Arimura: Efficient Algorithms for Cartesian Tree Subsequence Matching, IPSJ Technical Report, Vol.2022-AL-186,No.3,1-8, Jan. 2022
Tsubasa Oizumi

Jun. 2022, The Japanese Society for Artificial Intelligence, Best Paper Award
Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear Optimization
Kentaro Kanamori;Takuya Takagi;Ken Kobayashi;Hiroki Arimura, 35135885;35136410;11763074

Jan. 2019, JSAI, Best Presentation Award
Fairness-aware Edit of a Learned Decision Tree Using Integer Linear Programming
Kentaro Kanamori;Hiroki Arimura

Jun. 2016, Information Processing Society Japan (IPSJ), Best Paper Award
Practical Algorithms for Mining Flock Patterns from Trajectories
Xiaoliang Geng;Takeaki Uno;Hiroki Arimura, Official journal

Mar. 2016, IPSJ, 2016 Yamashita Award
Efficient Approximate 3-Dimensional Point Set Matching and Its Application to Molecular Pattern Matching, 2015-BIO-42，2015-MPS-104, 2015/6
Yoichi Sasaki

2015, IPSJ, Student Encouragement Award of IPSJ National Convention
Efficient Approximate 3-Dimensional Point Set Matching and Its Application to Molecular Pattern Matching
Yoichi Sasaki

Jun. 2013, The Japanese Society for Artificial Intelligence, JSAI Incentive Award
Efficient Enumeration of Acyclic Sub-hypergraphs in Hypergraphs
Kunihiro Wasa;Hiroki Arimura;Takeaki Uno;Kouichi Hirata

Jun. 2010, IEICE, Inf. & Syst. Society Best Paper Award (4 year award)
「ゼロサプレス型二分決定グラフを用いたトランザクションデータベースの効率的解析手法」
Shin-ichi Minato;Hiroki Arimura

Feb. 2005, Japan Database Society Japan, Kambayashi memorial award
Hiroki Arimura

Nov. 2004, 2nd Workshop on Frequent Itemset Mining Implementations (FIMI'04), in conjunction with IEEE ICDM'04, BEST IMPLEMENTATION AWARD
Takeaki Uno;Masashi Kiyomi;Hiroki Arimura

Nov. 2004, Japan Society of Artificial Intelligence, 2004 SIG Best Paper Award
「大規模系列データから代表的な頻出エピソードを発見するための効率よいアルゴリズム」
Hiroki Arimura;Takeaki Uno

Jul. 2004, IEICE SIG-DE, DEWS2004 Best paper award
「半構造データマイニングのための高速な無順序木パターン発見手法」
Shinji Fusanobu;Tatsuya Asai;Hiroki Arimura;Takeaki Uno;Shin-ichi Nakano

Jun. 2003, 電子情報通信学会DE研究会第１４回データ工学ワークショップ, DEWS2003最優秀論文賞
「領域効率の良い頻出データアイテム発見アルゴリズム」
川副真治;有村博紀

May 2002, 電子情報通信学会DE研究会第１４回データ工学ワークショップ, DEWS2002優秀論文賞
浅井達哉;安部賢治;川副真治;坂本比呂志;有村博紀;有川節夫

May 2001, 人工知能学会, 2000年度論文賞
「テキストデータからの高速データマイニング」, 安部潤一郎, 藤野亮一, 下薗真一, 有村博紀, 有川節夫（2000年7月掲載）
安部潤一郎;藤野亮一;下薗真一;有村博紀;有川節夫

Apr. 2000, PAKDD2000, Paper with Merit Award
"Discovering unordered and ordered phrase association patterns for text mining"
Ryoichi Fujino;Hiroki Arimura;Setsuo Arikawa

Dec. 1999, 人工知能学会, 1999年度全国大会優秀論文賞
「大規模テキストデータからの探索的文書ブラウジング」
板井力;笠井透;有村博紀;有川節夫

Mar. 1998, 情報処理学会, 第５５回全国大会大会優秀賞
「最適パタン発見に基づくテキストデータマイニング」
有村博紀;渡木厚;藤野亮一;有川節夫

Jul. 1992, 人工知能学会, 全国大会優秀論文賞
「極小汎化に基づくPROLOGプログラムの正事実からの多項式時間推論」
有村博紀;石坂裕毅;篠原武

Jun. 1992, 人工知能学会, 1992年度研究奨励賞
「Polynomial Time Inference of Unions of Tree Pattern Languages」
有村博紀;石坂裕毅;篠原武

■ Papers

Computing Minimal Absent Words and Extended Bispecial Factors with CDAWG Space
Shunsuke Inenaga; Takuya Mieno; Hiroki Arimura; Mitsuru Funakoshi; Yuta Fujishige
Combinatorial Algorithms - 35th International Workshop, IWOCA 2024, Tainan, Taiwan, June 7-10, 2023, Proceedings. Lecture Notes in Computer Science, Springer, 14764, 327, 340, Jul. 2024, [Peer-reviewed], [International Magazine]
English, Scientific journal, 35135885;35135762;35136410

Finding Diverse Strings and Longest Common Subsequences in a Graph.
Yuto Shida; Giulia Punzi; Yasuaki Kobayashi; Takeaki Uno; Hiroki Arimura
The 35th Annual Symposium on Combinatorial Pattern Matching, CPM 2024, 296, 27:1, 27:19, Schloss Dagstuhl – Leibniz-Zentrum für Informati, 27 Jun. 2024, [Peer-reviewed], [Corresponding author], [International Magazine]
English, International conference proceedings, This paper introduces the Diverse Longest Common Subsequences (LCSs) problem under Hamming distance. Given a set of strings, the goal is to find a subset of K LCSs with diversity at least Δ, measured by either the sum or minimum of pairwise Hamming distances. We analyze the computational complexity of this problem, providing polynomial-time algorithms for bounded K and approximation algorithms for unbounded K. Moreover, we presented a few hardness results. The above results are obtained in a more general setting where strings are represented by a DAG, and prove that our positive results hold in this setting., 35135885;35136410;11763074

presentation slide

Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph.
Hiroki Arimura; Shunsuke Inenaga; Yasuaki Kobayashi; Yuto Nakashima; Mizuki Sue
Proceedings of the 30th International Symposium on String Processing and Information Retrieval (SPIRE 2023), 14240, 28, 34, Lecture Notes in Computer Science, Sep. 2023, [Peer-reviewed], [Lead author, Corresponding author], [International Magazine]
English, International conference proceedings, This paper investigates the computational complexity of converting a Compact Directed Acyclic Word Graph (CDAWG) into various text indexing structures suitable for highly repetitive texts. By extending previous work, we present optimal construction algorithms in linear time and space for the run-length BWT, irreducible PLCP array, quasi-irreducible LPF array, lex-parse, and LZ77-parse from a CDAWG or its self-index version. Our algorithms consist of novel techniques for characterizing these arrays in terms of a CDAWG and efficiently enumerating specific subsets of suffixes using forward and backward search on the CDAWG., 35135885;11763074

Published

Minimum Consistent Subset for Trees Revisited.
Hiroki Arimura; Tatsuya Gima; Yasuaki Kobayashi 0001; Hiroomi Nochide; Yota Otachi
CoRR, abs/2305.07259, 2023
Scientific journal

Computing the Collection of Good Models for Rule Lists
Kota Mata; Kentaro Kanamori; Hiroki Arimura
Proc. the 18th International Conference on Machine Learning and Data Mining (MLDM 2022), abs/2204.11285, IbaI Publishing, Jul. 2022, [Peer-reviewed], [International Magazine]
English, International conference proceedings

Published

Cartesian Tree Subsequence Matching
Tsubasa Oizumi; Takeshi Kai; Takuya Mieno; Shunsuke Inenaga; Hiroki Arimura
Proc. the 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022), Leibniz International Proceedings in Informatics (LIPIcs), LIPIcs223, 14:1, 14:18, DROPS, Jun. 2022, [Peer-reviewed]
English, International conference proceedings

Published

Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear Optimization
Kentaro Kanamori; Takuya Takagi; Ken Kobayashi; Hiroki Arimura
Transactions of the Japanese Society for Artificial Intelligence, 36, 6, C, L44_1, Japanese Society for Artificial Intelligence, 01 Nov. 2021, [Peer-reviewed]
English, Scientific journal, 35136410;11763074;35135885

Efficient enumeration of dominating sets for sparse graphs
Kazuhiro Kurita; Kunihiro Wasa; Hiroki Arimura; Takeaki Uno
Discrete Applied Mathematics, 303, 283, 295, Elsevier BV, Nov. 2021, [Peer-reviewed]
English, Scientific journal

Fairness-Aware Decision Tree Editing Based on Mixed- Integer Linear Optimization
Kentaro Kanamori; Takuya Takagi; Ken Kobayashi; Hiroki Arimura
Transactions of the Japanese Society for Artificial Intelligence, 36, 4, B, L13_1, Japanese Society for Artificial Intelligence, 01 Jul. 2021, [Peer-reviewed]
English, Scientific journal, 35136410;35135762;35135885

A constant amortized time enumeration algorithm for independent sets in graphs with bounded clique number
Kazuhiro Kurita; Kunihiro Wasa; Takeaki Uno; Hiroki Arimura
Theoretical Computer Science, 874, 32, 41, Elsevier BV, Jun. 2021, [Peer-reviewed]
English, Scientific journal

Ordered Counterfactual Explanation by Mixed-Integer Linear Optimization.
Kentaro Kanamori; Takuya Takagi; Ken Kobayashi; Yuichi Ike; Kento Uemura; Hiroki Arimura
Proceedings of the AAAI Conference on Artificial Intelligence, 35, 13, 11564, 11574, Feb. 2021, [Peer-reviewed]
English, International conference proceedings

DACE: Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear Optimization.
Kentaro Kanamori; Takuya Takagi; Ken Kobayashi; Hiroki Arimura
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20), 2855, 2862, ijcai.org, Jan. 2021, [Peer-reviewed]
English, International conference proceedings

Fully-Online Suffix Tree and Directed Acyclic Word Graph Construction for Multiple Texts.
Takuya Takagi; Shunsuke Inenaga; Hiroki Arimura; Dany Breslauer; Diptarama Hendrian
Algorithmica, 82, 5, 1346, 1377, 2020, [Peer-reviewed]
English, Scientific journal

Route graph: joint map-matching by maximizing posterior probability,
Hiroya Inakoshi; Junichi Shigezumi; Tatsuya Asai; Takuya Kida; Hiroki Arimura
IPSJ Transactions on Mathematical Modeling and Its Applications, 12, 1, 1, 9, IPSJ, Sep. 2019, [Peer-reviewed]
English, Scientific journal

Discovery of Regularized Areas with Maximal Confidence from Location Data
Hiroya Inakoshi; Tatsuya Asai; Takuya Kida; Hiroki Arimura
Transactions of the Japanese Society for Artificial Intelligence, 34, 3, D, I56_1, Japanese Society for Artificial Intelligence, 01 May 2019, [Peer-reviewed]
English, Scientific journal

The Complexity of Induced Tree Reconfiguration Problems
Kunihiro Wasa; Katsuhisa Yamanaka; Hiroki Arimura
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences, E102-D, 3, 464, 469, The Institute of Electronics, Information and Communication Engineers, Mar. 2019, [Peer-reviewed]
English, Scientific journal,
Given two feasible solutions A and B, a reconfiguration problem asks whether there exists a reconfiguration sequence (A₀=A, A₁,...,A_ℓ=B) such that (i) A₀,...,A_ℓ are feasible solutions and (ii) we can obtain A_i from A_i-1 under the prescribed rule (the reconfiguration rule) for each i ∈ {1,...,ℓ}. In this paper, we address the reconfiguration problem for induced trees, where an induced tree is a connected and acyclic induced subgraph of an input graph. We consider the following two rules as the prescribed rules: Token Jumping: removing u from an induced tree and adding v to the tree, and Token Sliding: removing u from an induced tree and adding v adjacent to u to the tree, where u and v are vertices of an input graph. As the main results, we show that (I) the reconfiguration problemis PSPACE-complete even if the input graph is of bounded maximum degree, (II) the reconfiguration problem is W[1]-hard when parameterized by both the size of induced trees and the length of the reconfiguration sequence, and (III) there exists an FPT algorithm when the problem is parameterized by both the size of induced trees and the maximum degree of an input graph under Token Jumping and Token Sliding.

Constant Amortized Time Enumeration of Independent Sets for Graphs with Forbidden Subgraphs on Fixed Number of Vertices.
Kazuhiro Kurita; Kunihiro Wasa; Hiroki Arimura; Takeaki Uno
CoRR, abs/1906.09680, 2019, [Peer-reviewed]
Scientific journal

Efficient Approximate 3-Dimensional Point Set Matching Using Root-Mean-Square Deviation Score.
Yoichi Sasaki; Tetsuo Shibuya; Kimihito Ito; Hiroki Arimura
IEICE Transactions, 102-A, 9, 1159, 1170, 2019, [Peer-reviewed]
English, Scientific journal

An Efficient Algorithm for Enumerating Chordal Bipartite Induced Subgraphs in Sparse Graphs
Kazuhiro Kurita; Kunihiro Wasa; Takeaki Uno; Hiroki Arimura
Lecture Notes in Computer Science, 339, 351, Springer International Publishing, 2019, [Peer-reviewed]
English, International conference proceedings

Enumeration of Distinct Support Vectors for Interactive Decision Making.
Kentaro Kanamori; Satoshi Hara; Masakazu Ishihata; Hiroki Arimura
CoRR, abs/1906.01876, arXiv, 2019, [Peer-reviewed], [International Magazine]
English, International conference proceedings

Discovering Co-Cluster Structure from Relationships between Biased Objects
Iku OHAMA; Takuya KIDA; Hiroki ARIMURA
IEICE Transactions on Information and Systems, E101.D, 12, 3108, 3122, Institute of Electronics, Information and Communications Engineers (IEICE), 01 Dec. 2018, [Peer-reviewed]
English, Scientific journal

DenseZDD: a compact and fast index for families of sets
Shuhei Denzumi; Jun Kawahara; Koji Tsuda; Hiroki Arimura; Shin-ichi Minato; Kunihiko Sadakane
Algorithms, 11, 8, 128, 128, Aug. 2018, [Peer-reviewed]
English, Scientific journal

Efficient Enumeration of Induced Matchings in a Graph without Cycles with Length Four
KURITA Kazuhiro; WASA Kunihiro; UNO Takeaki; ARIMURA Hiroki
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 101, 9, 1383, 1391, The Institute of Electronics, Information and Communication Engineers, 2018
English,
In this study, we address a problem pertaining to the induced matching enumeration. An edge set M is an induced matching of a graph G=(V,E). The enumeration of matchings has been widely studied in literature; however, there few studies on induced matching. A straightforward algorithm takes O(Δ²) time for each solution that is coming from the time to generate a subproblem, where Δ is the maximum degree in an input graph. To generate a subproblem, an algorithm picks up an edge e and generates two graphs, the one is obtained by removing e from G, the other is obtained by removing e, adjacent edge to e, and edges adjacent to adjacent edge of e. Since this operation needs O(Δ²) time, a straightforward algorithm enumerates all induced matchings in O(Δ²) time per solution. We investigated local structures that enable us to generate subproblems within a short time and proved that the time complexity will be O(1) if the input graph is C₄-free. A graph is C₄-free if and only if none of its subgraphs have a cycle of length four.

Efficient Enumeration of Subgraphs and Induced Subgraphs with Bounded Girth.
Kazuhiro Kurita; Kunihiro Wasa; Alessio Conte; Takeaki Uno; Hiroki Arimura
Combinatorial Algorithms - 29th International Workshop, IWOCA 2018, Singapore, July 16-19, 2018, Proceedings, 201, 213, Springer, 2018, [Peer-reviewed]
International conference proceedings

Efficient Enumeration of Dominating Sets for Sparse Graphs.
Kazuhiro Kurita; Kunihiro Wasa; Hiroki Arimura; Takeaki Uno
The 29th International Symposium on Algorithms and Computation (ISAAC 2018), 8, 13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018, [Peer-reviewed]
English, International conference proceedings

On the Model Shrinkage Effect of Gamma Process Edge Partition Models
Iku Ohama; Issei Sato; Takuya Kida; Hiroki Arimura
Proc. the 31st Annual Conference on Neural Information Processing Systems (NIPS2017), Dec. 2017, [Peer-reviewed]
English, International conference proceedings

Packed compact tries: A fast and efficient data structure for online string processing
Takuya Takagi; Shunsuke Inenaga; Kunihiko Sadakane; Hiroki Arimura
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E100A, 9, 1785, 1793, Institute of Electronics, Information and Communication, Engineers, IEICE, 01 Sep. 2017, [Peer-reviewed]
English, Scientific journal

Statistical emerging pattern mining with multiple testing correction
Junpei Komiyama; Masakazu Ishihata; Hiroki Arimura; Takashi Nishibayashi; Shin-Ichi Minato
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 129685, 897, 906, Association for Computing Machinery, 13 Aug. 2017, [Peer-reviewed]
English, International conference proceedings

Discovering Relevance-Dependent Bicluster Structure from Relational Data
Iku Ohama; Hiromi Iida; Takuya Kida; Hiroki Arimura
Proc. the 26th International Joint Conference on Artificial Intelligence (IJCAI 2017), 2578, 2584, Aug. 2017, [Peer-reviewed]
English, International conference proceedings

Efficient Enumeration of Induced Matchings in a Graph without Cycles with Length Four.
Kazuhiro Kurita; Kunihiro Wasa; Takeaki Uno; Hiroki Arimura
CoRR, abs/1707.02740, 2017, [Peer-reviewed]
Scientific journal

Linear-size CDAWG: New repetition-aware indexing and grammar compression
Takuya Takagi; Keisuke Goto; Yuta Fujishige; Shunsuke Inenaga; Hiroki Arimura
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10508, 304, 316, Springer Verlag, 2017, [Peer-reviewed], [International Magazine]
English, International conference proceedings

Published

Estimating the Lineage Dynamics of Human Influenza B Viruses
Mayumbo Nyirenda; Ryosuke Omori; Heidi L. Tessmer; Hiroki Arimura; Kimihito Ito
PLOS ONE, 11, 11, e0166107, Nov. 2016, [Peer-reviewed]
English, Scientific journal

Sequence binary decision diagram: Minimization, relationship to acyclic automata, and complexities of Boolean set operations
Shuhei Denzumi; Ryo Yoshinaka; Hiroki Arimura; Shin-ichi Minato
DISCRETE APPLIED MATHEMATICS, 212, 61, 80, Oct. 2016, [Peer-reviewed]
English, Scientific journal

Fully-online construction of suffix trees for multiple texts
Takuya Takagi; Shunsuke Inenaga; Hiroki Arimura
Proc. the 27th Annual Symposium on Combinatorial Pattern Matching (CPM'16), Leibniz International Proceedings in Informatics (LIPIcs), 54, 22:1, 22:13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Jun. 2016, [Peer-reviewed]
English, International conference proceedings

Published

The Relevance Dependent Infinite Relational Model for Discovering Co-Cluster Structure from Relationships with Structured Noise
Iku Ohama; Hiromi Iida; Takuya Kida; Hiroki Arimura
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E99D, 4, 1139, 1152, Apr. 2016, [Peer-reviewed]
English, Scientific journal

Relaxing the Data Access Bottleneck of Geographic Big-data Analytics Applications using Distributed Quad trees
Mayumbo Nyirenda; Hiroki Arimura; Kimihito Ito
PROCEEDINGS OF 2016 5TH INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 189, 194, 2016, [Peer-reviewed]
English, International conference proceedings

The Complexity of Induced Tree Reconfiguration Problems
Kunihiro Wasa; Katsuhisa Yamanaka; Hiroki Arimura
LANGUAGE AND AUTOMATA THEORY AND APPLICATIONS, LATA 2016, 9618, 330, 342, 2016, [Peer-reviewed]
English, International conference proceedings

Packed Compact Tries: A Fast and Efficient Data Structure for Online String Processing
Takuya Takagi; Shunsuke Inenaga; Kunihiko Sadakane; Hiroki Arimura
Combinatorial Algorithms, 9843, 213, 225, 2016, [Peer-reviewed]
English, International conference proceedings

二部グラフ中に含まれる弦二部誘導グラフの列挙
和佐州洋; 有村博紀; 宇野毅明; 平田耕一
第154回情報処理学会アルゴリズム研究会, 154, 1, 8, Sep. 2015
Japanese, Symposium

Multi-Layered Framework for Modeling Relationships Between Biased Objects
Iku Ohama; Takuya Kida; Hiroki Arimura
Proc. 2015 SIAM International Conference on Data Mining (SDM'15), 9, Apr. 2015, [Peer-reviewed]
English, International conference proceedings

Practical Algorithms for Mining Flock Patterns from Trajectories
Xiaoliang Geng; Takeaki Uno; Hiroki Arimura
Transactions of IPSJ, 56, 4, 1292, 1304, Apr. 2015, [Peer-reviewed]
Japanese, Scientific journal

Efficient Approximate 3-Dimensional Point Set Matching Using Root-Mean-Square Deviation Score
Yoichi Sasaki; Tetsuo Shibuya; Kimihito Ito; Hiroki Arimura
SIMILARITY SEARCH AND APPLICATIONS, SISAP 2015, 9371, 191, 203, 2015, [Peer-reviewed]
English, International conference proceedings

Manipulatable Compressed String Indexing Technology for Pattern Matching
Shuhei Denzumi; Hiroki Arimura; Kunihiko Sadakane
The journal of the Institute of Electronics, Information and Communication Engineers, 97, 12, 1080, 1085, Dec. 2014, [Peer-reviewed]
Japanese, Research society

Enumeration of complete set of flock patterns in trajectories
Xiaoliang Geng; Takuya Takagi; Hiroki Arimura; Takeaki Uno
Proceedings of the 5th ACM SIGSPATIAL International Workshop on GeoStreaming, IWGS 2014, 53, 61, Association for Computing Machinery, Inc, 04 Nov. 2014, [Peer-reviewed]
English, International conference proceedings

Oblivious evaluation of non-deterministic finite automata with application to privacy-preserving virus genome detection
Hirohito Sasakawa; Hiroki Harada; David A. Du Verle; Hiroki Arimura; Koji Tsuda; Jun Sakuma
Proceedings of the ACM Conference on Computer and Communications Security, 21, 30, Association for Computing Machinery, 03 Nov. 2014, [Peer-reviewed]
English, International conference proceedings

Fast Regular Expression Matching Based on Dual Glushkov NFA
Ryutaro Kurai; Norihito Yasuda; Hiroki Arimura; Shinobu Nagayama; Shin-ichi Minato
Proc. Prague Stringology Conference 2014 (PSC'14), 3, 16, Sep. 2014, [Peer-reviewed]
English, International conference proceedings

Enumeration of Induced Subtrees in a K-Degenerate Graph
Kunihiro Wasa; Hiroki Arimura; Takeaki Uno
第148回情報処理学会アルゴリズム研究会, 148, 17, 1, 6, Jun. 2014
Japanese, Symposium

Constant Time Enumeration of Subtrees with Exactly k Nodes in a Tree
Kunihiro Wasa; Yusaku Kaneta; Takeaki Uno; Hiroki Arimura
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E97D, 3, 421, 430, Mar. 2014, [Peer-reviewed]
English, Scientific journal

DenseZDD: A Compact and Fast Index for Families of Sets
Shuhei Denzumi; Jun Kawahara; Koji Tsuda; Hiroki Arimura; Shin-ichi Minato; Kunihiko Sadakane
EXPERIMENTAL ALGORITHMS, SEA 2014, 8504, 187, 198, 2014, [Peer-reviewed]
English, International conference proceedings

Efficient Enumeration of Induced Subtrees in a K-Degenerate Graph
Kunihiro Wasa; Hiroki Arimura; Takeaki Uno
ALGORITHMS AND COMPUTATION, ISAAC 2014, 8889, 94, 102, 2014, [Peer-reviewed]
English, International conference proceedings

Compact Complete Inverted Files for Texts and Directed Acyclic Graphs Based on Sequence Binary Decision Diagrams
Shuhei Denzumi; Koji Tsuda; Hiroki Arimura; Shin-ichi Minato
In Proc. of Prague Stringology Conference 2013 (PSC2013), 157, 167, Sep. 2013, [Peer-reviewed]
English, International conference proceedings

Efficient Algorithms for Finding All Length-Maximal Flock Patterns from a Set of Trajectories
Hiroki Arimura; Xiaoliang Geng; Takeaki Uno
IPSJ SIG Notes, 2013-AL-144, 3, 1, 8, Information Processing Society of Japan (IPSJ), May 2013
Japanese, Symposium, In this paper, we study the problem of finding a class of spatio-temporal patterns called (m, k)-flock patterns (Gudmundsson and van Kreveld, Proc. ACM GIS'06; Benkert, Gudmundsson, Hubner, Wolle, Computational Geometry, 41:11, 2008), which represent a groups of moving objects close each other within width at most r under L∞-norm in a given time segment of length at least k, in a collection of 2-dimensional trajectories. For max-width r > 0, min-length k, and a collection S of n trajectories of legnth T, the proposed algorithm finds all length-maximal (m, k) flock patterns in an input collection of trajectory data in O(pnT2) delay and O(p2) space, p = |X| is the size of ID set being enumerated. We also present a practical improvement using geometric indexes.

Succinct Indices Based on Zero-Suppressed Binary Decision Diagrams
Shuhei Denzumi; Jun Kawahara; Koji Tsuda; Hiroki Arimura; Kunihiko Sadakane; Shin-ichi Minato
IEICE technical report. Theoretical foundations of Computing, 112, 498, 23, 30, The Institute of Electronics, Information and Communication Engineers, Mar. 2013
English, Symposium, In many real-life problems, we are often faced with manipulating families of sets. Manipulation of large-scale set families is one of the important fundamental techniques for web information retrieval, integration, and mining. For this purpose, a special type of binary decision diagram (BDD), called Zero-suppressed BDDs (ZDDs)js used. However there is a problem of huge required space for storing ZDDs and slow membership operations. This paper introduces DenseZDD, a compressed index for static ZDDs. Our technique not only indexes set families compactly but also executes fast membership operations. We also propose a hybrid method of DenseZDD and ordinary ZDDs to make the index dynamic.

Development of a Battery Powered Compact 2000m Class ROV System and Its Method of Operation
Jun YAMAMOTO; Toshihiro IWAMORI; Naoki HOSHI; Takuzo ABE; Keiichiro SAKAOKA; Yoshihiko KAMEI; Shogo TAKAGI; Osamu NUMAMOTO; Yukihiro SAKA; Kazuhisa SUEOKA; Hiroki ARIMURA; Hidemi WATANABE
Journal of Fisheries Technology, 5, 2, 171, 174, 水産総合研究センター, Feb. 2013, [Peer-reviewed]
Japanese, Scientific journal, We developed a battery powered compact 2000m class ROV (Remotely Operated Vehicle) system with a High-Definition video camera. It does not require specialized equipment to operate. It can be operated using only general purpose equipment. This system mainly consists of a shipboard controller, a vehicle and a launcher. A thin, light optical fiber cable (diameter 9mm, length 2,500m), the primary cable, transfers control data and video images between the shipboard controller and the launcher. The secondary cable, a composite cable (diameter 14.2mm, length 50m), transfers control data and video images and supplies power to the vehicle from the six packs of lithium-ion batteries, which are mounted in the launcher. The launcher is suspended by a rope from the support ship, and the depth of the launcher is adjusted by changing the length of the rope using a general purpose rewinder.

Although initially, we had some trouble due to the launcher rope and the primary cable getting tangled, a newly-designed instrument that restricts the movement of the carabiner, and the use of a low expansion rope, facilitated smoother operation and an easier recovery of the ROV.

Faster Algorithms for Tree Similarity Based on Compressed Enumeration of Bounded-Sized Ordered Subtrees
Kunihiro Wasa; Kouichi Hirata; Takeaki Uno; Hiroki Arimura
SIMILARITY SEARCH AND APPLICATIONS (SISAP), 8199, 73, 84, 2013, [Peer-reviewed]
English, International conference proceedings

An extension of the infinite relational model incorporating interaction between objects
Iku Ohama; Hiromi Iida; Takuya Kida; Hiroki Arimura
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7819, 2, 147, 159, Springer, 2013, [Peer-reviewed]
English, International conference proceedings

Polynomial Delay and Space Discovery of Connected and Acyclic Sub-hypergraphs in a Hypergraph
Kunihiro Wasa; Takeaki Uno; Kouichi Hirata; Hiroki Arimura
DISCOVERY SCIENCE, 8140, 308, 323, 2013, [Peer-reviewed]
English, International conference proceedings

Trajectory pattern mining in practice algorithms for mining flock patterns from trajectories
Xiaoliang Geng; Takeaki Uno; Hiroki Arimura
IC3K 2013; KDIR 2013 - 5th International Conference on Knowledge Discovery and Information Retrieval and KMIS 2013 - 5th International Conference on Knowledge Management and Information Sharing, Proc., 143, 151, SciTePress, 2013
English, International conference proceedings

超グラフ中に含まれる非巡回部分超グラフの効率よい列挙,
和佐州洋; 有村博紀; 宇野毅明; 平田耕一
第88回人工知能基本問題研究会, 88, 85, 90, Jan. 2013
Japanese, Symposium

Counterexamples to the long-standing conjecture on the complexity of BDD binary operations
Ryo Yoshinaka; Jun Kawahara; Shuhei Denzumi; Hiroki Arimura; Shin-ichi Minato
INFORMATION PROCESSING LETTERS, 112, 16, 636, 640, Aug. 2012, [Peer-reviewed]
English, Scientific journal

Faster bit-parallel algorithms for unordered pseudo-tree matching and tree homeomorphism
Yusaku Kaneta; Hiroki Arimura; Rajeev Raman
Journal of Discrete Algorithms, 14, 119, 135, Jul. 2012, [Peer-reviewed]
English, International conference proceedings

A Dynamically Reconfigurable FPGA-Based Pattern Matching Hardware for Subclasses of Regular Expressions
Yusaku Kaneta; Shingo Yoshizawa; Shin-ichi Minato; Hiroki Arimura; Yoshikazu Miyanaga
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E95D, 7, 1847, 1857, Jul. 2012, [Peer-reviewed]
English, Scientific journal

Faster Pattern Matching Algorithm for Very Long Extended Patterns with Applications to Large-scale String Matching
笹川裕人; 金田悠作; 有村博紀
日本データベース学会論文誌, 11, 1, 55, 60, 日本データベース学会, Jun. 2012
Japanese

Rich Operations for Manipulating Sequence Binary Decision Diagrams
DENZUMI Shuhei; ARIMURA Hiroki; MINATO Shin-ichi
IEICE technical report. Theoretical foundations of Computing, 112, 93, 9, 16, 電子情報通信学会, Jun. 2012
Japanese, Manipulating large sequence data is important problem in string processing field. In this paper, we deal with sequence binary decision diagrams proposed by Loekito et al. in 2009. This data structure compactly and uniquely represents large set of many strings, and executes set operaitions efficiently. However, we have only basic set operations on processing systems already proposed. In this paper, we show fundamental structure of sequence binary decision diagrams, define various and advanced operations for string sets, and propose efficient algorithms to execute the operations.

木に含まれる限定サイズ部分木の列挙
和佐州洋; 宇野毅明; 有村博紀
第140回情報処理学会アルゴリズム研究会, 2012-AL-140, 4, 1, 8, May 2012
Japanese, Symposium

Constant time enumeration of bounded-size subtrees in trees and its application
Kunihiro Wasa; Yusaku Kaneta; Takeaki Uno; Hiroki Arimura
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7434, 347, 359, Springer, 2012, [Peer-reviewed]
English, International conference proceedings

Pattern mining from trajectory GPS data
Xiaoliang Geng; Hiroki Arimura; Takeaki Uno
Proceedings of the 2012 IIAI International Conference on Advanced Applied Informatics, IIAIAAI 2012, 60, 65, 2012, [Peer-reviewed]
English, International conference proceedings

Trajectory pattern matching based on bit-parallelism for large GPS data
Hirohito Sasakawa; Hiroki Arimura
Proceedings of the 2012 IIAI International Conference on Advanced Applied Informatics, IIAIAAI 2012, 66, 71, 2012, [Peer-reviewed]
English, International conference proceedings

High-speed String and Regular Expression Matching on FPGA
Yusaku Kaneta; Shingo Yoshizawa; Shin-ichi Minato; Hiroki Arimura
Proc. of Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2011 (APSIPA ASC 2011, Oct. 2011, [Peer-reviewed]
English, International conference proceedings

Faster Bit-Parallel Algorithms for Unordered Pseudo-tree Matching and Tree Homeomorphism
Yusaku Kaneta; Hiroki Arimura
COMBINATORIAL ALGORITHMS, 6460, 68, 81, 2011, [Peer-reviewed]
English, International conference proceedings

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations
Shuhei Denzumi; Ryo Yoshinaka; Hiroki Arimura; Shin-ichi Minato
PROCEEDINGS OF THE PRAGUE STRINGOLOGY CONFERENCE 2011, 147, 161, 2011, [Peer-reviewed]
English, International conference proceedings

Implementation of Sequence BDDs in Erlang
Shuhei Denzumi; Hiroki Arimura; Shin-ichi Minato
ERLANG 11: PROCEEDINGS OF THE 2011 ACM SIGPLAN ERLANG WORKSHOP, 90, 91, 2011, [Peer-reviewed]
English, International conference proceedings

Sparse and Truncated Suffix Trees on Variable-Length Codes
Takashi Uemura; Hiroki Arimura
COMBINATORIAL PATTERN MATCHING, 22ND ANNUAL SYMPOSIUM, CPM 2011, 6661, 246, 260, 2011, [Peer-reviewed]
English, International conference proceedings

Unsupervised Spam Detection by Document Probability Estimation with Maximal Overlap Method
Uemura Takashi; Ikeda Daisuke; Kida Takuya; Arimura Hiroki
Information and Media Technologies, 6, 1, 231, 240, Information and Media Technologies Editorial Board, 2011
English, In this paper, we study content-based spam detection for spams that are generated by copying a seed document with some random perturbations. We propose an unsupervised detection algorithm based on an entropy-like measure called document complexity, which reflects how many similar documents exist in the input collection of documents. As the document complexity, however, is an ideal measure like Kolmogorov complexity, we substitute an estimated occurrence probability of each document for its complexity. We also present an efficient algorithm that estimates the probabilities of all documents in the collection in linear time to its total length. Experimental results showed that our algorithm especially works well for word salad spams, which are believed to be difficult to detect automatically.

Preface
Hiroki Arimura; Raoul Medina; Jean-Marc Petit; Franz Baader; Jean-François Boulicaut; Bruno Crémilleux; Luc De Raedt; Khaled Elbassioni; Alain Gely; Bart Goethals; Sergei Kuznetsov; Dominique Laurent; Hong-Cheu Liu; Joao Marques-Silva; Rokia Missaoui; Siegfried Nijssen; Lhouari Nourine; Marie-Christine Rousset; Lakhdar Sais; Thomas Schiex; Takeaki Uno; Jef Wijsen
Proceedings - IEEE International Conference on Data Mining, ICDM, liii, liv, 2011, [Peer-reviewed]
English, International conference proceedings

Mining Frequent k-Partite Episodes from Event Sequences
Takashi Katoh; Hiroki Arimura; Kouichi Hirata
NEW FRONTIERS IN ARTIFICIAL INTELLIGENCE, 6284, 331, 344, 2010, [Peer-reviewed]
English, International conference proceedings

Fast Bit-Parallel Matching for Network and Regular Expressions
Yusaku Kaneta; Shin-ichi Minato; Hiroki Arimura
STRING PROCESSING AND INFORMATION RETRIEVAL, 6393, 372, 384, 2010, [Peer-reviewed]
English, International conference proceedings

Dynamic reconfigurable bit-parallel architecture for large-scale regular expression matching
Yusaku Kaneta; Shingo Yoshizawa; Shin-Ichi Minato; Hiroki Arimura; Yoshikazu Miyanaga
Proceedings - 2010 International Conference on Field-Programmable Technology, FPT'10, 21, 28, IEEE, 2010, [Peer-reviewed]
English, International conference proceedings

An Efficient Depth-first Search Algorithm for Extracting Frequent Diamond Episodes from Event Sequences
Katoh Takashi; Arimura Hiroki; Hirata Kouichi
Information and Media Technologies, 5, 2, 459, 470, Information and Media Technologies Editorial Board, 2010
English, In this paper, we study the problem of mining frequent diamond episodes efficientlyfrom an input event sequence with sliding a window. Here, a diamond episode is of the form a → E → b, which means that every event of E follows an event a and is followed by an event b. Then, we design a polynomial-delay and polynomial-space algorithm PolyFreqDmd that finds all of the frequent diamond episodes without duplicates from an event sequence in O(|Σ|²l) time per an episode and in O(|Σ|+l) space, where Σ and l are an alphabet and the length of the event sequence, respectively. Finally, we give experimental results on artificial and real-world event sequences with varying several mining parameters to evaluate the efficiency of the algorithm.

An Adaptive Algorithm for Splitting Large Sets of Strings and Its Application to Efficient External Sorting
Tatsuya Asai; Seishi Okamoto; Hiroki Arimura
NEW FRONTIERS IN APPLIED DATA MINING, 5433, 13, +, 2009, [Peer-reviewed]
English, International conference proceedings

Flexible Framework for Time-Series Pattern Matching over Multi-dimension Data Stream
Takuya Kida; Tomoya Saito; Hiroki Arimura
NEW FRONTIERS IN APPLIED DATA MINING, 5433, 1, 12, 2009, [Peer-reviewed]
English, International conference proceedings

Efficient serial episode mining with minimal occurrences
Hideyuki Ohtani; Takuya Kida; Takeaki Uno; Hiroki Arimura
Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication, ICUIMC'09, 457, 464, ACM, 2009, [Peer-reviewed]
English, International conference proceedings

Mining Frequent Bipartite Episode from Event Sequences
Takashi Katoh; Hiroki Arimura; Kouichi Hirata
DISCOVERY SCIENCE, PROCEEDINGS, 5808, 136, +, 2009, [Peer-reviewed]
English, International conference proceedings

A Linear-Time Off-Line Construction of Property Suffix Trees
UEMURA Takashi; KIDA Takuya; ARIMURA Hiroki
The IEICE Trans Inf & Syst., 91, 3, 595, 607, 一般社団法人電子情報通信学会, Mar. 2008
Japanese, プロパティ付きテキストとは，長さnのテキストに，補助情報としてテキスト上の互いにオーバラップを許した区間の集合(プロパティという)が付加された構造化文書の一種であり，アノテーション付きの系列データの形式的なモデルとなっている. このプロパティ付きテキストへの全文テキスト索引として， Amirら(CPM2006) は，プロパティ接尾辞木を提案した. これは，プロパティの各区間に含まれるすべての部分文字列を格納する索引構造であり，遺伝子情報や，ビデオストリーム，メタデータ付き時系列データなどへの応用がある. また，高度な検索問題である重み付きパターン照合にも用いられる. Amirらは，定数サイズのアルファベット上で，プロパティ接尾辞木をO(n log log n)時間でオフライン構築するアルゴリズムを与えたが，その線形時間構築アルゴリズムは，現在まで未解決の問題であった. 本論文では，定数アルファベット上で，プロパティ接尾辞木を線形時間で構築するオフラインアルゴリズムを与え，この問題を肯定的に解決する. 提案アルゴリズムは，接尾辞リンクの巡回を用いた簡潔な手法であり，理論的に効率良いだけでなく，実際のデータに対しても高速に動作する. 更に，人工データ上の計算機実験を行い，実際の性能を評価する.

Ambiguous frequent itemset mining and polynomial delay enumeration
Takeaki Uno; Hiroki Arimura
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 5012, 357, +, 2008, [Peer-reviewed]
English, International conference proceedings

LCM over ZBDDs: Fast generation of very large-scale frequent itemsets using a compact graph-based representation
Shin-ichi Minato; Takeaki Uno; Hiroki Arimura
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 5012, 234, +, 2008, [Peer-reviewed]
English, International conference proceedings

Efficient algorithms for mining frequent and closed patterns from semi-structured data
Hiroki Arimura
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 5012, 2, +, 2008, [Peer-reviewed]
English, International conference proceedings

Unsupervised Spam Detection by Document Complexity Estimation
Takashi Uemura; Daisuke Ikeda; Hiroki Arimura
DISCOVERY SCIENCE, PROCEEDINGS, 5255, 319, +, 2008, [Peer-reviewed]
English, International conference proceedings

擬似頻出アイテムの集合の多項式遅延列挙アルゴリズム
宇野毅明; 有村博紀
第4回人工知能学会データマイニングと統計数理研究会（SIG-DMSM）, 2007, Jul. 2007, [Invited]
Japanese, Symposium

Efficient algorithms for finding frequent substructures from semi-structured data streams
Tatsuya Asai; Kenji Abe; Shinji Kawasoe; Hiroki Arimura; Setsuo Arikawa
NEW FRONTIERS IN ARTIFICIAL INTELLIGENCE, 3609, 29, +, 2007, [Peer-reviewed]
English, International conference proceedings

An efficient polynomial delay algorithm for pseudo frequent itemset mining
Takeaki Uno; Hiroki Arimura
DISCOVERY SCIENCE, PROCEEDINGS, 4755, 219, +, 2007, [Peer-reviewed]
English, International conference proceedings

Time and space efficient discovery of maximal geometric graphs
Hiroki Arimura; Takeaki Uno; Shinichi Shimozon
DISCOVERY SCIENCE, PROCEEDINGS, 4755, 42, +, 2007, [Peer-reviewed], [Lead author]
English, International conference proceedings

Frequent Closed Item Set Mining Based on Zero-suppressed BDDs
Minato Shin-ichi; Arimura Hiroki
Information and Media Technologies, 2, 1, 309, 316, Information and Media Technologies Editorial Board, 2007
English, Frequent item set mining is one of the fundamental techniques for knowledge discovery and data mining. In the last decade, a number of efficient algorithms for frequent item set mining have been presented, but most of them focused on just enumerating the item set patterns which satisfy the given conditions, and it was a different matter how to store and index the result of patterns for efficient data analysis. Recently, we proposed a fast algorithm of extracting all frequent item set patterns from transaction databases and simultaneously indexing the result of huge patterns using Zero-suppressed BDDs (ZBDDs). That method, ZBDD-growth, is not only enumerating/listing the patterns efficiently, but also indexing the output data compactly on the memory to be analyzed with various algebraic operations. In this paper, we present a variation of ZBDD-growth algorithm to generate frequent closed item sets. This is a quite simple modification of ZBDD-growth, and additional computation cost is relatively small compared with the original algorithm for generating all patterns. Our method can conveniently be utilized in the environment of ZBDD-based pattern indexing.

An efficient algorithm for complex pattern matching over continuous data streams based on bit-parallel method
Tomoya Saito; Takuya Kida; Hiroki Arimura
2007 IEEE INTERNATIONAL WORKSHOP ON DATABASES FOR NEXT GENERATION RESEARCHERS, 13, +, 2007, [Peer-reviewed]
English, International conference proceedings

大規模テキストマイニングのための特徴部分列抽出と極大パターン発見を組み合わせた効率的アルゴリズム,
有村博紀; 宇野毅明
第63回人工知能学会人工知能基礎問題研究会, 45, 52, Sep. 2006
Japanese, Symposium

２次元離散テキストからの効率よい極大パターンマイニング
下薗真一; 宇野毅明; 有村博紀
第63回人工知能学会人工知能基本問題研究会, Sep. 2006, [Invited]
Japanese, Symposium

LA_002 Efficient Algorithm of Constructing Suffix Tree with Word Count Limitation
Uemura Takashi; Kida Takuya; Arimura Hiroki
情報科学技術レターズ, 5, 5, 8, Forum on Information Technology, Aug. 2006
Japanese, Symposium

Computational challenges of massive data sets and randomness in computation - J.UCS special issue on the First and Second Japanese-German Frontiers of Science Symposia
Joerg Rothe; Hiroki Arimura
JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 12, 6, 579, 580, 2006, [Peer-reviewed]
English

Algorithmic Learning Theory (ALT 2000) - Preface
H Arimura; S Jain
THEORETICAL COMPUTER SCIENCE, 348, 1, 1, 2, Dec. 2005, [Invited]
English, Scientific journal

Efficient method of combinatorial item set analysis based on zero-suppressed BDDs
S Minato; H Arimura
International Workshop on Challenges in Web Information Retrieval and Integration, Proceedings, 4, 11, 2005, [Peer-reviewed]
English, International conference proceedings

Key semantics extraction by dependency tree mining
Satoshi Morinaga; Hiroki Arimura; Takahiro Ikeda; Yosuke Sakao; Susumu Akamine
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 666, 671, ACM, 2005, [Peer-reviewed]
English, International conference proceedings

A polynomial space and polynomial delay algorithm for enumeration of maximal motifs in a sequence
H Arimura; T Uno
ALGORITHMS AND COMPUTATION, 3827, 724, 737, 2005, [Peer-reviewed]
English, Scientific journal

Special Issue on Algorithmic Learning Thoery
Hiroki Arimura
2005

Efficient substructure discovery from large semi-structured data
T Asai; K Abe; S Kawasoe; H Sakamoto; H Arimura; S Arikawa
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E87D, 12, 2754, 2763, Dec. 2004, [Peer-reviewed]
English, Scientific journal

Pattern Matching with Taxonomic Information
Kida Takuya; Arimura Hiroki
Asia Information Retrieval Symposium (AIRS2004), 265, 268, Asia Information Retrieval Symposium, Oct. 2004
English, In this paper, we study the pattern matching problem with taxonomic information (PMTX, for short), where taxonomy is a partially ordered set of letters describing an IS-A hierarchy. We present an efficient algorithm for PMTX that runs in O(mn/w) time with O(m+mh/w) preprocess and O(mσ/w) extra space, where h, m, t, σ, and w are the size of the taxonomic information H, the length of the pattern P ∈ Σ∗, the length of the text T ∈ Σ∗, the cardinality of the sorted alphabet Σ, and the computer word length, respectively. If the patternlength m is less than w, it runs in O(n) time, O(m + h) preprocess, and O(σ) space, and works very fast in practice. We also discuss various improvements of the algorithms for real world applications.

頻出・飽和・極大頻出集合の効率的な列挙アルゴリズムとその実装
宇野毅明; 有村博紀
日本ソフトウェア科学会, 第4回データマイニングワークショップ, 47, 54, Sep. 2004
Japanese, Symposium

LCM: 頻出飽和アイテム集合を列挙する高速なアルゴリズム
宇野毅明; 内田雄三; 浅井達哉; 有村博紀
第54回人工知能学会人工知能基礎論研究会, 23, 29, Mar. 2004
Japanese, Symposium

半構造データマイニングのための高速な無順序木パターン発見手法
房延慎二; 浅井達哉; 有村博紀; 宇野毅明; 中野眞一
第１５回データ工学ワークショップ（DEWS2004）, Mar. 2004
Japanese, Symposium

LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets
Takeaki Uno; Masashi Kiyomi; Hiroki Arimura
FIMI '04, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, Brighton, UK, November 1, 2004, CEUR-WS.org, 2004, [Peer-reviewed]

An efficient algorithm for enumerating closed patterns in transaction databases
T Uno; T Asai; Y Uchida; H Arimura
DISCOVERY SCIENCE, PROCEEDINGS, 3245, 16, 31, 2004, [Peer-reviewed]
English, Scientific journal

Learning elementary formal systems with queries
H Sakamoto; K Hirata; H Arimura
THEORETICAL COMPUTER SCIENCE, 298, 1, 21, 50, Apr. 2003, [Peer-reviewed]
English, Scientific journal

LCM: An Efficient Algorithm for Enumerating Frequent Closed Item Sets.
Takeaki Uno; Tatsuya Asai; Yuzo Uchida; Hiroki Arimura
FIMI '03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, 19 December 2003, Melbourne, Florida, USA, CEUR-WS.org, 2003, [Peer-reviewed]

Efficient substructure discovery from large semi-structured data
T Asai; K Abe; S Kawaoe; H Arimura; H Sakamoto; S Arikawa
PROCEEDINGS OF THE SECOND SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 158, 174, 2002, [Peer-reviewed]
English, International conference proceedings

Optimized Substructure Discovery for Semi-structured Data.
Kenji Abe; Shinji Kawasoe; Tatsuya Asai; Hiroki Arimura; Setsuo Arikawa
Principles of Data Mining and Knowledge Discovery, 6th European Conference, PKDD 2002, Helsinki, Finland, August 19-23, 2002, Proceedings, 1, 14, Springer, 2002, [Peer-reviewed]

Online algorithms for mining semi-structured data stream
T Asai; H Arimura; K Abe; S Kawasoe; S Arikawa
2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 27, 34, 2002, [Peer-reviewed]
English, International conference proceedings

Knowledge Discovery from Semistructured Texts.
Hiroshi Sakamoto; Hiroki Arimura; Setsuo Arikawa
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project, 586, 599, Springer, 2002, [Peer-reviewed]

Efficient Data Mining from Large Text Databases.
Hiroki Arimura; Hiroshi Sakamoto; Setsuo Arikawa
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project, 123, 139, Springer, 2002, [Peer-reviewed]

Efficient text mining with optimized pattern discovery
H Arimura
COMBINATORIAL PATTERN MATCHING, 2373, 17, 19, 2002, [Peer-reviewed]
English, Scientific journal

Mining Frequent Substructures from Web
Kenji Abe; Shinji Kawasoe; Tatsuya Asai; Hiroki Arimura; Hiroshi Sakamoto; Setsuo Arikawa
Active Mining - New Directions of Data Mining, 79, 83, 94, IOS Press, 2002, [Peer-reviewed]

Efficiently Mining Frequent Substructures from Semi-structured Data
Tatsuya Asai; Kenji Abe; Shinji Kawasoe; Hiroshi Sakamoto; Hiroki Arimura; Setsuo Arikawa
Proc. International Workshop on Informations & Electrical Engineering (IWIE2002), 59, 64, 2002, [Peer-reviewed]

Practical Algorithms for Constructing Large Suffix Arrays on Distributed Memory Parallel Computers
ASAKA HIROKI; KAWASOE SHINJI; ABE JUNICHIRO; ARIMURA HIROKI; ARIKAWA SETSUO
情報処理学会論文誌数理モデル化と応用（TOM）, 42, 14, 14, 24, Information Processing Society of Japan (IPSJ), 15 Dec. 2001
Japanese, In this paper, we study efficient parallel construction of full-text indexing structures for large text data. The suffix array is a compact full-text indexing structure that is useful in information retrieval and bio-informatics. We propose an efficient parallel algorithm for constructing suffix arrays on distributed memory parallel computers. This algorithm is a parallel implementation of the well-known external memory algorithm, called Baeza-Yates-Gonnet-Sinder (BGS) algorithm. By theoretical analysis, we show that our algorithm runs more efficiently than Riberio-Kitajima-Ziviani (RKZ) algorithm, another parallel implementation of the BGS algorithm, in terms of parallel time and communication complexities.

Extracting Text Data from HTML Documents
MURAKAMI YOSHITSUGU; SAKAMOTO HIROSHI; ARIMURA HIROKI; ARIKAWA SETSUO
情報処理学会論文誌数理モデル化と応用（TOM）, 42, 14, 39, 49, Information Processing Society of Japan (IPSJ), 15 Dec. 2001
Japanese, This paper introduces the new model of the HTML Wrapper for the information extration from HTML documents and presents the learning algorithm for the HTML Wrappers in the framework of learning by exmaples. The expressiveness of this model is shown by experimental results.

Modelling Semi-structured Documents with Hedges for Deduction and Induction.
Akihiro Yamamoto; Kimihito Ito; Akira Ishino; Hiroki Arimura
Inductive Logic Programming, 11th International Conference, ILP 2001, Strasbourg, France, September 9-11, 2001, Proceedings, 240, 247, Springer, 2001, [Peer-reviewed]
International conference proceedings

Extracting Partial Structures from HTML Documents.
Hiroshi Sakamoto; Yoshitsugu Murakami; Hiroki Arimura; Setsuo Arikawa
Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference, May 21-23, 2001, Key West, Florida, USA, 264, 268, AAAI Press, 2001, [Peer-reviewed]

Mining Semi-structured Data by Path Expressions.
Katsuaki Taniguchi; Hiroshi Sakamoto; Hiroki Arimura; Shinichi Shimozono; Setsuo Arikawa
Discovery Science, 4th International Conference, DS 2001, Washington, DC, USA, November 25-28, 2001, Proceedings, 2226, 378, 388, Springer, 2001, [Peer-reviewed]

Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications.
Toru Kasai; Gunho Lee; Hiroki Arimura; Setsuo Arikawa; Kunsoo Park
Combinatorial Pattern Matching, 12th Annual Symposium, CPM 2001 Jerusalem, Israel, July 1-4, 2001 Proceedings, 181, 192, Springer, 2001, [Peer-reviewed]

Efficient Discovery of Proximity Patterns with Suffix Arrays.
Hiroki Arimura; Hiroki Asaka; Hiroshi Sakamoto; Setsuo Arikawa
Combinatorial Pattern Matching, 12th Annual Symposium, CPM 2001 Jerusalem, Israel, July 1-4, 2001 Proceedings, 152, 156, Springer, 2001, [Peer-reviewed]

Efficient Learning of Semi-structured Data from Queries.
Hiroki Arimura; Hiroshi Sakamoto; Setsuo Arikawa
Algorithmic Learning Theory, 12th International Conference, ALT 2001, Washington, DC, USA, November 25-28, 2001, Proceedings, 315, 331, Springer, 2001, [Peer-reviewed]

Inductive inference of unbounded unions of pattern languages from positive data
T Shinohara; H Arimura
THEORETICAL COMPUTER SCIENCE, 241, 1-2, 191, 209, Jun. 2000, [Peer-reviewed]
English, Scientific journal

On approximation algorithms for local multiple alignment.
Tatsuya Akutsu; Hiroki Arimura; Shinichi Shimozono
RECOMB 2000, 1, 7, 2000, [Peer-reviewed]

Discovering unordered and ordered phrase association patterns for text mining
R Fujino; H Arimura; S Arikawa
KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 1805, 281, 293, 2000, [Peer-reviewed]
English, Scientific journal

Text data mining: Discovery of important keywords in the cyberspace
H Arimura; J Abe; R Fujino; H Sakamoto; S Shimozono; S Arikawa
2000 KYOTO INTERNATIONAL CONFERENCE ON DIGITAL LIBRARIES: RESEARCH AND PRACTICE, PROCEEDINGS, 220, 226, 2000, [Peer-reviewed]
English, International conference proceedings

Identification of tree translation rules from examples
H Sakamoto; H Arimura; S Arikawa
GRAMMATICAL INFERENCE: ALGORITHMS AND APPLICATIONS, 1891, 241, 255, 2000, [Peer-reviewed]
English, Scientific journal

Efficient discovery of optimal word-association patterns in large text databases
S Shimozono; H Arimura; S Arikawa
NEW GENERATION COMPUTING, 18, 1, 49, 60, 2000, [Peer-reviewed]
English, Scientific journal

Inductive logic programming: From logic of discovery to machine learning (Special Issue on Surveys on Discovery Science)
Hiroki Arimura; Akihiro Yamamoto
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E83D, 1, 10, 18, Jan. 2000, [Peer-reviewed], [Invited]
English, Scientific journal, 35138305;13674145

Preprint

Knowledge discovery from health data using weighted aggregation classifier
T Takae; M Chikamune; H Arimura; A Shinohara; H Inoue; S Takeya; K Uezono; T Kawasaki
DISCOVERY SCIENCE, PROCEEDINGS, 1721, 359, 361, 1999, [Peer-reviewed]
English, Scientific journal

Empirical Evaluation of a Regular Pattern Inference Algorithm by Minimal Multiple Generalization
KASAI Toru; ARIMURA Hiroki; SHINOHARA Takeshi
Research reports on information science and electrical engineering of Kyushu University, 3, 2, 185, 190, Kyushu University, Sep. 1998
A regular pattern is a string consisting of constant symbols and mutually distinct variables, and represents the set of the constant strings obtained by substituting possibly empty constant strings for the variables. A learning algorithm, called k-minimal multiple generalization (k-rnmg), finds a minimally general collection of at most k regular patterns that explains all the positive examples. Recently, several attempts have been made at applying this algorithm to protein motif discovery and other knowledge acquisitions. In such applications, its performance is considerably influenced by a class of data and values of learning parameters. This paper empirically evaluates performance of the algorithm k-rnmg on synthesized data to apply it to protein motif discovery.

An efficient tool for discovering simple combinatorial patterns from large text databases
Hiroki Arimura; Atsushi Wataki; Ryoichi Fujino; Shinichi Shimozono; Setsuo Arikawa
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 1532, 393, 395, 1998, [Peer-reviewed]
English, International conference proceedings

A fast algorithm for discovering optimal string patterns in large text databases
H Arimura; A Wataki; R Fujino; S Araikawa
ALGORITHMIC LEARNING THEORY, 1501, 247, 261, 1998, [Peer-reviewed]
English, Scientific journal

Finding tree patterns consistent with positive and negative examples using queries
H Ishizaka; H Arimura; T Shinohara
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 23, 1-2, 101, 115, 1998, [Peer-reviewed]
English, Scientific journal

Maximizing agreement with a classification by bounded or unbounded number of associated words
H Arimura; S Shimozono
ALGORITHMS AND COMPUTATIONS, 1533, 39, 48, 1998, [Peer-reviewed]
English, Scientific journal

Learning unions of tree patterns using queries
Hiroki Arimura; Hiroki Ishizaka; Takeshi Shinohara
Theoretical Computer Science, 185, 1, 47, 62, Oct. 1997, [International Magazine]
Scientific journal

Inductive inference of unbounded unions of pattern languages from positive data
Takeshi Shinohara; Hiroki Arimura
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 1160, 257, 271, Springer Verlag, 1996
English, International conference proceedings

Learning unions of tree patterns using queries
Hiroki Arimura; Hiroki Ishizaka; Takeshi Shinohara
Lecture Notes in Computer Science, 66, 79, Springer Berlin Heidelberg, 1995
In book

Unions of a Bounded Number of Tree Pattern Languages Are Hard To Learn
Hiroki Arimura
RIFIS Technical Report, RIFIS-TR-CS-81, Research Institute of Fundamental Information Science (RIFIS), Kyushu University, 13 Feb. 1994, [International Magazine]
English, Research institution

Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data
Hiroki Arimura; Takeshi Shinohara; Setsuko Otsuki
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 775, 649, 660, Springer Verlag, 1994
English, International conference proceedings

Finding tree patterns consistent with positive and negative examples using queries
Hiroki Ishizaka; Hiroki Arimura; Takeshi Shinohara
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 872, 317, 332, Springer Verlag, 1994
English, International conference proceedings

A Generalization of the Least General Generalization
Hiroki Arimura; Takeshi Shinohara; Setsuko Otsuki; Hiroki Ishizaka
Machine Intelligence 13 -- Machine Intelligence and Inductive Learning, edited by K. Furukawa, D. Michie, and S. Muggleton, Clarendon Press, 1994, [Peer-reviewed], [Invited], [Lead author], [International Magazine]
English, In book

Preprint

Protein Motif Discovery from Positive Examples by Minimal Multiple Generalization over Regular Patterns
Arimura Hiroki; Fujino Ryoichi; Shinohara Takeshi; Arikawa Setsuo
GI, 5, 39, 48, Japanese Society for Bioinformatics, 1994
English, Recently, several attempts have been made at applying machine learning method to protein motif discovery, but most of these methods require negative examples in addition to positive examples. This paper proposes an efficient method for learning protein motif from positive examples. A regular pattern is a string consisting of constant symbols and mutually distinct variables, and represents the set of the constant strings obtained by substituting nonempty constant strings for variables. Regular patterns and their languages are called extended if empty substitutions are allowed. Our learning algorithm, called k-minimal multiple generalization (k-mmg), finds a minimally general collection of at most k regular patterns that explains all the positive examples. We have implemented this algorithm for subclasses for regular patterns and extended regular patterns where the number of variables are bounded by a small constant, and run experiments on protein data taken from GenBank and PIR databases. We incorporate three heuristics into these algorithms for controlling nondeterministic choices. The experiments show that the k-mmg algorithm can very quickly find a hypothesis on the computers in practice, and that the results of our system are comparable with the results of learning method from positive and negative data.

Inductive inference of Prolog programs with linear data dependency from positive data
Hiroki Arimura; Takeshi Shinohara
Information Modelling and Knowledge Bases V, 365, 375, IOS Press, Aug. 1993, [Peer-reviewed], [Lead author]
English, International conference proceedings

Download

Inductive Learning of Logic Programs From Positive Facts Using Minimal Multiple Generalization
ISHIZAKA Hiroki; ARIMURA Hiroki; SHINOHARA Takeshi
Journal of Japanese Society for Artificial Intelligence, Special Issue on ALT'92, 8, 4, 419, 426, 01 Jul. 1993, [Peer-reviewed], [Invited], [Domestic magazines]
Japanese, Scientific journal, We consider the polynomial time inferability of primitive Prologs from positive facts. The class of primitive Prologs is a proper subclass of that of linear Prologs which is known to be inferable from only positive facts. In this paper, we discuss the polynomial time inferability of the subclass using minimal multiple generalizations. The minimal multiple generalization is a natural extension of the least generalization given by Plotkin in 1970. The minimal multiple generalization generalizes given first order words by several words, while the least generalization does by a single word. The property of the minimal multiple generalization makes it possible to perform fine generalization and to construct the heads of several clauses in a target program at the same time. We give an outline of a consistent polynomial time inference algorithm which identifies the class of primitive Prologs in the limit. The algorithm infers the heads of clauses in a target program as a minimal multiple generalization of a set of given positive facts. Furthermore, we give a similar result on the inferability of a subclass of context-free transformations which includes several well-known Prolog programs.

Depth-Bounded Inference for Nonterminating PROLOGs
Hiroki Arimura
Bulletin of Informatics and Cybernetics, 25, 3-4, 125, 136, Research Association of Statistical Sciences, Fukuoka, Japan, Jun. 1993, [Peer-reviewed], [Lead author], [International Magazine]
English, Scientific journal

Preprint

A polynomial time algorithm for finding finite unions of tree pattern languages
Hiroki Arimura; Takeshi Shinohara; Setsuko Otsuki
Nonmonotonic and Inductive Logic (NIL1991), Second International Workshop, Reinhardsbrunn Vastle, Germany, December 1991, Proceedings, LNAI, 659, 118, 131, Springer-Verlag, Feb. 1993, [Peer-reviewed]
English, International conference proceedings, Post-proceedings of the 2nd International Workshop on Nonmonotonic and Inductive Inference (NIL'91), Reinhardsbrunn Castle, Germany, December 2-6, 1991; edited by G. Brewka, K.P. Jantke, and P.H. Schmitt.

Efficient inductive inference of primitive Prologs from positive data
Hiroki Ishizaka; Hiroki Arimura; Takeshi Shinohara
Algorithmic Learning Theory, Third Workshop, ALT'92, Tokyo, Japan, October 1992, Proceedings, Lecture Notes in Computer Science, 743, 135, 146, Springer Berlin Heidelberg, 1993, [Peer-reviewed]
English, International conference proceedings

Polynomial Time Inference of Unions of Two Tree Pattern Languages
Hiroki ARIMURA; Takeshi SHINOHARA; Setsuko OTSUKI
IEICE TRANSACTIONS on Information and Systems, E75-D, 4, 426, 434, Jul. 1992, [Peer-reviewed], [Lead author]
English, Scientific journal

Polynomial time inference of a subclass of context-free transformations
Hiroki Arimura; Hiroki Ishizaka; Takeshi Shinohara
Proceedings of the fifth annual workshop on Computational learning theory - COLT '92, 136, 143, ACM Press, 1992, [Peer-reviewed]
English, International conference proceedings

Polynomial Time Inference of Unions of Tree Pattern Languages
Hiroki Arimura; Takeshi Shinohara; Setsuko Otsuki
Proc. the 2nd Workshop on Algorithmic Learning Theory (ALT'91), 105, 114, Oct. 1991, [Peer-reviewed], [Lead author]
English, International conference proceedings

Completeness of Depth-bounded Resolution for Weakly Reducing Programs
Hiroki ARIMURA
Software Science and Engineering, Selected Papers from the Kyoto Symposia, World Scientific Series of Computer Science, edited by Ikuo Nakata and Masami Hagiya, 31, 227, 245, World Scientific, Sep. 1991, [International Magazine]
In book

Preprint

■ Other Activities and Achievements

My younger days and recent stories (Special feature, Researchers turning 60)
Hiroki Arimura, LA Symposium, 85, Jul. 2025, [Invited], [Corresponding author]
Japanese, Others

Preprint

「おめでとうソサイエティ論文賞」ゼロサプレス型二分決定グラフを用いたトランザクションデータベースの効率的解析手法
湊真一; 有村博紀, 電子情報通信学会情報・システムソサイエティ誌, Nov. 2010, [Invited]
Japanese

Interdisciplinary Research Project on Knowledge Software Science
ARIMURA Hiroki, The Journal of the Institute of Electronics, Information and Communication Engineers, 92, 10, 819, 821, 01 Oct. 2009, [Invited]
The Institute of Electronics, Information and Communication Engineers, Japanese, Introduction other

「知識創出学」とは? ー北海道大学グローバルCOEプログラムと北海道内情報通信系研究グループの活動北の国から明日のICTに架ける橋ー（特別小特集知の創出を支える次世代IT基盤技術）
有村博紀, 電子情報通信学会誌, 小特集知の創出を支える次世代IT基盤技術ー北海道大学グローバルCOEプログラムと北海道内情報通信系研究グループの活動北の国から明日のICTに架ける橋ー, 92, 10, 816, 818, Oct. 2009, [Invited], [Lead author], [Domestic magazines]
電子情報通信学会, Japanese, Introduction scientific journal

Global COE at Hokkaido University: Establishing next-generation IT infrastructure for supporting knowledge creation
有村博紀, Japanese scientific monthly, 60, 12, 1004, 1008, Dec. 2007
日本学術振興会, Japanese

"Prediction and Discovery" Knowledge Acquisition from Large-Scale Information (in Japanese)
Workshop Report, Science and Technology Foresight Strategy Workshop, Center for Research and Development Strategy (CRDS), CRDS-FY2007-WR-01, Aug. 2007, [Domestic magazines]
Center for Research and Development Strategy, Japan Science and Technology Agency (JST CRDS), Japanese, Technical report

Data Intensive Computing : No.2 Frequent Itemset Mining Algorithms(Intelligent Computing and Related Issues (2))
UNO Takeaki; ARIMURA Hiroki; Takeaki Uno; Hiroki Arimura; National Institute of Informatics:The graduate University for Advanced Studies; Graduate School of Information Science and Technology Hokkaido University, Journal of Japanese Society for Artificial Intelligence, 22, 3, 425, 436, 01 May 2007
人工知能学会, Japanese

Recent Development of Mining Algorithms for Data Streams
ARIMURA Hiroki, The transactions of the Institute of Electronics, Information and Communication Engineers. D-I, 88, 3, 563, 575, Mar. 2005
最近, 大規模データストリームに対するデータマイニングが注目を集めている.本論文では, データストリームマイニングにおけるデータマイニング技法について, 最近の研究動向を紹介する.特に, 概要データを用いた膨大なデータストリームの処理技法に焦点をあて, ストリーム統計の近似計算から, オンラインストリームマイニング, データストリームに対するパターン照合まで, ストリームデータに対する様々なデータマイニング技法を解説する., 一般社団法人電子情報通信学会, Japanese

飽和集合列挙アルゴリズムを用いた大規模データベースからのルール発見手法 (特集計算推論--モデリング・数理・アルゴリズム)
宇野毅明; 有村博紀, 統計数理, 53, 2, 317, 329, 2005
統計数理研究所, Japanese

1. Algorithms for Mining Data Streams(New Frontier of Data Mining Methods)
ARIMURA Hiroki; KIDA Takuya, IPSJ Magazine, 46, 1, 4, 11, Jan. 2005
大量の電子化データの流れであるデータストリームから，有用な情報を少ない資源で効率よく取り出すためのストリームマイニング技術を概観する．まず，データストリームの特徴と，データマイニングの目的について整理し，限定された計算資源を用いて無限に続くデータストリームからマイニングを行うためのシステムに要求される性質について議論する．次に，さまざまな要約データ構造とオンライン化技術についてまとめ，最近の研究のうち特色のあるものを紹介する．, 一般社団法人情報処理学会, Japanese

Algorithms for Mining Semistructured Data
ASAI Tatsuya; ARIMURA Hiroki, The transactions of the Institute of Electronics, Information and Communication Engineers. D-I, 87, 2, 79, 96, Feb. 2004, [Peer-reviewed], [Invited]
最近,ウェブページやXMLデータなどの半構造データに対するデークマイニングが注目を集めている.本論文では,半構造データマイニングにおける主要な技術の一つである,グラフ構造データからの部分構造パターン発見技法について,主に木パターンマイニングとグラフマイニングの観点から,近年の研究動向を紹介する.はじめに,半構造データマイニングの萌芽期の研究として,Subdue及び,GBI,Nestrovらのスキーマ発見,DehaspeらのWarmrを紹介する.次に,木パターンとグラフパターンのマイニングに関する現在の研究動向を概観する.木パターンの発見アルゴリズムとして,浅井らのFREQTとUNOT,ZakiのTreeMiner.安部らのOPTTなどを説明し,一般のグラフマイニングアルゴリズムとして,猪口らのAGMとYanらのgSpanなどを説明する.最後に,新しいデークマイニングの方向性を示すものとして,半構造データストリームからのマイニングアルゴリズムStreamTについて述べる., 一般社団法人電子情報通信学会, Japanese, Introduction scientific journal, 10555378;10555377

計算学習理論における学習
特集「機械学習，それが人に及ばざる理由」（今井むつみ編），人工知能学会誌, Vol.18, No.5, 531, 536, 2003, [Invited]
Japanese, Introduction scientific journal

.
ARIMURA Hiroki, CICSJ Bulletin, 21, 2, 28, 28, 2003
., Division of Chemical Information and Computer Sciences The Chemical Society of Japan, Japanese

Optimized Pattern Discovery for Text Mining(Data/Text Mining)
Arimura Hiroki; Sakamoto Hiroshi, 応用数理, 12, 4, 366, 378, 25 Dec. 2002
The Japan Society for Industrial and Applied Mathematics, Japanese

Web Data Mining(Special Issue on Data Mining)
IKEDA Daisuke; SAKAMOTO Hiroshi; ARIMURA Hiroki, Systems, control and information, 46, 4, 177, 183, 15 Apr. 2002
Institute of Systems, Control and Information Engineers, Japanese

帰納論理プログラミングと証明補完
山本章博; 有村博紀; 平田耕一, bit別冊「発見科学とデータマイニング」, 第4章, 共立出版, 01 Jun. 2001, [Domestic magazines]
共立出版, Japanese, Introduction other

Web Mining(Special Issue : "Text Mining")
Sakamoto Hiroshi; Arimura Hiroki, Journal of Japanese Society for Artificial Intelligence, 16, 2, 233, 238, 01 Mar. 2001
The Japanese Society for Artificial Intelligence, Japanese

Base Technology for Text Mining(Special Issue : "Text Mining")
Nasukawa Tetsuya; Kawano Hiroyuki; Arimura Hiroki, Journal of Japanese Society for Artificial Intelligence, 16, 2, 201, 211, 01 Mar. 2001
The Japanese Society for Artificial Intelligence, Japanese

Algorithmic Learning Theory, 11th International Conference, ALT 2000 Sydney, Australia, December 11-13, 2000 Proceedings
Hiroki Arimura; Sanjay Jain; Arun Sharma, Lecture Notes in Artificial Intelligence (LNAI), 1968, 2000
Springer Berlin Heidelberg, English, Meeting report

一階論理式の学習と帰納論理プログラミング (<特集>計算学習理論の進展と応用可能性)
有村博紀; 平田耕一, 人工知能, 14, 5, 790, 799, Sep. 1999, [Invited], [Corresponding author]
Japanese, Introduction scientific journal

九州大学大学院システム情報科学研究科情報理学専攻
有川節夫; 有村博紀, 人工知能学会誌 = Journal of Japanese Society for Artificial Intelligence, 12, 2, 332, 333, 01 Mar. 1997
Japanese

ALT'96報告
杉本典子; 平田耕一; 有村博紀, 人工知能学会誌 = Journal of Japanese Society for Artificial Intelligence, 12, 2, 334, 337, 01 Mar. 1997, [Invited]
Japanese, Introduction scientific journal

Assessment Report of Doctoral Theses ("Inductive Inference from Positive Data Based on Generalization")
Hiroki Arimura, Engineering sciences reports, Kyushu University, 16, 3, 359, 368, 01 Dec. 1994, [Invited]
九州大学大学院総合理工学研究科, Japanese, Report research institution

Inductive inference from positive data based on generalization
Hiroki Arimura, Ph.D Thesis, Kyushu University, 27 Jun. 1994, [Lead author]
English, Others

第2回マシンインテリジェンスに関する国際ワークショップ(International Workshop on Machine Intelligence 1993)の報告
沼尾正行; 有村博紀; 原口誠; 平田耕一; 斉藤和巳; 諏訪正樹; 月本洋; 元田浩, 人工知能学会誌 = Journal of Japanese Society for Artificial Intelligence, 9, 2, 318, 320, 01 Mar. 1994
Japanese

九州工業大学情報工学部知能情報工学科大槻研究室
有村博紀, 人工知能学会誌, 6, 5, 739, 739, 01 Sep. 1991
社団法人人工知能学会, Japanese

■ Books and other publications

Encyclopedia of Natural Language Processing (In Japanese)
Hiroki Arimura, "Information extraction from semi-structured texts"
The Kyoritsu Co.,Ltd, Dec. 2009, [Contributor]

Computational Challenges of Massive Data Sets and Randomness in Computation, Special Issue on the First and Second Japanese-German Frontiers of Science Symposia
J. Rothe; H. Arimura, editors
Journal of Universal Computer Science, Vol. 12, issue 6, 579-761. doi: 10.3217/jucs-012-06, Jul. 2006

JSAI Encyclopedia of Aritificial Intelligence (In Japanese)
Hiroki Arimura, Text mining
The Kyoritsu Co.,Ltd, Dec. 2005, [Contributor]

Proceedings of the 11th International Conference on Algorithmic Learning Theory
Hiroki Arimura
Springer-Verlag, Oct. 2000, [Editor]

Algorithmic Learning Theory, 11th International Conference, ALT 2000, Sydney, Australia, December 11-13, 2000, Proceedings
Springer, 2000, [Peer-reviewed]

■ Lectures, oral presentations, etc.

Exact Algorithms for Learning Minimal Consistent Boolean Formulas from Examples
Kawata Yusuke; Hiroki Arimura
The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026 (to appear), Jun. 2026, The Japanese Society for Artificial Intelligence, Japanese, Poster presentation
08 Jun. 2026 - 12 Jun. 2026, Gumma, Japan, Abstract: From the viewpoint of explainability in machine learning, we consider the proper PAC learnability of a subclass of $n$-variable Boolean functions, the family of propositional formulas of bounded size $t$. We focus on the computational complexity of the Minimum Consistent Hypothesis problem a combinatorial optimization problem that is closely related to the learnability of this class, in the framework of parameterized complexity. First, we demonstrate that when parameterized by the formula size $t$, the problem admits an XP algorithm with $O^*(n^t)$ running time via exhaustive search, where $O^*$ denotes the big-Oh notation ignoring polynomial factors, but is fixed-parameter intractable as it is shown to be W[2]-hard. Furthermore, we show that when parameterized by the total number of positive and negative examples $m$, the problem can be solved in $O^*(3^m)$ time, establishing its membership in FPT. This latter FPT algorithm introduces a novel learning paradigm based on a propositional formula size game by Razborov (Combinatorical, Vol.10, No.1, 1990), which has been traditionally used to show lower bounds. This approach may be of particular interest as an efficient solution for scenarios with a limited number of training examples.

Links:
* A poster on Zenodo repository
* A preprint on Zenodo repository, [Domestic Conference]

Published

極大反復列の列挙：既存手法の分類と新規アルゴリズムの提案 (共同研究 with 有村博紀 , 稲永俊介)
鶴園悠太
STRセミナー2025, 25 Sep. 2025, Japanese, Oral presentation
24 Sep. 2025 - 25 Sep. 2025

Published

代数的決定木計数の定式化と, アルゴリズム, 計算量について
機械学習とデータ科学研究会, 12 Sep. 2025, 広島市立大学, Japanese, Oral presentation
11 Sep. 2025 - 12 Sep. 2025, 広島市立大学情報科学部棟別館 6 階交流ラウンジ, Japan, 35135885, [Domestic Conference]

Simple and Efficient Dynamic and Persistent Data Structures for Dynamic Strings Based on Balanced Search Trees
Taiki Kaneda; Yasuaki Kobayashi; Hiroki Arimura
The 25th Korea–Japan Joint Workshop on Algorithms and Computation (WAAC 2025), 19 Aug. 2025, the Special Interest Group on Algorithms (SIGAL) of the Information Processing Society of Japan; the Special Interest Group on Algorithms (SIGAL) of the Korean Institute of Information Scientists and Engineers (KIISE), English, Oral presentation
19 Aug. 2025 - 20 Aug. 2025, Sapporo, Japan, 35135762, [International presentation]

Preprint

ルールリストの族に対する線形整数計画法に基づくモデル更新手法
佐々木耀一; 岡嶋穣; 有村博紀
情報論的学習理論と機械学習研究会（IBISML）, 20 Dec. 2024, 電子情報通信学会, Japanese, Oral presentation
20 Dec. 2024 - 21 Dec. 2024, 北海道大学大学院環境科学院，札幌市，北海道, Japan, [Domestic Conference]

Enumerating Maximal Repeats in a Text Based on Bidirectional Indices
Yuta Tsuruzono; Hiroki Arimura; Shunsuke Inenaga
The 24th Korea–Japan Joint Workshop on Algorithms and Computation (WAAC 2024), 02 Aug. 2024, Special Interest Group on Algorithms (SIGAL) of the Korean Institute of Information Scientists and Engineers (KIISE) and the Special Interest Group on Algorithms (SIGAL) of the Information Processing Society of Japan (IPSJ), English, Oral presentation
02 Aug. 2024 - 03 Aug. 2024, Sungshin Women's University, Seoul, Korea, Korea, Republic of, 35135885;35135762, [International presentation]

Download

Finding Diverse Strings and Longest Common Subsequences in a Graph
有村博紀
Forest Workshop 2024, Sapporo (talk only), 31 Mar. 2024, コンピューティング基盤CREST「学習/数理モデルに基づく時空間展開型アーキテクチャの創出と応用」(代表：本村真人(東工大))機械学習グループ, 信頼されるAIシステムCREST「D3-AI: 多様性と環境変化に寄り添う分散機械学習基盤の創出」（代表：高前田伸也(東大)）, Japanese, Nominated symposium
29 Mar. 2024 - 31 Mar. 2024, TKPガーデンシティPREMIUM札幌大通カンファレンスルーム，札幌市, 35136410

Download

多様な最適決定木の集合を発見する近似アルゴリズム
吉岡和希
フォレストワークショップ2024, 30 Mar. 2024, コンピューティング基盤CREST「学習/数理モデルに基づく時空間展開型アーキテクチャの創出と応用」(代表：本村真人(東工大))機械学習グループ信頼されるAIシステムCREST「D3-AI: 多様性と環境変化に寄り添う分散機械学習基盤の創出」（代表：高前田伸也(東大)）, Japanese, Poster presentation
29 Mar. 2024 - 31 Mar. 2024, TKPガーデンシティPREMIUM札幌大通カンファレンスルーム，札幌市, 35136410, [Domestic Conference]

Download

テキスト中の極大反復文字列の効率良い列挙アルゴリズム
鶴園悠太
フォォレストワークショップ2024, 30 Mar. 2024, コンピューティング基盤CREST「学習/数理モデルに基づく時空間展開型アーキテクチャの創出と応用」(代表：本村真人(東工大))機械学習グループ, 信頼されるAIシステムCREST「D3-AI: 多様性と環境変化に寄り添う分散機械学習基盤の創出」（代表：高前田伸也(東大)）, Japanese, Poster presentation
29 Mar. 2024 - 30 Mar. 2024, 札幌市, Japan, 35136410, [Domestic Conference]

Published

文字列集合に対する多様な最長共通部分列の発見
志田祐仁; 有村博紀; 小林靖明
電子情報通信学会技術研究報告(Web), 信学技報, vol. 123, no. 325, COMP2023-23, pp. 45-52, 2023年12月, 22 Dec. 2023, Japanese, Oral presentation
22 Dec. 2023 - 22 Dec. 2023, 宮崎大学まちなかキャンパス, [Domestic Conference], ［Internationally co-authored］

Preprint

多様な解の発見問題のAIと大規模データ解析への展開
有村博紀
AFSA 2023年度第2回領域集会, 23 Oct. 2023, 文部科学省科学研究費補助金学術変革領域(A)「社会変革の源泉となる革新的アルゴリズム基盤の創出と体系化」，2020〜2024年度, Japanese, Nominated symposium
22 Oct. 2023 - 24 Oct. 2023, 熱海ニューフジヤホテル，静岡県熱海市, Japan, [Domestic Conference]

Published

Dynamic Programming for Optimal Decision Trees
Hiroki Arimura
AFSA A02-group Seminar, 11 Jul. 2023, A02班, 文部科学省科学研究費補助金学術変革領域(A)「社会変革の源泉となる革新的アルゴリズム基盤の創出と体系化」, Japanese, Public discourse
11 Jul. 2023 - 11 Jul. 2023, 九州大学情報基盤研究開発センター，伊都キャンパス，福岡市, Japan, 35135885, [Invited], [Domestic Conference]

Published

Minimum Consistent Subset for Trees Revisited
Hiroki Arimura; Tatsuya Gima; Yasuaki Kobayashi; Hiroomi Nochide; Yota Otachi
The 23rd Japan–Korea Joint Workshop on Algorithms and Computation (WAAC 2023),, the Special Interest Group on Algorithms (SIGAL) of the Information Processing Society of Japan and the Special Interest Group on Theoretical Computer Science (SIGTCS) of the Korean Institute of Information Scientists and Engineers (KIISE)., English, Oral presentation
24 Jun. 2023 - 25 Jun. 2023, June 24–25, 2023 Nagoya University, Nagoya, Japan, Japan, [International presentation]

Efficient Computation of the Run-length Encoded Burrows-Wheeler Transform Based on the Compact Directed Acyclic Word Graph
Mizuki Sue; Yasuaki Kobayashi; Hiroki Arimura; Yuto Nakashima; Shunsuke Inenaga
電子情報通信学会技術研究報告(Web), 信学技報, vol.122, no.294, COMP2022-25, pp.21-28(COMP), 06 Dec. 2022, Japanese, Oral presentation
06 Dec. 2022 - 06 Dec. 2022, 愛媛大メディアホール, [Domestic Conference]

Preprint

コンパクト非巡回語グラフに基づく連長圧縮Burrows-Wheeler変換の効率良い構築（スライド）
須江瑞樹; 小林靖明; 有村博紀; 中島祐人; 稲永俊介
電子情報通信学会技術研究報告;信学技報;OM;, 06 Dec. 2022, 電子情報通信学会, Japanese, Oral presentation
愛媛大メディアホール, Japan, 35135762;35135885, [Domestic Conference]

Published

Efficient Construction of Multi-Dimensional Contingency Tables for Small and Accurate Decision Trees
Shota SHIRAI; Hiroki ARIMURA
The 25th Information-Based Induction Sciences Workshop, 21 Nov. 2022, IEICE, Japanese, Poster presentation
20 Nov. 2022 - 23 Nov. 2022, Tsubaka International Conference Hall, Tsukuba, Japan, Japan, [Domestic Conference]

Published

人工知能とビッグデータ応用における離散アルゴリズム
有村博紀
2023年度第2回領域集会, 学術変革(A)「社会変革アルゴリズム基盤」(AFSA), 23 Oct. 2022, 科研費学術変革(A)「社会変革アルゴリズム基盤」(AFSA), Japanese, Poster presentation
22 Oct. 2022 - 24 Oct. 2022, 熱海ニューフジヤホテル, 静岡県熱海市, 35135885, [Domestic Conference]

Download

Discovering Closed Patterns over Relational Databases with Generalization Hierarchy
Hiroki Arimura; Naoki Taguchi; Ye Wang
JSAI 2022 Annual Meeting, JSAI, Japanese, Poster presentation
14 Jun. 2022 - 17 Jun. 2022, 国立京都国際会館, オンライン, Japan, [Domestic Conference]

Published

デカルト木部分列照合問題の高速なアルゴリズム
大泉翼; 有村博紀
アルゴリズム（AL）研究会, 2022-AL-186, Vol.3, pp.1-8, 20 Jan. 2022, 情報処理学会, Japanese, Oral presentation
20 Jan. 2022 - 20 Jan. 2022, オンライン, Japan, [Domestic Conference]

Download

半順序集合の弱埋め込み問題に対するパラメータ化アルゴリズム
宮崎怜子; 有村博紀; 小林靖明
電子情報通信学会技術研究報告(Web), 06 Dec. 2022, 電子情報通信学会, Japanese, Oral presentation
2022 - 2022, 愛媛大, Japan, 35135762;35135885, [Domestic Conference]

Published

Explainable Machine Learning for Trustworthy AI
Hiroki Arimura
HU-DUT Workshop for Big data and AI, Hokkaido University, 17 Dec. 2021, English, Invited oral presentation
17 Dec. 2021 - 17 Dec. 2021, [Invited]

ルールリストに対するRashomon集合の厳密計算と予測多重性解析
又康太; 金森憲太朗; 有村博紀
第24回情報論的学習理論ワークショップ (IBIS 2021), 一般セッション, 2-3 モデル解釈・検証, 106, 10 Nov. 2021, Japanese, Oral presentation
10 Nov. 2021 - 13 Nov. 2021, オンライン, Japan

Published

デカルト木照合の部分系列への拡張
加井丈志; 光吉健汰; 古谷勇; 有村博紀
コンピュテーション（COMP）研究会, COMP2021-6, May 2021, 電子情報通信学会, Japanese, Oral presentation
07 May 2021 - 08 May 2021, オンライン, Japan, [Domestic Conference]

Published

Fast Enumeration Algorithms for Feature Selection Based on Consistency Measure
Wang Ye; Hiroki Arimura
SIG-FPAI meeting, 116, JSAI, 21 Mar. 2021, 人工知能学会, Japanese, Oral presentation
オンライン, Japan, 35135762, [Domestic Conference]

Published

Variable Importance Cloudの要約方法と決定木に対する実験的評価
又康太; 金森憲太朗; 有村博紀
第23回情報論的学習理論ワークショップ (IBIS 2020), 電子情報通信学会, Japanese, Oral presentation
23 Nov. 2020 - 26 Nov. 2020, オンライン, [Domestic Conference]

Published

決定木要約の効率良い構築法 -- 説明可能な人工知能の実現に向けて --
有村博紀; 金森憲太朗; 王叶
2020年度人工知能学会全国大会 (JSAI2020), 3E1-GS-2-05, 人工知能学会, Japanese, Oral presentation
04 Jun. 2020 - 07 Jun. 2020, オンライン, Japan, [Domestic Conference]

Download

混合整数線形計画法に基づく実現可能性を考慮した反事実的説明法
金森憲太朗; 高木拓也; 小林健; 有村博紀
2020年度人工知能学会全国大会 (JSAI2020), インタラクティブセッション, 3Rin4-44, 人工知能学会,, Japanese, Oral presentation
04 Jun. 2020 - 07 Jun. 2020, オンライン, Japan, [Domestic Conference]

決定木アンサンブルにおける出現頻度比に基づく変数重要度
又康太; 金森憲太朗; 有村博紀
2020年度人工知能学会全国大会 (JSAI2020) , インタラクティブセッション, 4Rin1-26, 人工知能学会, Japanese, Oral presentation
04 Jun. 2020 - 07 Jun. 2020, オンライン, Japan, [Domestic Conference]

説明可能な機械学習のための拡張シャープレイ値の厳密指数時間アルゴリズム
大泉翼; 有村博紀
第12回データ工学と情報マネジメントに関するフォーラム (DEIM2020), インタラクティブセッション, P2-23, 電子情報通信学会DE研究専門委員会，日本データベース学会，情報処理学会DBS研究会, Japanese, Oral presentation
02 Mar. 2020 - 04 Mar. 2020, オンライン, Japan, [Domestic Conference]

Efficient Constrained Pattern Mining Using Dynamic Item Ordering for Explainable Classification
IWASHITA Hiroaki; TAKAGI Takuya; SUZUKI Hirofumi; GOTO Keisuke; OHORI Kotaro; ARIMURA Hiroki
JSAI Technical Report, SIG-FPAI, 111 8-8, Jan. 2020, Japanese, Oral presentation

Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear Optimization
金森憲太朗; 高木拓也; 小林健
第22回情報論的学習理論ワークショップ (IBIS 2019), ポスター, 1-069, 電子情報通信学会, Japanese, Oral presentation
20 Nov. 2019 - 23 Nov. 2019, ウインクあいち, 名古屋市, Japan, [Domestic Conference]

説明可能な機械学習に向けて：整数計画法と列挙に基づく最適決定木の厳密学習アルゴリズムの実験的比較
王叶; 有村博紀
第22回情報論的学習理論ワークショップ (IBIS 2019), ポスター, 2-089, 電子情報通信学会, Japanese, Oral presentation
20 Nov. 2019 - 23 Nov. 2019, ウインクあいち, 名古屋市, Japan, [Domestic Conference]

決定木アンサンブル予測器の効率的ハードウェア実装のための簡約化に関する研究
松田祐汰; 瀧川一学; 有村博紀
第22回情報論的学習理論ワークショップ (IBIS 2019), ポスター, 2-088, 電子情報通信学会, Japanese, Oral presentation
20 Nov. 2019 - 23 Nov. 2019, ウインクあいち, 名古屋市, Japan, [Domestic Conference]

Constant Amortized Time Enumeration of Independent Sets for Graphs with Bounded Clique Number
栗田和宏; 和佐州洋; 宇野毅明; 有村博紀
電子情報通信学会コンピューテーション研究会 119(249) COMP2019 18-28, Nov. 2019, Oral presentation

Fairness-aware Edit of Thresholds in a Learned Decision Tree Using a Mixed Integer Programming Formulation
金森憲太朗; 有村博紀
2019年度人工知能学会全国大会 (JSAI2019), 人工知能学会, インタラクティブセッション, 3Rin2-11, 06 Jun. 2019, 人工知能学会, English, Oral presentation
朱鷺メッセ, 新潟市, Japan, [Domestic Conference]

イベント系列からの有意なエピソードの効率良いマイニング
谷陽太; 平田耕一; 有村博紀
第11回データ工学と情報マネジメントに関するフォーラム (DEIM2019), H1-4, 電子情報通信学会DE研究専門委員会，日本データベース学会，情報処理学会DBS研究会, Japanese, Oral presentation
03 Mar. 2019 - 04 Mar. 2019, ホテルオークラJRハウステンボス, 佐世保市, Japan, [Domestic Conference]

An Efficient Algorithm for Enumerating Chordal Bipartite Induced Subgraphs in Graphs
栗田和宏; 和佐州洋; 宇野毅明; 有村博紀
第108回人工知能基本問題研究会(SIG-FPAI), SIG-FPAI-B802-02, 6-11, 人工知能学会, Japanese, Oral presentation
29 Jan. 2019 - 30 Jan. 2019, 大阪府立大学 I-siteなんば, 大阪市, 受賞：研究会優秀賞, 金森憲太朗, 有村博紀, 2018年度，人工知能学会，2019年6月27日

大きな正規表現に対する系列二分決定グラフを用いた効率よい照合手法
瀧澤涼介; 喜田拓也; 有村博紀; 瀧川一学
電子情報通信学会技術研究報告, 2019
2019 - 2019

Subgraph Enumeration: Efficient Algorithms and Empirical Studies
Kazuhiro Kurita; Kunihiro Wasa; Takeaki Uno; Hiroki Arimura
The 2nd International Workshop on Enumeration Problems & Application (WEPA2018), 06 Nov. 2018, English, Oral presentation
06 Nov. 2018 - 06 Nov. 2018, University of Pisa, Italy, [International presentation], ［Internationally co-authored］

Efficient Enumeration Algorithm for Dominating Sets in Bounded Degenerate Graphs (Foundations and Applications of Algorithms and Computation)
Kurita Kazuhiro; Wasa Kunihiro; Arimura Hiroki; Uno Takeaki
数理解析研究所講究録, Aug. 2018, 京都大学数理解析研究所, English
Aug. 2018 - Aug. 2018, Dominating sets are fundamental graph structures. However, enumeration of dominating sets has not received much attention. This study aims to propose an efficient enumeration algorithms for bounded degenerate graphs. The algorithm enumerates all the dominating sets for k-degenerate graphs in O(k) time per solution using O(n+m) space. Since planar graphs have a constant degeneracy, this algorithm can enumerate all such sets for planar graphs in constant time per solution.

モデル選択のためのサポートベクトル列挙
金森憲太朗; 原聡; 石畠正和; 有村博紀
第29回IBISML研究会, 情報論的学習理論と機械学習研究会, 電子情報通信学会情報・システムソサイエティ, Japanese, Oral presentation
13 Jun. 2018 - 15 Jun. 2018, 沖縄科学技術大学院大学(OIST), Japan, [Domestic Conference]

グラフ断片決定木を用いたグラフ特徴抽出手法
坂上陽規; 瀧川一学; 有村博紀
人工知能学会2018年全国大会(JSAI2018),インタラクティブセッション, 3Pin1-10, 人工知能学会, Japanese, Oral presentation
05 Jun. 2018 - 08 Jun. 2018, 城山観光ホテル、鹿児島県鹿児島市, Japan, [Domestic Conference]

イベント系列からの有意性を考慮した菱形エピソードマイニング
谷陽太; 古谷勇; 平田耕一; 有村博紀
人工知能学会2018全国大会(JSAI2018), インタラクティブセッション, 3Pin1-10, 人工知能学会, Japanese, Oral presentation
05 Jun. 2018 - 08 Jun. 2018, 城山観光ホテル、鹿児島県鹿児島市, Japan, [Domestic Conference]

グラフに含まれる大きな内周を持つ部分グラフの効率良い列挙
栗田和宏; Alessio Conte; 和佐州洋; 宇野毅明; 有村博紀
情報処理学会第80回全国大会講演論文集, Vol.80, No.1, 1.347-1.348, 13 Mar. 2018, 情報処理学会, Japanese
13 Mar. 2018 - 13 Mar. 2018, 内周はグラフに含まれる最小の閉路長を表し,グラフの内周を計算する問題はよく研究されている.ItaiとRodehは一般グラフの内周を計算する非自明な最初のアルゴリズムを開発した.このアルゴリズムは最悪時O(nm)時間で動作し,平均時間O(n^2)時間で動作する.重みなし平面グラフに対しては,DjidjevがO(n^{5/4} log n)時間アルゴリズムを与え,Changらがこのアルゴリズムを改良し線形時間アルゴリズムを与えた.我々の知る限りでは大きな内周を持つ連結な部分グラフを列挙するアルゴリズムは存在しない.本論文では,これらの問題を解く列挙アルゴリズムを与える.このアルゴリズムの単純な拡張により,k以上の内周を持つ重み付きグラフの部分グラフや非連結なグラフの列挙も可能である., [Domestic Conference]

順序決定木に対する正則化パラメータ推定の高速化
金森憲太朗; 石畠正和; 湊真一; 有村博紀
第105回人工知能基本問題研究会(SIG-FPAI)，SIG-FPAI-B508, 人工知能学会, Japanese, Oral presentation
28 Jan. 2018 - 29 Jan. 2018, 石垣島大濱信泉記念館, 沖縄県石垣市, Japan, [Domestic Conference]

Published

決定化されたグラフパターントライの学習アルゴリズム
坂上陽規; 栗田和宏; 瀧川一学; 有村博紀
第105回人工知能基本問題研究会(SIG-FPAI)，SIG-FPAI-B508, 人工知能学会, Japanese, Oral presentation
28 Jan. 2018 - 29 Jan. 2018, 石垣島大濱信泉記念館, 沖縄県石垣市, Japan, [Domestic Conference]

Efficient Enumeration of Dominating Sets in k-degenerate Graphs (情報セキュリティ)
栗田和宏; 和佐州洋; 宇野毅明; 有村博紀
電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 21 Dec. 2017, 電子情報通信学会, English
21 Dec. 2017 - 21 Dec. 2017, [Domestic Conference]

〔Major achievements〕Optimization and Enumeration of Decision Trees from Massive Data Sets (talk)
Hiroki Arimura; Kazuhito Osabe; Takeaki Uno
The 23st Conference of the International Federation of Operational Research Societies (IFORS 2017), 17 Jul. 2017, International Federation of Operational Research Societies, IFORS, English, Invited oral presentation
17 Jul. 2017 - 21 Jul. 2017, Québec City Convention Centre, Québec, Canada, Canada, In this talk, we study optimization and counting of complex rules in data mining/machine learning based on enumeration. In particular, we consider exact algorithms using small space (of input polynomial) for the task of mining a class of small and accurate decision trees in a labeled data set., 11763074, [Invited], [International presentation]

Published

Fast Discovery of Optimal Ordered Decision Tree Based on Item Set Enumeration
長部和仁; 宇野毅明; 有村博紀
人工知能基本問題研究会, 12 Dec. 2016, 人工知能学会, Japanese
12 Dec. 2016 - 12 Dec. 2016

グラフに含まれる誘導マッチングの列挙
栗田和宏; 和佐州洋; 喜田拓也; 有村博紀
情報処理学会研究報告(Web), 2016
2016 - 2016

トラジェクトリデータに対する効率良い近似パターン照合アルゴリズム (コンピュテーション)
笹川裕人; 有村博紀
電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 12 Jun. 2015, 電子情報通信学会, Japanese
12 Jun. 2015 - 12 Jun. 2015

Extension of Ukkonen's Online Suffix Tree Algorithm to Multi-Stream Texts
高木拓也; 有村博紀
電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 12 Jun. 2015, 電子情報通信学会, Japanese
12 Jun. 2015 - 12 Jun. 2015

Reconfiguration Problem for Maximal Induced Trees
和佐州洋; 山中克久; 有村博紀
電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 12 Jun. 2015, 電子情報通信学会, Japanese
12 Jun. 2015 - 12 Jun. 2015, [Domestic Conference]

Efficient Approximate 3-Dimensional Point Set Matching and Its Application to Molecular Pattern Matching
佐々木耀一; 渋谷哲朗; 伊藤公人; 有村博紀
電子情報通信学会技術研究報告, 2015
2015 - 2015

Secure Pattern Matching Using Homomorphic Encryption for Extended String Patterns
原田弘毅; 笹川裕人; 有村博紀; 佐久間淳
コンピュータセキュリティシンポジウム2013論文集, 14 Oct. 2013, Japanese
14 Oct. 2013 - 14 Oct. 2013

Event Clustering Method based on Linear HMM for Geotagged Personal Contents
OHAMA Iku; KIDA Takuya; ARIMURA Hiroki; ABE Toshihisa
IEICE technical report, 21 Mar. 2011, The Institute of Electronics, Information and Communication Engineers, Japanese
21 Mar. 2011 - 21 Mar. 2011, We discuss the geographical clustering problem on human activity logs. In this paper, we formulate the problem as a segmentation problem for 2-dimensional sequence data, and propose two clustering methods, named LS-linHMM and X-linHMM. LS-linHMM is based on linear constrained HMMs which estimate the optimal number of clusters by model selection with an information criterion. X-linHMM is a hierarchical clustering method which is based on the idea of x-means and use 2-state linear constrained HMMs. It runs faster than LS-linHMM in many cases. In this time, we apply these clustering methods into the clustering problem for geo-tagged personal picture contents. Our experimental results using real data showed that the proposed methods are more effective in clustering than the simple x-means.

D-4-2 An Efficient Stream Pattern Matching Algorithm for Tree Regular Expressions
Fujikane Yasuyuki; Kaneta Yusaku; Arimura Hiroki
Proceedings of the IEICE General Conference, 28 Feb. 2011, The Institute of Electronics, Information and Communication Engineers, Japanese
28 Feb. 2011 - 28 Feb. 2011

TK-7-1 Center for Next-Generation Information Technology based on Knowledge Discovery and Knowledge Federation
有村博紀
Proceedings of the IEICE General Conference, 28 Feb. 2011, The Institute of Electronics, Information and Communication Engineers, Japanese
28 Feb. 2011 - 28 Feb. 2011

An EM algorithm for inferring geographic transmission probability tables from a large phylogenetic tree (特集「ベイジアン・ネットワーク」および一般)
柳橋史成; 伊藤公人; 有村博紀
人工知能基本問題研究会, 17 Nov. 2010, 人工知能学会, English
17 Nov. 2010 - 17 Nov. 2010

Extraction of frequent bipartite episodes from bacterial culture data
河東孝; 有村博紀; 平田耕一
人工知能基本問題研究会, 24 Sep. 2010, 人工知能学会, Japanese
24 Sep. 2010 - 24 Sep. 2010

Efficient Pattern Matching for Acyclic Regular Expressions
金田悠作; 湊真一; 有村博紀
情報処理学会研究報告, Jun. 2010, 情報処理学会, Japanese
Jun. 2010 - Jun. 2010

Efficient Pattern Matching for Acyclic Regular Expressions
KANETA Yusaku; MINATO Shin-ichi; ARIMURA Hiroki
IEICE technical report. Theoretical foundations of Computing, 12 May 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
12 May 2010 - 12 May 2010, A regular expression is acyclic if it is over the basis in Σ, dot "・", and union "|". In this paper, for the subclass of acyclic regular expressions, we give an efficient algorithm that solves the regular expression matching problem for an acyclic regular expression of length m and depth d and an input text of length n in O(nmd/w) time using O(md) preprocessing and O(md/w) space in words on unit-cost RAM model with word length w. We introduce new bit-parallel techniques, called scatter and gather operations to simulate Thompson NFA for a given regular expression, and naturally extend SHIFT-AND approach by Wu and Manber (CACM 35(10), 1992) to the regular expression matching problem. For an acyclic regular expression with unbounded depth d=O(1), our approach is faster than Bille's approach for any regular expression keeping O(m/w) space.

D-1-7 AN EFFICIENT REGULAR EXPRESSION MATCHING ALGORITHM BASED PARALLEL BIT-DISTRIBUTION
Kaneta Yusaku; Minato Shin-ichi; Arimura Hiroki
Proceedings of the IEICE General Conference, 02 Mar. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
02 Mar. 2010 - 02 Mar. 2010

An efficient algorithm for extracting frequent partite episodes with minimum occurrences
河東孝; 有村博紀; 平田耕一
人工知能基本問題研究会, 27 Jan. 2010, 人工知能学会, Japanese
27 Jan. 2010 - 27 Jan. 2010

An efficient hardware-oriented algorithm for regular expression matching based on parallel bit-distribution
KANETA Yusaku; YOSHIZAWA Shingo; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
Technical report of IEICE. VLD, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
19 Jan. 2010 - 19 Jan. 2010, In this paper, we study the regular expression matching problem for fast data stream processing. We present an efficient algorithm for based on new bit-parallel methods, called parallel scatter and gather exploiting bit-parallelism in a computer word. Finally, we show an architecture for a hardware implementation of our algorithm. The architecture can change its regular expression patterns on-the-fly without expensive reconfiguration.

Development of Interdisciplinary Research Environment by Collaboration of e-Learning and Remote FPGA
KIM Jaseong; YOSHIZAWA Shingo; KANETA Yusaku; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
Technical report of IEICE. VLD, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
19 Jan. 2010 - 19 Jan. 2010, Field programmable gate array (FPGA) can reconfigure logic circuits after production, which is embedded into electric instruments. Recently, researchers and engineers who do not engage in the field of LSI have developed exclusive computer by FPGA as their researches. However, they need to face to overcome barriers in acquiring expert hardware knowledge and preparing design environments when they enter FPGA studies. In this report, we have developed a remote FPGA system combined with e-learning for smooth promotion of interdisciplinary research. This system provides FPGA design education, research and development, field tests by remote control, where users prepare only their own computers. This report describes the framework and reports the research activities using the system.

An efficient hardware-oriented algorithm for regular expression matching based on parallel bit-distribution
KANETA Yusaku; YOSHIZAWA Shingo; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
IEICE technical report. Computer systems, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
19 Jan. 2010 - 19 Jan. 2010, In this paper, we study the regular expression matching problem for fast data stream processing. We present an efficient algorithm for based on new bit-parallel methods, called parallel scatter and gather exploiting bit-parallelism in a computer word. Finally, we show an architecture for a hardware implementation of our algorithm. The architecture can change its regular expression patterns on-the-fly without expensive reconfiguration.

Development of Interdisciplinary Research Environment by Collaboration of e-Learning and Remote FPGA
KIM Jaseong; YOSHIZAWA Shingo; KANETA Yusaku; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
IEICE technical report. Computer systems, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
19 Jan. 2010 - 19 Jan. 2010, Field programmable gate array (FPGA) can reconfigure logic circuits after production, which is embedded into electric instruments. Recently, researchers and engineers who do not engage in the field of LSI have developed exclusive computer by FPGA as their researches. However, they need to face to overcome barriers in acquiring expert hardware knowledge and preparing design environments when they enter FPGA studies. In this report, we have developed a remote FPGA system combined with e-learning for smooth promotion of interdisciplinary research. This system provides FPGA design education, research and development, field tests by remote control, where users prepare only their own computers. This report describes the framework and reports the research activities using the system.

An efficient hardware-oriented algorithm for regular expression matching based on parallel bit-distribution
KANETA Yusaku; YOSHIZAWA Shingo; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
IEICE technical report, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
19 Jan. 2010 - 19 Jan. 2010, In this paper, we study the regular expression matching problem for fast data stream processing. We present an efficient algorithm for based on new bit-parallel methods, called parallel scatter and gather exploiting bit-parallelism in a computer word. Finally, we show an architecture for a hardware implementation of our algorithm. The architecture can change its regular expression patterns on-the-fly without expensive reconfiguration.

Development of Interdisciplinary Research Environment by Collaboration of e-Learning and Remote FPGA
KIM Jaseong; YOSHIZAWA Shingo; KANETA Yusaku; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
IEICE technical report, 19 Jan. 2010, The Institute of Electronics, Information and Communication Engineers, Japanese
19 Jan. 2010 - 19 Jan. 2010, Field programmable gate array(FPGA)can reconfigure logic circuits after production, which is embedded into electric instruments. Recently, researchers and engineers who do not engage in the field of LSI have developed exclusive computer by FPGA as their researches. However, they need to face to overcome barriers in acquiring expert hardware knowledge and preparing design environments when they enter FPGA studies. In this report, we have developed a remote FPGA system combined with e-learning for smooth promotion of interdisciplinary research. This system provides FPGA design education, research and development, field tests by remote control, where users prepare only their own computers. This report describes the framework and reports the research activities using the system.

Efficient Mining Algorithms for Partite Episodes
河東孝; 有村博紀; 平田耕一
人工知能学会全国大会論文集, 2010, 人工知能学会, Japanese
2010 - 2010

D-4-18 A Fast String Matching Algorithm and Its FPGA Design for High-Speed Stream Processing
KANETA Yusaku; YOSHIZAWA Shingo; MINATO Shin-ichi; ARIMURA Hiroki; MIYANAGA Yoshikazu
Proceedings of the IEICE General Conference, 04 Mar. 2009, The Institute of Electronics, Information and Communication Engineers, Japanese
04 Mar. 2009 - 04 Mar. 2009

A Polynomial-Delay Polynomial-Space Algorithm for Extracting Frequent Diamond Episodes from Event Sequences
河東孝; 有村博紀; 平田耕一
論文集, 2009, 人工知能学会, Japanese
2009 - 2009

XQUBE : A Visual Language for XQuery Construction from Interaction and Examples
TSUTSUI Junpei; ARIMURA Hiroki
IPSJ SIG Notes, 14 Sep. 2008, Information Processing Society of Japan (IPSJ), Japanese
14 Sep. 2008 - 14 Sep. 2008, This paper studies visual languages for interactive construction and editting of XQuery. We present a tree pattern-like representation TQ by extending Braga et al.'s XQBE (TODS, 30:2, 2005), and then give its visual language counterpart XQUBE. Furthermore, we show that these two representations are equivalent to a restricted fragment of XQuery called TQ-XQuery that has for, child-axis paths, concatenation constructions, but does not have let, if/where constructions, and arbitrary compsition.

Efficent Serial Episode Mining with Minimal Occurrences
Ohtani Hideyuki; Kida Takuya; Uno Takeaki; Arimura Hiroki
IPSJ SIG Notes, 12 Jun. 2008, Information Processing Society of Japan (IPSJ), Japanese
12 Jun. 2008 - 12 Jun. 2008, Recently, knowledge discovery in large data increases its importance in various fields. Especially, data mining from time-series data gains much attension. This paper studies the problem of finding frequent episodes appearing in a sequence of events. We propose an efficient depth-first-search algorithm for mining frequent serial episodes in a given event sequence using the notion of right-minimal occurrences. Then, we present some techniques for speeding up the algorithm. Finally, we ran experiments on real datasets to evaluate the usefulness of the proposed methods.

Learning Walks from Graphs
Junpei TSUTSUI; Hiroki ARIMURA
Technical Committee on Theoretical Foundations of Computing (COMP), 18 Apr. 2008, The Institute of Electronics, Information and Communication Engineers (IEICE), Japan, Japanese, Oral presentation
18 Apr. 2008 - 18 Apr. 2008, Osaka Prefecture University, Nakamozu, Osaka, Japan, Abstract In this paper, we study the problem of learning an unknown label sequence, called a walk, that is embedded in a collection of vertex-labeled graphs. We present a polynomial time learning algorithm that learns all walks π∗ from one positive example and using O(m + n) membership queries, where m = |π∗| is the size of the walk and n is the size of the positive example, respectively. Based on prediction preserving reduction, we also show that the prediction problem for the class of walks from positive and negative examples of graphs under both of vertex-non-overlapping and edge-non-overlapping embeddings are as hard as the prediction problem of DNF (disjunctive normal form formulas) from positive and negative examples.
Key words: graph inference, query learning model, membership queries, polynomial time learning, positive examples, [Domestic Conference]

Preprint

4ZK-7 Keyword Extraction for Browsing Support
Uemura Takashi; KIDA Takuya; ARIMURA Hiroki
全国大会講演論文集, 13 Mar. 2008, Information Processing Society of Japan (IPSJ), Japanese
13 Mar. 2008 - 13 Mar. 2008

Fast Algorithms for Finding Correlated Mutations from Nucleic / Amino Acid Sequences
ITO Kimihito; TANIGUCHI Tsuyoshi; MURAKAMI Teiji; ARIMURA Hiroki; HARAGUCHI Makoto
IPSJ SIG technical reports, 03 Mar. 2008, Information Processing Society of Japan (IPSJ), Japanese
03 Mar. 2008 - 03 Mar. 2008, We present new algorithms to find the sets of positions involved in correlated mutations from amino/nucleic acid sequences. In this paper, we show that 1) use of the conditional entropy H(i|j) is overall effective in detecting the sets of positions involved in correlated mutations, 2) a natural extension of H(i|j) to the conditional entropy H(i_1…i_m|j) allows us to detect the sets of more than two positions involved in correlated mutations, 3) pruning rules derived from the monotonicity and closure of positions under logical equivalence boost the running performance in terms of computational speed remarkably.

D-020 Property Suffix Tree : An Efficient Data Structure for Stream Data with Meta Data
Uemura Takashi; Kida Takuya; Arimura Hiroki
情報科学技術フォーラム一般講演論文集, 22 Aug. 2007, Forum on Information Technology, Japanese
22 Aug. 2007 - 22 Aug. 2007

D-019 Massive Continuous Stream Pattern Matching Based on Bit-Parallel Method
Saito Tomoya; Kida Takuya; Arimura Hiroki
情報科学技術フォーラム一般講演論文集, 22 Aug. 2007, Forum on Information Technology, Japanese
22 Aug. 2007 - 22 Aug. 2007

Efficient maximal pattern discovery from massive geometric graphs
有村博紀; 宇野毅明; 下菌真一
人工知能基本問題研究会, 13 Jul. 2007, 人工知能学会, Japanese
13 Jul. 2007 - 13 Jul. 2007

Tane: a flexible information extraction browser with learning
筒井淳平; 金田悠作; 有村博紀
人工知能基本問題研究会, 13 Jul. 2007, 人工知能学会, Japanese
13 Jul. 2007 - 13 Jul. 2007

On an Efficient off-Line Construction of Property Suffix Trees
UEMURA Takashi; KIDA Takuya; ARIMURA Hiroki
IEICE technical report. Theoretical foundations of Computing, 19 Apr. 2007, The Institute of Electronics, Information and Communication Engineers, Japanese
19 Apr. 2007 - 19 Apr. 2007, In some intelligent application of text retrieval, it is required to do a search just through particular parts of target text data with consideration for some properties that the text have. Namely, the texts are followed by additional information, and the each target part satisfies a certain condition on it. In this paper, we address the problem to construct an efficient index structure for this kind of search. In 2006, Amir et al. proposed a modified suffix tree for this problem, called the property suffix tree, which stores only substrings that belong to the given property. This algorithm constructs a suffix tree at first, and then pluning edges indicating substrings which are not included in the property. In this paper, we present an algorithm for finding all borders between nodes to be remained and nodes to be eliminated in the suffix tree.

Hardness Results on Local Multiple Alignment of Biological Sequences
AKUTSU Tatsuya; ARIMURA Hiroki; SHIMOZONO Shinichi
IPSJ SIG technical reports, 05 Mar. 2007, Information Processing Society of Japan (IPSJ), English
05 Mar. 2007 - 05 Mar. 2007, This article studies the local multiple alignment problem, which is, given protein or DNA sequences, to locate a region (i.e., a substring) of fixed length from each sequence so that the score determined from the set of regions is optimized. We consider the following scoring schemes: the relative entropy score (i.e., average information content), the sum-of-pairs score and a relative entropy-like score introduced by Li et al. We prove that multiple local alignment is NP-hard under each of these scoring schemes. In particular, we prove that multiple local alignment is APX-hard under relative entropy scoring. It implies that unless P = NP there is no polynomial time algorithm whose worst case approximation error can be arbitrarily specified (precisely, a polynomial time approximation scheme). Scveral related theoretical results are also provided.

Efficient Discovery of Maximal Two-Dimensional Patterns with Don't-Cares(extended abstract) (テーマ:特集「ウェブデータの知的処理」および一般)
有村博紀; 宇野毅明; 下薗真一
人工知能基本問題研究会, 08 Sep. 2006, 人工知能学会, English
08 Sep. 2006 - 08 Sep. 2006

D_045 An Adaptive Algorithm for Dividing Large Sets of Strings to Efficiently Sort
Asai Tatsuya; Okamoto Seishi; Arimura Hiroki
情報科学技術フォーラム一般講演論文集, 21 Aug. 2006, Forum on Information Technology, Japanese
21 Aug. 2006 - 21 Aug. 2006

深さ優先探索に基づく変数制限つき極大モチーフの高速マイニング (テーマ:「データマイニングと統計数理」および一般)
有村博紀; 宇野毅明
知識ベ-スシステム研究会, 09 Mar. 2006, 人工知能学会, Japanese
09 Mar. 2006 - 09 Mar. 2006

Efficient Method of Transaction Database Analysis Using Zero-Suppressed BDDs
MINATO Shin-ichi; ARIMURA Hiroki
The IEICE transactions on information and systems (Japanese edetion), 01 Feb. 2006, The Institute of Electronics, Information and Communication Engineers, Japanese
01 Feb. 2006 - 01 Feb. 2006, 大規模なトランザクションデータを,計算機上でコンパクトに表現して効率的に処理することは,データマイニングにおける重要な基盤技術の一つである.本論文では,VLSI CADの分野で大規模論理関数データの表現法として広く用いられている二分決定グラフ(BDD: Binary Decision Diagrams)をデータマイニングの分野に応用する方法を提案する.BDDの中でも「ゼロサプレス型BDD」(ZBDD: Zero-suppressed BDD)と呼ばれるデータ構造は,大規模な組合せ集合データを非明示的に列挙し,効率良く演算処理することができる.このZBDDを用いて,トランザクションデータベースに関する種々の計算処理を行う手法について述べ,実験結果を示す.

A Polynomial Space Polynomial Delay Algorithm for Enumerating Maximal Motifs in a Sequence
ARIMURA Hiroki; UNO Takeaki
IEICE technical report. Theoretical foundations of Computing, 08 Sep. 2005, The Institute of Electronics, Information and Communication Engineers, English
08 Sep. 2005 - 08 Sep. 2005, In this paper, we consider the problem of finding all maximal motifs in an input string for the class of repeated motifs with wild cards such as ABoBoBC. Since the proboem has various applications in bioinformatics, sequence mining, and data compression, output-sensitive enumeration of such maximal motifs has been extensively studied. However, no algorithms achieved polynomial space and polynomial delay so far. For the problem, we present a polynomial space polynomial delay algorithm for the maximal motif enumeration problem for the repeated motifs with wild cards. We also show exponential lower bound and succinctness results on the number of maximal motifs, which indicate the limit of a straightforward approach.

大規模系列データから代表的な頻出エピソードを発見するための効率よいアルゴリズム (特集「人工知能における論理の新たな展開」)
有村博紀; 宇野毅明
人工知能基本問題研究会, 04 Nov. 2004, 人工知能学会, Japanese
04 Nov. 2004 - 04 Nov. 2004

An Effcient Algorithm for Mining Frequent Closed Sequential Episodes
ARIMURA Hiroki; UNO Takeaki
IPSJ SIG Notes, 14 Oct. 2004, Information Processing Society of Japan (IPSJ), Japanese
14 Oct. 2004 - 14 Oct. 2004, In this paper, we consider the enumeration problem for the class of frequent closed sequence patterns, called frequent position-closed sequences, from a given collection of sequences, where closed sequence patterns are a generalization of closed itemsets for sequence databases. Out notion of closed sequences are are based on the position occurrences, and thus it is weaker than the traditional notion of closed patterns based on the document occurrences. We present an efficient algorithm for enumerating all frequent maximal sequence patterns without duplicates, which runs in polynomial time per pattern in the total size of the input sequence database based on the framework of reverse search. As a corollary, the enumeration problem for frequent maximal sequence patterns is shown to be solvable in output polynomial time.

An Effcient Algorithm for Mining Frequent Closed Sequential Episodes
ARIMURA Hiroki; UNO Takeaki
IEICE technical report. Theoretical foundations of Computing, 08 Oct. 2004, The Institute of Electronics, Information and Communication Engineers, Japanese
08 Oct. 2004 - 08 Oct. 2004, In this paper, we consider the enumeration problem for the class of frequent closed sequence patterns, called frequent position-closed sequences, from a given collection of sequences, where closed sequence patterns are a generalization of closed itemsets for sequence databases. Out notion of closed sequences are are based on the position occurrences, and thus it is weaker than the traditional notion of closed patterns based on the document occurrences. We present an efficient algorithm for enumerating all frequent maximal sequence patterns without duplicates, which runs in polynomial time per pattern in the total size of the input sequence database based on the framework of reverse search. As a corollary, the enumeration problem for frequent maximal sequence patterns is shown to be solvable in output polynomial time.

E-007 Mining Key Semantics Using Dependency Graphs
Morinaga Satoshi; Arimura Hiroki; Ikeda Takahiro; Sakao Yosuke; Akamine Susumu
情報科学技術フォーラム一般講演論文集, 20 Aug. 2004, Forum on Information Technology, Japanese
20 Aug. 2004 - 20 Aug. 2004

分類階層を考慮したパタン照合アルゴリズム (特集オントロジー)
喜田拓也; 有村博紀
人工知能基本問題研究会, 25 Jul. 2004, 人工知能学会, Japanese
25 Jul. 2004 - 25 Jul. 2004

An Efficient Algorithm for Discovering Frequent Patterns from Large Unordered Trees (Evolutionary Advancement in Fundamental Theories of Computer Science)
Asai Tatsuya; Fusanobu Shinji; Arimura Hiroki; Uno Takeaki; Nakano Shin-ichi
RIMS Kokyuroku, May 2004, Kyoto University, Japanese
May 2004 - May 2004

高速な無順序木パターン発見アルゴリズム (人工知能基礎論研究会(第54回)特集「医療及び化学情報マイニング」および一般)
浅井達哉; 房延慎二; 有村博紀
人工知能基礎論研究会, 03 Mar. 2004, 人工知能学会, Japanese
03 Mar. 2004 - 03 Mar. 2004

Practical Techniques for Speeding Up Enumeration Algorithms for Frequent Itemset Mining Problems
UNO Takeaki; ASAI Tatsuya; ARIMURA Hiroki; UCHIDA Yuzo
IPSJ SIG Notes, Mar. 2004, Information Processing Society of Japan (IPSJ), Japanese
Mar. 2004 - Mar. 2004, In this paper, we address practical techniques for efficiently mining all frequent item sets, frequent closed item sets, and maximal frequent item sets, respectively, from transaction databases. We modify the algorithm for enumerating maximal bipartite cliques proposed in SIGAL89 [9, 10], and propose an algorithm for frequent closed item set enumeration. The computation time of the algorithm is theoretically proved to be linear in the number of output, which is not proved for existing algorithms. We further modify the algorithm for practical computation, and for enumerating all frequent item sets, and maximal frequent item sets. By computational experiments, which is done by a competition of frequent set mining algorithms, we showed our approach is quite efficient, especially for realworld problems witch are said to be difficult.

Efficiently Mining Frequent Substructures from Large Unordered Trees
ASAI Tatsuya; ARIMURA Hiroki; UNO Takeaki; NAKANO Shin-ichi
IEICE technical report. Theoretical foundations of Computing, 22 Jan. 2004, The Institute of Electronics, Information and Communication Engineers, Japanese
22 Jan. 2004 - 22 Jan. 2004, In this paper, we study a data mining problem of discovering frequent substructures in a large collection of semi-structured data, where both of the patterns and the data are modeled by labeled unordered trees. The keys of the algorithm are efficient enumerating all unordered trees and incrementally computation of the occurrences based on a powerful design technique known as the reverse search. We present an efficient algorithm called UNOT that computes all labeled unordered trees appearing in a collection of data trees with frequency above a user-specified threshold. We prove that the algorithm enumerates each frequent pattern T in O (kb^2m) per pattern, where k is the size of T, b is the branching factor of the data tree, and m is the total number of occurrences of T in the data trees.

An Efficient Algorithm for Mining Frequent Unordered Trees from Semi-structured Data
ASAI Tatsuya; ARIMURA Hiroki; UNO Takeaki; NAKANO Shinichi
IEICE technical report. Data engineering, 01 Oct. 2003, The Institute of Electronics, Information and Communication Engineers, Japanese
01 Oct. 2003 - 01 Oct. 2003, In this paper, we study a data mining problem of discovering frequent substructures in a large collection of semi-structured data, where both of the patterns and the data are modeled by labeled unordered trees. The keys of the algorithm are efficient enumerating all unordered trees and incrementally computation of the occurrences based on a powerful design technique known as the reverse search. We present an efficient algorithm called U NOT that computes all labeled unordered trees appearing in a collection of data trees with frequency above a user-specified threshold. We prove that the algorithm enumerates each frequent pattern T in O(kb^2m) per pattern, where k is the size of T, b is the branching factor of the data tree, and m is the total number of occurrences of T in the data trees.

An Efficient Algorithm for Mining Frequent Unordered Trees from Semi-structured Data
ASAI Tatsuya; ARIMURA Hiroki; UNO Takeaki; NAKANO Shinichi
IEICE technical report. Dependable computing, 01 Oct. 2003, The Institute of Electronics, Information and Communication Engineers, Japanese
01 Oct. 2003 - 01 Oct. 2003, In this paper, we study a data mining problem of discovering frequent substructures in a large collection of semi-structured data, where both of the patterns and the data are modeled by labeled unordered trees. The keys of the algorithm are efficient enumerating all unordered trees and incrementally computation of the occurrences based on a powerful design technique known as the reverse search. We present an efficient algorithm called U NOT that computes all labeled unordered trees appearing in a collection of data trees with frequency above a user-specified threshold. We prove that the algorithm enumerates each frequent pattern T in O(kb^2m) per pattern, where k is the size of T, b is the branching factor of the data tree, and m is the total number of occurrences of T in the data trees.

An Efficient Algorithm for Mining Frequent Unordered Trees from Semi-structured Data
ASAI Tatsuya; ARIMURA Hiroki; UNO Takeaki; NAKANO Shin-ichi
電子情報通信学会技術研究報告. DE, データ工学, 01 Oct. 2003, Japanese
01 Oct. 2003 - 01 Oct. 2003

Machine Learning in Computational Learning Theory(Machine Learning vs. Human Learning : Why Can't Machines Outsmart Humans ?")
Arimura Hiroki
Journal of Japanese Society for Artificial Intelligence, 01 Sep. 2003, The Japanese Society for Artificial Intelligence, Japanese
01 Sep. 2003 - 01 Sep. 2003

Efficiently Mining Frequent Substructures from Large Unordered Trees
ASI Tatsuya; ARIMURA Hiroki; UNO Takeaki; NAKANO Shin-ichi
IEICE technical report. Artificial intelligence and knowledge-based processing, 24 Jul. 2003, The Institute of Electronics, Information and Communication Engineers, English
24 Jul. 2003 - 24 Jul. 2003, In this paper, we study a data mining problem of discovering frequent substructures in a large collectionof semi-structured data, where both of the patterns and the data are modeled by labeled unordered trees. The keys ofthe algorithm are efficient enumerating all unordered trees and incrementally computation of the occurrences basedon a powerful design technique known as the reverse search. We present an efficient algorithm called UNOT thatcomputes all labeled unordered trees appearing in a collection of data trees with frequency above a user-specifiedthreshold. We prove that the algorithm enumerates each frequent pattern T in O(kb2n] per pattern, where A; is thesize of T, b is the branching factor of the data tree, and n is the total number of occurrences of T in the data trees.

Enumerating Closed Item Sets via Maximal Bipartite Cliques
UNO Takeaki; ARIMURA Hiroki; ASAI Tatsuya
IEICE technical report. Artificial intelligence and knowledge-based processing, 24 Jul. 2003, The Institute of Electronics, Information and Communication Engineers, Japanese
24 Jul. 2003 - 24 Jul. 2003, For an item set and its subset family, a frequent set is a subset of the item set included in at least a specified number of elements of the subset family. Frequent sets have some applications in data mining. Recently, maximal frequent set and closed item sets are remarked where a closed item set is a group of similar frequent sets, since we can obtain necessary frequent sets by enumerating closed item sets. In this paper, we propose an efficient method for enumerating closed item sets by using enumeration of maximal bipartite cliques, and enumeration algo- rithm for maximal frequent sets via closed item sets. We evaluate their performances by computational experiments using benchmark problems, and show that our algorithms are faster than existing algorithms. Key words

Efficient XPath Processing for Semi-structured Data Streams
MORIKAWA Hiroaki; ASAI Tatsuya; ARIMURA Hiroki
IPSJ SIG Notes, 16 Jul. 2003, Information Processing Society of Japan (IPSJ), Japanese
16 Jul. 2003 - 16 Jul. 2003, In this paper, we consider the problem of efficient processing of XML data streams online. We give another version of XML scanner called ASAX (active SAX), and then develop an efficient lightweight XPath pattern matching engine XMatch for fast XML data stream processing on Internet.

Efficient XPath Processing for Semi-structured Data Streams
MORIKAWA Hiroaki; ASAI Tatsuya; ARIMURA Hiroki
IEICE technical report. Data engineering, 10 Jul. 2003, The Institute of Electronics, Information and Communication Engineers, Japanese
10 Jul. 2003 - 10 Jul. 2003, In this paper, we consider the problem of efficient processing of XML data streams online. We give another version of XML scanner called ASAX (active SAX), and then develop an efficient lightweight XPath pattern matching engine XMatch for fast XML data stream processing on Internet.

Online Algorithms for Discovering Frequent Substructures from Semi-structured Data Stream (New Aspects of Theoretical Computer Science)
Asai Tatsuya; Abe Kenji; Kawasoe Shinji; Arimura Hiroki; Arikawa Setsuo
RIMS Kokyuroku, May 2003, Kyoto University, Japanese
May 2003 - May 2003

テキストマイニング:ウェブデータからの知識発見を目指して (特集情報論的学習理論--機械学習のさまざまな形)
有村博紀; 浅井達哉
Computer today, Mar. 2003, サイエンス社, Japanese
Mar. 2003 - Mar. 2003

Efficient Algorithms for Mining XML Data Streams
川副真治; 浅井達哉; 有村博紀
人工知能学会全国大会論文集, 2003, 人工知能学会, Japanese
2003 - 2003

Online Algorithms for Mining Semi-structured Data Stream with Sliding Window and Forgetting Factor
ASAI Tatsuya; ARIMURA Hiroki; ABE Kenji; KAWASOE Shinji; ARIKAWA Setsuo
IEICE technical report. Data engineering, 10 Oct. 2002, The Institute of Electronics, Information and Communication Engineers, Japanese
10 Oct. 2002 - 10 Oct. 2002, In this paper, we study an online data mining problem from streams of semi-structured data such as XML data. Modeling semi-structured data and patterns as labeled ordered trees, we present an online algorithm StreamT that receives fragments of an unseen possibly infinite semi-structured data in the document order through a data stream, and can return the current set of frequent patterns immediately on request at any time. A crucial part of our algorithm is the incremental maintenance of the occurrences of possibly frequent patterns using a tree sweeping technique. We give modifications of the algorithm to other online mining models. We present theoretical and empirical analyses to evaluate the performance of the algorithm.

Online Algorithms for Mining Semi-structured Data Stream with Sliding Window and Forgetting Factor
ASAI Tatsuya; ARIMURA Hiroki; ABE Kenji; KAWASOE Shinji; ARIKAWA Setsuo
IEICE technical report. Dependable computing, 10 Oct. 2002, The Institute of Electronics, Information and Communication Engineers, Japanese
10 Oct. 2002 - 10 Oct. 2002, In this paper, we study an online data mining problem from streams of semi-structured data such as XML data. Modeling semi-structured data and patterns as labeled ordered trees, we present an online algorithm StreamT that receives fragments of an unseen possibly infinite semi-structured data in the document order through a data stream, and can return the current set of frequent patterns immediately on request at any time. A crucial part of our algorithm is the incremental maintenance of the occurrences of possibly frequent patterns using a tree sweeping technique. We give modifications of the algorithm to other online mining models. We present theoretical and empirical analyses to evaluate the performance of the algorithm.

タグとキーワードの関係を利用したテキストマイニング (人工知能基礎論研究会(第46回) 知識ベースシステム研究会(第54回) 合同研究会テーマ:「アクティブマイニング」および一般)
坂本比呂志; 谷口力昭; 有村博紀
知識ベ-スシステム研究会, 12 Nov. 2001, 人工知能学会, Japanese
12 Nov. 2001 - 12 Nov. 2001

タグとキーワードの関係を利用したテキストマイニング (人工知能基礎論研究会(第46回) 知識ベースシステム研究会(第54回) 合同研究会テーマ:「アクティブマイニング」および一般)
坂本比呂志; 谷口力昭; 有村博紀
人工知能基礎論研究会, 12 Nov. 2001, 人工知能学会, Japanese
12 Nov. 2001 - 12 Nov. 2001

Efficient Substructure Discovery from Large Semi-structured Data
ASAI Tatsuya; ABE Kenji; KAWASOE Shinji; ARIMURA Hiroki; ARIKAWA Setsuo
IEICE technical report. Data engineering, 04 Oct. 2001, The Institute of Electronics, Information and Communication Engineers, Japanese
04 Oct. 2001 - 04 Oct. 2001, In this paper, we consider a data mining problem for semi-structured data. We present an efficient algorithm for discovering frequent substructures from a given large collection of semi-structured data by modeling semi-structured data as labeled ordered trees. This algorithm is a generalization of the itemset enumeration technique, called set-enumeration tree, by Bayardo(SIGMOD'98)to ordered tree enumeration. The experiments on HTML documents show that the algorithm is efficient and scalabel on realworld data.

Discovery of Important Keywords From the Web Using Text Mining
Arimura Hiroki; Abe Junichirou; Sakamoto Hiroshi; Arikawa Setsuo
Annual report of Computing and Communications Center Kyushu University, Oct. 2001, Kyushu University, Japanese
Oct. 2001 - Oct. 2001

Text Mining with Optimal Pattern Discovery
ARIMURA Hiroki
Proceedings of the Society Conference of IEICE, 29 Aug. 2001, The Institute of Electronics, Information and Communication Engineers, Japanese
29 Aug. 2001 - 29 Aug. 2001

A Catalog for Prediction-Preserving Reducibility with Membership Queries on Formal Languages (New Developments of Theory of Computation and Algorithms)
Hirata Kouichi; Sakamoto Hiroshi; Arimura Hiroki
RIMS Kokyuroku, May 2001, Kyoto University, English
May 2001 - May 2001

Practical Algorithms for Constructing Large Suffix Arrays
ASAKA HIROKI; ABE JUNICHIRO; ARIMURA HIROKI; ARIKAWA SETSUO
IPSJ SIG Notes, 15 Mar. 2001, Information Processing Society of Japan (IPSJ), Japanese
15 Mar. 2001 - 15 Mar. 2001, In this paper, we study efficient parallel construction of full-text indexing structures for large text data. The suffix array is a compact full-text indexing structure that is useful in information retrieval and bio-informatics. We propose an efficient parallel algorithm for constructing suffix arrays on distributed memory parallel computers. This algorithm is a parallel implementation of the well-known external memory algorithm, called the Baeza-Yates-Gonnet-Sinder (BGS) algorithm. By theoretical analysis, we show that our algorithm runs more efficiently than the Riberio-Kitajima-Ziviani (RKZ) algorithm, another the parallel implementation of the BGS algorithm, in terms of the parallel time and communication complexities.

Extracting Text Data from HTML Documents
MURAKAMI YOSHITSUGU; SAKAMOTO HIROSHI; ARIMURA HIROKI; ARIKAWA SETSUO
IPSJ SIG Notes, Mar. 2001, 一般社団法人情報処理学会, Japanese
Mar. 2001 - Mar. 2001

Data Mining : Towards Knowledge Discovery from Text and Web Data
Arimura Hiroki
IEICE technical report. Information theory, 03 Oct. 2000, The Institute of Electronics, Information and Communication Engineers, Japanese, Invited oral presentation
03 Oct. 2000 - 03 Oct. 2000, The rapid progress of computer and network technologies makes it easy to store large amount of unstructured or semi-structured texts such as webpages, HTML/XML archives, and a collection of emails or text files on a computer. Thus, there is a potential demand for efficient semi-automatic tool that supports a human discovery from large text databases. In this paper, we give a short survey of the current data mining methods and their limitations in the context of text and Web mining. Then, we present efficient and robust data mining algorithms for unstructured text data based on optimized pattern discovery. We also discuss the applications of the algorithms to Web mining., [Invited]

Learning Term Rewriting Systems from Entailment
Hiroki Arimura; Hiroshi Sakamoto; Setsuo Arikawa
Inductive Logic Programming, 10th International Conference, ILP 2000, Work-in-progress reports, London, UK, July 2000, Proceedings, 24 Jul. 2000, ILP 1995, English, Oral presentation
24 Jul. 2000 - 27 Jul. 2000, London, UK, United Kingdom, editors: James Cussens and Alan Frisch, [International presentation]

木の変換規則の例からの学習 (小特集「発見科学」及び一般演題)
坂本比呂志; 有村博紀; 有川節夫
人工知能基礎論研究会, 15 Jul. 2000, 人工知能学会, Japanese
15 Jul. 2000 - 15 Jul. 2000

Finding Unexpected Regression Association Rules from Numeric Database
TAKAE Toru; ABE Junichiro; SAKAMOTO Hiroshi; ARIMURA Hiroki; INOUE Hitoshi; TAKEYA Shunichi; UEZONO Keiko; KAWASAKI Terukazu
Proceedings of the Annual Conference of JSAI, 03 Jul. 2000, 人工知能学会, Japanese
03 Jul. 2000 - 03 Jul. 2000

Text Data Mining : Applications to Browsing Large Document Collections and Web Data(Discovery Science)
Abe Junichiro; Fujino Ryoichi; Shimozono Shinichi; Arimura Hiroki; Arikawa Setsuo
Journal of Japanese Society for Artificial Intelligence, 01 Jul. 2000, The Japanese Society for Artificial Intelligence, Japanese
01 Jul. 2000 - 01 Jul. 2000, This paper describes applications of text data mining to interactive document browsing and web mining. We develop a fast text data mining method based on optimal pattern discovery, and make experiments on large collections of documents and Web pages to evaluate the proposed method.

Maximizing Agreement with a Classification by Bounded Number of Associated Words
SHIMOZONO Shinichi; ARIMURA Hiroki; ARIKAWA Setsuo
IEICE technical report. Theoretical foundations of Computing, 16 Nov. 1999, The Institute of Electronics, Information and Communication Engineers, English
16 Nov. 1999 - 16 Nov. 1999, We study the efficient discovery of word-association patterns, defined by a sequence of strings and a proximity gap between them, from a collection of texts with binary labels. We present an algorithm that finds all d strings and k proximity word-association patterns that maximizes agreement with the labels. It runs in expected time complexity O(k∧nlog∧n)and O(k∧n)space with the total length n of texts, if texts are uniformly random strings. It shows a contrast to a known fact that finding a best word-association pattern with arbitrarily many strings is intractable and hard to approximate within a factor arbitrary close to one, i.e., has no polynomial-time approximation scheme unless P=NP.

データマイニングを用いた探索的文書ブラウジングシステム
安部潤一郎; 笠井透; 有村博紀
人工知能基礎論研究会, 07 Jul. 1999, 人工知能学会, Japanese
07 Jul. 1999 - 07 Jul. 1999

Text Data Mining : Application to Browsing Large Document Collections
ITAI Tsutomu; KASAI Toru; ARIMURA Hiroki; ARIKAWA Setsuo
Proceedings of the Annual Conference of JSAI, 15 Jun. 1999, 人工知能学会, Japanese
15 Jun. 1999 - 15 Jun. 1999

Finding Proximity Word-Association Patterns from Large Text Databases
FUJINO Ryoichi; ARIMURA Hiroki; ARIKAWA Setsuo
Proceedings of the Annual Conference of JSAI, 15 Jun. 1999, 人工知能学会, Japanese
15 Jun. 1999 - 15 Jun. 1999

部分語計数問題の接尾辞配列を用いた高速アルゴリズム (計算モデルとアルゴリズム)
笠井透; 有村博紀; 有川節夫
数理解析研究所講究録, Apr. 1999, 京都大学, Japanese
Apr. 1999 - Apr. 1999

1T-10 Virtual suffix tree : fast computation of subword frequency using suffix array for text data mining
笠井透; 有村博紀; 有川節夫
全国大会講演論文集, 09 Mar. 1999, Information Processing Society of Japan (IPSJ), Japanese
09 Mar. 1999 - 09 Mar. 1999

Knowledge discovery from health data using weighted aggregation classifiers
Toru Takae; Minoru Chikamune; Hiroki Arimura; Ayumi Shinohara; Hitoshi Inoue; Shun-Ichi Takeya; Keiko Uezono; Terukazu Kawasaki
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 1999, Springer Verlag, English
1999 - 1999

On Approximation Algorithms for Local Multiple Alignment (合同研究会"AIシンポジウム'99"(第10回))
阿久津達也; 有村博紀; 下薗真一
合同研究会AIシンポジウム, 1999, 人工知能学会, English
1999 - 1999

Maximizing Agreement with a Classification by Bounded Number of Associated Words
Shimozono Shinichi; Arimura Hiroki; Arikawa Setsuo
IEICE technical report. Theoretical foundations of Computing, 20 Nov. 1998, The Institute of Electronics, Information and Communication Engineers, English
20 Nov. 1998 - 20 Nov. 1998, We study the efficient discovery of word-association patterns, defined by a sequence of strings and a proximity gap between them, from a collection of texts with binary labels. We present an algorithm that finds all d strings and k proximity word-association patterns that maximizes agreement with the labels. It runs in expected time complexity O(k∧<d-1>nlog∧<d+1>n)and O(k∧<d-1>n)space with the total length n of texts, if texts are uniformly random strings. It shows a contrast to a known fact that finding a best word-association pattern with arbitrarily many strings is intractable and hard to approximate within a factor arbitrary close to one, i.e., has no polynomial-time approximation scheme unless P=NP.

Empirical Evaluation of an Inference Algorithm for Unions of Patterns by Minimal Multiple Generalization
笠井透; 有村博紀; 篠原武
全国大会講演論文集, 05 Oct. 1998, Information Processing Society of Japan (IPSJ), Japanese
05 Oct. 1998 - 05 Oct. 1998

Empirical Evaluation of a Regular Pattern Inference Algorithm by Minimal Multiple Generalization
KASAI Toru; ARIMURA Hiroki; SHINOHARA Takeshi
Research reports on information science and electrical engineering of Kyushu University, Sep. 1998, Kyushu University
Sep. 1998 - Sep. 1998

Text Data Mining Based on Optimal Pattern Discovery : Towards a Scalable Data Mining System for Large Text Databases
Kasai Toru; Arimura Hiroki; Fujino Ryoichi; Arikawa Setsuo
IPSJ SIG Notes, 08 Jul. 1998, Information Processing Society of Japan (IPSJ), Japanese
08 Jul. 1998 - 08 Jul. 1998, We consider the problem to find a pattern over strings that maximizes the precision of classifying the given positive- and negative-labeled strings. We introduce a notion of patterns for specifying associations between strings, and present an efficient algorithm that finds a best pattern that maximizes the precision and runs in O(k^<d-1>n log^<d+1> n )expected time and with O(k^<d-1>n)space for uniformly random texts. We also discuss an implementation technique for the algorithm on the suffix array indexing structure.

1変数パタン言語の多項式時間オンライン学習 (アルゴリズムと計算の理論)
稲子希望; 有村博紀
数理解析研究所講究録, Apr. 1998, 京都大学, Japanese
Apr. 1998 - Apr. 1998

部分語相関ルール発見のための高速アルゴリズム (アルゴリズムと計算の理論)
渡木厚; 有村博紀; 藤野亮一; 有川節夫
数理解析研究所講究録, Apr. 1998, 京都大学, Japanese
Apr. 1998 - Apr. 1998

ONline learning algorithms for one-variable pattern languages
Nozomu Inago; Hiroki Arimura
情報処理学会九州支部研究会, 1B-2, 13 Mar. 1998, 第12回情報処理学会九州支部, Japanese, Oral presentation
13 Mar. 1998 - 13 Mar. 1998, 鹿児島大学工学部情報科学科, Japan, 情報処理学会九州支部長松尾文碩, [Domestic Conference]

Preprint

An Efficient Algorithm for Text Data Mining
Atsushi Wataki; Hiroki Arimura; Ryoichi Fujino; Setsuo Arikawa
第12回情報処理学会九州支部研究会, 13 Mar. 1998, 情報処理学会九州支部, Japanese, Oral presentation
鹿児島大学工学部情報科学科, Japan, [Domestic Conference]

Preprint

Efficient Inductive Logic Programming with Majority Vote
Jun Kuramoto; Hiroki Arimura
第12回情報処理学会九州支部研究会, 13 Mar. 1998, 情報処理学会九州支部, Japanese, Oral presentation
鹿児島大学工学部情報科学科, Japan, [Domestic Conference]

Preprint

Maximum Agreement Problem for Word Association Patterns
Arimura Hiroki; Wataki Atsushi; Shimozono Shinichi
IEICE technical report. Theoretical foundations of Computing, 31 Oct. 1997, The Institute of Electronics, Information and Communication Engineers, Japanese
31 Oct. 1997 - 31 Oct. 1997, We consider the problem to find a pattern over strings that maximizes the precision of classifying the given positive- and negative-labeled strings. We introduce a notion of patterns for specifying associations between two strings, and present an efficient algorithm that finds a pattern with the best precision and runs in O(n^2) time and with O(kn) space. Furthermore, we show that if patterns are extended to associations over arbitrary many strings then the problem is not approximable within arbitrary small error ratio in polynomial time unless P=NP.

Text data mining with optimal string patterns
Arimura Hiroki; Wataki Atsushi; Fujino Ryoichi; Arikawa Setsuo
全国大会講演論文集, 24 Sep. 1997, Information Processing Society of Japan (IPSJ), Japanese
24 Sep. 1997 - 24 Sep. 1997, 本研究では, 大量の文書の集積から, 分類精度を最適化するパタンを見つける問題を考察する。二語相関パタンとよばれる単純なパタンを仮説としたとき, 分類精度を最大化する最適パタンをO(n^2)時間および領域O(kn)領域で計算するアルゴリズムを与える。

On-Web-Teaching 2300 Students Information Processing and Computer Literacy at Kyushu University
MINE Tsunenori; SATOH Hiroyuki; SHOUDAI Takayoshi; HIROKAWA Sachio; ARIMURA Hiroki; MORI Masao; SHINOHARA Ayumi; TAKEDA Masayuki
IPSJ SIG Notes, 23 May 1997, Information Processing Society of Japan (IPSJ), Japanese
23 May 1997 - 23 May 1997, We are developing a Web system for the introductory course of programming and computer literacy. 2300 students take the course with this Web system at Kyushu University. Our main object of this system is to reappear our lecture in front of students in their private studies. Teaching how programs run is a hard problem to deal with in information processing educations. In order to deal with such a difficult subject easily, we propose a program animation compiler for assisting students to understand a motion of programs. This compiler systematically translates tiny Pascal programs into Java Applets. In this paper, we state our idea and future plans of the system.

The Computational Complexity of Hereditary Elementary Formal Systems
Ikeda Daisuke; Arimura Hiroki
RIMS Kokyuroku, May 1997, Kyoto University, English
May 1997 - May 1997

ALT'96報告
杉本典子; 平田耕一; 有村博紀
人工知能学会誌, 01 Mar. 1997, 社団法人人工知能学会, Japanese
01 Mar. 1997 - 01 Mar. 1997

九州大学大学院システム情報科学研究科情報理学専攻
有川節夫; 有村博紀
人工知能学会誌, 01 Mar. 1997, 社団法人人工知能学会, Japanese
01 Mar. 1997 - 01 Mar. 1997

On the strong compactness of unions of regular pattern languages with respect to inclusion
Hiroki Arimura; Takeshi Shinohara
RIMS Lecture Notes, May 1996, 京都大学, Japanese, Oral presentation
May 1996 - May 1996

Download

A polynomial-time relative generalization algorithm for learning logic programs with background knowledge
Arimura Hiroki; Shinohara Takeshi
全国大会講演論文集, 20 Sep. 1995, Information Processing Society of Japan (IPSJ), Japanese, Oral presentation
20 Sep. 1995 - 20 Sep. 1995, Japan, 機械学習や帰納推論,仮説推論等における重要な推論手法の一つが,概念の汎化である.概念の汎化とは,いくつかの具体例が与えられたときに,それらを説明するより一般的な概念を見つけることであり,構成的学習において有効な手法である.とくに帰納論理プログラミング等の論理プログラムの学習の分野では,汎化にもとづく学習システムについて活発な研究がおこなわれている.本研究では,論理プログラムを背景知識の存在のもとで汎化する問題について議論する.とくに,高々k個の制約確定節からなる集合のうちで,例として与えられたすべての確定節を導き,背景知識Bのもとで含意に関して極小なものを見つける問題を考察し,この問題を多項式時間で解くアルゴリズムを与える.これは[2]の結果を複数の節からなる論理プログラムに拡張するものである.さらに,論理プログラムの学習への応用について述べる., [Domestic Conference]

Preprint

Efficient string pattern matching for compressed Japanese text
Miyazaki Tetsuji; Hirata Hiroaki; Koga Hisanori; Fukamachi Shuichi; Arimura Hiroki; Shinohara Takeshi
全国大会講演論文集, 20 Sep. 1995, Information Processing Society of Japan (IPSJ), Japanese
20 Sep. 1995 - 20 Sep. 1995, 情報検索の問題の1つとして,パターン照合問題がある.パターン照合間題とは,パターンおよびテキストという2つの文字列が与えられたとき,テキスト中におけるパターンのすべての出現を見つけだす問題である.このパターン照合問題を解くアルゴリズムに,Aho-Corasick法がある.Aho-Corasick法の特長は,パターンよりパターン照合機械という一種の有限オートマトンを作り,そのパターン照合機械をテキスト上で1回走査させるだけで複数のパターンを同時に検出できることである.Aho-Corasick法を用いて実際に検索を行う際に問題になるのが,ディスクからテキストを読み込むのに要する時間である.この間題に対して,テキストをあらかじめ圧縮しておき,その圧縮されたテキストを復号せずに検索する方法が提案されている.日本語テキストには,ひらがな,カタカナ,漢字,記号といった複数の文字種が存在し,テキスト中では同じ文字種が連続して出現するという性質がある.この性質を利用して,文字種ごとに圧縮符号を割り当てることにする.このとき,異なる文字種間で同じ符号が割り当てられる可能性があるので,文字種を保持する必要がある.文字種の変わり目に文字種が変わることを示すシフトコードを用いる.この方法により,文字種を考慮に入れずに全体で圧縮符号を割り当てた場合より圧縮率はよくなる.本稿では,Aho-Corasick法のパターン照合機械を拡張し,シフトコード付き圧縮符号によって圧縮された日本語テキストのためのパターン照合機械を設計した.これを用いて日本語テキストを対象として実験を行った結果,圧縮率は69.7%,走査時間は78.3%に短縮できた.

Protein Motif Discovery from Amino Acid Sequences by Sets of Regular Patterns
Yamaguchi Michiyo; Shinohara Takeshi; Fujino Ryouichi; Arimura Hiroki; Arikawa Setsuo
IPSJ SIG Notes, 28 Jul. 1995, Information Processing Society of Japan (IPSJ), Japanese, Oral presentation
28 Jul. 1995 - 28 Jul. 1995, A regular pattern is a string consisting of constant symbols and mutually distinct variables, and represents the set of the constant strings obtained by Substituting possibly empty constant strings for variables. To the problem of discovering protein motifs from given amino acid sequences, we apply the learning algorithm, called k-minimal multiple generalization (k-mmg), that finds a minimally general collections of at most k regular patterns explaining all the positive examples. We incorporate into the learning algorithm a search heuristics, called minimal covering approach, for detecting exceptions in examples. The experiments on protein data show that our algorithm argumented with the new search heuristics more quickly finds accurate hypotheses from given amino acid sequences than the previous heuristics.

Preprint

Protein Motif Discovery by Multiple String Patterns from Positive Examples
SHINOHARA Takeshi; FUJINO Ryoichi; ARIMURA Hiroki; ARIKAWA Setsuo
人工知能学会全国大会論文集 = Proceedings of the Annual Conference of JSAI, 24 Jul. 1995, Japanese
24 Jul. 1995 - 24 Jul. 1995

Learning Unions of Tree Patterns Using Queries
ARIMURA Hiroki; ISHIZAKA Hiroki; SHINOHARA Takeshi
人工知能学会全国大会論文集 = Proceedings of the Annual Conference of JSAI, 24 Jul. 1995, Japanese
24 Jul. 1995 - 24 Jul. 1995

Logical Generalization for Learning with Background Knowledge
Hiroki Arimura; Takeshi Shinohara
ICLP'95 Post-Conference Workshop on Inductive Logic Programming, SUT Technical Report, IA-TR-95-03, 17 Jun. 1995, Stephen Muggleton, Fumio Mizoguchi, and Koichi Furukawa, English, Nominated symposium
17 Jun. 1995 - 17 Jun. 1995, Shonan Village Center, Hayama-Cho, Kanagawa 240-01, Japan, Japan, Proceedings of the ICLP'95 Post-Conference Workshop on Inductive Logic Programming, SUT Technical Report, Edited by Stephen Muggleton, Fumio Mizoguchi, and Koichi Furukawa, 17 June 1995; co-located with: Twelfth International Conference on Logic Programming (ICLP 1995), June 13-16, 1995. Tokyo, Japan., [Invited], [International presentation]

Published

Lossy Text Compression for String Pattern Matching
Fukamachi Shuichi; Shimozono Shinichi; Arimura Hiroki; Shinohara Takeshi
IEICE technical report. Natural language understanding and models of communication, 12 May 1995, The Institute of Electronics, Information and Communication Engineers, Japanese
12 May 1995 - 12 May 1995, A mapping, called k-indexing, from an alphabet to a smaller alphabet of size k is considered as a lossy text compression for speeding up string pattern matching. An approximation algorithm to find a k-indexing that minimizes false detection is presented. Using optimized 2^n-indexing, texts can be compressed in fixed length code of n (bit/char) and the expectation of false detection for patterns of length l is roughly <1/2>^<nl>. Optimized indexings for symbols, 2-grams, and 3-grams are compared with each other on the actual English text.

第2回マシンインテリジェンスに関する国際ワークショップ(International Workshop on Machine Intelligence 1993)の報告
沼尾正行; 有村博紀; 原口誠; 平田耕一; 斉藤和巳; 諏訪正樹; 月本洋; 元田浩
人工知能学会誌, 01 Mar. 1994, 社団法人人工知能学会, Japanese
01 Mar. 1994 - 01 Mar. 1994

How to acquire Meta-cognition from Concrete Databases
Hiroki Arimura; Akira Takeuchi; Setsuko Otsuki
World Conference on Artificial Intelligence in Education (AI-ED 1993), poster, pp.539, Aug. 1993, AACE, English, Poster presentation
23 Aug. 1993 - 27 Aug. 1993, Edinburgh, Scotland

Preprint

Inductive Inference of Prolog Programs with Internal Variables from Positive Data
ARIMURA Hiroki; SHINOHARA Takeshi
人工知能学会全国大会論文集 = Proceedings of the Annual Conference of JSAI, 20 Jul. 1993, Japanese
20 Jul. 1993 - 20 Jul. 1993

Inductive Learning of Logic Programs From Positive Facts Using Minimal Multiple Generalization
Ishizaka Hiroki; Arimura Hiroki; Shinohara Takeshi
Journal of Japanese Society for Artificial Intelligence, 01 Jul. 1993, The Japanese Society for Artificial Intelligence, Japanese
01 Jul. 1993 - 01 Jul. 1993, We consider the polynomial time inferability of primitive Prologs from positive facts. The class of primitive Prologs is a proper subclass of that of linear Prologs which is known to be inferable from only positive facts. In this paper, we discuss the polynomial time inferability of the subclass using minimal multiple generalizations. The minimal multiple generalization is a natural extension of the least generalization given by Plotkin in 1970. The minimal multiple generalization generalizes given first order words by several words, while the least generalization does by a single word. The property of the minimal multiple generalization makes it possible to perform fine generalization and to construct the heads of several clauses in a target program at the same time. We give an outline of a consistent polynomial time inference algorithm which identifies the class of primitive Prologs in the limit. The algorithm infers the heads of clauses in a target program as a minimal multiple generalization of a set of given positive facts. Furthermore, we give a similar result on the inferability of a subclass of context-free transformations which includes several well-known Prolog programs.

データベースからメタ認知を獲得する問題
有村博紀; 岸本拓也; 竹内章; 大槻説乎
人工知能学会第4回知的教育システム研究会, 19 Mar. 1993, Japanese, Oral presentation
19 Mar. 1993 - 20 Mar. 1993

Improvements of the Analogical Reasoning System ARTS for Definite Clauses
Ryuji Wakizono; Hiroki Arimura; Hitoshi Inoue; Makoto Haraguchi; Setsuo Arikawa
the 6th JSSST Annual Meeting, 19 Oct. 1989, JSSST, Japanese, Oral presentation
19 Oct. 1989 - 21 Oct. 1989, 九州大学工学部（福岡市箱崎）, Japan, [Domestic Conference]

Preprint

Completeness of Depth-Bounded Resolution in Logic Programming
Hiroki Arimura
the 6th JSSST Annual Meeting, 19 Oct. 1989, JSSST, Japanese, Oral presentation
19 Oct. 1989 - 21 Oct. 1989, Kyushu University, Fukuoka, Japan, [Domestic Conference]

Download

■ Syllabus

情報知識ネットワーク特論, 2024年, 修士課程, 情報科学院

Introduction to Artificial Intelligence, Big Data, and Cybersecurity for Graduate Students, 2024年, 修士課程, 情報科学院

情報知識ネットワーク特論, 2024年, 博士後期課程, 情報科学研究科

情報知識ネットワーク特論, 2024年, 博士後期課程, 情報科学院

Introduction to Artificial Intelligence, Big Data, and Cybersecurity for Graduate Students, 2024年, 博士後期課程, 情報科学院

プログラム理論と言語, 2024年, 学士課程, 工学部

情報理工学演習Ⅲ, 2024年, 学士課程, 工学部

アルゴリズムとデータ構造, 2024年, 学士課程, 工学部

一般教育演習(ﾌﾚｯｼｭﾏﾝｾﾐﾅｰ), 2024年, 学士課程, 全学教育

■ Affiliated academic society

Aug. 2005 - Present
The Databases Society of Japan (DBSJ)

Oct. 2000 - Present
The Institute of Electronics, Information and Communication Engineers (IEICE)

Jul. 2000 - Present
Association for Computing Machinery (ACM)

Jun. 1992 - Present
The Japanese Society for Artificial Intelligence (JSAI)

Sep. 1990 - Present
Information Processing Society of Japan (IPSJ)

Present
IEEE Computer Society

■ Research Themes

社会を志向した革新的アルゴリズムの実装
Grants-in-Aid for Scientific Research Grant-in-Aid for Transformative Research Areas (A)
Nov. 2020 - Mar. 2025
安田宜仁; 有村博紀; 鍋島英知; 井上武; 西野正彬
2020年度は採択後からの4か月の期間を、今後の研究活動の準備期間と捉え、遠隔での交流のための環境整備や、実装を意識した場合に注力するべき項目の洗い出しを中心に行った。特に後者については、二分決定図を活用した既存ライブラリである graphillion の課題についての検討を進めた。
個別の研究としては、厳密被覆の全解列挙、非同期ストリーム索引、離散最適化を用いた説明可能性機械学習において進展があった。
厳密被覆(集合とそのいくつかの部分集合が与えられた場合に、和集合が重複なく入力集合を構成するような組合せを見つける問題)の全解列挙に関して、入力の部分集合の数が大きい場合でも効率的に動作するような方法を考案した。ゼロサプレス型二分決定図(ZDD)に類似する「DanceDD」というデータ構造を考案し、これを実現した。
高速非同期ストリーム索引について、高木と、稲永、有村、Breslauer、Hendrianらは、一本の成長する系列の索引である接尾辞木のオンライン線形時間構築法を、非同期に成長する複数本の系列の索引である「全オンライン接尾辞木」に自然に拡張することに世界で初めて成功し、その線形時間アルゴリズムを与えた（学術論文1）．
離散最適化を用いた説明可能性機械学習について、金森と有村等は、説明可能機械学習問題において、因果構造を反映した反実仮想説明の整数計画法を用いた生成手法を開発し、人工知能分野のトップ会議AAAI2021に採択され、2022年1月に発表を行った（国際会議発表1）。この成果は、日本経済新聞や各種ウェブメディア等で報道されるなど社会的関心を集めた（報道他1）．
Japan Society for the Promotion of Science, Grant-in-Aid for Transformative Research Areas (A), NTT Communication Science Laboratories, Coinvestigator, 20H05963

Development of Next-generation Semi-Structured Data Mining Technology Towards The Real-World Knowledge Creation Infrastructure
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)
Apr. 2020 - Mar. 2025
有村博紀; 宇野毅明; 平田耕一; 山本章博; 喜田拓也; ジョーダンチャールズハロルド
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (A), Hokkaido University, Principal investigator, 20H00595

Steering Toward Spatio-Temporal Computing Architecture Driven by Learning/Math-Scientific Models
Oct. 2018 - Mar. 2025
有村博紀; 中村篤祥; 瀧川一学; 和佐州洋; 栗田和宏
Japan Science and Technology Agency (JST), JST CREST " Technology for Computing Revolution for Society 5.0", Coinvestigator

Expansion of efficient search and discovery technology for processing massive data stream in the real world
Grants-in-Aid for Scientific Research Grant-in-Aid for Challenging Research (Exploratory)
Jun. 2018 - Mar. 2021
Arimura Hiroki
In this research, we study efficient methods for search, count, and discovery of complex patterns and rules from multiple, massive, changing, uncertain, unstructured stream data as a basis of large-scale time-space data stream processing in the real world. In this research project, Hiroki Arimura conducted research and take part in R & D of fast pattern search and discovery, Thomas Zeugmann with Arimura studied the theory of multi-stream data processing and knowledge discovery, and Charles Jordan with Arimura devised efficient massively parallel and stochastic search methods for combinatorial structures as patterns. We focus on multiplicity, adaptivity, context-sensitiveness, and low-memory footprint of the algorithms, and built implementations and theretical and empirical analyses of the devised algorithms as libraries for real world data processing.
Japan Society for the Promotion of Science, Grant-in-Aid for Challenging Research (Exploratory), Hokkaido University, Principal investigator, 18K19771

Next-generation semi-structured data mining technologies for real-world knowledge infrastructures
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)
Apr. 2016 - Mar. 2020
Arimura Hiroki
The goal of this research is to establish base technologies for forming large-scale knowledge bases from massive data in real and cyber spaces. For this purpose, we study next-generation data mining technologies for efficiently extracting useful knowledge as patterns and rules from large semi-structured data, that is, huge and heterogeneous collections of weakly structured data. In more detail, we studied (1) efficient semi-structured data mining engines in the optimal pattern discovery framework, (2) semi-strcuture mining based on spatio-temporal information, (3) combination of search, compression, and indexing/storage technologies for massive semi-structured data, development of knowledge base creation systems based on semi-structured data mining.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (A), Hokkaido University, Principal investigator, 16H01743

Research on Fundamental Algorithms of Discrete Structure Manipulation Systems
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (S)
May 2015 - Mar. 2020
MINATO Shin-ichi
In this project, we aimed to construct the core algorithms for discrete structure manipulation systems, and to provide efficient software tools for many researchers in various application areas. Our achievements include that (1) we first succeeded in enumerating all connected sub-block patterns (in total 109.8 billion patterns) of 47 prefectures in Japan, the data is open for all Japanese citizens from the governmental statistics center, and (2) we produced many academic papers, accepted at the top-conferences such as AAAI, WWW, KDD, INFOCOM, AISTATS, SDM, etc.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (S), Coinvestigator, 15H05711

Statistically Sound Pattern Mining
Grants-in-Aid for Scientific Research
01 Apr. 2017 - 31 Mar. 2019
Komiyama Junpei; Ishihata Masakazu; Arimura Hiroki; Nishibayashi Takashi; Minato Shin-ichi; Maehara Takanori
Pattern mining algorithms enumerate all the combinatorial patterns with their frequency larger than a given threshold. Existing algorithms output many patterns that characterizes a dataset, they do not address how significant the found patterns are in terms of statistical significance. To address this issue, we propose a method that guarantees the rate of false discovery in found patterns while keeping its computational efficiency. The proposed method is presented in a top-tier data mining / artificial intelligence conference (KDD2017).
Japan Society for the Promotion of Science, Grant-in-Aid for Young Scientists (B), The University of Tokyo, 17K12736

Reconfigurable Architectures for Accelerating Data Mining
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)
Apr. 2015 - Mar. 2019
Motomura Masato
We have conducted researches of architectures for combinatory optimization problems. Conventional idea has been to conduct minor-embedding or a target graph onto hardware graph that is more sparse than the former. Our new architectural proposal is to time-multiplexing a hardware structure for expanding hardware graph, that is powerful enough to sustain the original denser target graph. This idea has been verified to achieve higher accuracy for large graphs in our simulation.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B), Hokkaido University, Coinvestigator, 15H02673

Efficient search and discovery technology for processing massive data stream in the real world
Grants-in-Aid for Scientific Research Grant-in-Aid for Challenging Exploratory Research
Apr. 2015 - Mar. 2018
Arimura Hiroki; JORDAN Charles
In this research, we study the algorithms and data structures for search, discovery, and analysis of complex combinatorial patterns as the base technology for massive stream data processing in the real world. Especially, we focus on the following key aspects of such algorithms: low-memory footprint, adaptivity, context-sensitivity, multiplicity. The followings are the research topics that we study: A1. Fast search algorithms based on bit-parallel techniques; A2. Ultra-fast counting technology based on stochastic testing; A3. pattern discovery using substructure enumeration; A4. Theory of knowledge discovery from massive high-speed streams; B1. Prototype implementation and preliminary evaluation.
Japan Society for the Promotion of Science, Grant-in-Aid for Challenging Exploratory Research, Hokkaido University, Principal investigator, 15K12022

Development of Next-Generation Semi-structured Data Mining for Large-Scale Knowledge Base Formation
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)
Apr. 2012 - Mar. 2016
Arimura Hiroki; UNO TAKEAKI; MINATO S.; HIRATA KOUICHI; ITO K.; SHIMOZONO S.; KIDA T.
The final goal of this research is to establish a strategy for forming large-scale knowledge bases from massive data and information on a wide range of human activities on social, scientific, and industrial aspects in the cyber space. For this purpose, we study next-generation data mining technologies for efficiently extracting useful knowledge as patterns and rules from semi-structured data, that is, huge and heterogeneous collections of weakly structured data in the cyber space. (1) Efficient semi-structured data mining engines based on optimal pattern discovery framework. (2) Semi-strcuture mining based on spatio-temporal information. (3) Combining semi-structured data mining with stochastic information processing schema. (4) Knowledge federation technologies for large-scale knowledge bases creation. (5) Knowledge indexing technologies for large-scale knowledge bases creation. (6) Development of knowledge base creation systems based on semi-structured data mining.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (A), Hokkaido University, Principal investigator, 24240021

バイオミメティクス・データベースのオープンイノベーションプラットフォームへの展開
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Innovative Areas (Research in a proposed research area)
Apr. 2013 - Mar. 2015
有村博紀
本研究が連携する計画研究A01班で構築するバイオミメティクス・データベースは，生物学や材料科学等の異分野のデータベースを，画像データ検索技術により横断的に検索し，クラスタリングして提示することで，専門の異なる研究者に新たな気づきを提供し，異分野連携と産学連携の強力な推進基盤となる．また，ISO においてバイオミメティクスの国際標準化の検討が開始されており，先に述べた画像検索技術を踏まえ，国際標準化に参画することで，我が国の国際競争力強化に貢献できる．
以上の背景のもと，本研究では，オープン・イノベーション・プラットフォームの実現に向けて，国際的産業応用可能なバイオミメティクス・データベースの機能についての検討を目的として調査研究を行った．産学連携については，国内外におけるバイオミメティクスに関する異分野連携・産学連携の動向について，欧州のBiomimicry Europa等，昨年度の調査対象組織のその後の動向について詳細に調べた．
バイオミメティクスの国際標準化を検討する技術委員会ISO/TC266における標準化活動を行った．具体的には，国際委員会（第4回ISO総会，2014年10月，ベルギー），国内審議委員会としてISO国内審議委員会（2014年4月，第9回から第11回），国内WG4会議等に参加し，調査および標準化活動を行った．また，標準化に関連する大規模検索とデータ解析に関する技術開発も行った．
以上の活動により，我が国がバイオミメティクスの産業応用で世界に先んじ，我が国の産業界に資することを目指した．さらに，上記を通じて同分野の人材育成も行った．
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Innovative Areas (Research in a proposed research area), Hokkaido University, Principal investigator, 25120501

A study on positive-definite and efficient kernels for structured data
Grants-in-Aid for Scientific Research
01 Apr. 2011 - 31 Mar. 2014
SHIN Yoshihiro; OKAMOTO Hiroshi; ARIMURA Hiroki; SAKAMOTO Hiroshi; KUBOYAMA Tetsuji; CUTURI Marco
Kernel method is an important field of machine learning research and allows us to leverage information assets like big data to make useful predictions in various applications. In addition to vector data,there exist huge amount of data that have structures. For example, DNA is an array of nucleotides; Protein, parse trees and XML documents are naturally structured as trees; Various kinds of networks are represented using the graph structure. In this regard, this project aims to establish a theory of kernels for structured data and practical techniques to apply kernels to structured data of the real applications. Specifically, we have developed a mathematical theory to investigate positive definiteness of kernels and various types of kernels that deal with a wide variety of structured data. Furthermore, we have developed a utility to compute kernels and have publicized it to researchers in the field of machine learning over the Internet.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B), University of Hyogo, 23300061

Research on database analysis algorithms using very large-scale monolithic memory space
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)
2008 - 2011
MINATO Shin-ichi; ARIMURA Hiroki; ZEUGMANN Thomas; KIDA Takuya; OKUBO Yoshiaki
Formerly, database applications required to store all the data into a hard disk, however, after 2000's, middle or large-scale practical databases become possible to be stored in the main memory of PCs. In this project, we proposed and developed a new method of high-speed database analysis with a very large-scale monolithic main memory. Our method is based on Binary Decision Diagram (BDD), which is a new type of compressed graph structure for efficiently manipulating large-scale combinatorial data.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B), Hokkaido University, Coinvestigator, 20300051

Next-Generation Semi-structured Data Mining for Large-Scale Knowledge Base Formation
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)
2008 - 2011
ARIMURA Hiroki; KIDA Takuya; MINATO Shin-ichi; ITO Kimihito; UNO Takeaki; HIRATA Kouichi
In this research, we study efficient semi-structured data mining technologies and related knowledge federation and indexing technologies, which enables knowledge extraction from massive semi-structured data in the real world. Finally, we made a number of knowledge extraction experiments.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (A), Hokkaido University, Principal investigator, 20240014

情報ネットワークにおける大規模知識処理のための超高速アルゴリズムの研究
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas
2009 - 2010
トーマスツォイクマン; 湊真一; 喜田拓也; 下薗真一; 高木剛; 宇野毅明; 有村博紀
本研究課題では,大規模知識処理のための超高速アルゴリズムの研究に取り組んでいる.本研究では,ウェブからの知識獲得を人間による大量のウェブ情報アクセスを支援する効率的な半自動的ツールとしてとらえ,従来の情報検索システムを超えた新しい情報アクセスシステムの開発を目指している.そのために,現在までに開発してきたウェブ情報処理技術を元に,中核技術として,機械学習と知識索引技術に基づく,適応的な知的情報獲得システムの基盤技術を開発することを目標としている.
特に本年度は,下記の課題に取り組み,各課題において多くの国際会議論文の採択を受けた.
(1)自律・適応的な知識発見のための機械学習の研究.(ツォイクマン・湊)
(2)超高速知識獲得アルゴリズムの研究.(ツォイクマン・宇野・有村)
(3)ZDDに基づく大規模知識索引の研究.(湊・宇野・有村)
(4)高速情報検索のためのデータ圧縮技術の研究.(喜田・有村・ツォイクマン)
(5)開放分散環境における高度知識処理のためのプライバシー保護機構の研究.(高木・ツォイクマン)
これら一連の知識発見に関する我々の仕事が国際的にも評価され,第25回計算機と情報科学に関する国際会議において,研究代表者であるツォイクマンを筆頭に,分担者である湊および有村らがそれぞれ招待講演者として招かれた.また,分担者の宇野毅明准教授が,文部科学大臣表彰若手科学者賞(業績名:巨大データ解析に対する超高速アルゴリズム構築法の研究)を受賞した.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Priority Areas, Hokkaido University, Principal investigator, 21013001

Efficient Pattern Discovery from Massive Semi-Structured Data for Knowledge Infrastructure Formation on the Web
Grants-in-Aid for Scientific Research Grant-in-Aid for Specially Promoted Research
2005 - 2007
ARIMURA Hiroki; KIDA Takuya; MINATO Shin-ichi; ITO Kimihito
By rapid progress of network and storage technologies for the last decade, a huge amount of weakly structured electronic data of various types, called semi-structured data, become available over the Internet. In this research project, we study efficient semi-structured data mining technologies that supports human's discovery of useful knowledge from massive collections of semi-structured data. In particular, we develop high-speed semi-structured data mining engines as a core of large-scale knowledge Infrastructure formation from the Internet and establish their architecture and base technologies. In particular, we have studied the following research topics:
1. High-speed semi-structured data mining technologies. We developed efficient depth-first mining algorithms for sequences and trees based on the rightmost expansion technique, proposed by Hiroki Arimura and other researchers. We also devised various optimization techniques in frequent and maximal pattern mining frameworks. We made theoretical analysis of their computational complexity, and show that they are actually polynomial delay and polynomial space complexity enumeration algorithms, which means that they can be used as light-weight and high-throughput mining engines. From theory point of view, they are the first maximal pattern mining algorithms for non-trivial subclasses of semi-structured data having theoretically guaranteed performance.
2. Knowledge indexing technologies. The pre- and post-processing of the data and patterns are important issues for the next-generation data mining. We developed an efficient knowledge indexing data structure, called VSOP, based on Zero-suppressed BDD (ZBDD) proposed by Shin-ichi Minato. VSOP is a compressed indexing structure for discrete structure, equipped with a rich set of algebraic manipulation operators for compressed transactions and itemsets. On the top of VSOP, we devised efficient data mining algorithms that can extract interesting structures hidden in a huge input dataset in an highly interactive way.
3. Knowledge federation technologies. We combine semi-structured data processing techniques with information retrieval and information extraction from the web and stream data. As results, we developed fast pattern matching algorithms for multi-dimensional numerical streams based on bit-parallel techniques and semi-automatic information extraction algorithms from the Web based on demonstration-by-example. We also implemented a set of knowledge discovery tools based on our algorithms and technologies devised in this project. We also made experiments to evaluate the prototype system on real world data on the Web and in Bioinformatics fields.
Japan Society for the Promotion of Science, Grant-in-Aid for Specially Promoted Research, Hokkaido University, Principal investigator, 17002008

超高速データストリームのためのオンライン型半構造情報変換システムの開発
Grants-in-Aid for Scientific Research Grant-in-Aid for Exploratory Research
2004 - 2005
坂本比呂志; 有村博紀; 坂本比呂志
本研究では,半構造データに対する高速なXPath処理法を提案した.これまでに,データを効率的に圧縮する手法として知られている算術符号化を半構造データの検索に応用した,逆算術符号化が提案されている.これは,木構造データ上のパスの依存関係を,データを圧縮したまま復号化することなく検査できる手法であり,この関係性を利用することで,パスによる問い合わせを高速に処理できる.しかしながら,この問い合わせで利用可能なパスの形式は限定されているため,一般のXPathの問い合わせは処理が困難である.そこで本研究では,このような逆算術符号化にノード間の先祖子孫関係を判定可能な範囲ラベルを導入することにより,より複雑な問い合わせ処理を高速に実現するための手法を提案する.評価実験の結果,300MB程度のXMLデータに対してテキストを直接処理する既存の手法と比較し,数十から百倍の高速化を達成した.また,本研究では,畳み込みカーネルのアイディアに基づいた,ラベル付き順序木に対するこれまでにない新しいカーネル関数を提案した.まず,畳み込みカーネルの枠組みにおいてラベル付き順序木に対して任意の部分グラフを部分構造として用いた場合の,効率の良いカーネル計算のアルゴリズムを提案し,曖昧なラベルや構造を取り込むような拡張を行った.さらに,より一般的な木構造として,順序のないラベル付き根付き木に対するカーネルを考えた場合には,カーネルの計算が#P-完全問題であることを示した.
Japan Society for the Promotion of Science, Grant-in-Aid for Exploratory Research, Principal investigator, 16650021

最適パターン発見に基づく大規模半構造データからの知的情報獲得システムの開発
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas
2004 - 2005
トーマスツォイクマン; 有村博紀; 坂本比呂志; 篠原歩; 下薗真一; 湊真一; 喜田拓也; 竹田正幸
本研究は,大量のウェブページやXML等の大規模半構造データからのデータマイニング(ウェブマイニング)に基づき,大量のデータ解析を対話的に支援する効率的なツールとして,従来の情報検索システムを超えた新しい情報アクセスシステムの実現方式を明らかにすることを目標としている.その鍵になる技術として,最適パターン発見を木やグラフ構造に拡張し,計算量理論と計算学習理論の最新の成果を援用しながら,半構造データに対する頑健かつ高速な最適化パターン発見アルゴリズムの開発に取り組んだ.
平成17年度は,初年度から昨年度までの研究成果と統合し,最適半構造マイニングのプロトタイプシステム構築を目指した.研究項目としては,有用な情報源の発見,特徴的なパターンの発見,情報の抽出の3つの情報獲得問題に加えて,昨年度から新たに研究を開始した知識索引問題について取り組んだ.今年度得られた具体的な結果のうち主要なものは以下のとおりである.
(1)大規模なトランザクションデータによく見られる疎な組み合わせ集合データを効率よく扱うことのできるデータ構造であるZBDD(Zero-suppress BDD)をベースに,その構造の元で重み付き積和集合を計算可能なZBDDパッケージツールVSOP(Valued Sum-Of-Products)の開発を推し進め,頻出するパターン集合を表現するZBDDを単純直交分解する機能を追加した.これにより,そのデータに内包された意味的構造を自動抽出することが可能になった.(湊)
(2)パターン発見アルゴリズムによる分類・予測の長期的ふるまいに関する理論保証を与えることに成功した.(ツォイクマン)
(3)系列データからの極大モチーフパターンを効率よく枚挙するアルゴリズムを得た.(有村:H13-H16代表)
(4)Arc構造付きテキストに対する高速なパターン照合アルゴリズムを得た.(喜田)
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Priority Areas, Hokkaido University, Principal investigator, 16016266

Study of High-speed Data Mining Algorithms from Massive Data Streams
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)
2003 - 2005
IKEDA Daisuke; TAKEDA Masayuki; SHINOHARA Ayumi; KIDA Takuya; KASAHARA Yoshiaki; ISHINO Akira
In this research, we investigated high-speed online knowledge discovery system for extracting useful information from massive semi-structured data streams. Particularly in this year, as theoretical researches, we extended further the theory of efficient pattern matching and pattern discovery methods for online streams. As application studies, we made a series of experiments on collection and analysis of network data from real high-speed networks in a huge organization. We have also published the results obtained in the research period of the last three years. In particular, we proceed the studies on the following issues:
(1)Survey on semi-structured data : We have summarized and published a survey on stream data mining in an academic journal, which has been studied through this project for the last three years.
(2)Study on streaming pattern matching technology for semi-structured data : We developed an efficient method for performing tree pattern matching with horizontal wildcards by bit parallel technology, which potentially gives drastic speed-up for Xpath and XQuery pattern matching languages for huge XML data.
(3)Study on sequential and streaming pattern discovery technology for semi-structured data : We developed efficient algorithms for finding interesting patterns from massive data streams for various classes of complex patterns/motifs. In this year, we also published pattern discovery algorithms developed in the last year. Also, one of them got awarded for 2004 JSAI SIG AWARD.
(4)Empirical study on knowledge discovery from real massive network data : As applications, we performed a series of surveys on data collection and online analysis of high-speed large-scale network for middle sized organization at Kyushu University. These experiments will give insights for future research on the development of efficient pattern matching/discovery algorithms for high-speed streaming data.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B), Principal investigator, 15300036

最適パターン発見に基づく大規模半構造データからの知的情報獲得システムの開発
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas
2001 - 2003
有村博紀; 篠原歩; 竹田正幸; 坂本比呂志; 下薗真一
本研究では,大量のウェブページやXML等の大規模半構造データからのデータマイニング(ウェブマイニング)に基づき,大量のデータ解析を対話的に支援する効率的なツールとして,従来の情報検索システムを超えた新しい情報アクセスシステムの実現方式を明らかにすることを目標としている.鍵になる技術として,最適パターン発見を木やグラフ構造に拡張し,計算量理論と計算学習理論の最新の成果を援用しながら,半構造データに対する頑健かつ高速な最適化パターン発見アルゴリズムの開発に取り組んだ.平成15年度は,前年度の成果に基づき,(a)有用な情報源の発見,(b)特徴的なパターンの発見,(c)情報の抽出の3つの情報獲得問題について,次の研究を行った.
(1)前年度開発した最適分類パターンを用いた高精度テキスト自動分類手法を一種の正規表現を扱えるよう一般化した.さらに,近似文字列照合を可能なパターン発見エンジンを開発し,加速学習法を用いて決定木学習システムBONSAIに組み込んだ(竹田・篠原).
(2)XPathパターンに対する最適パターン発見アルゴリズムを設計し,半構造データマイニングエンジンを開発した.とくに今回は,より現実の半構造データに近い無順序木モデルに対して,高速なパターン発見エンジンを開発した.パターンの圧縮表現に対する,高速な列挙アルゴリズムを開発した.独自に開発したオンライン化技術を用いて,オンラインパターン発見手法を開発した(有村).
(3)情報抽出に関して,現状の技術動向の調査を行い,水平方向制約(Sequence制約)付き半構造データに対するラッパー帰納アルゴリズムの設計を行った(坂本).さらに,半構造データに適した高性能圧縮アルゴリズムを開発し,性能に関する理論的解析を行った.並行して,開発したアルゴリズムの計算量を解析し(下薗,篠原),個々のアルゴリズムの最適化をおこない,性能を向上させた(全員).最後に,ウェブデータとXMLデータに関する評価実験をおこない,この方式の有効性を評価した(有村・篠原).
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Priority Areas, Kyushu University, Principal investigator, 15017268

Study of efficient text data mining based on optimized pattern discovery
PRESTO21, "Information and Intelligence"
Oct. 1999 - Mar. 2002
Hiroki Arimura
JST, Principal investigator, Competitive research funding

最適パターン発見に基づく大規模半構造データからの知的情報獲得システムの開発
科学研究費助成事業
2002 - 2002
有村博紀; 下薗真一; 篠原歩; 竹田正幸
ネットワーク上に分散したウェブページやXML等の半構造データの急速な増大に対して,これらのコンテンツに直接アクセスするための効率良い手法の開発が緊急の課題となっている.本研究では,大規模半構造データからのデータマイニング(ウェブマイニング)に基づき,大量のデータ解析を対話的に支援する効率的なツールとして,従来の情報検索システムを超えた新しい情報アクセスシステムの開発を目指す.
そのための鍵となる技術として,最適パターン発見を木やグラフ構造に拡張して,半構造データに対する頑健かつ高速な最適化パターン発見アルゴリズムを開発する.さらに,ウェブマイニングを(a)有用な情報源の発見,および(b)特徴的なパターンの発見,(c)情報抽出の3つの過程ととらえ,これらを有機的に結合して,半構造データを対象とした知識獲得システムの効率良い実現方式を明らかにすることを目標とする.また,計算量理論と計算学習理論の最新成果を援用し,計算量に徹底的に配慮した高速なアルゴリズムの開発を目指すのも特色である.
平成14年度は,次の研究成果を得た.
(a)「有用な情報源の発見」に関しては,従来より表現力の高いパターンである可変長変数パターン(VLDCパターン)に対する新しいテキスト索引構造を開発し,これを用いて,効率よい最適化マイニングアルゴリズムを開発した.
(b)「特徴的なパターンの発見」に関しては,XML-MessagingとSOAPへの応用を目指して,昨年開発した半構造データマイニング手法FREQTを元に,高速な半構造データストリームマイニングSTREAMTを開発した.これは,非常に少なく資源を使いながらデータストリームを監視し、有用なパターンを逐次報告するアルゴリズムである.また,FREQTの最適化マイニングへの拡張と理論的性能解析を行い,この方式の最適性を示した.
(c)「情報抽出」に関しては,XMLデータ処理の基本技術である符号語列上のパターン照合機械技術を開発し,XMLパターン検索への応用を考察した.
日本学術振興会, 特定領域研究, 九州大学, 14019070

最適パターン発見に基づく大規模半構造データからの知的情報獲得システムの開発
科学研究費助成事業
2001 - 2001
有村博紀; 篠原歩; 竹田正幸; 坂本比呂志; 下薗真一
ネットワーク上に分散したウェブページやXML等の半構造データの急速な増大に対して,これらのコンテンツに直接アクセスするための効率良い手法の開発が緊急の課題となっている.本研究では,大規模半構造データからのデータマイニング(ウェブマイニング)に基づき,大量のデータ解析を対話的に支援する効率的なツールとして,従来の情報検索システムを超えた新しい情報アクセスシステムの開発を目指す.
そのために,鍵となる技術として,最適パターン発見を木やグラフ構造に拡張して,半構造データに対する頑健かつ高速な最適化パターン発見アルゴリズムを開発する.さらに,ウェブマイニングを(a)有用な情報源の発見,および(b)特徴的なパターンの発見,(c)情報抽出の3つの過程からなると考え,これらを有機的に結合して,半構造データを対象とした知識獲得システムの効率良い実現方式を明らかにすることを目標とする,また,計算量理論と計算学習理論の最新の成果を援用して,計算量に徹底的に配慮した高速なアルゴリズムの開発を目指すことも特色である.
平成13年度は,次の研究成果を得た.
(a)「有用な情報源の発見」に関しては,部分系列パターンとエピソードパターンと呼ぶ組合せパターンに対する効率よい最適化マイニングアルゴリズムを開発し,これを文字列分類のための決定木学習アルゴリズムBONSAIに組み込んだ.
(b)「特徴的なパターンの発見」に関しては,半構造データを最も基本的なラベル付き順序木(labeled ordered trees)のクラスとしてモデル化し,データ中の頻出共通部分構造に対する高速な発見アルゴリズムを開発した.木に関するパターン発見問題は,一般に高い計算量をもつことが多い.そこで,最右枝拡張法という効率よい発見手法を与え,これを複数の最適化手法と組み合わせて,半構造データに対する高速なマイニングアルゴリズムを与えた.
(c)「情報抽出」に関しては,ウェブからの情報抽出問題を考察し,HTMLデータから木構造の情報を利用して必要な情報を効率よく切り出すTree-Wrapperアルゴリズムを開発した.
日本学術振興会, 特定領域研究(C), 九州大学, 13224073

Development of Efficient Data Mining Systems for Large Semi-Structured Text Data
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)
1999 - 2001
ARIMURA Hiroki; SHINOHARA Ayumi; TAKEDA Masayuki; SHOUDAI Takayoshi; HIRATA Kouichi; ISHINO Akira
The goal of this research project is to devise an efficient semi-automatic tool that supports human discovery from large unstructured and semi-structured text data. To achieve this goal, we studied in the following three directions.
1. The central process of text mining is pattern discovery. We employed the framework of optimized pattern discovery, and developed effcient and robust text mining algorithms that find simple combinatorial patterns from large unstructured texts. To implement these algorithms, we developed a text index structure based on the suffix arrays suitable for text mining. Based on these technologies, we implemented a prototype system and run computer experiments on Web data.
2. Another important technology for text is efficient pattern matching. As a theoretical framework, we proposed a unified framework, called Collage system, for realizing various dictionary-based compression methods. We developed both Knuth-Morris-Pratt type and Byer-Moore type pattern matching algorithms employing this framework. We also applied this framework to Byte-Pair-Encoding compression method and Sequitur, the former of which yields the fastest compressed pattern matching algorithm.
3. Final process of text mining is information extraction. From theoretical point of view, we first formalize the information extraction problem from semi-structured data, and then gave theoretical analysis of the power and the limitation of such tasks. Then, we developed efficient information extraction algorithms for various types of extraction rules including tree wrappers and hedge patterns and evaluate them through experiments on real-life semi-structured data on the internet.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B), Kyushu University, Principal investigator, 11558040

高度知識ベースを対象とした知識獲得システムの研究
Grants-in-Aid for Scientific Research Grant-in-Aid for Encouragement of Young Scientists (A)
1999 - 2000
有村博紀
本研究の目的は,大規模知識データベースから,複雑な知識構造を効率よく獲得するシステムの実現方法を明らかにすることである.大規模知識データベースは,(i)大量の(ii)非均質なデータを含み,同時に(iii)その多くは正例だけを含むので,従来の知識獲得手法は適用できない.そこで,本研究では,このような大規模知識データベースに適用可能な新しい学習手法について研究をおこなった.
本年度は,以下の項目について研究を実施した.
(1)前年度に開発した一般的な知識獲得手法を,ネットワーク上の半構造データからの知識獲得に応用した.具体的には,ネットワーク上でのデータ交換および記述言語として急速に利用が進んでいるXMLデータを対象とし,与えられたXMLデータから,対話によって,データ変換規則を効率良く発見するアルゴリズムを開発する.(ICGI 2000;人工知能学会誌Vol.16,No.2)
(2)さらに(1)項の半構造データからのパターン発見アルゴリズムを,前年度に開発したテキストデータからのパターン発見手法と融合し,構造と内容の両方を利用した知識獲得手法を開発した.さらに,この手法の時間計算量と質問計算量を理論的に解析した.(PAKDD 2000;RECOMB 2000;ビット別冊)
(3)開発した知識獲得手法の有効性を計算機実験を通じて評価する.開発中の知識獲得システムを,本研究で得たパターン発見手続きの導入によって拡張した.さらに,ネットワーク上に分散したウェブページからの知識獲得実験をおこなった.(Kyoto ICDL2000;人工知能学会誌Vol.15,No.4)
Japan Society for the Promotion of Science, Grant-in-Aid for Encouragement of Young Scientists (A), Kyushu University, Principal investigator, 11780277

Knowledge Discovery by Inferences
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas (A)
1998 - 2000
SATO Taisuke; HARAGUCHI Makoto; IMAI Mutsumi; ARIMURA Hiroki; SATO Masako; SHINOHARA Takeshi
From the fiscal year 1999 to 2001, we conducted research in knowledge discovery and developed inference various methods that can cope with uncertainty and complexities in the real data as follows.
T. Sato developed a symbolic-statistical modeling language PRISM. Arimura et al. developed two types of fast text pattern matching algorithm. Tsukimoto proposed logical regression analysis. K.Satoh continued the analysis of minimal case base required for representing concepts. Sakama proposed non-monotonic inverse resolution.
Imai et al. developed an algorithm for computing maximally specific hypotheses for ILP. Haraguchi proposed data abstraction for decision trees. Yamamoto reconstructed the theoretical base of ILP based on the logic of Suggestion.
M. Sato et al. studied the theory of refutable/inductive inference. Shinohara proposed reducing data dimension by his H-map. Ohsawa showed that his KeyGraph approach is effective by examples.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Priority Areas (A), Tokyo Institute of Technology, Coinvestigator, 10143104

高度知識ベースを対象とした知識獲得システムの研究
Grants-in-Aid for Scientific Research Grant-in-Aid for Encouragement of Young Scientists (A)
1997 - 1998
有村博紀
本研究では,構造化データからの対話的な知識獲得について,基礎的な研究をおこなった[1,3].まず,オブジェクト指向データベース等の複合オブジェクト(complex object)の単純なモデルとして,述語論理の一階項(first-order term)を考えた.さらに,能動的学習(active learning)の枠組みのもとで,さまざまな構造化パタン族に対して,効率的なパタン発見手法の開発と,発見問題の本質的複雑さの探求をおこない,次の結果を得た[1,3,2].
1. 2個の一階項からなる簡単なパタンでさえ,効率よい学習が困難なことを示した.その一方で,所属性質問を用いて実験をおこなうことで,この種のパタンが効率よく多項式時間で同定できることを示した.
2. 上記の結果を2個以上の一階項からなるパタンに拡張して,所属性質問を用いた効率よい学習アルゴリズムを与えた.このアルゴリズムは構造化データの学習における基本的機構を確立するもので,所属性質問を用いるさまざまな学習モデルに適用可能である.
3. ACH(k)(acyclic constrained Horn programs of bounded arity k)と呼ばれる一階ホーン論理式の族(再帰論理プログラムの一種)が,含意に基づく学習(能動学習の一種)の枠組みで多項式時間学習可能であることを示した[1].さらに,質問計算量に関する下限定理と学習困難性を示して,一般的な一階ホーン論理式の学習には,質問の利用が不可欠なことを明らかにした.
以上の結果は,構造化データからの知識獲得の本質的難しさを明らかにするものである.同時に,対話による知識ベース構築のための基本的方法論を与えるものである.
Japan Society for the Promotion of Science, Grant-in-Aid for Encouragement of Young Scientists (A), Kyushu University, Principal investigator, 09780343

超大規模データからの高速データマイニング・システムの研究
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research on Priority Areas
1997 - 1997
有村博紀; 正代隆義; 有川節夫
データマイニング(Data Mining)は,データベースからの知識発見とも呼ばれ,現在,ビジネス分野や科学技術分野等,さまざまな対象領域で,その適用が盛んにおこなわれている.しかし,現在のデータマイニングの対象は関係データベースが中心であり,現在急速に利用が進みつつあるテキストデータベースやオブジェクト指向データベースに関しては,明示的な構造をもたない,あるいは非均質な構造しかもたない,膨大なデータの集積であるなどの理由から,従来の手法をそのまま適用することができないため,ほとんど研究がおこなわれていない.
そこで本研究では,ここにあげたような非構造的データや構造データからのデータマイニングについて研究した.平成9年度は,具体的にはつぎの問題に中心に研究した.
1.高速パターン発見アルゴリズムの研究:近年発展著しいテキストデータベースを対象に,高速なパターン発見アルゴリズムを開発した.とくに,単純なパタンに仮説を制限する代わりに,誤差や欠落を含む不完全データにたいしても働くような,頑健かつ高速な手法を開発することができた.また,確率的サンプリングを用いた高速化や効率的な探索の枝刈り法についても成果を得た.
2.属性効率のよいパタン発見アルゴリズム:テキストデータベースにおいては,属性に対応する部分列や語彙の数は膨大になる.本項では,1変数パタンと呼ばれる単純な規則を対象に,発見に必要な具体例が,発見対象に関連しない属性数にはほとんど依存しないような,「属性効率がよい」パタン発見アルゴリズムを開発し,この族が多項式オンライン学習可能であることを示した.
3.構造化データからの対話を用いた知識獲得:本項では,構造化データからの対話的な知識獲得について基礎的な研究をおこなった.関係データベースではさまざまな演繹質問や統合性制約が一階ホーン論理式として表される.主結果として,一階ホーン論理式の部分族ACH(k)に対する多項式時間学習アルゴリズムを与え,さらに,これが質問計算量に関してほぼ最適であることを示した.
他にも,テキストデータベース用の質問言語の計算量と表現力を調べ,完全に特徴づけた.また,本年度に1項のテキストデータマイニング・システムの小規模プロトタイプを試験的に実装し,問題点の洗い出しをした.次年度は,これに基づいて効果的な実装法を研究し,分子生物学のデータを対象として大規模な知識獲得実験をおこないたい.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research on Priority Areas, Kyushu University, Principal investigator, 09230215

Speedup of Text Database by Data Compression
Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)
1995 - 1997
SHINOHARA Takeshi; FUKAMACHI Shuichi; SHIMOZONO Shinichi; ISHIZAKA Hiroki
The objective of this research is to establish a speedup method for sequential pattern matching by data compression and demonstrate its availability in text database.
We design a pattern matching machine for compressed data by Huffman codes without decoding. In the experiment on this algorithm, although the effect of this method depends on the characteristics of data, the text size and the response time of searching are reduced to 60% and 70%, respectively, for English text.
We also design a similar technique for new compression scheme, called Byte-Pair-Encoding (BPE,for short). This technique compresses English text to around 50% and reduces search time to 60%. BPE is basically a fixed length code, and therefore compresses text by BPE is efficiently distributed to processors in parallel environment.
Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (A), Kyushu Institute of Technology, Coinvestigator, 07558159

大規模オブジェクト指向データベースを対象とした知識獲得システムの研究
Grants-in-Aid for Scientific Research Grant-in-Aid for Encouragement of Young Scientists (A)
1996 - 1996
有村博紀
本研究の目的は,CADデータベースやマルチメディアデータベースのように,大量の非均質なデータを含み,同時に,主に正例だけを含むようなデータベースに対しても適用可能な,新しい学習手法を開発することである。具体的な学習手法としては,申請者らが開発した「極小多重汎化」手法を採用し,大規模科学データベースを対象として,高速な極小多重汎化手続きを開発することを目指して研究をおこなった.
具体的には,極小多重汎化をデータベースを対象に拡張し,高速な学習手続きを開発した.
1.まず,データの理論的モデルの基盤として,関係データベースをとりあげ,知識表現の立場からこれを概念階層の導入によって拡張した.こうして拡張されたデータベースを対象として,高速な極小多重汎化手続きを開発することに成功した.この手法の能力と適用限界を理論的に調べた.
2.さらにこの汎化手続きを計算機上に実装し,その有用性をデータベースからの知識獲得に関する計算機実験を通じて確認した.この際,確率的選択を含む数種の経験的発見法を導入により,実際的な効率の向上をはかった.
3.近年のマルチメディア等の高度データベースの急激な発展に対処するため,上で開発した拡張データベースにテキストデータかたの知識獲得手続きを組み込むための基礎的研究をおこなった.文字列パターンとよばれる表現をとりあげ,パターンへの概念階層の組み込みと高速汎化アルゴリズムの開発をおこなった.
4.また,複数概念の個数にあらかじめ上限を与えない場合の学習可能性について理論的考察をおこない,正例からの学習の可能性と上限を明らかにした.
5.背景知識を許した関係記述言語である基本形式体系の計算量についても考察した.
Japan Society for the Promotion of Science, Grant-in-Aid for Encouragement of Young Scientists (A), Kyushu University, Principal investigator, 08780371

大規模オブジエクト指向データベースを対象とした知識獲得システムの研究
Grants-in-Aid for Scientific Research Grant-in-Aid for Encouragement of Young Scientists (A)
1995 - 1995
有村博紀
本研究では,大規模オブジェクト指向データベースを対象とした知識獲得システムの実現方法をあきらかにすることを目的として,研究をおこなった.このために申請者が提案した学習手法である極小多重汎化を,データベースを対象に拡張し,高速な学習手続きを開発した.具体的には,つぎの研究をおこなった.
(1)オブジェクト指向データベースにおける概念継承と背景知識を統一的にあつかうために,背景知識として論理プロブラムが与えられた場合に,与えられデータベースの極小汎化を求める問題を考察し,高速な汎化手続きを開発した.さらに,この汎化手続きによって,制限された述語論理プログラムのクラスが,正例だけから多項式時間極限可能であることを示した.
(2)計算量理論の立場から,最も単純な体系である原子式の極小多重汎化問題を考察し,未知仮説の正確な同定に必要な時間計算量と質問情報量を理論的に明らかにした.これにより,正確な同定を高速におこなうには,知識獲得システム自身が決定実験をおこない,必要な例を動的に得る仕組みが必要なことがわかった.
(3)開発中の知識獲得システムを,本研究で得た汎化手続きの導入によって拡張し,遺伝子配列と蛋白質に関するオブジェクト指向データベースを対象として知識獲得実験をおこなった.この計算機実験によって,開発した汎化手続きの実際のデータベースにおける有効性を検証した.
Japan Society for the Promotion of Science, Grant-in-Aid for Encouragement of Young Scientists (A), Kyushu Institute of Technology, Principal investigator, 07780339

概念の理解と知識の定着をはかる知的CAIに関する研究
科学研究費助成事業
1993 - 1993
竹内章; 有村博紀
発見的な学習を支援する知的教育システムに関する研究を行った。本研究では、発見的学習とAIとの関係から、2つの立場より研究を行った。一つは、知的教授システムを実験台にして、知識処理の特徴を体験を通して学習させる方法に関する研究、今一つは発見学習の支援方法に関する研究である。
前者は、知的教授システムの特徴的機能が知識処理によって初めて可能となることから、知識処理を学ぶよい教材となるのを利用して、実験によって知識処理の深い理解を得させるための環境に関する研究である。ここでは学習環境の構成方法と、計算機によってシミュレートされた疑似学習者の役割について提案を行った。
後者は、システムが学習者の振る舞いを認識して、学習者の意図に対して適応的な発見支援をめざす。このとき問題になるのは、学習者の意図を推定する方法と支援の方法である。我々は、学習者がすでに獲得済みの背景知識を用いながら、実験環境での試行錯誤を通して新しい概念を獲得するとの考えにたって、学習者とシステムが実験環境を共有し、協調作業を行う学習支援システムのアーキテクチャと、そこでの支援方法を提案した。ここでは発見のプロセスを、実験装置とデータの整理・加工のための道具の状態で表現される状態空間中での、それぞれの部品で定義されているオペレータの適用による移動としてモデル化した。この状態空間は、実験によるデータ収集を行う操作空間と、実験データをもとに仮説を生成する仮説空間に分けられる。また、システムが与えられた環境から規則を発見するため、そして学習者の指導のために用いる発見の戦略を、個々の環境に依存したもの、一つの領域に共通なもの、全体に共通なものの3レベルに分け記述した。
日本学術振興会, 重点領域研究, 九州工業大学, 05213219

概念の理解と知識の定着をはかる知的CAIに関する研究
科学研究費助成事業
1992 - 1992
竹内章; 有村博紀
マイクロワールドに代表される発見的学習環境と、知的教授システムに代表される問題解決中心の学習環境を統合した、Bimodus CAIと呼ぶ新しい教育環境を提案し、特に、発見学習の支援方法を中心に研究した。
Bimodus CAIにおける発見学習環境は、現実世界のモデルを計算機上に実現したダイレクト・マニピュレーション環境である。学習者はモデルを操作し、反応を観察しながら、モデルに埋め込まれた概念を獲得する。この過程で学習者は実験環境から情報を集め、整理し、自分の持っている背景知識を用いて仮説を立て、実験結果を解釈し、加工しながら、試行錯誤を繰り返す。
学習者が新しい概念を発見するには、それを導くための前提知識を持っていることが必要であるが、この条件のもので、学習者に対する支援は、次の二つの目的を満たすシステムの振舞いであると考える。
1)学習者の行う思考の領域を、背景知識全体から関連のある知識領域へ、さらに目標概念の発見に必要な前提知識領域へと、次第に縮小する。
2)前提知識の領域において、正しい知識を導出できるような実験環境を提示する。
学習者支援には学習者モデルが必要で、これは、システムに学習者と同じように実験環境から新しい概念を発見する機能をもたせ、学習者のふるまいをモニタして構築する。この機構は、内部実験装置、機械学習プログラム、背景知識から構成される。機械学習プログラムは、背景知識と内部実験装置から得られる情報を基に目標概念を獲得する。内部実験装置は学習者が行うのと同じ操作が可能で、学習者が収集するのと同じ情報が機械学習プログラムに与えられる。内部実験装置と学習者が操作する実験装置は表裏一体の関係にあり、学習者の操作をモニターしたり、システムが実験装置を操作してその結果を学習者に示したりする。
日本学術振興会, 重点領域研究, 九州工業大学, 04229221

■ Academic and Social Contribution Activities/Other
Industrial Property Rights

STR-seminar 2024 Sapporo, an inter-university research meeting for young researchers
02 Sep. 2024 - 04 Sep. 2024
Planning etc
Academic society etc
STR Seminar, Steering committee
Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Hokkaido, Japan, 開催概要
名称：「STRセミナー2024北海道〜若手研究者のための大学間合同研究集会」
日時： 2024年9月2日(月) 〜9月4日(水)：セミナー，9月4日(水)午後: 研究討論会
場所：北海道大学大学院情報科学研究院，A13教室（第1〜3日目すべて）
（北海道札幌市北区北14条西9丁目，北海道大学札幌キャンパス，情報棟，1階，玄関入ってすぐ右）
アクセス： https://www.ist.hokudai.ac.jp/access/
集会HP: https://sites.google.com/view/str2024sapporo/
35135885;11763074

meeting program

Social Contribution Activities

学術講演会「北海道から多文化共生を考える」
17 Nov. 2024 - 17 Nov. 2024
Presenter, Organizing member
日本学術会議北海道地区会議
学術講演会「北海道から多文化共生を考える」
北海道地区会議ニュース，学術講演会開催報告，No.55, 2025-3
High school students, College students, Graduate students, Teachers, Guardians, Researchers, General, Scientific organization, Company, Civic organization, Governmental agency, Media

Public Lecture Series on Global COE Program of Center for Next-Generation Information Technology Based on Knowledge Discovery and Knowledge Federation (GCOE-NGIT)
2011 - 2011
Lecturer, Planner
Graduate School of Information Science and Technology
Global COE Program Public Lecture Series on Next-Generation Information Technology Based on Knowledge Discovery and Knowledge Federation (GCOE-NGIT): From networks via Nano-technology to Genome Science

Media Coverage

Press Fujitsu and Hokkaido University Develop "Explainable AI" Technology Providing Users with Concrete Steps to Achieve Desired Outcomes
04 Feb. 2021
Hokkaido University / Fujitsu Laboratories Ltd.
Fujitsu Laboratories Ltd., Press Releases: https://www.fujitsu.com/global/about/resources/news/press-releases/2021/0204-01.html, [Others]

Fujitsu and Hokkaido University Develop "Explainable AI" Technology Providing Users with Concrete Steps to Achieve Desired Outcomes
04 Feb. 2021
Other than myself
Nikkei
日経新聞: https://www.nikkei.com/article/DGXLRSP604512_U1A200C2000000/, [Paper]

Faculty of Information Science and Technology　Computer Science and Information Technology　Knowledge Software Science	Professor
Research Center of Mathematics for Social Creativity	Professor
Institute for Academic Innovation	Professor