工藤 峰一(クドウ ミネイチ) |

情報科学研究院 情報理工学部門 数理科学分野 |

教授 |

Last Updated :2021/09/07

- パターン認識 特徴選択 汎化性能 センサー 識別 所属性質問 大規模問題 画像処理 クラス数 計算論的学習理論 漢字認識 データマイニング スーパークラス 部分クラス 可視化 区分的線形識別関数 個人認証 ニューラルネットワーク 機械学習 記憶 ラフ集合 記号処理 学習 グラフ理論 問題解決 正則化 様相論理 クリプキ・モデル ファジィ論理 ろ過法 識別子 Learning Feature Selection Classifier Man-Machine Interface friendly for people Identication Technologies Machine Learning Pattern Recognition

- 情報通信 / 知能情報学
- 情報通信 / 知覚情報処理
- 情報通信 / 統計科学
- 情報通信 / ヒューマンインタフェース、インタラクション
- 情報通信 / データベース
- 情報通信 / 知能ロボティクス
- 情報通信 / 知覚情報処理
- 情報通信 / 感性情報学
- 情報通信 / ソフトコンピューティング
- 情報通信 / 知能情報学

- Kejing Lu, Mineichi Kudo569 - 584 2021年
- Kejing Lu, Mineichi KudoIEICE Trans. Inf. Syst. 104-D 1 138 - 145 2021年
- Kejing Lu, Mineichi Kudo1045 - 1056 2020年
- Mitsuki Maekawa, Atsuyoshi Nakamura, Mineichi Kudo241 - 256 2020年
- Kejing Lu, Hongya Wang, Wei Wang, Mineichi KudoProc. VLDB Endow. 13 9 1443 - 1455 2020年
- Mariko Tai, Mineichi KudoProgress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 24th Iberoamerican Congress, CIARP 2019, Havana, Cuba, October 28-31, 2019, Proceedings 525 - 534 2019年 [査読有り][通常論文]
- Lu Sun 0001, Mineichi KudoPattern Anal. Appl. 22 3 1029 - 1049 2019年 [査読有り][通常論文]

Multi-label classification faces several critical challenges, including modeling label correlations, mitigating label imbalance, removing irrelevant and redundant features, and reducing the complexity for large-scale problems. To address these issues, in this paper, we propose a novel method—polytree-augmented classifier chains with label-dependent features—that models label correlations through flexible polytree structures based on low-dimensional label-dependent feature spaces learned by a two-stage feature selection approach. First, a feature weighting approach is applied to efficiently remove irrelevant features for each label and mitigate the effect of label imbalance. Second, a polytree structure is built in the label space using estimated conditional mutual information. Third, an appropriate label-dependent feature subset is found by taking account of label correlations in the polytree. Extensive empirical studies on six synthetic datasets and 12 real-world datasets demonstrate the superior performance of the proposed method. In addition, by incorporating the proposed two-stage feature selection approach, the multi-label classifiers with label-dependent features achieve on average 9.4% performance improvement in Exact-Match compared with the original classifiers. - Ryo Watanabe, Junpei Komiyama, Atsuyoshi Nakamura, Mineichi KudoIEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E101A 3 662 - 667 2018年03月 [査読有り][通常論文]

We propose a policy UCB-SC for budgeted multi-armed bandits. The policy is a variant of recently proposed KL-UCB-SC. Unlike KL-UCB-SC, which is computationally prohibitive, UCB-SC runs very fast while keeping KL-UCB-SC's asymptotical optimality when reward and cost distributions are Bernoulli with means around 0.5, which are verified both theoretically and empirically. - Lu Sun 0001, Mineichi KudoPattern Recognit. 74 503 - 517 2018年 [査読有り][通常論文]

Multi-label classification associates an unseen instance with multiple relevant labels. In recent years, a variety of methods have been proposed to handle the multi-label problems. Classifier chains is one of the most popular multi-label methods because of its efficiency and simplicity. In this paper, we consider to optimize classifier chains from the viewpoint of conditional likelihood maximization. In the proposed unified framework, classifier chains can be optimized in either or both of two aspects: label correlation modeling and multi-label feature selection. In this paper we show that previous classifier chains algorithms are specified in the unified framework. In addition, previous information theoretic multi-label feature selection algorithms are specified with different assumptions on the feature and label spaces. Based on these analyses, we propose a novel multi-label method, k-dependence classifier chains with label-specific features, and demonstrate the effectiveness of the method. (C) 2017 Elsevier Ltd. All rights reserved. - Ryo Watanabe, Junpei Komiyama, Atsuyoshi Nakamura, Mineichi KudoIEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E100A 11 2470 - 2486 2017年11月 [査読有り][通常論文]

We study the budgeted multi-armed bandit problem with stochastic action costs. In this problem, a player not only receives a reward but also pays a cost for an action of his/her choice. The goal of the player is to maximize the cumulative reward he/she receives before the total cost exceeds the budget. In the classical multi-armed bandit problem, a policy called KL-UCB is known to perform well. We propose KL-UCB-SC, an extension of this policy for the budgeted bandit problem. We prove that KL-UCB-SC is asymptotically optimal for the case of Bernoulli costs and rewards. To the best of our knowledge, this is the first result that shows asymptotic optimality in the study of the budgeted bandit problem. In fact, our regret upper bound is at least four times better than that of BTS, the best known upper bound for the budgeted bandit problem. Moreover, an empirical simulation we conducted shows that the performance of a tuned variant of KL-UCB-SC is comparable to that of state-of-the-art policies such as PD-BwK and BTS. - Lu Sun 0001, Mineichi Kudo, Keigo KimuraIEICE Trans. Inf. Syst. 100-D 10 2597 - 2604 2017年 [査読有り][通常論文]

Multi-label classification is an appealing and challenging supervised learning problem, where multiple labels, rather than a single label, are associated with an unseen test instance. To remove possible noises in labels and features of high-dimensionality, multi-label dimension reduction has attracted more and more attentions in recent years. The existing methods usually suffer from several problems, such as ignoring label outliers and label correlations. In addition, most of them emphasize on conducting dimension reduction in an unsupervised or supervised way, therefore, unable to utilize the label information or a large amount of unlabeled data to improve the performance. In order to cope with these problems, we propose a novel method termed Robust sEmi-supervised multi-lAbel DimEnsion Reduction, shortly READER. From the viewpoint of empirical risk minimization, READER selects most discriminative features for all the labels in a semi-supervised way. Specifically, the l(2,1)-norm induced loss function and regularization term make READER robust to the outliers in the data points. READER finds a feature subspace so as to keep originally neighbor instances close and embeds labels into a low-dimensional latent space nonlinearly. To optimize the objective function, an efficient algorithm is developed with convergence property. Extensive empirical studies on real-world datasets demonstrate the superior performance of the proposed method. - Koji Tabata, Atsuyoshi Nakamura, Mineichi KudoIEICE Trans. Inf. Syst. 100-D 5 994 - 1002 2017年 [査読有り][通常論文]

We propose a heuristic approximation algorithm for the 1-median problem. The 1-median problem is the problem of finding a vertex with the highest closeness centrality. Starting from a randomly selected vertex, our algorithm repeats to find a vertex with higher closeness centrality by approximately calculating closeness centrality of each vertex using simpler spanning subgraphs, which are called k-neighbor dense shortest path graphs with shortcuts. According to our experimental results using real networks with more than 10,000 vertices, our algorithm is more than 100 times faster than the exhaustive search and more than 20 times faster than the state-of-the-art approximation algorithm using annotated information to the vertices while the solutions output by our algorithm have higher approximation ratio. - Jana Backhus, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo, Masanori SugimotoIEICE Trans. Fundam. Electron. Commun. Comput. Sci. 100-A 3 865 - 876 2017年 [査読有り][通常論文]

In this paper, we introduce a self-constructive Normalized Gaussian Network (NGnet) for online learning tasks. In online tasks, data samples are received sequentially, and domain knowledge is often limited. Then, we need to employ learning methods to the NGnet that possess robust performance and dynamically select an accurate model size. We revise a previously proposed localized forgetting approach for the NGnet and adapt some unit manipulation mechanisms to it for dynamic model selection. The mechanisms are improved for more robustness in negative interference prone environments, and a new merge manipulation is considered to deal with model redundancies. The effectiveness of the proposed method is compared with the previous localized forgetting approach and an established learning method for the NGnet. Several experiments are conducted for a function approximation and chaotic time series forecasting task. The proposed approach possesses robust and favorable performance in different learning situations over all testbeds. - Keigo Kimura, Lu Sun 0001, Mineichi KudoCoRR abs/1704.02592 2017年 [査読有り][通常論文]
- Hiroyuki Hanada, Mineichi Kudo, Atsuyoshi NakamuraCoRR abs/1701.06134 2017年 [査読有り][通常論文]
- Mineichi Kudo, Keigo Kimura, Michal Haindl, Hiroshi Tenmoto2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) 3603 - 3608 2016年 [査読有り][通常論文]

Visualization helps us to understand single-label and multi-label classification problems. In this paper, we show several standard techniques for simultaneous visualization of samples, features and multi-classes on the basis of linear regression and matrix factorization. The experiment with two real-life multi-label datasets showed that such techniques are effective to know how labels are correlated to each other and how features are related to labels in a given multi-label classification problem. - Batzaya Norov-Erdene, Mineichi Kudo, Lu Sun, Keigo Kimura2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) 2319 - 2324 2016年 [査読有り][通常論文]

Lately, multi-label classification (MLC) problems have drawn a lot of attention in a wide range of fields including medical, web, and entertainment. The scale and the diversity of MLC problems is much larger than single-label classification problems. Especially we have to face all possible combinations of labels. To solve MLC problems more efficiently, we focus on three kinds of locality hidden in a given MLC problem. In this paper, first we show how large degree of locality exists in nine datasets, then examine how closely they are related to labels, and last propose a method of reducing the problem size using one kind of locality. - Syota Suzuuchi, Mineichi Kudo2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) 2079 - 2084 2016年 [査読有り][通常論文]

Recording of the activity of people working in an office or in a living room is important for several goals: to design an evacuation route, to measure the degree of ADL (Activity of Daily Living) of single-living elderly persons, and to analyze the working contents of people, and so on. Camera systems are available for these goals, but they are weak for the light condition change (not available at dark in general) and invasive for the users' privacy. Therefore, we use an infrared ceiling sensor network for localizing moving people and use some extra pieces of evidence for identifying them. Such ID evidence is taken from a finger-vein authentication system at the entrance/exit and from the staying duration at their personal desks. The kind of activity of individuals is determined by the locations and the staying duration related to those activities, e.g., spending two minutes or more at kitchen means "cooking." A challenging task is to find how to recover missing IDs and related their activities. This sometimes happens when two or more persons cross or meet at a place or someone sits at a desk silently, because a motion sensor reacts only when something with human temperature "moves" in the focus of view. This system succeeded in recording of the activity of three persons for three hours at 87.4% and six persons for seven hours at 15.0%. - Lu Sun, Mineichi Kudo, Keigo Kimura2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) 1612 - 1617 2016年 [査読有り][通常論文]

Multi-label classification has attracted many attentions in various fields, such as text categorization and semantic image annotation. Aiming to classify an instance into multiple labels, various multi-label classification methods have been proposed. However, the existing methods typically build models in the identical feature (sub) space for all labels, possibly inconsistent with real-world problems. In this paper, we develop a novel method based on the assumption that meta-labels with specific features exist in the scenario of multi-label classification. The proposed method consists of meta-label learning and specific feature selection. Experiments on twelve benchmark multi-label datasets show the efficiency of the proposed method compared with several state-of-the-art methods. - Keigo Kimura, Mineichi Kudo, Lu Sun, Sadamori Koujaku2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) 438 - 443 2016年 [査読有り][通常論文]

Multi-label classification (MLC), allowing instances to have multiple labels, has been received a surge of interests in recent years due to its wide range of applications such as image annotation and document tagging. One of simplest ways to solve MLC problems is label-power set method (LP) that regards all possible label subsets as classes. LP validates traditional multi-classification classifiers such as multi-class SVM but it suffers from the increased number of classes. Therefore, several improvements have been made for LP to be scaled for large problems with many labels. Random k labELsets (RAkEL) proposed by Tsoumakas et al. solves this problem by randomly sampling a small number of labels and taking ensemble of them. However, RAkEL needs all instances for constructing each model and thus suffers from high computational complexity. In this paper, we propose a new fast algorithm for RAkEL. First, we assign each training instance to a small number of models. Then LP is applied for each model with only the assigned instances. Experiments on twelve benchmark datasets demonstrated that the proposed algorithm works faster than the conventional methods while keeping accuracy. In the best case, it was 100 times faster than baseline method (LP) and 30 times faster than the original RAkEL. - Keigo Kimura, Mineichi Kudo, Lu SunSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2016 10029 15 - 25 2016年 [査読有り][通常論文]

In this paper, unlike previous many linear embedding methods, we propose a non-linear embedding method for multi-label classification. The algorithm embeds both instances and labels into the same space, reflecting label-instance relationship, label-label relationship and instance-instance relationship as faithfully as possible, simultaneously. Such an embedding into two-dimensional space is useful for simultaneous visualization of instances and labels. In addition linear and nonlinear mapping methods of a testing instance are also proposed for multi-label classification. The experiments on thirteen benchmark datasets showed that the proposed algorithm can deal with better small-scale problems, especially in the number of instances, compared with the state-of-the-art algorithms. - Shunsuke Suzuki, Mineichi Kudo, Atsuyoshi NakamuraIEEE International Conference on Identity, Security and Behavior Analysis, ISBA 2016, Sendai, Japan, February 29 - March 2, 2016 1 - 6 IEEE 2016年 [査読有り][通常論文]
- Jana Backhus, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo, Masanori SugimotoNEURAL INFORMATION PROCESSING, ICONIP 2016, PT IV 9950 538 - 546 2016年 [査読有り][通常論文]

In this paper, we propose a weight-time-dependent (WTD) update approach for an online EM algorithm applied to the Normalized Gaussian network (NGnet). WTD aims to improve a recently proposed weight-dependent (WD) update approach by Celaya and Agostini. First, we discuss the derivation of WD from an older time-dependent (TD) update approach. Then, we consider additional aspects to improve WD, and by including them we derive the new WTD approach from TD. The difference between WD and WTD is discussed, and some experiments are conducted to demonstrate the effectiveness of the proposed approach. WTD succeeds in improving the learning performance for a function approximation task with balanced and dynamic data distributions. - Jana Backhus, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo, Masanori SugimotoARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT I 9886 444 - 452 2016年 [査読有り][通常論文]

In this paper, a Normalized Gaussian Network (NGnet) is introduced for online sequential learning that uses unit manipulation mechanisms to build the network model self-constructively. Several unit manipulation mechanisms have been proposed for online learning of an NGnet. However, unit redundancy still exists in the network model. We propose a merge mechanism for such redundant units, and change its overlap calculation in order to improve the identification accuracy of redundant units. The effectiveness of the proposed approach is demonstrated in a function approximation task with balanced and imbalanced data distributions. It succeeded in reducing the model complexity around 11% on average while keeping or even improving learning performance. - Lu Sun 0001, Mineichi Kudo, Keigo KimuraECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE 285 261 - 268 2016年 [査読有り][通常論文]

Multi-label classification aims to assign multiple labels to a single test instance. Recently, more and more multi-label classification applications arise as large-scale problems, where the numbers of instances, features and labels are either or all large. To tackle such problems, in this paper we develop a clustering-based local multi-label classification method, attempting to reduce the problem size in instances, features and labels. Our method consists of low-dimensional data clustering and local model learning. Specifically, the original dataset is firstly decomposed into several regular-scale parts by applying clustering analysis on the feature subspace, which is induced by a supervised multi-label dimension reduction technique; then, an efficient local multi-label model, meta-label classifier chains, is trained on each data cluster. Given a test instance, only the local model belonging to the nearest cluster to it is activated to make the prediction. Extensive experiments performed on eighteen benchmark datasets demonstrated the efficiency of the proposed method compared with the state-of-the-art algorithms. - Hideaki Konno, Mineichi Kudo, Hideyuki Imai, Masanori SugimotoSpeech Commun. 83 10 - 20 2016年 [査読有り][通常論文]

We can perceive pitch in whispered speech, although fundamental frequency (F-0) does not exist physically or phonetically due to the lack of vocal-fold vibration. This study was carried out to determine how people generate such an unvoiced pitch. We conducted experiments in which speakers uttered five whispered Japanese vowels in accordance with the pitch of a guide pure tone. From the results, we derived a multiple regression function to convert the outputs of a mel-scaled filter bank of whispered speech into the perceived pitch value. Next, using this estimated pitch value as F-0, we constructed a system for conversion of whispered speech to normal speech. Since the pitch varies with time according to the spectral shape, it was expected that the pitch accent would be kept by this conversion. Indeed, auditory experiments demonstrated that the correctly perceived rate of Japanese word accent was increased from 55.5% to 72.0% compared with that when a constant F-0 was used. (C) 2016 Elsevier B.V. All rights reserved. - Sadamori Koujaku, Ichigaku Takigawa, Mineichi Kudo, Hideyuki ImaiSoc. Networks 44 143 - 152 2016年 [査読有り][通常論文]

Discovery of cohesive subgraphs is an important issue in social network analysis. As representative cohesive subgraphs, pseudo cliques have been developed by relaxing the perfection of cliques. By enumerating pseudo clique subgraphs, we can find some structures of interest such as a star-like structure. However, a little more complicated structures such as a core/periphery structure is still hard to be found by them. Therefore, we propose a novel pseudo clique called p-dense core and show the connection with the other pseudo cliques. Moreover, we show that a set of p-dense core subgraphs gives an optimal solution in a graph partitioning problem. Several experiments on real-life networks demonstrated the effectiveness for cohesive subgraph discovery. (C) 2015 Elsevier B.V. All rights reserved. - Guoliang Lu, Yiqi Zhou, Xueyong Li, Mineichi KudoMultim. Tools Appl. 75 6 3479 - 3494 2016年 [査読有り][通常論文]

To accurately recognize human actions in less computational time is one important aspect for practical usage. This paper presents an efficient framework for recognizing actions by a RGB-D camera. The novel action patterns in the framework are extracted via computing position offset of 3D skeletal body joints locally in the temporal extent of video. Action recognition is then performed by assembling these offset vectors using a bag-of-words framework and also by considering the spatial independence of body joints. We conducted extensive experiments on two benchmarking datasets: UCF dataset and MSRC-12 dataset, to demonstrate the effectiveness of the proposed framework. Experimental results suggest that the proposed framework 1) is very fast to extract action patterns and very simple in implementation; and 2) can achieve a comparable or a better performance in recognition accuracy compared with the state-of-the-art approaches. - Keigo Kimura, Mineichi Kudo, Yuzuru TanakaMach. Learn. 103 2 285 - 306 2016年 [査読有り][通常論文]

Recently orthogonal nonnegative matrix factorization (ONMF), imposing an orthogonal constraint into NMF, has been attracting a great deal of attention. ONMF is more appropriate than standard NMF for a clustering task because the constrained matrix can be considered as an indicator matrix. Several iterative ONMF algorithms have been proposed, but they suffer from slow convergence because of their matrix-wise updating. In this paper, therefore, a column-wise update algorithm is proposed for speeding up ONMF. To make the idea possible, we transform the matrix-based orthogonal constraint into a set of column-wise orthogonal constraints. The algorithm is stated first with the Frobenius norm and then with Bregman divergence, both for measuring the degree of approximation. Experiments on one artificial and six real-life datasets showed that the proposed algorithms converge faster than the other conventional ONMF algorithms, more than four times in the best cases, due to their smaller numbers of iterations. - Atsuyoshi Nakamura, Ichigaku Takigawa, Hisashi Tosaka, Mineichi Kudo, Hiroshi MamitsukaDiscret. Appl. Math. 200 123 - 152 2016年 [査読有り][通常論文]

We consider a frequent approximate pattern mining problem, in which interspersed repetitive regions are extracted from a given string. That is, we enumerate substrings that frequently match substrings of a given string locally and optimally. For this problem, we propose a new algorithm, in which candidate patterns are generated without duplication using the suffix tree of a given string. We further define a k-gap-constrained setting, in which the number of gaps in the alignment between a pattern and an occurrence is limited to at most k. Under this setting, we present memory-efficient algorithms, particularly a candidate-based version, which runs fast enough even over human chromosome sequences with, more than 10 million nucleotides. We note that our problem and algorithms for strings can be directly extended to ordered labeled trees. In our experiments we used both randomly synthesized strings, in which corrupted similar substrings are embedded, and real data of human chromosome. The synthetic data experiments show that our proposed approach extracted embedded patterns correctly and time-efficiently. In real data experiments, we examined the centers of 100 clusters computed after grouping the patterns obtained by our k-gap-constrained versions (k = 0, 1 and 2) and the results revealed that the regions of their occurrences coincided with around a half of the regions automatically annotated as Alu sequences by a manually curated repeat sequence database. (C) 2015 Elsevier B.V. All rights reserved. - Ryo Watanabe, Atsuyoshi Nakamura, Mineichi KudoOPERATIONS RESEARCH LETTERS 43 6 558 - 563 2015年11月 [査読有り][通常論文]

We improved an upper bound on the expected regret of a UCB-type policy LLR for a bandit problem that repeats the following rounds: a player selects a maximal matching on a complete bipartite graph K-M,K-N and receives a reward for each component edge of the selected matching. Rewards are assumed to be generated independently of its previous rewards according to an unknown fixed distribution. Our upper bound is smaller than the best known result (Chen et al., 2013) by a factor of Theta (M-2/3). (C) 2015 Elsevier B.V. All rights reserved. - Akira Tanaka, Hirofumi Takebayashi, Ichigaku Takigawa, Hideyuki Imai, Mineichi KudoIEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E98A 11 2315 - 2324 2015年11月 [査読有り][通常論文]

For the last few decades, learning with multiple kernels, represented by the ensemble kernel regressor and the multiple kernel regressor, has attracted much attention in the field of kernel-based machine learning. Although their efficacy was investigated numerically in many works, their theoretical ground is not investigated sufficiently, since we do not have a theoretical framework to evaluate them. In this paper, we introduce a unified framework for evaluating kernel regressors with multiple kernels. On the basis of the framework, we analyze the generalization errors of the ensemble kernel regressor and the multiple kernel regressor, and give a sufficient condition for the ensemble kernel regressor to outperform the multiple kernel regressor in terms of the generalization error in noise-free case. We also show that each kernel regressor can be better than the other without the sufficient condition by giving examples, which supports the importance of the sufficient condition. - Shuai Tao, Mineichi Kudo, Bing-Nan Pei, Hidetoshi Nonaka, Jun ToyamaIEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS 45 5 550 - 561 2015年10月 [査読有り][通常論文]

Low-cost sensor networks for multitarget tracking are increasingly becoming important equipment in many applications. A major problem is that these sensors usually provide only a binary response in each epoch, if a target is present or absent. Efficient approaches for realizing the location and tracking of multiple targets are needed. In this paper, we develop a soft tracking system using an infrared ceiling sensor network and propose a novel algorithm for tracking multiple people. In this system, 43 infrared sensors were attached to the ceiling of an office room (15.0 m x 8.5 m). Some pieces of weak evidence such as locations of personal desks, and the moving directions of people, were used for soft tracking. Through experiments, the ability of tracking in different situations was evaluated. The tracking accuracy during a 3 h period was investigated. The results showed that this system was able to track up to eight people simultaneously for hours in an office room. The tracking accuracy was above 90% most of the time, although some identity ambiguities occurred. - Sadamori Koujaku, Mineichi Kudo, Ichigaku Takigawa, Hideyuki ImaiWWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB 793 - 798 2015年 [査読有り][通常論文]

Detection of anomalous changes in social networks has been studied in various applications such as change detection of social interests and virus infections. Among several kinds of network changes, we concentrate on the structural changes of relatively small stationary communities. Such a change is important because it implies that some crucial changes have happened in a special group, such as dismiss of a board of directors. One difficulty is that we have to do this in a noisy environment. This paper, therefore, proposes an algorithm that finds stationary communities in a noisy environment. Experiments on two real networks showed the advantages of our proposed algorithm. - Ayako Mikami, Mineichi Kudo, Atsuyoshi NakamuraMULTIPLE CLASSIFIER SYSTEMS (MCS 2015) 9132 27 - 37 2015年 [査読有り][通常論文]

Ensemble learning is a strong tool to strengthen weak classifiers. A large amount of diversity among those weak classifiers is a key to accelerate the effectiveness. Therefore, many diversity measures on a given training sample set have been proposed so far. However, they are almost all based on the oracle output that is one if the class predicted by the classifier is correct, zero otherwise. We point out such an oracle output scheme is not appropriate for the problems of more than two classes, and extend one of the most popular diversity measures, disagreement measure, to multi-class cases. On the other hand, the concept of margin has been recognized as an analytic tool to measure the generalization performance of a given classifier. Therefore, we analyze when some criteria for maximizing margins of an ensemble classifier over training samples are maximized under the assumption that the average accuracy of the base classifiers is constant. We also reveal the relationship between those criteria and the extended disagreement measure. As a result, it turns out that diversity is necessary not only over samples but also over predicted classes, if we want to extract the highest potential of ensemble. - Lu Sun, Mineichi KudoProceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015 3834 - 3840 AAAI Press 2015年 [査読有り][通常論文]
- Keigo Kimura, Mineichi Kudo2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) 805 - 810 2015年 [査読有り][通常論文]

Nonnegative Tensor Factorization (NTF) has become a popular tool for extracting informative patterns from tensor data. However, NTF has high computational cost both in space and in time, mostly in iterative calculation of the gradient. In this paper, we consider variable selection to reduce the cost, assuming sparsity of the factor matrices. In fact, it is known that the factor matrices are often very sparse in many applications such as network analysis, text analysis and image analysis. We update only a small subset of important variables in each iterative step. We show the effectiveness of the algorithm analytically and experimentally in comparison with conventional NTF algorithms. The algorithm was five times faster than the naive algorithm in the best case and required one to five hundred times less memory while keeping the approximation accuracy as the same. - Koji Tabata, Atsuyoshi Nakamura, Mineichi KudoDISCOVERY SCIENCE, DS 2015 9356 275 - 283 2015年 [査読有り][通常論文]

We developed an algorithm to exactly solve an influence maximization problem (MaxInf) for a two-terminal series parallel graph (TTSPG) in the independent cascade model. The class of TTSPGs can be considered as a class wider than that of trees, only for which an efficient exact solver of this problem has been developed so far. Our algorithm calculates candidate node sets in the divide-and-conquer manner keeping the number of them as small as possible by efficiently eliminating unnecessary ones in merge of subproblems' solutions. Furthermore, we propose a way of converting an arbitrary network to a TTSPG with edges important for propagation to apply our method to real networks. According to our empirical results, our method is significantly faster than the greedy approximation algorithm for MaxInf of a TTSPG. We also demonstrate improvement of solutions by converting to TTSPGs instead of trees using real networks made from DBLP datasets. - Michal Haindl, Stanislav Mikes, Mineichi KudoCOMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2015, PT I 9256 261 - 273 2015年 [査読有り][通常論文]

An unsupervised, illumination invariant, multi-spectral, multi-resolution, multiple-segmenter for textured images with unknown number of classes is presented. The segmenter is based on a weighted combination of several unsupervised segmentation results, each in different resolution, using the modified sum rule. Multi-spectral textured image mosaics are locally represented by eight causal directional multi-spectral random field models recursively evaluated for each pixel. The single-resolution segmentation part of the algorithm is based on the underlying Gaussian mixture model and starts with an over segmented initial estimation which is adaptively modified until the optimal number of homogeneous texture segments is reached. The performance of the presented method is extensively tested on the Prague segmentation benchmark both on the surface reflectance field textures as well as on the static colour textures using the commonest segmentation criteria and compares favourably with several leading alternative image segmentation methods. - Anton Milan, Stefan Roth, Konrad Schindler, Mineichi KudoCOMPUTER VISION - ACCV 2014 WORKSHOPS, PT III 9010 519 - 530 2015年 [査読有り][通常論文]

Automated people tracking is important for a wide range of applications. However, typical surveillance cameras are controversial in their use, mainly due to the harsh intrusion of the tracked individuals' privacy. In this paper, we explore a privacy-preserving alternative for multi-target tracking. A network of infrared sensors attached to the ceiling acts as a low-resolution, monochromatic camera in an indoor environment. Using only this low-level information about the presence of a target, we are able to reconstruct entire trajectories of several people. Inspired by the recent success of offline approaches to multi-target tracking, we apply an energy minimization technique to the novel setting of infrared motion sensors. To cope with the very weak data term from the infrared sensor network we track in a continuous state space with soft, implicit data association. Our experimental evaluation on both synthetic and real-world data shows that our principled method clearly outperforms previous techniques. - Xavier Llad'o, Atsushi Imiya, David Mason, Constantino, Carlos Reyes-Aldasoro, Kazuaki Aoki, Mineichi Kudo, Yu-Jin Zhang, Vasileios ArgyriouPattern Recognition Letters 54 109 2015年 [査読有り][通常論文]
**マルチカーネル回帰とアンサンブルカーネル回帰の汎化誤差解析**田中 章, 瀧川 一学, 今井 英幸, 工藤 峰一第29回信号処理シンポジウム講演論文集 120 - 123 2014年11月 [査読無し][通常論文]- Hiroyuki Hanada, Mineichi Kudo, Atsuyoshi NakamuraTHEORETICAL COMPUTER SCIENCE 530 23 - 41 2014年04月 [査読有り][通常論文]

In this paper we consider the problem of similar substring searching in the q-gram distance. The q-gram distance d(q)(x, y) is a similarity measure between two strings x and y defined by the number of different q-grams between them. The distance can be used instead of the edit distance due to its lower computation cost, O(|x| + |Y|) vs. O(|x||Y|). and its good approximation for the edit distance. However, if this distance is applied to the problem of finding all similar strings, in a long text t, to a given pattern p, the total computation cost is sometimes not acceptable. Ukkonen already proposed two fast algorithms: one with an array and the other with a tree. When "similar" means k or less in dq, their time complexities are O(|t|k + |P|) and O(|t| log k + |p|). respectively. In this paper, we propose two algorithms of average-case complexity O(|t| + |p|). although their worst-case complexities are still O(|t|k + |P|) and O(|t| log k + |p|). respectively. The linearity of the average-case complexity is analyzed under the assumption of random sampling of t and the condition that q is larger than a threshold. The algorithms exploit the fact that similar substrings in t are often found at very close positions if the beginning positions of the substrings are close. In the second proposed algorithm, we adopted a doubly-linked list supported by an array and a search tree to search for a list element in O(log k) time. Experimental results support their theoretical average-case complexities. (C) 2014 Elsevier B.V. All rights reserved. - Koji Ouchi, Atsuyoshi Nakamura, Mineichi KudoPATTERN RECOGNITION 47 3 1459 - 1468 2014年03月 [査読有り][通常論文]

We develop efficient construction methods of a rectangle greedy cover (RGC), and evaluate its usefulness in applications. An RGC is a greedy cover of the set of given positive instances by exclusive axis-parallel hyperrectangles, namely, axis-parallel hyperrectangles that exclude all the given negative instances. An RGC is expected to be a compact classification rule with high readability because the number of its component rectangles is expected to be small and it can be seen as a disjunctive normal form, which is one of the most readable representations for us. We propose two approaches of RGC construction: enumeration approach and direct approach. In enumeration approach, the maximal exclusive positive subsets (MEPSs) are enumerated first and then an ordinary greedy set covering is done using the enumerated MEPSs. We make clear the relation between enumeration of the maximal frequent itemsets and enumeration of the MEPSs, and convert an efficient enumeration algorithm LCMmax [1] of maximal frequent itemsets to an enumeration algorithm LCMmax.R-naive of MEPSs. We also develop a more efficient version of LCMmax.R-naive, or LCMmax.R, by incorporating effective dynamic reordering of instances using excluded frequency and bit-parallel exclusiveness check. In direct approach, each component MEPS of an RGC is searched not from enumerated MEPSs but directly from the dataset that consists of the remaining uncovered positive instances and the whole negative instances. We developed an algorithm called MRF that efficiently finds an maximum-sized MEPS for given positive and negative instances. MRF is made from LCMmax.R by modifying it so as to find a maximum-sized MEPS only. An RGC is constructed by MRF repetition, that is, by repeatedly executing MRF using the remaining uncovered positive instances. According to our experimental evaluation using UCI-repository datasets, LCMmax.R was about 5-11 times faster than LCMmax.Rnaive, which indicates effectiveness of the introduced two improvements. MRF repetition, however, was significantly faster than LCMmax.R, and it was fast enough for practical usage. The experimental results using UCI-repository datasets also showed that accuracy of a nearest rectangle classifier using an RGC is close to that using the hyperrectangles output by the randomized subclass method (RSM) [2] though the number of component rectangles of an RGC is significantly smaller than the number of the hyperrectangles output by RSM. The performance of RGC was also shown to be comparable to that of the six popular classifiers including logistic regression and support vector machine. The disjunctive normal form representation of the classification rules obtained by RGC was demonstrated to be simpler and more readable for us than that obtained by RSM and C4.5. (C) 2013 Elsevier Ltd. All rights reserved. - Guoliang Lu, Mineichi KudoNEUROCOMPUTING 123 328 - 336 2014年01月 [査読有り][通常論文]

A new framework is presented for single-person oriented action recognition. This framework does not require detection/location of bounding boxes of human body nor motion estimation in each frame. The novel descriptor/pattern for action representation is learned with local temporal self-similarities (LTSSs) derived directly from difference images. The bag-of-words framework is then employed for action classification taking advantages of these descriptors. We investigated the effectiveness of the framework on two public human action datasets: the Weizmann dataset and the KTH dataset In the Weizmann dataset, the proposed framework achieves a performance of 95.6% in the recognition rate and that of 91.1% in the KTH dataset, both of which are competitive with those of state-of-the-art approaches, but it has a high potential to achieve a faster execution performance. (C) 2013 Elsevier B.V. All rights reserved. - Hideaki Konno, Rinako Sato, Hideyuki Imai, Mineichi KudoAsia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014, Chiang Mai, Thailand, December 9-12, 2014 1 - 4 IEEE 2014年 [査読有り][通常論文]
- Akira Tanaka, Ichigaku Takigawa, Hideyuki Imai, Mineichi KudoProceedings of the Sixth Asian Conference on Machine Learning, ACML 2014, Nha Trang City, Vietnam, November 26-28, 2014. JMLR.org 2014年 [査読有り][通常論文]
- Keigo Kimura, Yuzuru Tanaka, Mineichi KudoProceedings of the Sixth Asian Conference on Machine Learning, ACML 2014, Nha Trang City, Vietnam, November 26-28, 2014. JMLR.org 2014年 [査読有り][通常論文]
- Akira Tanaka, Ichigaku Takigawa, Hideyuki Imai, Mineichi KudoSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION 8621 273 - 281 2014年 [査読有り][通常論文]

Kernel-based learning is widely known as a powerful tool for various fields of information science such as pattern recognition and regression estimation. For the last few decades, a combination of different learning machines so-called ensemble learning, which includes learning with multiple kernels, have attracted much attention in this field. Although its efficacy was revealed numerically in many works, its theoretical grounds are not investigated sufficiently. In this paper, we discuss regression problems with a class of kernels and show that the generalization error by an ensemble kernel regressor with the class of kernels is smaller than the averaged generalization error by kernel regressors with each kernel in the class. - Hiroshi Tsukioka, Mineichi Kudo2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) 1591 - 1596 2014年 [査読有り][通常論文]

In some dynamic environments, the degree of importance of features for classification varies over time. For example, if we want to identify the kinds of birds in a forest, different groups of birds might sing in different time periods. Then we have to change the features to identify the kinds of a bird, e.g., frequency, depending on time of observation. This study deals with such a sequence of feature subsets changing their importance over time. We assume that such a change happens gradually, that is, the case of population drift. To track drifting distributions, we use volume prototypes with a forgetting factor and on the basis of volume prototypes at each time period we extract feature subsets useful for that time. - Kenshiro Nishikawa, Mineichi KudoACTIVITY MONITORING BY MULTIPLE DISTRIBUTED SENSING 8703 64 - 72 2014年 [査読有り][通常論文]

We propose a simple method for measuring group sleepiness, aiming at making clear how the environmental factor affects such sleepiness. A unit to measure the angle of head slant and the acceleration of head moving was attached to one side of ears with a bandana. A video interface was also developed to display how many students in a classroom are awake, felling sleepiness, or sleeping with the corresponding colors. The experiments showed the existence of some different patterns in the way of falling asleep. We revealed that level prediction of group sleepiness is easier than that of individual sleepiness. Indeed, the standard error was reduced from 0.695 (for individual) into 0.423 (for group), showing almost perfect prediction of sleepiness level of a group of five persons. - Tomomi Endo, Kazuhiro Omura, Mineichi KudoIJPRAI 28 7 2014年 [査読有り][通常論文]
**"A Sufficient Condition For Reproducing Kernel Hilbert Spaces Being Separable**田中 章, 今井 英幸, 工藤 峰一第28回信号処理シンポジウム講演論文集 350 - 354 2013年11月 [査読無し][通常論文]- Atsuyoshi Nakamura, Tomoya Saito, Ichigaku Takigawa, Mineichi Kudo, Hiroshi MamitsukaDiscrete Applied Mathematics 161 10-11 1556 - 1575 2013年07月 [査読有り][通常論文]

A string with many repetitions can be represented compactly by replacing h-fold contiguous repetitions of a string r with (r)h. We present a compact representation, which we call a repetition representation (of a string) or RRS, by which a set of disjoint or nested tandem arrays can be compacted. In this paper, we study the problem of finding a minimum RRS or MRRS, where the size of an RRS is defined by the sum of the length of component letters and the description length of the component repetitions (ṡ)h which is defined by ER (h) using a repetition weight function wR. We develop two dynamic programming-based algorithms to solve this problem: CMR, which works for any type of wR, and CMR-C, which is faster but can be applied to a constant wR only. CMR-C is an O(n2logn)-time O(nlogn)-space algorithm, which is more efficient in both time and space than CMR by a ((logn)/n)-factor, where n is the length of the given string. The problem of finding an MRRS for a string can be extended to that of finding a minimum repetition representation (of a tree) or MRRT for a given labeled ordered tree. For this problem, we present two algorithms, CMRT and CMRT-C, by using CMR and CMR-C, respectively, as a subroutine. As well as the theoretical analysis, we confirmed the efficiency of the proposed algorithms by experiments, which consist of the following three parts: First we demonstrated that CMR-C and CMRT-C are fast enough for large-scale data by using synthetic strings and trees, respectively. The size of an MRRS for a given string can be a measure of how compactly the string can be represented, meaning how well the string is structurally organized. This is also true of trees. To check such ability of MRRS-size, second we measured the size of an MRRS for chromosomes of nine different species. We found that all the chromosomes of the same species have a similar compression rate when realized by an MRRS. Run length encoding (RLE) was also shown to have species-specific compression rate, but species were separated more clearly by MRRS than by RLE. Third we examined the size of an MRRT for web pages of world-leading companies by using the tag trees, showing a consistency between the compression rate by an MRRT and visual web page structures. © © 2013 Elsevier B.V. All rights reserved. - Tomomi Endo, Mineichi KudoProgress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 18th Iberoamerican Congress, CIARP 2013, Havana, Cuba, November 20-23, 2013, Proceedings, Part I 8258 1 149 - 156 Springer 2013年 [査読有り][通常論文]

A weighted naïve Bayes classifier using Renyi entropy is proposed. Such a weighted naïve Bayes classifier has been studied so far, aiming at improving the prediction performance or at reducing the number of features. Among those studies, weighting with Shannon entropy has succeeded in improving the performance. However, the reasons of the success was not well revealed. In this paper, the original classifier is extended using Renyi entropy with parameter α. The classifier includes the regular naïve Bayes classifier in one end (α = 0.0) and naïve Bayes classifier weighted by the marginal Bayes errors in the other end (α = ∞). The optimal setting of α has been discussed analytically and experimentally. © Springer-Verlag 2013. - Guoliang Lu, Mineichi Kudo, Jun ToyamaPattern Recognition Letters 34 15 1936 - 1944 2013年 [査読有り][通常論文]

Temporal segmentation of successive actions in a long-term video sequence has been a long-standing problem in computer vision. In this paper, we exploit a novel learning-based framework. Given a video sequence, only a few characteristic frames are selected by the proposed selection algorithm, and then the likelihood to trained models is calculated in a pair-wise way, and finally segmentation is obtained as the optimal model sequence to realize the maximum likelihood. The average accuracy on IXMAS dataset reached to 80.5% at frame level, using only 16.5% of all frames in computation time of 1.57 s per video which has 1160 frames on the average. Crown Copyright © 2012 Published by Elsevier B.V. All rights reserved. - Guoliang Lu, Mineichi KudoIEICE Transactions on Information and Systems E96-D 5 1238 - 1242 2013年 [査読有り][通常論文]

Temporal Self-Similarity Matrix (SSM) based action recognition is one of the important approaches of single-person oriented action analysis in computer vision. In this study, we propose a new kind of SSM and a fast computation method. The computation method does not require time-consuming pre-processing to find bounding boxes of the human body, instead it processes difference images to obtain action patterns which can be done very quickly. The proposed SSM is experimentally confirmed to have high power/capacity to achieve a better classification performance than four typical kinds of SSMs. Copyright © 2013 The Institute of Electronics, Information and Communication Engineers. - Koujaku Sadamori, Kudo Mineichi, Takigawa Ichigaku, Imai HideyukiWORLD CONGRESS ON ENGINEERING - WCE 2013, VOL I 324 - + 2013年 [査読有り][通常論文]
- Hideaki Konno, Hideo Kanemitsu, Nobuyuki Takahashi, Mineichi Kudo2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings 245 - 249 2013年 [査読有り][通常論文]

The characteristics of whispered speech are not well known. The most remarkable difference from ordinal speech is the pitch (the height of speech), since whispered speech has no fundamental frequency. In this study, we have tried to reveal the mechanism of producing pitch in whispered speech through an experiment in which a male and a female subjects uttered Japanese whispered vowels in a way so as to tune their pitch to the guidance tone with different five to nine frequencies. We applied multivariate analysis such as the principal component analysis to the data in order to make clear which part of frequency contributes much to the change of pitch. We have succeeded in endorsing the previous observations, i.e. shift of formants is dominant, with more detailed numerical evidence. In addition, we obtained some implications to approach the pitch mechanism of whispered speech. The main result obtained is that two or three formants of less than 5 kHz are shifted upward and the energy is increased in high frequency region over 5 kHz. © 2013 IEEE. - Yingmei Pia, Mineichi Kudo2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013) 882 - 886 2013年 [査読有り][通常論文]

Human age estimation based on facial images has many potential applications in practice. However, the current age estimation techniques are not matured. Most studies focus only on neutral faces, that is, expressionless faces. Several expressions such as happy expression, may help to improve the prediction accuracy. Recently, some works reported that expressions could badly impact on the accuracy. In this paper, we investigated the degree of facial expression impact on age prediction subjectively and objectively. It was revealed that expressions do not contribute for age prediction so much. - Shuai Tao, Mineichi Kudo, Hidetoshi NonakaSENSORS 12 12 16920 - 16936 2012年12月 [査読有り][通常論文]

An infrared ceiling sensor network system is reported in this study to realize behavior analysis and fall detection of a single person in the home environment. The sensors output multiple binary sequences from which we know the existence/non-existence of persons under the sensors. The short duration averages of the binary responses are shown to be able to be regarded as pixel values of a top-view camera, but more advantageous in the sense of preserving privacy. Using the "pixel values" as features, support vector machine classifiers succeeded in recognizing eight activities (walking, reading, etc.) performed by five subjects at an average recognition rate of 80.65%. In addition, we proposed a martingale framework for detecting falls in this system. The experimental results showed that we attained the best performance of 95.14% (F-1 value), the FAR of 7.5% and the FRR of 2.0%. This accuracy is not sufficient in general but surprisingly high with such low-level information. In summary, it is shown that this system has the potential to be used in the home environment to provide personalized services and to detect abnormalities of elders who live alone. - Guoliang Lu, Mineichi Kudo, Jun ToyamaIEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E95D 10 2514 - 2521 2012年10月 [査読有り][通常論文]

Vision based human action recognition has been an active research field in recent years. Exemplar matching is an important and popular methodology in this field, however, most previous works perform exemplar matching on the whole input video clip for recognition. Such a strategy is computationally expensive and limits its practical usage. In this paper, we present a martingale framework for selection of characteristic frames from an input video clip without requiring any prior knowledge. Action recognition is operated on these selected characteristic frames. Experiments on 10 studied actions from WEIZMANN dataset demonstrate a significant improvement in computational efficiency (54% reduction) while achieving the same recognition precision. - 田畑 公次, 中村 篤祥, 工藤 峰一人工知能基本問題研究会 86 0 77 - 82 人工知能学会 2012年08月09日 [査読無し][通常論文]
- 渡邊 僚, 中村 篤祥, 工藤 峰一人工知能基本問題研究会 86 0 29 - 34 人工知能学会 2012年08月09日 [査読無し][通常論文]
- NAKAMURA Atsuyoshi, TAKIGAWA Ichigaku, TOSAKA Hisashi, KUDO Mineichi, MAMITSUKA Hiroshi人工知能学会人工知能基本問題研究会資料 85th 23-28 2012年01月26日 [査読無し][通常論文]
- SSPR/SPR 2012 7626 Springer 2012年 [査読有り][通常論文]
- Shuai Tao, Mineichi Kudo, Hidetoshi Nonaka, Jun ToyamaCONSTRUCTURING AMBIENT INTELLIGENCE 277 119 - 127 2012年 [査読有り][通常論文]

Person authentication and activities analysis are indispensable for providing various personalized services in a smart home/office environment. In this study, we introduce a person localization algorithm using an infrared ceiling sensor network, and realize person authentication anywhere and anytime. The key problem is how to distinguish different persons meeting at the same position. We solve this problem by different moving directions depending on individuals. Furthermore, with the locations and the known identities, multiple persons can be tracked and their interactive behaviors can be analyzed by our system. - Akira Tanaka, Ichigaku Takigawa, Hideyuki Imai, Mineichi KudoSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION 7626 345 - 353 2012年 [査読有り][通常論文]

Learning based on kernel machines is widely known as a powerful tool for various fields of information science such as pattern recognition and regression estimation. An appropriate model selection is required in order to obtain desirable learning results. In our previous work, we discussed a class of kernels forming a nested class of reproducing kernel Hilbert spaces with an invariant metric and proved that the kernel corresponding to the smallest reproducing kernel Hilbert space, including an unknown true function, gives the best model. In this paper, we relax the invariant metric condition and show that a similar result is obtained when a subspace with an invariant metric exists. - Kazuhiro Omura, Mineichi Kudo, Tomomi Endo, Tetsuya Murai12th International Conference on Intelligent Systems Design and Applications, ISDA 2012, Kochi, India, November 27-29, 2012 865 - 870 IEEE 2012年 [査読有り][通常論文]

Recently we face classification problems with many categorical features, as seen in genetic data and text data. In this paper, we discuss some ways to give weights on features in the framework of naive Bayes classifier, that is, under independent assumption of features. Because no order exists in a categorical feature, we consider a histogram over possible values (bins) in the feature. Taking into the difference of number of samples falling in each bin, we propose two kinds of weights: 1) one is derived from the probability that the majority class takes the majority even in samples, and 2) another reflects the expected conditional entropy. With the latter entropy weight, it will be shown that more discriminative features gain higher weights and non-discriminative feature diminishes as the number of samples goes infinity. We reveal the properties of these two kinds of weights through artificial data and some real-life data. - Hironobu Yasuda, Mineichi Kudo2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA) 859 - 864 2012年 [査読有り][通常論文]

Automatic speech recognition of conversational speech has not reached the level of practical use yet. Possible reasons are 1) multiple speakers speak in turn or sometimes at the same time, and 2) they speak very casually. One of key features to characterize conversational speech is the speaking speed. By knowing the speaking speed it becomes possible to adapt a recognition system to speech at that speed. Therefore, in this paper, we aim to estimate the speech rate in real time in order to choose a recognition model suitable for the speech rate. This paper introduces a probabilistic method for estimating the speech rate as well as the changing points of speech rates. We use a martingale framework in two stages to this goal. For examining the effectiveness of our method, two experiments are conducted. First, phoneme-level speech rate change detection is tried in both of reading and conversational speech. Second, the detection of the changing points of word-level speech rates is tried. - Guoliang Lu, Mineichi Kudo, Jun Toyama2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012) 3268 - 3271 2012年 [査読有り][通常論文]

For achieving efficient action recognition, some recent works propose to select a smaller number of frames in a video sequence instead of the entire sequence of frames. In this study, we propose to represent a frame by a combination of local and global descriptors instead of the silhouette used in our previous approach aiming at frame selection. Action recognition is then executed on the basis of the selected frames. The experiment on KTH database shows that the selected frames by the proposed framework are, in the minimum number to achieve the best recognition rate, better than those by two compared selection ways. - Shuai Tao, Mineichi Kudo, Hidetoshi Nonaka, Jun Toyama2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012) 1759 - 1762 2012年 [査読有り][通常論文]

A ceiling sensor system is reported in this study to recognize different activities of multiple persons in the home environment. The sensors output binary sequences by which we know the existence/nonexistence of persons under the sensors. A short-period average of the binary response is shown to be regarded as a pixel value of a top view camera, but the camera-like view is more advantage in the sense of preserving privacy. Using the "pixel values" as features, support vector machine (SVM) classifier succeeded to recognize eight activities of five subjects at average recognition rate of 80.10%. This accuracy is not sufficient in general but surprisingly high with such low-level information. - Koji Tabata, Atsuyoshi Nakamura, Mineichi KudoLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7569 169 - 183 2012年 [査読有り][通常論文]

We present a fast approximation algorithm for the 1-median problem. Our algorithm can be applied to metric undirected graphs with node weight. Given a node v, our algorithm repeatedly finds a better node by making use of a shortest path tree of the previous node. We empirically show that our algorithm runs much faster and has better approximation ratio than a sophisticated existing method called DTZ. We demonstrate the effectiveness of our algorithm through experiments. © 2012 Springer-Verlag Berlin Heidelberg. - Shen Pan, Mineichi KudoTrans. MLDM 5 1 45 - 62 2012年 [査読有り][通常論文]
- Atsuyoshi Nakamura, Hisashi Tosaka, Mineichi KudoProceedings of the 2012 IIAI International Conference on Advanced Applied Informatics, IIAIAAI 2012 54 - 59 2012年 [査読有り][通常論文]

We propose a novel frequent approximate pattern mining in which occurrences themselves are valuable regions to extract. Given a string , our mining task is to enumerate its substrings that locally optimally match many substrings of . We show an algorithm for this problem whose time and space complexities are (3) for a string with length . According to our experiments using synthetic data, our algorithm can correctly extract embedded frequent patterns as both patterns and their occurrences even if the embedded patterns are corrupted by random edit operations. © 2012 IEEE. - Hiroyuki Hanada, Mineichi Kudo, Atsuyoshi Nakamura2012 6TH INTERNATIONAL CONFERENCE ON NEW TRENDS IN INFORMATION SCIENCE, SERVICE SCIENCE AND DATA MINING (ISSDM2012) 544 - 549 2012年 [査読有り][通常論文]

The q-gram distance d(q)(x, y) between two strings x and y is a string similarity measure correlated with a famous string distance: the edit distance. In addition, it can be computed much faster, in linear (O(vertical bar x vertical bar + vertical bar y vertical bar) time, than the edit distance in quadratic (O(vertical bar x parallel to y vertical bar) time, where I. I denotes the stmg length. However, it does not mean that we can find all substrmgs of a text t similar to a pattern p in linear time. In this paper we will propose a searching algorithm achieving quasi-linear (O(vertical bar t vertical bar log vertical bar p vertical bar + vertical bar p vertical bar) time by the q-gram distance. - Tetsuji Takahashi, Mineichi Kudo, Atsuyoshi NakamuraPATTERN RECOGNITION LETTERS 32 16 2224 - 2230 2011年12月 [査読有り][通常論文]

We propose an algorithm to approximate each class region by a small number of approximated convex hulls and to use these for classification. The classifier is one of non-kernel maximum margin classifiers. It keeps the maximum margin in the original feature space, unlike support vector machines with a kernel. The construction of an exact convex hull requires an exponential time in dimension, so we find an approximate convex hull (a polyhedron) instead, which is constructed in linear time in dimension. We also propose a model selection procedure to control the number of faces of convex hulls for avoiding over-fitting, in which a fast procedure is adopted to calculate an upper-bound of the leave-one-out error. In comparison with support vector machines, the proposed approach is shown to be comparable in performance but more natural in the extension to multi-class problems. (C) 2011 Elsevier B.V. All rights reserved. - 田畑 公次, 中村 篤祥, 工藤 峰一電子情報通信学会技術研究報告. COMP, コンピュテーション 111 256 7 - 14 一般社団法人電子情報通信学会 2011年10月14日 [査読無し][通常論文]

Closeness Centralityはグラフにおけるノードの中心性を表す尺度の一つである.あるノードのCloseness Centralityは,そのノードから他のすべてのノードまでの距離の和の逆数として計算される.本報告では,無向グラフに対しCloseness Centralityが大きいノードを高速に発見するアルゴリズムを提案する.提案法は,与えられたグラフのノードvに対し,vを根とする最短経路木Tを構築し,TにおいてCloseness Centerを新たなvとして同じ手続きを繰り返すアルゴリズムである.頂点の次数分布がべき乗則に従う場合には,高い確率でCloseness Centralityがほぼ最大のノードを見つけることができることが実験的に示された. - 金光 秀雄, 今野 英明, 工藤 峰一, 宮腰 政明, 新保 勝電子情報通信学会論文誌. A, 基礎・境界 94 8 604 - 620 一般社団法人電子情報通信学会 2011年08月01日 [査読無し][通常論文]

閉区間に有限個の狭義の極小点を有する一変数多峰関数の大域的最適化問題において,関数の極小値が(下へ)単峰列で各極小点の単峰領域幅が等しい関数を定義・考察し,その関数の大域的最小点を求める手法を提案する.また,提案手法が大域解を見出す理論的な保証を与え,簡単な数値実験でその有効性を示す.次に,極小値が単峰列で単峰領域幅が等しい多峰関数を成分とする変数分離型の目的関数を矩形探索領域上で最小化する問題に対して,提案手法を繰返し用いる大域的最適化手法を与える.2〜1000変数のテスト関数に対する数値実験の結果から,本手法が問題の特殊構造を活用して従来法と同等若しくは非常に少ない関数評価回数で最小点を見出すことを示す.この主題に関わる論文は全体で2部構成になっており,本論文はその中の第1部で,ここでは限定された特殊構造をもつ多峰目的関数の数理構造を明らかにし,その数理構造を利用して大域解を見出す理論的な保証のあるアルゴリズムを提案することに主眼をおいている.第2部では,本論文の手法の改良と,より緩和された問題に対する実用的な手法の提案に主眼をおく. - Shen Pan, Mineichi KudoCOMPUTERS AND ELECTRONICS IN AGRICULTURE 75 2 250 - 260 2011年02月 [査読無し][通常論文]

A set of pores is one of the key features in the process of wood identification. A novel method to segment pores in a wood microscopic image is proposed in this paper. The method has three steps. In the first step, structuring elements with several sizes are employed in a mathematical morphology algorithm to enhance the pores and remove other tissues in a wood microscopic image. Then the best structuring element size is determined according to the results obtained from the previous step. Finally, a binary image that includes only pores is obtained by an adaptive thresholding. Experimental results show the efficiency of the method. (C) 2010 Elsevier B.V. All rights reserved. - Shen Pan, Mineichi KudoCOMPUTERS AND ELECTRONICS IN AGRICULTURE 75 2 250 - 260 2011年02月 [査読無し][通常論文]

A set of pores is one of the key features in the process of wood identification. A novel method to segment pores in a wood microscopic image is proposed in this paper. The method has three steps. In the first step, structuring elements with several sizes are employed in a mathematical morphology algorithm to enhance the pores and remove other tissues in a wood microscopic image. Then the best structuring element size is determined according to the results obtained from the previous step. Finally, a binary image that includes only pores is obtained by an adaptive thresholding. Experimental results show the efficiency of the method. (C) 2010 Elsevier B.V. All rights reserved. - Hisataka Nakane, Jun Toyama, Mineichi KudoProceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 490 - 495 2011年 [査読有り][通常論文]

Fatigue is considered a contributing factor to accidents and illnesses at work. Hence, detecting fatigue by routinely monitoring an individual's person health is useful in preventing accidents and illnesses. There are many conventional methods for detecting fatigue However, these methods require users to monitor the extent of their fatigue with gauges and instruments that can restrict users' actions. On the other hand, fatigue detection using posturography can mitigate this problem. This study focuses on posturography in seated position, because sitting is quite a common action in daily life, and posturography in seated position can be measured without restricting the users' movement. Posturography in seated position is measured when subjects sit on a chair attached to a pressure sensor sheet. We formed an assumption that the more tired a person is, the more rapidly the COP (center of pressure) converges toward a point. Based on this assumption, we investigated the difference between postural sway in seated position as subjects went from experiencing a condition of fatigue to non-fatigue. Experimental result showed that the COP for left-right axis for subjects experiencing fatigue converged earlier than for those experiencing non-fatigue. © 2011 IEEE. - Guoliang Lu, Mineichi Kudo, Jun ToyamaProceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 433 - 438 2011年 [査読有り][通常論文]

Robust human pose estimation from the given visual observations has attracted many attentions in the past two decades. However, this problem is still challenging due to the suituation that observations are often corrupted with partial occlusions or noise pollutions or both in real-world applications. In this paper, we propose to estimate human pose by using robust silhouette matching in original rectangle-coordinate space. In addition, human action model is employed to determinate reasonable matching results. Experimental results on robustness sequence of Weizman dataset reveal that our proposed approach can estimate human pose robustly and reasonably when pose observations are corrupted with partial occlusions or noise pollutions. © 2011 IEEE. - Hiroyuki Hanada, Atsuyoshi Nakamura, Mineichi KudoProceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 231 - 236 2011年 [査読有り][通常論文]

The edit distance is a basic string similarity measure for many applications such as string searching, text mining, signal processing, bioinformatics and so on. However, its high computational cost often prevents it from being used for a large set of strings like similar string searching. A promising solution for the problem is to approximate the edit distance with low computational cost. However, although there are many methods for approximating the edit distance, most of them are analyzed only theoretically. In fact, most of the methods can evaluate the edit distance only in terms of order notations, and do not conduct any experiment. This is a large obstacle for applying them to real applications. In this study we will first list up existing edit distance approxi-mation methods. Then we compare them by: (i) approximation performances shown by the original authors, (ii) approximation performances re-analyzed by us (concrete values instead of the order notations) and (iii) experimental performances by our implementations. © 2011 IEEE. - Guoliang Lu, Mineichi Kudo, Jun ToyamaCOMPUTER ANALYSIS OF IMAGES AND PATTERNS: 14TH INTERNATIONAL CONFERENCE, CAIP 2011, PT 2 6855 413 - 420 2011年 [査読有り][通常論文]

Foreground detection in dynamic background is one of challenging problems in many vision-based applications. In this paper, we propose a hierarchical foreground detection algorithm in the HSL color space. With the proposed algorithm, the experimental precision in five testing sequences reached to 56.46%, which was the best among compared four methods. - Shuai Tao, Mineichi Kudo, Hidetoshi Nonaka, Jun ToyamaCOMPUTER ANALYSIS OF IMAGES AND PATTERNS: 14TH INTERNATIONAL CONFERENCE, CAIP 2011, PT 2 6855 122 - 129 2011年 [査読有り][通常論文]

Person localization and identification are indispensable to provide various personalized services in an intelligent environment. We propose a novel method for person localization and developed a system for identifying up to ten persons in an office room to realize soft authentication. Our system consists of forty-three infrared ceiling sensors with low cost and easy installation. In experiments, the average distance error of person localization was 31.6cm that is an acceptable error for sensors with 1.5m distance to each other. We also confirmed that walking path and speed gives sufficient information for authenticating the user. Through the experiments, we obtained the correct recognition rates of 98%, 95% and 86% for any pair, any three people and all ten people to identify individuals. - NAKAMURA Atsuyoshi, KUDO MineichiLect Notes Comput Sci 6635 234-245 - 245 Springer 2011年 [査読有り][通常論文]
- Shen Pan, Mineichi KudoLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6870 147 - 160 2011年 [査読有り][通常論文]

The size and configuration of pores are key features for wood identification. In this paper, these features are extracted and then used for construction of a decision tree to recognize three different kinds of pore distributions in wood microscopic images. The contribution of this paper lies in three aspects. Firstly, two different sets of features about pores were proposed and extracted Secondly, two decision trees were built with those two sets by C4.5 algorithm Finally, the acceptable recognition results of up to 75.6% were obtained and the possibility to improve was discussed. © 2011 Springer-Verlag. - TANAKA Akira, IMAI Hideyuki, KUDO Mineichi, MIYAKOSHI MasaakiProc IEEE Int Conf Acoust Speech Signal Process 2011 Vol.3 2072-2075 - 2075 IEEE 2011年 [査読有り][通常論文]
- Koji Ouchi, Atsuyoshi Nakamura, Mineichi KudoProceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 533 - 538 2011年 [査読無し][通常論文]

We discuss efficient construction and usefulness of greedy covers of positive instances by axis-parallel rectangles that exclude negative instances. A rectangle greedy cover is expected to be a simple classification rule with high readability because the number of its component rectangles is expected to be small and it can be seen as a disjunctive normal form, which is one of the most readable representations. We develop efficient construction methods of rectangle greedy covers by making use of algorithms for efficient maximal frequent itemsets. We empirically demonstrate that, for high dimensional datasets, the method of finding a component rectangle one by one without enumerating candidate covering component sets is faster than the method with independent greedy covering process after the enumeration. In our classification experiments using 10 datasets in UCI repository, rectangle greedy covers (RGC) were shown to have classification performance comparable to the randomized subclass method (RSM) developed in [1], which is a conventional classification method using rectangles, though RGC used significantly smaller number of rectangles. The performance of RGC was also shown to be comparable to that of popular classifiers such as logistic regression and SVM. The DNF-representation of the classification rules obtained by RGC was demonstrated to be simpler than that obtained by RSM and C4.5 through our experiment. © 2011 IEEE. - Kazuhiro Omura, Kazuaki Aoki, Mineichi KudoProceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 527 - 532 2011年 [査読有り][通常論文]

Decision rules in if-then form are highly readable and suitable for the situations in which users need to understand the rules intuitively. When we suppose the situation in which someone reads rules, a set of decision rules is desired to satisfy the following three conditions: 1) They can explain most of possible situations as a rule set, 2) The size of a rule set is small and thus memorable, 3) Description of each rule is simple and easily understood. In general, however, it is difficult to achieve both 2) and 3) under the condition 1). In addition to typical reduction of attributes, we consider reduction of attribute domains, the number of possible attribute values in each attribute, aiming at obtaining simpler but more readable rules. It brings a large variety of granularity in data representation. Using previously proposed some criteria on the basis of 1) through 3), we rated rule sets obtained at specified levels of granularity in some real-life datasets. The rating was almost consistent to that by a human inspector in readability. © 2011 IEEE. - Shuai Tao, Mineichi Kudo, Hidetoshi Nonaka, Jun ToyamaProceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 647 - 652 2011年 [査読有り][通常論文]

In this study, we introduce two methods for indoor localization of person by using an infrared ceiling sensor network, then the performances of them are compared. Through experiments we see that the nonlinear method for person localization obtains better performance. Based on the nonlinear person localization method, we recorded the Activities of Daily Living (ADLs) of a person successfully. By observing the ADL of the person, we can investigate the transition pattern between activities and the living habits of him/her conveniently. © 2011 IEEE. - Hidetoshi Nonaka, Shuai Tao, Jun Toyama, Mineichi KudoPECCS 2011: PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON PERVASIVE AND EMBEDDED COMPUTING AND COMMUNICATION SYSTEMS 218 - 223 2011年 [査読有り][通常論文]

In the previous stage of our research, we have developed a soft authentication system using a ceiling sensor network. Our aim has been to exclude psychological and physical load caused by using strict biometrics, video camera, and so on. We introduced a notion of distributed personality for authentication and tracking of several persons. Through experimental results, we confirmed that the system could keep track of up to 5 persons. However, it has been found the performance is not enough for practical use, and then we have reconstructed and improved the system. In this position paper, we present the design policy, overview of the network system, and the obtained performance. - Lecture Notes in Computer Science 6855 413 - 420 2011年 [査読無し][通常論文]
- 電子情報通信学会誌 J94-A 8 604 - 620 2011年 [査読無し][通常論文]
**極小値が単峰列となる多峰関数の大域的最適化法(1) -単峰領域幅が等しい目的関数の大域的最適化-** - GrC 2011 IEEE 2011年 [査読有り][通常論文]
- Koji Ouchi, Atsuyoshi Nakamura, Mineichi KudoProceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 533 - 538 2011年 [査読有り][通常論文]

We discuss efficient construction and usefulness of greedy covers of positive instances by axis-parallel rectangles that exclude negative instances. A rectangle greedy cover is expected to be a simple classification rule with high readability because the number of its component rectangles is expected to be small and it can be seen as a disjunctive normal form, which is one of the most readable representations. We develop efficient construction methods of rectangle greedy covers by making use of algorithms for efficient maximal frequent itemsets. We empirically demonstrate that, for high dimensional datasets, the method of finding a component rectangle one by one without enumerating candidate covering component sets is faster than the method with independent greedy covering process after the enumeration. In our classification experiments using 10 datasets in UCI repository, rectangle greedy covers (RGC) were shown to have classification performance comparable to the randomized subclass method (RSM) developed in [1], which is a conventional classification method using rectangles, though RGC used significantly smaller number of rectangles. The performance of RGC was also shown to be comparable to that of popular classifiers such as logistic regression and SVM. The DNF-representation of the classification rules obtained by RGC was demonstrated to be simpler than that obtained by RSM and C4.5 through our experiment. © 2011 IEEE. - 柳堀 慎吾, 工藤 峰一電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 110 330 73 - 78 一般社団法人電子情報通信学会 2010年12月02日 [査読無し][通常論文]

本研究では,文書分類などの大規模データに対して,実用的な時間で行える識別子独立型の特徴選択を検討する.二クラス,二値特徴に限定して,有効な少数の特徴の組み合わせを信頼区間を考慮して求めることで比較的効率の良い方法を提案する.特徴数およびサンプル数がともに十万を越える規模の文書分類問題に対して行った比較実験では,提案手法により最適な特徴集合に近い特徴集合が得られることが示された. - 大内 康治, 中村 篤祥, 工藤 峰一電子情報通信学会技術研究報告. IBISML, 情報論的学習理論と機械学習 110 265 99 - 104 一般社団法人電子情報通信学会 2010年10月28日 [査読無し][通常論文]

固定された自然数dに対し,軸に平行な有限個のd次元超矩形の和集合で表現される概念クラスが多項式時間PAC学習可能であることはBlumerらにより示されている.証明に用いられたアルゴリズムは,概念に含まれる事例(正例)全体を被覆する超矩形の集合を,貪欲手続きで求めるというものである.本稿では,Blumerらのアルゴリズムの効率的な実装について検討する.また,このアルゴリズムを実際のnクラス識別問題(n≧2)に通用し,同じく超矩形の和集合を用いる確率的サブクラス法との比較を行う.UCI機械学習レポジトリのデータセットを用いた実験では,ほぼ同等の識別性能となりながらも,貪欲な手法によって生成される超矩形の数が確率的サブクラス法のものよりも少なくなり,多くのデータで7割以上の削減がみられた, - Kenji Tabata, Maiko Sato, Mineichi KudoPATTERN RECOGNITION 43 9 3162 - 3176 2010年09月 [査読有り][通常論文]

In these years, we often deal with an enormous amount of data in a large variety of pattern recognition tasks. Such data require a huge amount of memory space and computation time for processing. One of the approaches to cope with these problems is using prototypes. We propose volume prototypes as an extension of traditional point prototypes. A volume prototype is defined as a geometric configuration that represents some data points inside. A volume prototype is akin to a data point in the usage rather than a component of a mixture model. We show a one-pass algorithm to have such prototypes for data stream, along with an application for classification. An oblivion mechanism is also incorporated to adapt concept drift. (C) 2010 Elsevier Ltd. All rights reserved. - Jun Toyama, Mineichi Kudo, Hideyuki ImaiPATTERN RECOGNITION 43 4 1361 - 1372 2010年04月 [査読有り][通常論文]

A novel approach for k-nearest neighbor (k-NN) searching with Euclidean metric is described. It is well known that many sophisticated algorithms cannot beat the brute-force algorithm when the dimensionality is high. In this study, a probably correct approach, in which the correct set of k-nearest neighbors is obtained in high probability, is proposed for greatly reducing the searching time. We exploit the marginal distribution of the k th nearest neighbors in low dimensions, which is estimated from the stored data (an empirical percentile approach). We analyze the basic nature of the marginal distribution and show the advantage of the implemented algorithm, which is a probabilistic variant of the partial distance searching. Its query time is sublinear in data size n, that is, O(mn delta) with S=o(1) in n and delta <= 1, for any fixed dimension m. (C) 2009 Elsevier Ltd. All rights reserved. - 金光 秀雄, 今野 英明, 工藤 峰一, 宮腰 政明電子情報通信学会技術研究報告. NLP, 非線形問題 109 366 7 - 12 一般社団法人電子情報通信学会 2010年01月14日 [査読無し][通常論文]

閉区間上で目的関数の極小値が(下へ)単峰列で各極小点の単峰領域幅が等しい一変数多峰関数の大域的最適化問題に対する,前報の理論的な収束性が保証されたアルゴリズムを修正・拡張したアルゴリズムを提案する.具体的には,囲い込みステップでの局所探索を減らす修正をしたアルゴリズムと,各極小点の単峰領域幅が等しいという目的関数の条件を緩和した問題を解く拡張アルゴリズムを提案する.さらに,前報の問題における多変数目的関数の変数分離可能性を緩和し,その問題に対する解法アルゴリズムを提案する.多変数のテスト関数に対するいくつかの数値実験の結果から,本手法が非常に効率的かつ高信頼性で最小点を見い出せることを示す. - Tetsuji Takahashi, Mineichi KudoProceedings - International Conference on Pattern Recognition 4052 - 4055 2010年 [査読有り][通常論文]

The usage of convex hulls for classification is discussed with a practical algorithm, in which a sample is classified according to the distances to convex hulls. Sometimes convex hulls of classes are too close to keep a large margin. In this paper, we discuss a way to keep a margin larger than a specified value. To do this, we introduce a concept of "expanded convex hull" and confirm its effectiveness. © 2010 IEEE . - Akira Tanaka, Hideyuki Imai, Mineichi Kudo, Masaaki MiyakoshiProceedings - International Conference on Pattern Recognition 1421 - 1424 2010年 [査読有り][通常論文]

A relationship between generalization error and training samples in kernel regressors is discussed in this paper. The generalization error can be decomposed into two components. One is a distance between an unknown true function and an adopted model space. The other is a distance between an estimated function and the orthogonal projection of the unknown true function onto the model space. In our previous work, we gave a framework to evaluate the first component. In this paper, we theoretically analyze the second one and show that a larger set of training samples usually causes a larger generalization error. © 2010 IEEE. - Kazuaki Aoki, Mineichi KudoProceedings of the 2010 International Conference on High Performance Computing and Simulation, HPCS 2010 390 - 398 2010年 [査読有り][通常論文]

Tree-type expression of multi-class problems is known to be useful for drawing several insights from a given problem and for improving the performance of classifiers. The authors have already proposed a bottom-up procedure to construct such a tree, called a "class decision tree", but a top-down procedure is still worth studying. In this paper, we propose a simple top-down procedure, and compare those two procedures. In addition, we discuss the effectiveness of classifier selection and feature selection applied to every node in class decision trees, and the preferable order of them as well. © 2010 IEEE. - Kazuki Tsuji, Mineichi Kudo, Akira TanakaSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION 6218 90 - 99 2010年 [査読無し][通常論文]

It is interesting to compare different criteria of kernel machines. In this paper, the following is made: 1) to cope with the scaling problem of projection learning, we propose a dynamic localized projection learning using k nearest neighbors, 2) the localized method is compared with SVM from some viewpoints, and 3) approximate nearest neighbors are demonstrated their usefulness in such a localization. As a result, it is shown that SVM is superior to projection learning in many classification problems in its optimal setting but the setting is not easy. - Lecture Notes in Computer Science 6393 185 - 190 2010年 [査読無し][通常論文]
- Taishi Uchiya, Atsuyoshi Nakamura, Mineichi KudoALGORITHMIC LEARNING THEORY, ALT 2010 6331 375 - 389 2010年 [査読有り][通常論文]

Adversarial bandit problems studied by Auer et al. [4] are multi-armed bandit problems in which no stochastic assumption is made on the nature of the process generating the rewards for actions. In this paper, we extend their theories to the case where k(>= 1) distinct actions are selected at each time step. As algorithms to solve our problem, we analyze an extension of Exp3 [4] and an application of a bandit online linear optimization algorithm [1] in addition to two existing algorithms (Exp3, ComBand [5]) in terms of time and space efficiency and the regret for the best fixed action set. The extension of Exp3, called Exp3. M, performs best with respect to all the measures: it runs in O(K (log k + 1) time and O(K) space, and suffers at most O(root kTK log (K/k)) regret, where K is the number of possible actions and T is the number of iterations. The upper bound of the regret we proved for Exp3. M is an extension of that proved by Auer et al. for Exp3. - Kazuki Tsuji, Mineichi Kudo, Akira TanakaSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION 6218 90 - 99 2010年 [査読有り][通常論文]

It is interesting to compare different criteria of kernel machines. In this paper, the following is made: 1) to cope with the scaling problem of projection learning, we propose a dynamic localized projection learning using k nearest neighbors, 2) the localized method is compared with SVM from some viewpoints, and 3) approximate nearest neighbors are demonstrated their usefulness in such a localization. As a result, it is shown that SVM is superior to projection learning in many classification problems in its optimal setting but the setting is not easy. - Atsuyoshi Nakamura, Tomoya Saito, Ichigaku Takigawa, Hiroshi Mamitsuka, Mineichi KudoString Processing and Information Retrieval - 17th International Symposium, SPIRE 2010, Los Cabos, Mexico, October 11-13, 2010. Proceedings 185 - 190 Springer 2010年 [査読有り][通常論文]
- 工藤 峰一, 今井 英幸, 田中 章, 杉山 将電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 109 344 29 - 34 一般社団法人電子情報通信学会 2009年12月10日 [査読無し][通常論文]

パターン認識の研究は既に40年を越す歴史を持ち、これまでに数多くの基礎的方法論が提案され、それ以上に多くの応用的技法が報告されている。その中にあって現在は、これまでの研究の方向性を見直す時期に来ていると言える。本稿では、これまで多く取り上げられてきた研究や基礎的技法を改めて見直す。視点は、各種の技法がその限界を越え、あるいは適用範囲を逸脱した使われかたをしていないか、また、潜在的に可能性のある研究を見失っていないか、などである。これらの事頂を、「都市伝説」という形で、幾らか誇張した形で整理する。 - 金光 秀雄, 今野 英明, 工藤 峰一, 宮腰 政明電子情報通信学会技術研究報告. NLP, 非線形問題 109 269 79 - 84 一般社団法人電子情報通信学会 2009年11月04日 [査読無し][通常論文]

孤立極小点を有する一変数多峰関数の大域的最適(最小)化問題において,関数の極小値が(下へ)単峰列で各極小点の単峰領域幅が等しい関数を定義・考察する.さらに,その関数の大域的最小点を求める手法を提案し,簡単な数値実験で提案手法の有効性を示す.次に,変数分離可能で単峰領域幅が等しい多変数多峰の目的関数を矩形探索領域上で最小化する問題に対して,提案手法を繰り返し用いる大域的最適化手法を与え,数値実験の結果から本手法が非常に効率的かつ高信頼性で最小点を見い出せることを示す. - 打矢 泰志, 中村 篤祥, 工藤 峰一電子情報通信学会技術研究報告. COMP, コンピュテーション 109 195 13 - 20 一般社団法人電子情報通信学会 2009年09月07日 [査読無し][通常論文]

Auerらにより研究されたadversarial bandit問題は,プレーヤーが選択したアクションに対する報酬生成過程において確率的な仮定をおかないmulti-armed bandit問題である.本稿ではadversarial bandit問題を,各時刻においてk(≧1)回のアクションを選択できるように拡張し,アクションの重複選択を許す場合と許さない場合の2つの設定で分析を行う.両方の設定において,Auerらが提案したアルゴリズムExp3を一般化し,最適固定アクション集合に対する損失上界の一般化を得る. - T. Hosokawa, M. Kudo, H. Nonaka, J. ToyamaPATTERN ANALYSIS AND APPLICATIONS 12 3 237 - 249 2009年09月 [査読有り][通常論文]

Person identification is needed to provide various personalized services at home or at the office. We propose a system for tracking persons identified at the entrance to a room in order to realize "soft authentication." Our system can be constructed at low cost and works anytime and anywhere in a room. Through experiments, we confirmed that the system could track up to 5 persons with a high probability of correct identification, though precise identification is difficult. - M. Yamada, K. Kamiya, M. Kudo, H. Nonaka, J. ToyamaPATTERN ANALYSIS AND APPLICATIONS 12 3 251 - 260 2009年09月 [査読有り][通常論文]

An authentication system using a chair with sensors attached is described. Pressure distribution (hipprint) measured by network-connected sensors on the chair is used for identifying the person sitting on the chair. Hipprint information is not sufficient for maintaining a high level of security but is sufficient for providing personalized services such as automatic log-in at home or in a small office. In experiments, we obtained correct identification rates of 99.6% for five people and 93.2% for ten people. A false rejection rate of 9.2% and a false acceptance rate of 1.9% were achieved using another group of 20 people. The results also showed that changes in hipprints can be used to estimate what the person sitting on the chair is doing, for example, using a mouse or leaning back. - Ichigaku Takigawa, Mineichi Kudo, Atsuyoshi NakamuraENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 22 1 101 - 108 2009年02月 [査読有り][通常論文]

We propose a general framework for nonparametric classification of multi-dimensional numerical patterns. Given training points for each class, it builds a set cover with convex sets each of which contains some training points of the class but no points of the other classes. Each convex set has thus an associated class label, and classification of a query point is made to the class of the convex set such that the projection of the query point onto its boundary is minimal. In this sense, the convex sets of a class are regarded as "prototypes" for that class. We then apply this framework to two special types of convex sets, minimum enclosing balls and convex hulls, giving algorithms for constructing a set cover with them and for computing the projection length onto their boundaries. For convex hulls, we also give a method for implicitly evaluating whether a point is contained in a convex hull, which can avoid computational difficulty for explicit construction of convex hulls in high-dimensional space. (C) 2008 Elsevier Ltd. All rights reserved. - Maiko Sato, Mineichi Kudo, Jun ToyamaPATTERN RECOGNITION IN INFORMATION SYSTEMS, PROCEEDINGS 39 - 48 2009年 [査読有り][通常論文]

The authors have proposed volume prototypes as a compact expression of a huge data or a data stream, along with a one-pass algorithm to find them. A reasonable number of volume prototypes can be used, instead of an enormous number of data, for many applications including classification, clustering and density estimation. In this paper, two algorithms using volume prototypes, called VKM and VEM, are introduced for clustering and density estimation. Compared with the other algorithms for such a huge data, we showed that our algorithms were advantageous in speed of processing, while keeping the same degree of performance, and that both applications were available from the same set of volume prototypes. - Recent Advances in Intelligent Information Systems 345 - 358 2009年 [査読無し][通常論文]
**Hierarchical and Overlapping Clustering of Retrieved Web Pages** - Tetsuji Takahashi, Mineichi Kudo, Atsuyoshi NakamuraPROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, PROCEEDINGS 5856 441 - 448 2009年 [査読有り][通常論文]

We consider an algorithm to approximate each class region by a small number of convex hulls and to apply them to classification. The convex hull of a finite set of points is computationally hard to be constructed in high dimensionality. Therefore, instead of the exact convex hull, we find an approximate convex hull (a polyhedron) in a time complexity that is linear in dimension. On the other hand, the set of such convex hulls is often too much complicated for classification. Thus we control the complexity by adjusting the number of faces of convex hulls. For reducing the computational time, we use an upper bound of the leave-one-out estimated error to evaluate the classifiers. - Lecture Notes in Computer Science 5519 22 - 31 2009年 [査読無し][通常論文]
- Mineichi Kudo, Jun Toyama, Hideyuki ImaiLecture Notes in Computer Science 5721 333 - 339 Springer 2009年 [査読有り][通常論文]
- Shen Pan, Mineichi Kudo, Jun Toyama2009 1st International Conference on Information Science and Engineering, ICISE 2009 1219 - 1222 2009年 [査読有り][通常論文]

Edge extraction in tobacco leaf images is the key step of tobacco leaf inspection by computer. In order to detect edge and keep detail texture information such as vein, the original tobacco leaf images obtained by a CCD camera are processed by a membership function at first. Then a fuzzy mathematical morphology algorithm is used to detect the edge. Efficiency of edge detection based on the fuzzy mathematic morphology is proved by comparing with the results of the other algorithms under different conditions. ©2009 IEEE. - Satoshi Shirai, Mineichi Kudo, Atsuyoshi NakamuraMULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS 5519 22 - 31 2009年 [査読有り][通常論文]

We compared boosting with bagging in different strengths of learning algorithms for improving the performance of the set of classifiers to be fused. Our experimental results showed that boosting worked much with weak algorithms and bagging, especially feature-based bagging, worked much with strong algorithms. On the basis of these observations we developed a mixed fusion method in which randomly chosen features are used with a standard boosting method. As a result, it was confirmed that the proposed fusion method worked well regardless of learning algorithms. - Mineichi Kudo, Tetsuya MuraiIEEE TRANSACTIONS ON FUZZY SYSTEMS 16 2 285 - 298 2008年04月 [査読有り][通常論文]

An information table or a training/designing sample set is all that can be obtained to infer the underlying generation mechanism (distribution) of tuples or samples. However, how an information table is available in representation, in treatment, and in interpretation, can still be discussed. In this paper, these matters are discussed on the basis of "granularity." First, an explanation is given to identify the reasons why different goals/treatments of information tables exist in some different research fields. In this stage, it will be emphasized that "granularity concept" plays an important role. Next, a framework of information tables is reformulated in terms of attribute sets and tuple sets. Here, a "Galois connection" helps to understand their relationship. Then, the use of "closed subsets" is proposed instead of given tuples, for efficiency and for interpretability. With a special type of closed subsets, the traditional logical DNF expression framework can be naturally extended to those with multivalues and continuous values. Last, several concepts on rough sets are reformulated using "variable granularity" connected to closed subsets. This paper determines how and in what points granularity can give flexibility in dealing with several problems. Through several concepts defined in this paper, some intuitions toward development of data exploration and data mining are given. - 5-th IASTED International Conference Signal Processing, Pattern Recognition, and Applications 1 - 6 2008年 [査読無し][通常論文]
**Wiener Implementation of Kernel Machines** - Mineichi Kudo, Atsuyoshi Nakamura, Ichigaku TakigawaProceedings of 19th International Conference on Pattern Recognition to appear - 4 IEEE Computer Society 2008年 [査読有り][通常論文]
- Akira Tanaka, Hideyuki Imai, Mineichi Kudo, Masaaki MiyakoshiSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION 5342 530 - 539 2008年 [査読有り][通常論文]

Learning based oil kernel machines is widely known as a, powerful tool for various fields of information science such as pattern recog nition and regression estimation. One of central topics of kernel machines is model selection, especially selection of a kernel or its parameters. In this paper, we consider a. class of kernels that forms a monotonic classes of reproducing kernel Hilbert spaces with an invariant metric and show that the kernel corresponding to the smallest reproducing kernel Hilbert space including an unknown true function gives the optimal model for the unknown true function. - Hiroshi Tenmoto, Mineichi KudoSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION 5342 572 - + 2008年 [査読有り][通常論文]

Proposed is a histogram approach for feature selection and classification. The axes are divided into equally-spaced intervals, but the division numbers are different among axes. The main difference from similar approaches is that feature selection mechanism is embedded in the method. The optimal division is determined by an MDL criterion, so that the classifier is guaranteed to converge to the Bayes optimal classifier. We also introduce the concept of soft feature selections' that is carried out by this method as an extension of traditional "feature selection." - Maiko Sato, Mineichi Kudo, Jun ToyamaLecture Notes in Computer Science 5342 884 - 894 Springer 2008年 [査読有り][通常論文]
- Satoshi Shirai, Mineichi Kudo, Atsuyoshi NakamuraLecture Notes in Computer Science 811 - 820 Springer 2008年 [査読有り][通常論文]
- Kazuhiro Kamiya, Mineichi Kudo, Hidetoshi Nonaka, Jun ToyamaProceedings of 19th International Conference on Pattern Recognition to appear - 4 IEEE Computer Society 2008年 [査読有り][通常論文]
- Yohji Shidara, Mineichi Kudo, Atsuyoshi NakamuraProceedings of the 19th International Conference on Pattern Recognition to appear - 4 IEEE Computer Society 2008年 [査読有り][通常論文]
- Kazuaki Aoki, Mineichi KudoSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION 5342 562 - 571 2008年 [査読有り][通常論文]

Feature selection is an important technique in pattern recognition. By removing features that have little or no discriminative information, it is possible to improve the predictive performance of classifiers and to reduce the measuring cost, of features. In general, feature selection algorithms choose a common feature subset useful for all classes. However in general the most, contributory feature subsets vary depending oil classes relatively to the other classes. Ill this study, We propose a classifier as a decision tree in which each leaf corresponds to one class and an internal node classifies a sample to one of two class subsets. We also discuss classifier selection in each node. - Yohji Shidara, Mineichi Kudo, Atsuyoshi NakamuraTransactions on Machine Learning 1 1 17 - 30 2008年 [査読有り][通常論文]
- Engineering Applications of Artificial Intelligence ?-? 2008年 [査読無し][通常論文]
- Atsuyoshi Nakamura, Mineichi KudoICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS 482 - 491 2008年 [査読有り][通常論文]

We study the problem of enumerating concepts in a Sperner family concept class using subconcept queries [7], which is a general problem including maximal frequent itemset mining as its instance. Though even the theoretically best known algorithm needs quasi-polynomial time to solve this problem in the worst case, there exist practically fast algorithms for this problem. This is because many instances of this problem in real world have low complexity in some measures. In this paper, we characterize the complexity of Sperner family concept class by the VC dimension of its intersection closure and its characteristic dimension, and analyze the worst case time complexity on the enumeration problem of its concepts in terms of the VC dimension. We also showed that the VC dimension of real data used in data mining is actually small by calculating the VC dimension of some real datasets using a new algorithm closely related to the introduced two measures, which does not only solve the problem but also let us know the VC dimension of the intersection closure of the target concept class. - Akira Tanaka, Hideyuki Imai, Mineichi Kudo, Masaaki MiyakoshiPATTERN RECOGNITION 40 11 2930 - 2938 2007年11月 [査読有り][通常論文]

Kernel machines are widely considered to be powerful tools in various fields of information science. By using a kernel, an unknown target is represented by a function that belongs to a reproducing kernel Hilbert space (RKHS) corresponding to the kernel. The application area is widened by enlarging the RKHS such that it includes a wide class of functions. In this study, we demonstrate a method to perform this by using parameter integration of a parameterized kernel. Some numerical experiments show that the unresolved problem of finding a good parameter can be neglected. (c) 2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. - Yohji Shidara, Atsuyoshi Nakamura, Mineichi KudoLecture Note in Artificial Intelligence. Machine Learning and Data Mining in Pattern Recognition 4571 409 - 498 Springer 2007年 [査読有り][通常論文]
- Mineichi Kudo, Satoshi Shirai, Hiroshi TenmotoMultiple Classifier System, Lecture Notes in Computer Science 4472 241 - 250 Springer 2007年 [査読有り][通常論文]
- Yuji Muto, Mineichi Kudo, Yohji ShidaraLecture Notes in Computer Science 4481 211 - 218 Springer 2007年 [査読有り][通常論文]
- Hisashi Tosaka, Atsuyoshi Nakamura, Mineichi KudoDISCOVERY SCIENCE, PROCEEDINGS 4755 286 - + 2007年 [査読有り][通常論文]

We study a novel problem of mining subtrees with frequent occurrence of similar subtrees, and propose an algorithm for this problem. In our problem setting, frequency of a subtree is counted not only for equivalent subtrees but also for similar subtrees. According to our experiment using tag trees of web pages, this problem can be solved fast enough for practical use. An encouraging result was obtained in a preliminary experiment for data record extraction from web pages using our mining method. - Naoto Abe, Mineichi Kudo, Jun Toyama, Masaru ShimboPATTERN ANALYSIS AND APPLICATIONS 9 2-3 127 - 137 2006年10月 [査読有り][通常論文]

Feature selection aims to choose a feature subset that has the most discriminative information from the original feature set. In practical cases, it is preferable to select a feature subset that is universally effective for any kind of classifier because there is no underlying information about a given dataset. Such a trial is called classifier-independent feature selection. We took notice of Novovicova et al.'s study as a classifier-independent feature selection method. However, the number of features have to be selected beforehand in their method. It is more desirable to determine a feature subset size automatically so as to remove only garbage features. In this study, we propose a divergence criterion on the basis of Novovicova et al.'s method. - N Abe, M KudoPATTERN RECOGNITION 39 5 737 - 746 2006年05月 [査読有り][通常論文]

Feature selection is used for finding a feature subset that has the most discriminative information from the original feature set. In practice, since we do not know the classifier to be used after feature selection, it is desirable to find a feature subset that is universally effective for any classifier. Such a trial is called classifier-independent feature selection. In this study, we propose a novel classifier-independent feature selection method on the basis of the estimation of Bayes discrimination boundary. The experimental results on 12 real-world datasets showed the fundamental effectiveness of the proposed method. (c) 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. - Yuji Muto, Mineichi Kudo, Tetsuya MuraiJournal of Advanced Computational Intelligence and Intelligence Informatics 10 5 666 - 672 2006年 [査読有り][通常論文]
- 39 737 - 746 2006年 [査読無し][通常論文]
**A Hybrid BTF Model Based on Gaussian Mixtures** - 10 666 - 672 2006年 [査読無し][通常論文]
**A Hybrid BTF Model Based on Gaussian Mixtures** - 2 3 127 - 137 2006年 [査読無し][通常論文]
**A Hybrid BTF Model Based on Gaussian Mixtures** - Akira Tanaka, Hideyuki Imai, Mineichi Kudo, Masaaki MiyakoshiCOMPUTING ANTICIPATORY SYSTEMS 839 347 - + 2006年 [査読有り][通常論文]

Learning based on kernel machines is widely known as a powerful tool for various fields of information science. The kernel ridge regression is one of simple and classical kernel machines and it gives a foundation for other kernel machines such as the support vector machine. However, it has some problems such as arbitrariness of the model and theoretical validity of an ad hoc kernelization of the ridge regression. The essence of using a kernel in learning problems is that the unknown target is representable by a function belonging to the reproducing kernel Hilbert space corresponding to the adopted kernel. In this paper, on the basis of the essence, we give two identical interpretations, whose theoretical grounds are clarified, for the kernel ridge regression. One is the ridge regression on the reproducing kernel Hilbert space and the other is the parametric projection learning with a specific condition. - Akira Tanaka, Masashi Sugiyama, Hideyuki Imai, Mineichi Kudo, Masaaki MiyakoshiSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS 4109 862 - 870 2006年 [査読有り][通常論文]

Learning based on kernel machines is widely known as a powerful tool for various fields of information science such as pattern recognition and regression estimation. The efficacy of the model in kernel machines depends on the distance between the unknown true function and the linear subspace, specified by the training data set, of the reproducing kernel Hilbert space corresponding to an adopted kernel. In this paper, we propose a framework for the model selection of kernel-based learning machines, incorporating a class of kernels with an invariant metric. - Masafumi Yamada, Mineichi Kudo, Hidetoshi Nonaka, Jun Toyama18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS 533 - + 2006年 [査読有り][通常論文]

It has been gathering great deal of attention to provide personalized services in the home or in the small office. As a necessary technology for it, person recognition by pressure sensors on a chair is described. It exploits the hipprints of users. Such a method makes it possible to automatically login to a computer when the user sits. The recognition rate reached to 99.6% for 5 people and 98.4% for 10 people. In addition, a change of the hipprints is available to figure out roughly what he/she was doing. - Kazuaki Aoki, Toshiharu Watanabe, Mineichi KudoSystems and Computers in Japan 36 4 37 - 47 2005年04月 [査読有り][通常論文]

In pattern recognition, feature selection is effective for improving the performance of classifiers and reducing the measurement cost of features. In particular, by removing features with no discriminative information, an improvement can be expected in the estimation precision of classifier parameters, and as a result, higher performance classifiers can be constructed than when all of the features are used. Many of the feature selection techniques that have been proposed so far have attempted to select a feature subset that is common to all classes. However, it seems reasonable to assume that the optimum feature subset for classification differs for each set of classes to be discriminated. In this paper, the authors investigate the effectiveness of feature subsets that depend on sets of classes and use the extracted class-dependent feature subsets to construct a decision tree. In addition, they show the effectiveness of this method through character recognition experiments. © 2005 Wiley Periodicals, Inc. - Proc. of Data Engineering Workshop 2005 5C-j5 2005年 [査読無し][通常論文]
**コミュニティ集合族発見手法の比較** - Proc. of Data Engineering Workshop 2005 5C-o4 2005年 [査読無し][通常論文]
**構造と内容に基づくWebページからの評判抽出におけるパターンの構成法** - A Nakamura, M KudoADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS 3518 850 - 860 2005年 [査読有り][通常論文]

In this paper, we propose an efficient algorithm enumerating all frequent subtrees containing all special nodes that are guaranteed to be included in all trees belonging to a given data. Our algorithm is a modification of TreeMiner algorithm [10] so as to efficiently generate only candidate subtrees satisfying our constraints. We report mining results obtained by applying our algorithm to the problem of finding frequent structures containing the name and reputation of given restaurants in Web pages collected by a search engine. - Hidehiko Ino, Mineichi Kudo, Atsuyoshi NakamuraProc. of WWW2005 661 - 669 ACM 2005年 [査読有り][通常論文]
- Hidehiko Ino, Mineichi Kudo, Atsuyoshi NakamuraProceedings - International Workshop on Biomedical Data Engineering, BMDE2005 2005 154 - 157 2005年 [査読有り][通常論文]

Recently, researches on extraction of densely connected subgraphs, which are called communities, from the graph representing link structure in WWW, are very popular. However, few methods guarantee that extracted subgraphs satisfy community conditions which are strictly defined. In this paper, we consider the problem of extracting subgraphs that strictly satisfy the community conditions proposed in [3]. It is known that finding all such communities is computationally hard. As methods that possibly find many communities efficiently, we experimentally compared two methods, a method with a Gomory-Hu tree construction and a method with calculating edge-betweenness. We also proposed evaluation criterion for ranking found communities. © 2005 IEEE. - Proceedings of the Workshop on Feature Selection for Data Mining: Interfacing Machine Learning and Statistics 106 - 107 2005年 [査読無し][通常論文]
**Optimal Division for Feature Selection and Classification.** - M Yamada, J Toyama, M KudoKNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 4, PROCEEDINGS 3684 703 - 708 2005年 [査読有り][通常論文]

We argue how we can deal with pressure sensors for person recognition issue. It would appear that the information from the pressure sensors are relatively robust but weak. Using such sensors we argue the merit and possible use of these information. We describe: 1) to what degree and in what way information collected by pressure sensors on a chair is effective for person recognition, and 2) to what situation we can apply these sensor information and how practical they are. - Proceedings of the 4th International Workshop on Texture Analysis and Synthesis in conjunction with ICCV2005 91 - 100 2005年 [査読無し][通常論文]
**A Hybrid BTF Model Based on Gaussian Mixtures** - Proc. of Data Engineering Workshop 2005 5C-j5 2005年 [査読無し][通常論文]
**コミュニティ集合族発見手法の比較** - Proc. of Data Engineering Workshop 2005 5C-o4 2005年 [査読無し][通常論文]
**構造と内容に基づくWebページからの評判抽出におけるパターンの構成法** - Machine Learning and Data Mining in Pattern Recognition, Lecture Notes in Artificial Intelligence 3587 90 - 99 2005年 [査読無し][通常論文]
**The Convex Subclass Method: Combinatorial Classifier Bases on A Family of Convex Sets,** - Proceedings of the Workshop on Feature Selection for Data Mining: Interfacing Machine Learning and Statistics 106 - 107 2005年 [査読無し][通常論文]
**Optimal Division for Feature Selection and Classification.** - Taisuke Hosokawa, Mineichi KudoKnowledge-Based Intelligent Information and Engineering Systems, Lcture Notes in Computer Science, 3684 682 - 667 Springer 2005年 [査読有り][通常論文]
- H Hasegawa, M Kudo, A NakamuraKNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 4, PROCEEDINGS 3684 668 - 674 2005年 [査読有り][通常論文]

We consider the problem of extracting texts related to a given keyword from Web pages collected by a search engine. Recently, we proposed a method using both structural and content information[1, 2]. In our previous paper, we reported good extraction performance of our method only for Ramen-shop dataset written in Japanese. In this paper, we examine it for datasets of other kind of restaurants, and also for a dataset written in English. We discuss some modification for performance improvement. - Y Shidara, M Kudo, A NakamuraFoundations of Data Mining and Knowledge Discovery 6 161 - 170 2005年 [査読有り][通常論文]

We propose a novel method for mining generalized rules with high support and confidence. Using our method, we can obtain generalized rules in which the abstraction of attribute values is implicitly carried out without the requirement of additional information such as information on conceptual hierarchies. Our experimental results showed that the obtained rules not only have high support and confidence but also have expressions that are conceptually meaningful. - 電子情報通信学会論文誌 629 - 635 2005年 [査読無し][通常論文]
**両方向N-gram確率を用いた誤り文字検出法** - Naoto Abe, Mineichi KudoExtraction of Generalized Rules with Automated Attribute Abstraction, 3684 686 - 695 Springer 2005年 [査読有り][通常論文]
- Proceedings of the 4th International Workshop on Texture Analysis and Synthesis in conjunction with ICCV2005 91 - 100 2005年 [査読無し][通常論文]
**A Hybrid BTF Model Based on Gaussian Mixtures** - Machine Learning and Data Mining in Pattern Recognition, Lecture Notes in Artificial Intelligence 3587 90 - 99 2005年 [査読無し][通常論文]
**The Convex Subclass Method: Combinatorial Classifier Bases on A Family of Convex Sets,** - 91 - 100 2005年 [査読無し][通常論文]
**A Hybrid BTF Model Based on Gaussian Mixtures** - 91 - 100 2005年 [査読無し][通常論文]
**A Hybrid BTF Model Based on Gaussian Mixtures** - 電子情報通信学会論文誌 629 - 635 2005年 [査読無し][通常論文]
**両方向N-gram確率を用いた誤り文字検出法** - H Tenmoto, M KudoSOFT COMPUTING AS TRANSDISCIPLINARY SCIENCE AND TECHNOLOGY 391 - 399 2005年 [査読有り][通常論文]

We regularize Gaussian mixture Bayesian (GMB) classifier in terms of the following two points: 1) class-conditional probability density functions, and 2) complexity as a classifier. For the former, we employ the Bayesian regularization method proposed by Ormoneit and Tresp, which is derived from the maximum a posteriori (MAP) estimation framework. For the latter, we use a discriminative MDL-based model selection method proposed by us. In this paper, we optimize the hyperparameters in 1) and 2) simultaneously with respect to the discriminative MDL criterion, aiming to auto-configure the hyperparameter setting for the best classification performance. We show the effectiveness of the proposed method through some experiments on real datasets. - Y Muto, M KudoROUGH SETS, FUZZY SETS, DATA MINING, AND GRANULAR COMPUTING, PRT 1, PROCEEDINGS 3641 692 - 700 2005年 [査読有り][通常論文]

In this paper, we discuss the most suitable "representation granularity", keeping several types of discernibility including individually discernibility and class discernibility. In the traditional "reduction" sense, the goal is to find the smallest number of attributes such that they enable us to discern each tuple or each decision class. However, once we pay attention to the number of attribute values too, that is, the size of each attribute, another criterion is needed. Indeed, we should ask ourselves about which one is better in the following two situations: 1) we can discern them with a single attribute of size ten, and 2) we can do this with two attributes of size five. This study answers this question with some criteria. Especially, we deal with continuous attributes. If we evaluate this difference in the light of understandability, we may prefer the latter, because they give more simple descriptions. Such a combination of simple nominal description helps us as a language or as a Kansei representation. To do this, we propose some criteria and algorithms to find near-optimal solutions for those criteria. In addition, we show some results for some databases in UCI Machine Learning Repository. - M Kudo, T MuraiROUGH SETS, FUZZY SETS, DATA MINING, AND GRANULAR COMPUTING, PRT 1, PROCEEDINGS 3641 234 - 243 2005年 [査読有り][通常論文]

According to the sizes of the attribute set and the information table, the information tables are categorized into three types of Rough Set problems, Pattern Recognition/Machine Learning problems, and Statistical Model Identification problems. In the first Rough Set situation, what we have seen is as follows: 1) The "granularity" should be taken so as to divide equally the unseen tuples out of the information table, 2) The traditional "Reduction" sense accords with the above insistence, and 3) The "stable" subsets of tuples, which are defined through a "Galois connection" between the subset and the corresponding attribute subset, may play an important role to capture some characteristics that can be read from the given information table. We show these with some illustrative examples. - Ichigaku Takigawa, Mineichi Kudo, Atsuyoshi NakamuraMachine Learning and Data Mining in Pattern Recognition, 4th International Conference, MLDM 2005, Leipzig, Germany, July 9-11, 2005, Proceedings 90 - 99 Springer 2005年 [査読有り][通常論文]
- H Tenmoto, M KudoKNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 4, PROCEEDINGS 3684 696 - 702 2005年 [査読有り][通常論文]

We propose a new method which extracts task groups on user's numerous messages such as e-mails and documents by finding the family of subsets that share common topics, sender/receivers, date or author. In addition, the method automatically gives labels to the groups by the common items. The biggest difference between the proposed method and conventional methods is that the conventional methods find completely divided clusters of documents, while the proposed method finds overlapping clusters of documents on the basis of subclass method. Therefore, the proposed method provides multiple views on the documents. We carried out experiments on the authors' e-mails and document files in order to confirm the basic property and effectiveness of the proposed method. - Mineichi Kudo, Atsuyoshi NakamuraFederation over the Web - International Workshop, Dagstuhl Castle, Germany, May 1-6, 2005. Revised Selected Papers 79 - 96 Springer 2005年 [査読有り][通常論文]
- Takigawa, I, M Kudo, J ToyamaIEEE TRANSACTIONS ON SIGNAL PROCESSING 52 3 582 - 591 2004年03月 [査読有り][通常論文]

Results of the analysis of the performance of minimum l(1)-norm solutions in underdetermined blind source separation, that is, separation of n sources from m ( < n) linearly mixed observations, are presented in this paper. The minimum l(1)-norm solutions are known to be justified as maximum a posteriori probability (MAP) solutions under a Laplacian prior. Previous works have not given much attention to the performance of minimum l(1)-norm solutions, despite the need to know about its properties in order to investigate its practical effectiveness. We first derive a probability density of minimum l(1)-norm solutions and some properties. We then show that the minimum l(1)-norm solutions work best in a case in which the number of simultaneous nonzero source time samples is less than the number of sensors at each time point or in a case in which the source signals have a highly peaked distribution. We also show that when neither of these conditions is satisfied, the performance of minimum l(1)-norm solutions is almost the same as that of linear solutions obtained by the Moore-Penrose inverse. Our results show when the minimum l(1)-norm solutions are reliable. - Michal Haindl, Jiri Grim, Petr Somol, Pavel Pudil, Mineichi KudoProceedings of the 17th International Conference on Pattern Recognition (ICPR2004) CD-ROM - 180 IEEE Computer Society 2004年 [査読有り][通常論文]
- M Kudo, T Hosokawa, J Toyama, H Tenmoto, A Nakamura8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL V, PROCEEDINGS V 65 - 68 2004年 [査読無し][通常論文]

In the past decades, person identification has been gathering a great deal of attention in a wide rage of applications. Among those techniques, many aim at achieving a high identification rate using several biometrics evidences such as fingerprint, iris, and finger vein. However these evidences require user's cooperation and are suitable for a "hard" identification in which any kind of impersonation is not allowed. In this study, we consider a "soft" identification, in which the system does require user-cooperation, at the expense of some degree of accuracy. On the contrary, the cost for gathering evidences becomes cheap and some weak evidences are used together The assumed applications are electronic devices on daily life. Through a soft identification, a device can be personalized to a user and the system provides some user-specific functions to the user In this study, a Bayesian network approach is investigated its potential for this goal. In addition, environmental evidence such as day and time are also incorporated to the system. - A Tanaka, Takigawa, I, H Imai, M Kudo, M MiyakoshiKNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS 3213 1058 - 1064 2004年 [査読有り][通常論文]

Kernel machines are widely known as powerful tools for various fields of information science. In general, they are designed based on a generalization criterion related to the complexity of the model and intuitive but ad hoe philosophy such as maximal margin principle shown in SVM. On the other hand, the project ion learning scheme was proposed in the field of neural networks. In the projection learning, the generalization ability is evaluated by the distance between the unknown target function and the estimated one. In,this paper, we construct projection learning based kernel machines and propose a method of making a kernel function that has necessary representability for the task. The method is reduced to a selection of an appropriate reproducing kernel Hilbert space from a series of monotone increasing subspaces. We also verify the efficacy of the proposed method by numerical examples. - M Kudo, H Imai, A Tanaka, T MuraiSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS 3138 885 - 893 2004年 [査読有り][通常論文]

A novel algorithm for finding the nearest neighbor was proposed. According to the development of modern technology, the demand is increasing in large-scale datasets with a large number of samples and a large number of features. However, almost all sophisticated algorithms proposed so far are effective only in a small number of features, say, up to 10. This is because in a high-dimensional space many pairs of samples share a same distance. Then the naive algorithm outperforms the others. In this study, we considered to utilize a sequential information of distances obtained by the examined training samples. Indeed, a combinatorial information of examined samples was used as bisectors between possible pairs of them. With this algorithm, a query is processed in O(alphabetand) for n samples in a d-dimensional space and for alpha,beta < 1, in expense of a preprocessing time and space in O(n(2)). We examined the performance of the algorithm. - Ichigaku Takigawa, Mineichi Kudo, Atsuyoshi Nakamura, Jun ToyamaIndependent Component Analysis and Blind Signal Separation, Lecture Notes in Computer Science 3195 192 - 200 Springer 2004年 [査読有り][通常論文]
- M Yamada, M KudoKNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS 3213 1065 - 1071 2004年 [査読有り][通常論文]

We argue how we should deal with some pieces of information each of which is not so strong for person recognition. On the basis of Dempster-Shafer theory, we introduce: 1) a new method of assigning a basic probability to nodes on a decision tree that is a basic expression of our current psychological status when we recieve an evidence, and 2) an update rule to combine several evidences presented sequentially. In person identification, the effectioness of these approaches is confirmed. - Structural, Syntactic and Statistical Pattern Recognition, Lecture Notes in Computer Science 3138 1043 - 1051 2004年 [査読無し][通常論文]
**Classifier-Independent Visualization of Supervised Data Structure Using a Graph** - International Journal of Computing Anticipatory Systems 14 311 - 320 2004年 [査読無し][通常論文]
**The Boosted/Bagged Subclass Method** - H Tenmoto, Y Mori, M KudoSTRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS 3138 1043 - 1051 2004年 [査読有り][通常論文]

Supervised data structures in high dimensional feature spaces are displayed as graphs. The structure is analyzed by normal mixture distributions. The nodes of the graph correspond the mean vectors of the mixture distributions, and the location is carried out by Sammon's nonlinear mapping. The thickness of the edges expresses the separability between the component distributions, which is determined by Kullback-Leibler divergence. From experimental results, it was confirmed that the proposed method can illustrate in which regions and to what extent it is difficult to classify samples correctly. Such visual information can be utilized for the improvement of the feature sets. - T Murai, M Sanada, Y Kudo, M KudoROUGH SETS AND CURRENT TRENDS IN COMPUTING 3066 103 - 108 2004年 [査読有り][通常論文]

Granular reasoning is a way of reasoning using granularized possible worlds and lower approximation in rough set theory. However, it can deal only with monotonicity. Then, the extended lower approximation in Ziarko's variable precision rough set model is introduced to describe nonmonotonic reasoning. - T Murai, Y Kudo, VN Huynh, A Tanaka, M Kudo2004 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, PROCEEDINGS 263 - 268 2004年 [査読有り][通常論文]

In this paper, firstly processes of classical inference are reviewed as granular reasoning from a point of view of reconstructing Kripke-style models with granularity. The essential point of the reconstruction is that some possible worlds are amalgamated to generate granules of worlds and vice versa. It is also called zoom reasoning systems. Then, the idea is applied for fuzzy reasoning processes by considering fuzzily granularized possible worlds. There linguistic truth values with linguistic hedges can be naturally introduced. - AOKI Kazuaki, WATANABE Toshiharu, KUDO MineichiIEICE transactions on information and systems 86 8 一般社団法人電子情報通信学会 2003年08月01日 [査読無し][通常論文]
- MORI Yasukuni, KUDO MineichiIEICE transactions on information and systems 86 8 一般社団法人電子情報通信学会 2003年08月01日 [査読無し][通常論文]
- A Nakamura, M Kudo, A Tanaka, K TanabeDISCOVERY SCIENCE, PROCEEDINGS 2843 393 - 401 2003年 [査読有り][通常論文]

We propose a modified version of our collaborative filtering method using restoration operators, which was proposed in [6]. Our previous method was designed so as to minimize expected squared error of predictions for user's ratings, and we experimentally showed that, for users who have evaluated only small number of items, mean squared error of our method is smaller than that of correlation-base methods. After further experiments, however, we found that, for users who have evaluated many items, the best correlation-based method has smaller mean squared error than our method. In our modified version, we incorporated an idea of projecting on a low-dimensional subspace with our method using restoration operators. We experimentally showed that our modification overcame the shortcoming stated above. - A Nakamura, M Kudo, A TanakaKNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS 2838 339 - 349 2003年 [査読有り][通常論文]

We propose a new collaborative filtering method that uses restoration operators. The problem of restoration by operators was originally studied in the field of digital image restoration [9]. We also consider the problem of selecting items that users should be asked to rate in order to achieve a small expected squared error, and we propose a greedy method as a solution of this problem. According to our experimental results, prediction performance of restoration operators is good when the number of observed ratings is small, and our greedy method outperforms random query item selection. - International Journal of Knowledge-Based Intelligent Engineering Systems 7 2 70 - 77 2003年 [査読無し][通常論文]
**Comparison of Low-Dimensional Mapping Techniques Based on Discriminatory Information** - Mineichi Kudo, Naoto Masuyama, Jun Toyama, Masaru ShimboPattern Recognition Letters 24 9-10 1213 - 1223 2003年 [査読有り][通常論文]
- 電子情報通信学会論文誌 J86-D-II-8 1156 - 1165 2003年 [査読無し][通常論文]
**クラスに依存した特徴集合を用いた決定木の設計** - 電子情報通信学会論文誌 J86-D-II-8 1166 - 1176 2003年 [査読無し][通常論文]
**グラフによるインタラクティブなデータ分析と決定木の構成** - Y Mori, M Kudo6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XVI, PROCEEDINGS XVI 312 - 317 2002年 [査読無し][通常論文]

In pattern recognition, knowledge of the structure of pattern data can help us to know the sufficiency of features and to design classifiers. We have proposed a visualization method with the help of graph representation in which the structures of classes in the original feature space are correctly preserved. This method is different from conventional methods in that subsets of data points are used instead of individual data points. In this paper a method of exploratory data analysis is proposed using the graphical data mapping method. This approach is shown to be useful especially for large-class problems. - F Taniguchi, M Kudo6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XVI, PROCEEDINGS XVI 329 - 332 2002年 [査読無し][通常論文]

The goal of pattern classification is to classify class-unknown samples correctly, in which a classifier is trained from a finite number of samples. Multiple classifier si-stems are trials to overcome to a single (strong) classifier by combining some (weak) classifiers. In this paper, we propose a novel combining method in which each element classifier is constructed by a random selection of both training samples and features. Experimental results on two real datasets are shown. As a result, the performance of a single classifier was improved by combining them with this method. - M Kudo6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XVI, PROCEEDINGS XVI 305 - 311 2002年 [査読無し][通常論文]

In pattern recognition. feature selection is an important technique to improve the performance of classifiers designed from a limited number of training samples. So far many algorithms have been proposed, but one of the problems is that it is hard to determine the size of a feature subset to be selected. In this paper, we discuss this issue and propose a novel method to determine the size automatically. This method is applicable to large-scale problems with features over fifty. Through some experiments with synthetic data the approach is shown to work almost optimally. - Kazuaki Aoki, Mineichi KudoAdvances in Pattern Recognition, Lecture Notes in Computer Science 2396 761 - 769 Springer 2002年 [査読有り][通常論文]
- Naoto Abe, Mineichi Kudo, Masaru ShimboAdvances in Pattern Recognition, Lecture Notes in Computer Science 2396 470 - 479 Springer 2002年 [査読有り][通常論文]
- H Tenmoto, Y Mori, M KudoADVANCES IN LOGIC, ARTIFICIAL INTELLIGENCE AND ROBOTICS 85 104 - 111 2002年 [査読無し][通常論文]

Piecewise linear classifiers are utilized in order to visualize class structures in high-dimensional feature space. The feature space is split into many polyhedral regions by the component hyperplanes of the classifier, and the class label of each polyhedron is determined by taking majority rule on the number of samples. The intra- and inter-class relationships of the polyhedra are shown as a graph, drawing each polyhedron as a node and connecting the nodes by edges according to the adjacency of the regions. Compared to another graph representation method, the proposed method is superior in showing the inseparability or the closeness between the classes. - Y Mori, M KudoADVANCES IN LOGIC, ARTIFICIAL INTELLIGENCE AND ROBOTICS 85 112 - 119 2002年 [査読無し][通常論文]

In pattern recognition, knowledge of the structure of pattern data can help us to know the sufficiency of features and to design classifiers. We have proposed a graphical visualization method where the structure and separability of classes in the original feature space are almost correctly preserved. This method is especially effective for knowing which groups of classes are close and which groups are not. In this paper, we propose a method to group classes on the basis of this graphical analysis. Such a grouping. is exploited to design a decision tree in which a sample is classified into groups of classes at each node with a different feature subset, and is further divided into smaller groups, and finally reaches at one of the leaves consisting of single classes. The main characteristics of this method is to use different feature subsets in different nodes. This way is most effective to solve multi-class problems. An experiment with 30 characters (30 classes) was conducted to demonstrate the effectiveness of the proposed method. - Advances in Logic, Artificial Intelligence and Robotics 120 - 125 2002年 [査読無し][通常論文]
**Clustering Based on Gap and Structure** - Yoshinori Yanagihara, Masanori Kawakami, Mineichi Kudo, Jun Toyama, Masaru ShimboSystems and Computers in Japan 32 6 13 - 20 2001年06月15日 [査読有り][通常論文]

In image coding, DCT is a standard technique adopted in JPEG. However, block distortion or blurring of details is often observed when this method is used. To resolve these problems, we propose a new two-channel coding technique using a spline surface. This technique divides an image into two components: a low-frequency component that is represented as a smooth spline surface and a high-frequency component that is represented as the difference between the original image and the low-frequency component. Experimental results show that the proposed method is more effective in compressing images than other methods, while maintaining high quality of the images. - Proceedings of the Second International ICSC Symposium on Advances in Intelligent Data Analysis (AIDA'2001) CD-ROM No. 1724-168 2001年 [査読無し][通常論文]
**Clustering Consistent with Human Perception** - Proceedings of the Second International ICSC Symposium on Advances in Intelligent Data Analysis (AIDA'2001) CD-ROM No. 1724-169 2001年 [査読無し][通常論文]
**Error Analysis of MAP Solutions under Laplace Prior in Underdetermined Blind Source Separation** - Proceedings of the Second International ICSC Symposium on Advances in Intelligent Data Analysis (AIDA'2001) CD-ROM No. 1724-167 2001年 [査読無し][通常論文]
**Visualization of High-Dimensional Supervised Data Structure using Piecewise Linear Classifiers** - Proceedings of the Second International ICSC Symposium on Advances in Intelligent Data Analysis (AIDA'2001) CD-ROM No. 1724-116 2001年 [査読無し][通常論文]
**Comparison of Low-Dimensional Mapping Techniques Based on Discriminatory Information** - H Hayashi, M Kudo, J Toyama, M ShimboPATTERN ANALYSIS AND APPLICATIONS 4 1 20 - 27 2001年 [査読無し][通常論文]

A new technique for labelling natural scenes is proposed. This technique labels disjoint regions on an image of a natural scene an the basis of knowledge about the relationship among objects. The proposed technique consists of three stages: (1) segmentation, (2) initial labelling, and (3) label improvement. One of the most promising previous techniques uses simulated annealing to had the solution, while our technique uses local hill-climbing with enhanced knowledge for speeding Lip the professing. Local hill-climbing is known to he easy to be captured by a local minimum. We solved this problem by enhancing the knowledge being used as constraints for the search, Ou; knowledge represents 1-to-n relationships among regions, Fair wise relationships of regions, and relative locations uf the regions to the image. in addition, we introduced two region features: an entropy in intensity; and a linearity of contours of each region. The linearity evaluation aims to distinguish artificial objects from natural objects. The validity of the technique is supported by some experiments. These experiments chewed that the proposed technique is much faster with the almost same accurate. - 電子情報通信学会論文誌 D-II-3 439 - 447 2001年 [査読無し][通常論文]
**パターン認識問題における終端条件の付加によるk近隣法の高速化** - H Hayashi, M Kudo, J Toyama, M ShimboPATTERN ANALYSIS AND APPLICATIONS 4 1 20 - 27 2001年 [査読有り][通常論文]

A new technique for labelling natural scenes is proposed. This technique labels disjoint regions on an image of a natural scene an the basis of knowledge about the relationship among objects. The proposed technique consists of three stages: (1) segmentation, (2) initial labelling, and (3) label improvement. One of the most promising previous techniques uses simulated annealing to had the solution, while our technique uses local hill-climbing with enhanced knowledge for speeding Lip the professing. Local hill-climbing is known to he easy to be captured by a local minimum. We solved this problem by enhancing the knowledge being used as constraints for the search, Ou; knowledge represents 1-to-n relationships among regions, Fair wise relationships of regions, and relative locations uf the regions to the image. in addition, we introduced two region features: an entropy in intensity; and a linearity of contours of each region. The linearity evaluation aims to distinguish artificial objects from natural objects. The validity of the technique is supported by some experiments. These experiments chewed that the proposed technique is much faster with the almost same accurate. - 工藤 峰一, 新保 勝電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 100 507 1 - 6 一般社団法人電子情報通信学会 2000年12月07日 [査読無し][通常論文]

凸包の有限集合でクラス領域を近似し, 近似した領域への近さに基づいて識別を行う方法を提案する.凸包は, 着目したクラスの順練サンプルのみを含み, できるだけ大きなものになるような方式で構成される.サポートベクターマシンとの比較を行い, 低次元の場合, 性能はほぼ同程度であることを実験的に確認した。 - M Kudo, H Imai, M Shimbo15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS 2 29 - 33 2000年 [査読有り][通常論文]

The subclass method is a classifier based on approximation of class regions. It assumes that all classes are separable (but not necessarily linear separable). We extend the method so as to meet cases in which class-conditional probability density functions (PDFs) overlap each other. In this extension, the method becomes to a histogram approach for approximating PDFs, but the method allows overlapping of bins unlike usual histogram approaches. It is shown that this method is consistent in the meaning that its error rate approaches the Bayes error rate as the number of samples tends to infinity. It is also shown that the convergence rate is faster than that using a previous MDL-based histogram approach in the range of practical number of samples. - N Abe, M Kudo, J Toyama, M ShimboADVANCES IN PATTERN RECOGNITION 1876 668 - 676 2000年 [査読有り][通常論文]

Feature selection aims to find the most important feature subset from a given feature set without degradation of discriminative information. In general, we wish to select a feature subset that is effective for any kind of classifier. Such studies are called Classifier-Independent Feature Selection, and Novovicova et al.'s method is one of them. Their method estimates the densities of classes with Gaussian mixture models, and selects a feature subset using Kullback-Leibler divergence between the estimated densities, but there is no indication how to choose the number of features to be selected. Kudo and Sklansky (1997) suggested the selection of a minimal feature subset such that the degree of degradation of performance is guaranteed. hi this study, based on their suggestion, we try to find a feature subset that is minimal while maintainig a given Kullback-Leibler divergence. - M Kudo, P Somol, P Pudil, M Shimbo, J SklanskyADVANCES IN PATTERN RECOGNITION 1876 677 - 686 2000年 [査読有り][通常論文]

The performance and speed of three classifier-specific feature selection algorithms, the sequential forward (backward) floating search (SFFS (SBFS)) algorithm, the ASFFS (ASBFS) algorithm (its adaptive version), and the genetic algorithm (GA) for large-scale problems are compared. The experimental results showed that 1) ASFFS (ASBFS) has better performance than does SFFS (SBFS) but requires much computation time, 2) much training in CA with a larger number of generations or with a larger population size, or both, is effective, 3) the performance of SFFS (SBFS) is comparable to that of CA with less training, and the performance of ASFFS (ASBFS) is comparable to that of GA with much training, but in terms of speed CA is better than ASFFS (ASBFS) for large-scale problems. - H Tenmoto, M Kudo, M ShimboADVANCES IN PATTERN RECOGNITION 1876 511 - 520 2000年 [査読有り][通常論文]

A genetic algorithm is employed in order to select the appropriate number of components for mixture model classifiers. In this classifier, each class-conditional probability density function can be approximated well using the mixture model of Gaussian distributions. Therefore, the classification performance of this classifier depends on the number of components by nature. In this method, the appropriate number of components is selected on the basis of class separability, while a conventional method is based on likelihood. The combination of mixture models is evaluated by a classification oriented MDL (minimum description length) criterion, and its optimization is carried out using a genetic algorithm. The effectiveness of this method is shown through the experimental results on some artificial and real datasets. - M Kudo, J SklanskyPATTERN RECOGNITION 33 1 25 - 41 2000年01月 [査読有り][通常論文]

A comparative study of algorithms for large-scale feature selection (where the number of features is over 50) is carried out. In the study, the goodness of a feature subset is measured by leave-one-out correct-classification rate of a nearest-neighbor (1-NN) classifier and many practical problems are used. A unified way is given to compare algorithms having dissimilar objectives. Based on the results of many experiments, we give guidelines for the use of feature selection algorithms. Especially, it is shown that sequential floating search methods are suitable for small- and medium-scale problems and genetic algorithms are suitable for large-scale problems. (C) 1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. - 電子情報通信学会論文誌 J83-D-II 1266 - 1273 2000年 [査読無し][通常論文]
**スプライン曲面を用いた画像の2チャネル符号化** - M Kudo, J Toyama, M ShimboPATTERN RECOGNITION LETTERS 20 11-13 1103 - 1111 1999年11月 [査読有り][通常論文]

A new method is proposed for classifying sets of a variable number of points and curves in a multidimensional space as time series. Almost all classifiers proposed so far assume that there is a constant number of features and they cannot treat a variable number of features. To cope with this difficulty, we examine a fixed number of questions like "how many points are in a certain range of a certain dimension", and we convert the corresponding answers into a binary vector with a fixed length. These converted binary vectors are used as the basis for our classification. With respect to curve classification, many conventional methods are based on a frequency analysis such as Fourier analysis, a predictive analysis such as auto-regression, or a hidden Markov model. However, their resulting classification rules are difficult to interpret. Tn addition, they also rely on the global shape of curves and cannot treat cases in which only one part of a curve is important for classification. We propose some methods that are especially effective for such cases and the obtained rule is visualized. (C) 1999 Elsevier Science B.V. All rights reserved. - Masanori Kawakami, Mineichi Kudo, Jun Toyama, Masaru ShimboProceedings of 3rd International Conference on Conventional and Knowledge-Based Intelligent Electronic Systems 451 - 454 IEEE 1999年 [査読有り][通常論文]
- J. Konishi, S. Shimba, Jun Toyama, Mineichi Kudo, Masaru ShimboProceedings of 3rd International Conference on Conventional and Knowledge-Based Intelligent Electronic Systems 518 - 521 IEEE 1999年 [査読有り][通常論文]
- T. Gotoh, Mineichi Kudo, Jun Toyama, Masaru ShimboProceedings of 3rd International Conference on Conventional and Knowledge-Based Intelligent Electronic Systems 455 - 458 IEEE 1999年 [査読有り][通常論文]
- Naoto Masuyama, Mineichi Kudo, Jun Toyama, Masaru ShimboProceedings of 3rd International Conference on Conventional and Knowledge-Based Intelligent Electronic Systems 443 - 446 IEEE 1999年 [査読有り][通常論文]
- Hiroki Hayashi, Mineichi Kudo, Jun Toyama, Masaru ShimboProceedings of 3rd International Conference on Conventional and Knowledge-Based Intelligent Electronic Systems 447 - 450 IEEE 1999年 [査読有り][通常論文]
- Hiroshi Tenmoto, Mineichi Kudo, Masaru ShimboProceedings of 3rd International Conference on Conventional and Knowledge-Based Intelligent Electronic Systems 439 - 442 IEEE 1999年 [査読有り][通常論文]
- H Tenmoto, M Kudo, M ShimboPATTERN RECOGNITION 31 11 1627 - 1634 1998年11月 [査読有り][通常論文]

A new method to construct a piecewise linear classifier is proposed. This method selects an appropriate number of hyperplanes of a piecewise linear classifier by MDL (Minimum Description Length) criterion. This method constructs the hyperplanes so as to keep the local error rate for a training set under a threshold. The threshold is determined automatically by the MDL criterion so as to avoid overfitting of the classifier to the training set. This method showed results better than those of a previous method in some experiments. (C) 1998 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. - 阿部 澄枝, 高橋 伸幸, 工藤 峰一, 飯田 陽一, 新保 勝生物物理 38 2 日本生物物理学会 1998年09月07日 [査読無し][通常論文]
- M Kudo, Y Torii, Y Mori, M ShimboPATTERN RECOGNITION LETTERS 19 9 777 - 786 1998年07月 [査読有り][通常論文]

A discrimination method is proposed in which a class region is taken as a set of quasi convex hulls. A subclass method approximates a class region by a set of hyper-rectangles, while the proposed method uses a set of quasi convex hulls. The experimental results show the effectiveness of the proposed method. (C) 1998 Elsevier Science B.V. All lights reserved. - 天元 宏, 工藤 峰一, 新保 勝電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 98 127 39 - 43 一般社団法人電子情報通信学会 1998年06月19日 [査読無し][通常論文]

混合分布モデルを用いた識別規則において、最適な混合数の選択法を提案する。従来の多くの選択法はクラス毎に混合数を決定している。しかし, 本来の目的はクラス間の識別であることから、特に限られた訓練サンプル数の下では、クラス間の関係を考慮して混合数を決定することが有効である。本研究の選択基準は識別指向のMDL(Minimum Description Length;最小記述長)原理による。この計算法では、識別規則の訓練サンプル集合に対する誤識別を記述するのに必要なビット長と、モデル(識別規則)自身の複雑さを記述するビット長の和を評価する。これらのトレードオフにより、最適な混合数を決定する。人工データおよび実データに対する実験で提案法の有効性を確認した。 - 工藤 峰一, 溝井 俊明, 新保 勝電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 98 127 25 - 32 一般社団法人電子情報通信学会 1998年06月19日 [査読無し][通常論文]

オフライン手書き漢字認識を高速・高信頼度で行う手法の一部として, 知識に基づいた動的ストローク抽出方法を提案する。研究の目標は、この方法で抽出されたストローク情報に基づいて、対象文字が予測文字と同字種であるか否かを判定する(同定する)ことである。本報告では、動的マッチングにより、モデル文字の特徴点集合の相対位置関係を保存しつつ対象文字の特徴点集合を抽出する。さらに、特徴点を繋ぐことでストロークを推定し、推定したストロークを再び動的マッチングにより、より正確に対象文字のストロークに一致させる。最後に、ストローク間の平行性、各ストロークの直線性、はみ出しの具合、などのモデル文字の知識との照合により検出したストロークの妥当性を評価する。この方法により、必ず同定できるとは限らないものの、同定した場合、非常に誤りの少ない判定が可能になる。 - XZ Li, M Kudo, J Toyama, M ShimboIEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E81D 5 457 - 463 1998年05月 [査読無し][通常論文]

Many image-processing techniques are based on texture features or gradation features of the image. However, Landsat images are complex; they also include physical features of reflection radiation and heat radiation from land cover. In this paper, we describe a method of constructing a super-resolution image of Band 6 of the Landsat TM sensor, oriented to analysis of an agricultural area, by combining information (texture features, gradation features, physical features) from other bands. In this method, a knowledge-based hierarchical classifier is first used to identify land cover in each pixel and then the least-squares approach is applied to estimate the mean temperature of each type of land cover. By reassigning the mean temperature to each pixel, a finer spatial resolution is obtained in Band 6. Computational results show the efficiency of this method. - M Kudo, F Taniguchi, H Tenmoto, M Shimbo1998 SECOND INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED INTELLIGENT ELECTRONIC SYSTEMS, KES '98, PROCEEDINGS, VOL 2 216 - 220 1998年 [査読有り][通常論文]

Some initial component densities are compared in a mixture model for pattern recognition. The EM algorithm is widely adopted in construction of a mixture density for approximating a class-conditional density. However the algorithm is very sensitive to the number of component densities and the initial component densities themselves. The initial component densities are obtained by a clustering method. We report the results of comparison between clustering methods yielding son-overlapping clusters and methods yielding overlapping clusters. - M Kudo, H Tenmoto, S Sumiyoshi, M ShimboFOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2 1 870 - 872 1998年 [査読有り][通常論文]

A classifier based an a mixture model is proposed The EM algorithm for construction of a mixture density is sensitive to the initial densities. It is also difficult to determine the optimal number of component densities. In this study we construct a mixture density on the basis of a hyper-rectangles found in the subclass method, in which the number of components is determined automatically. Experimental results show the effectiveness af this approach. - Y Mori, M Kudo, J Toyama, M ShimboFOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2 1 1724 - 1727 1998年 [査読有り][通常論文]

A method for visualizing the structure of classes using a graph is proposed. Unlike the previous approaches of mapping data points onto a plane, this method represents a few sets of data points as nodes of the graph. From the sizes of the nodes and the widths of the links connecting the nodes, we can know how complex the structure is and to what degree the classes are separable. The experimental results show that the approach is effective for revealing a hidden structure of classes in a high-dimensional space. - Proceedings of International Conference on Signal Processing and Communications 169 - 172 1998年 [査読無し][通常論文]
**Enhancing AVHRR Imagery to Estimate NDVI** - Proceedings of First International Conference on Geospatial Information in Agriculture and Forestry I 613 - 620 1998年 [査読無し][通常論文]
**Enhancement of Low Spatial Resolution Image with Wavelet Transform** - S Yanagi, M Kudo, M Shimbo1998 SECOND INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED INTELLIGENT ELECTRONIC SYSTEMS, KES '98, PROCEEDINGS, VOL 2 230 - 235 1998年 [査読有り][通常論文]

In this paper, we show a positive result. for learnability of all arbitrary Disjunctive Normal Form (DNF) formula. We propose a learning: algorithm that requires a polynomial number of examples,les in the size of an unknown formula under Probably Approximately Correct (PAC) learning: with a subset query, while it is not polynomial time. Our algorithm is based on Valiant's well-known approach with respect to monotone-DNF. - M Sato, M Kudo, J Toyama, M ShimboKYBERNETIKA 34 4 467 - 472 1998年 [査読無し][通常論文]

Although a nonlinear discrimination function may be superior to linear or quadratic classifiers, it is difficult to construct such a function. In this paper, we propose a method to construct a nonlinear discrimination function using Legendre polynomials. The selection of an optimal set of Legendre polynomials is determined by the MDL (Minimum Description Length) criterion. Results using many real data show the effectiveness of this method. - M Kudo, J SklanskyKYBERNETIKA 34 4 429 - 434 1998年 [査読無し][通常論文]

Needs of feature selection in medium and large problems increases in many fields including medical and image processing fields. Previous comparative studies of feature selection algorithms are not satisfactory in problem size and in criterion function. In addition, no way has not shown to compare algorithms with different objectives. In this study, we propose a unified way to compare a large variety of algorithms. Our results show that the sequential floating algorithms promises for up to medium problems and genetic algorithms for medium and large problems. - H Tenmoto, M Kudo, M ShimboKYBERNETIKA 34 4 479 - 484 1998年 [査読無し][通常論文]

We propose a new method to construct piecewise linear classifiers. This method constructs hyperplanes of a piecewise linear classifier so as to keep the correct recognition rate over a threshold for a training set. The threshold is determined automatically by the MDL (Minimum Description Length) criterion so as to avoid overfitting of the classifier to the training set. The proposed method showed better results in some experiments than a previous method. - 1998年 [査読無し][通常論文]
**Kudo, M. and Sklansky, J.:"Classifier-Independent Feature Selection for Two-stage Feature Selection",Advances in Pattern Recognition, Lecture Notes in Computer Science, 1451, 548-554 (A. Amin, D. Dori, P. Pudil and H. Freeman(eds.), Springer), (1998)** - 1998年 [査読無し][通常論文]
**Tenmoto, H., Kudo, M. and Shimbo, M.:"MDL-based selection of the number of components in mixture models for pattern classification", Advances in Pattern Recognition, Lecture Notes in Computer Science, 1451, 831-836(A. Amin, D. Dori, P. Pudil and H. Fre・・・**

Tenmoto, H., Kudo, M. and Shimbo, M.:"MDL-based selection of the number of components in mixture models for pattern classification", Advances in Pattern Recognition, Lecture Notes in Computer Science, 1451, 831-836(A. Amin, D. Dori, P. Pudil and H. Freeman(eds.), Springer), (1998) - 1998年 [査読無し][通常論文]
**Sato, M., Kudo, M., Toyama, J. and Shimbo, M.:"Feature Selection for a Nonlinear Classifier",Advances in Pattern Recognition, Lecture Notes in Computer Science, 1451, 555-563(A. Amin, D. Dori, P. Pudil and H. Freeman(eds.), Springer), (1998)** - Hiroshi Tenmoto, Mineichi Kudo, Masaru ShimboAdvances in Pattern Recognition, Joint IAPR International Workshops SSPR '98 and SPR '98, Sydney, NSW, Australia, August 11-13, 1998, Proceedings 831 - 836 Springer 1998年 [査読有り][通常論文]
- Maiko Sato, Mineichi Kudo, Jun Toyama, Masaru ShimboAdvances in Pattern Recognition, Joint IAPR International Workshops SSPR '98 and SPR '98, Sydney, NSW, Australia, August 11-13, 1998, Proceedings 555 - 563 Springer 1998年 [査読有り][通常論文]
- Mineichi Kudo, Jack SklanskyAdvances in Pattern Recognition, Joint IAPR International Workshops SSPR '98 and SPR '98, Sydney, NSW, Australia, August 11-13, 1998, Proceedings 548 - 554 Springer 1998年 [査読有り][通常論文]
- 谷口 文威, 工藤 峰一, 村井 哲也, 新保 勝全国大会講演論文集 54 1 395 - 396 一般社団法人情報処理学会 1997年03月12日 [査読無し][通常論文]

パターン認識は「対象の特徴を抽出し、対象を特徴空間上のベクトルとしたものを識別規則を用いて分類する」ものである。特徴空間において分類すべき集合をクラスと呼ぶ。従来の識別規則は通常、すべてのサンプルを強制的にいづれかのクラスに割り当てる。しかし、限られた特徴空間では本質的に識別不能なサンプルが存在し、このような強制割り当ては好ましくない場合がある。本稿では、ラフ集合理論における下近似および上近似を用いてタラス領域のかなり確かな部分と曖昧性を含む部分を抽出する. これにより、曖昧な領域に落ちたサンプルの判定を延期し、常に判定を行うとは限らないが判定した場合にはかなりの確率で正しい判定を行う識別規則を構成する。下近似および上近似の推定は複数の識別規則の判定結果を統合することで行う。これは、近年研究されてきている識別子の統合の一つの方法論も与える。 - Proceedings of the Seventh Conference of the International Association for the Development of Interdisciplinary Research (AIDRI) 135 - 138 1997年 [査読無し][通常論文]
**A Rough-Set-Based Approach to Estimation of Class Regions in Pattern Recognition** - F Taniguchi, M Kudo, M ShimboFIRST INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED INTELLIGENT ELECTRONIC SYSTEMS, PROCEEDINGS 1997 - KES '97, VOLS 1 AND 2 373 - 377 1997年 [査読有り][通常論文]

A technique to iind sure and ambiguous regions in a class is proposed. These regions are defined by lower approximations in rough set theory. Outputs of many classifiers are combined in order to make such lower approximations and give class labels to them. As an application of this technique; a classifier with few misclassification is proposed. - M Kudo, S Yanagi, M ShimboPATTERN RECOGNITION 29 4 581 - 588 1996年04月 [査読有り][通常論文]

A randomized algorithm is proposed for solving the problem of finding hyper-rectangles, sufficiently approximating the true region in each class. This method yields a suboptimal solution, but is more efficient than previous methods. The performance is analysed based on a criterion of PAC (Probably Approximately Correct) learning. Experimental results show that the proposed method can solve large problems which were not able to be solved previously. - Mineichi Kudo, Masaru ShimboProceedings - International Conference on Pattern Recognition 2 886 - 890 1996年 [査読有り][通常論文]

The MDL (minimum description length) criterion is used to select the best classifier among several types of classifiers on a given pattern recognition problem. Unlike previous studies, our technique can compare a wide variety of classifiers if we know a combinational property, viz., the Vapnick-Chervonenkis (VC) dimension. Three classifiers are compared using this criterion. Experimental results show the effectiveness of the method. © 1996 IEEE. - M Kudo, K Mizukami, Y Nakamura, M ShimboPATTERN RECOGNITION LETTERS 17 1 77 - 82 1996年01月 [査読有り][通常論文]

Membership queries are realized approximately in a character recognition problem using an artificial sample generator. A user plays a role in an oracle to answer membership queries. This realization enables a supplementary learning of the discrimination rules. By an experiment, the recognition rate was raised to 98.0 % from 97.3 %. - 電子情報通信学会論文誌 D--II 378 - 384 1995年 [査読無し][通常論文]
**ボルテラ級数を用いた非線形画像復元** - 電子情報通信学会論文誌 D--II 729 - 736 1994年 [査読無し][通常論文]
**統計的弛緩法による図形分節モデル** - M KUDO, M SHIMBOPATTERN RECOGNITION 26 6 891 - 901 1993年06月 [査読有り][通常論文]

A new technique is proposed to select features out of all available ones on the basis of structural indices of categories. In terms of hyper-rectangles including as many training samples of a category as possible, two characteristic indices are calculated which summarize its underlying distribution of samples. The hyper-rectangles and the indices are available in evaluating the degree of importance of features, and are used to increase the discrimination rates of discrimination rules by removing redundant features. The running time of the algorithm is linear order in the number of features. Experiments on artificial and real data attests its effectiveness. - M KUDO, S KITAMURAABE, M SHIMBO, Y IIDACOMPUTER APPLICATIONS IN THE BIOSCIENCES 8 4 367 - 376 1992年08月 [査読有り][通常論文]

The signals that direct the excision of introns from mammalian pre-mRNA are not yet well understood. However, at least three kinds of signals-5'-splice site signals, 3'-splice site signals and branch point signals-play important roles in the excision of introns. In the present paper we treat only the 5'-splice sites. In addition to a consensus sequence for 5'-splice signals, several methods have been proposed, based on a statistical model, and used to analyze relative importance of each nucleotide at each position. In our approach a nucleotide sequence is regarded as a string with symbols of 'A', 'T', 'G' and 'C'; important substrings of 5'-splice site sequences, called pattern sequences, are extracted. A pattern sequence expresses which nucleotide is needed at a limited number of positions around the 5'-splice site. It is observed that a particular pattern sequence matches predominantly 5'-splice site sequences nearest to the 5'-end of a gene and another pattern sequence matches predominantly the second nearest ones. Moreover, it is confirmed that the pattern sequences accurately predict authentic 5'-splice sites for unknown genes and explain some mutation examples. - Mineichi Kudo, Sumie Kitamura-Abe, Masaru Shimbo, Yôichi LidaBioinformatics 8 4 367 - 376 1992年08月 [査読有り][通常論文]

The signals that direct the excision of introns from mammalian pre-mRNA are not yet well understood. However, at least three lands of signals-5′-splice site signals, 3′-splice site signals and branch point signals -play important roles in the excision of introns. In the present paper we treat only the 5′-splice sites. In addition to a consensus sequence for 5′-splice signals, several methods have been proposed, based on a statistical model, and used to analyze relative importance of each nucleotide at each position. In our approach a nucleotide sequence is regarded as a string with symbols of 'A', 'T' 'G' and 'C' important substrings of 5′-splice site sequences, called pattern sequences, are extracted. A pattern sequence expresses which nucleotide is needed at a limited number of positions around the 5′-splice site. It is observed that a particular pattern sequence matches predominantly 5′-splice site sequences nearest to the 5′-end of a gene and another pattern sequence matches predominantly the second nearest ones. Moreover, it is confirmed that the pattern sequences accurately predict authentic 5′-splice sites for unknown genes and explain some mutation examples. © 1992 Oxford University Press. - M KUDO, M SHIMBOIEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS 19 5 1194 - 1199 1989年09月 [査読有り][通常論文]
- M KUDO, M SHIMBOPATTERN RECOGNITION 21 4 401 - 409 1988年 [査読有り][通常論文]
- M KUDO, Y IIDA, M SHIMBOCOMPUTER APPLICATIONS IN THE BIOSCIENCES 3 4 319 - 324 1987年11月 [査読有り][通常論文]
- Mineichi Kudo, Yôichi Lida, Masaru ShimboBioinformatics 3 4 319 - 324 1987年11月 [査読有り][通常論文]

The signals which direct the excision of introns from eukaryotic pre-mRNA are not yet well understood. In order to define the signals for 5'-splice sites of mRNA splicing, nucleotide sequences including 5'-splice junctions of mammalian pre-mRNAs are analysed by means of syntactic pattern analysis. Taking this approach, we infer the grammatical rules which specify 5'-splice sites and construct a finite automaton which is the recognizer of the nucleotide sequences at 5'-splice sites. By scanning the automaton along nucleotide sequences, we can identify the positions of 5'-splice junctions with a degree of discrimination of up to 94-97% in the known genes, while the degree of prediction is in the range 50-55% in new genes. © 1987 IRL Press Limited.

- 安井拓未, 中村篤祥, 中村篤祥, 田中章, 田中章, 工藤峰一, 工藤峰一 情報処理学会研究報告(Web) 2017 (MUS-116) Vol.2017‐MUS‐116,No.15,1‐4 (WEB ONLY) 2017年08月17日 [査読無し][通常論文]
- BACKHUS Jana, TAKIGAWA Ichigaku, IMAI Hideyuki, KUDO Mineichi, SUGIMOTO Masanori 人工知能学会全国大会論文集(CD-ROM) 30th (0) ROMBUNNO.3E4‐3 -3E43 2016年 [査読無し][通常論文]
- 渡辺 僚, 中村 篤祥, 工藤 峰一 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 115 (323) 167 -174 2015年11月26日 [査読無し][通常論文]
- Xavier Llado, Atsushi Imiya, David Mason, Constantino Carlos Reyes-Aldasorod, Kazuaki Aoki, Mineichi Kudo, Yu-Jin Zhang, Vasileios Argyriou PATTERN RECOGNITION LETTERS 54 I -I 2015年03月 [査読無し][通常論文]
- 渡辺 僚, 中村 篤祥, 工藤 峰一 人工知能基本問題研究会 96 29 -34 2015年01月13日 [査読無し][通常論文]
- Xavier Llado, Atsushi Imiya, David Mason, Constantiono Carlos Reyes Aldaraso, Kazuaki Aoki, Mineichi Kudo, Yu-Jin Zhang, Vasileios Argyriou PATTERN RECOGNITION LETTERS 48 2 -7 2014年10月 [査読無し][通常論文]
- 田畑 公次, 中村 篤祥, 工藤 峰一 人工知能基本問題研究会 92 35 -40 2014年01月30日 [査読無し][通常論文]
- 渡辺 僚, 中村 篤祥, 工藤 峰一 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 (198) 9 -16 2013年09月03日 [査読無し][通常論文]
- 花田 博幸, 中村 篤祥, 工藤 峰一 人工知能基本問題研究会 89 25 -30 2013年02月28日 [査読無し][通常論文]
- 打矢 泰志, 中村 篤祥, 工藤 峰一 電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 108 (363) 213 -218 2008年12月11日 [査読無し][通常論文]

multi-armed bandit問題とは、異なるK個のスロットマシンから1台のマシンを選択するという試行を繰り返し行う状況において、総合利得を最大化するようにマシンを選択する問題である。ほとんどの従来手法では各スロットマシンから得られる報酬は確率的に定まるという仮定のもとに分析が行われてきた。一方、Auerらは報酬に確率的な仮定をおかない一般的な場合を考え、損失の上界が理論的に保証されたアルゴリズムを示した。本報告では、この問題を一度に複数のスロットマシンを選択できるように拡張し、分散投資効果について理論的に分析する。 - 高橋 哲自, 工藤 峰一, 中村 篤洋 電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 108 (363) 37 -41 2008年12月11日 [査読無し][通常論文]

クラス領域の近似において凸包は有効であるものの,高次元で凸包を構成するのは計算量的に困難である.本稿では,次元に線形な計算量で済む近似凸包の構成法を提案する.加えて,近似凸包を含む多面体領域を用いた識別子の族から,与えられた問題に有効な識別子を選択する基準を検討する. - 中村 篤祥, 工藤 峰一 電子情報通信学会技術研究報告. COMP, コンピュテーション 108 (237) 9 -16 2008年10月03日 [査読無し][通常論文]

音のような長さ付きシンボルの列に対するアライメントとしてパッキングアライメントを提案する。更に音に重なりのある一般的な楽曲のモデルとして、終了位置(時間)と長さが付いたシンボルの列を考え、パッキングアライメントをそのような列に適用できるように拡張する。パッキングアライメントを用いることにより、曲のなんとなく似ている部分を自動的に抽出することが可能であると考えられる。バッハの人気曲を使った実験によれば、明らかには似ていると言えない、なんとなく似ている部分が実際に抽出された。 - 中村 篤祥, 工藤 峰一 数理解析研究所講究録 1599 (0) 162 -169 2008年05月 [査読無し][通常論文]
- 中村 篤祥, 工藤 峰一 電子情報通信学会技術研究報告. COMP, コンピュテーション 107 (219) 35 -42 2007年09月13日 [査読無し][通常論文]

複雑な領域を、複数の単純な領域の和集合で表現する方法は、わかりやすく扱いやすいため様々な分野で用いられる。工藤らは、多次元実数空間において、与えられた正例の部分集合で、それをすべて含み与えられた負例を1つも含まない超矩形が存在するような極大集合(部分クラス)をすべて求める、部分クラス問題を考えた[2],[4]。しかし、この問題を解く既存アルゴリズムの時間計算量が大き過ぎるため、ランダマイズド法によりできるだけ多くの重要な部分クラスを求める方法がとられてきた。本稿では、部分クラス族のVC次元に計算時間が依存し、VC次元が低い部分クラス族ほど高速に列挙できる、部分クラス列挙アルゴリズムを提案する。UCI Machine Learning Repositoryのデータセットを使った実験により、提案アルゴリズムが小規模な実データへ適用可能であることが検証された。 - 戸坂 央, 中村 篤祥, 工藤 峰一 電子情報通信学会技術研究報告. DE, データ工学 107 (114) 7 -12 2007年06月21日 [査読無し][通常論文]

ラベル付き木は計算機ネットワークやWebマイニング,バイオインフォマティクス,XML文書マイニング等様々な分野で扱われている.本稿ではこれらのデータからの基礎的なマイニング手法として,類似する部分木が頻出する部分木を発見する問題を扱う.問題の定式化を行った後に,この問題を効率良く解くアルゴリズムを提案する.実際のWebページを用いた実験により提案アルゴリズムが実用的な速度で動作することを確認した.また,Webページからのデータレコード抽出における予備実験では有望な結果が得られた. - 斉藤 智哉, 中村 篤祥, 工藤 峰一 情報科学技術レターズ 5 (0) 1 -4 2006年08月21日 [査読無し][通常論文]
- 斉藤 智哉, 中村 篤祥, 工藤 峰一 数理解析研究所講究録 1489 (0) 216 -222 2006年05月 [査読無し][通常論文]
- 瀧川 一学, 工藤 峰一 電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 104 (524) 37 -42 2004年12月17日 [査読無し][通常論文]

与えられた正例の部分集合の最小包含球のうち,負例を一つも含まないような超球族をプロトタイプとして用いたノンパラメトリック多クラス識別法を提案する。本稿では軸平行超矩形族によるクラス被覆に基づく従来の部分クラス法を一般化して再定式化し,その枠組で定義を満たす部分クラス族を得る厳密アルゴリズムと効率的に実行可能な部分クラスを得る逐次追加型の確率的アルゴリズムを示し,その性質を議論する.また,罰則関数を用いたソフト識別への拡張方法を提示し,いくつかのデータでの予備実験の結果を示す. - 中村 篤祥, 工藤 峰一 情報処理学会研究報告. AL, アルゴリズム研究会報告 2004 (101) 67 -74 2004年10月14日 [査読無し][通常論文]

本論文では、与えられたラベル付き順序木の集合から、あらかじめ全ての木が含んでいることがわかっている特殊節点を全て含む部分木で、頻出するものを効率的に数えあげる方法を提案する。提案アルゴリズムはZakiのTreeMinerアルゴリズム[9]を、制約を育たすものだけ候補として効率的に生成するように改造したものである。また、同じラベルが多く存在する場合に大量のメモリを使用するという問題に対処する方法についても提案する。検索エンジンで集められたWebページに含まれる、レストランの名前と評判情報を含む頻出部分構造見つける問題に、提案アルゴリズムを適用することにより得られた結果につても報告する。 - 中村 篤祥, 工藤 峰一 電子情報通信学会技術研究報告. COMP, コンピュテーション 104 (339) 7 -14 2004年10月08日 [査読無し][通常論文]

本論文では、与えられたラベル付き順序本の集合から、あらかじめ全ての本が含んでいることがわかっている特殊節点を全て含む部分木で、頻出するものを効率的に数えあげる方法を提案する。提案アルゴリズムはZakiのTreeMinerアルゴリズム[9]を、制約を満たすものだけ候補として効率的に生成するように改造したものである。また、同じラベルが多く存在する場合に大量のメモリを使用するという問題に対処する方法についても提案する。検索エンジンで集められたWebページに含まれる、レストランの名前と評判情報を含む頻出部分構造見つける問題に、提案アルゴリズムを適用することにより得られた結果につても報告する。 - 瀧川 一学, 外山 淳, 工藤 峰一 電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 103 (296) 113 -118 2003年09月09日 [査読無し][通常論文]

近年注目を集めている音声信号などの信号源分離において,マイクの個数が音源数より少ない場合などの系が劣決定な場合は,系を同定した後,その線形系の解を一意に決める必要がある.これには各時刻での最小l_1ノルム解がよく用いられており,通常各時刻で線形計画問題を解く必要がある.本研究では音声のようにサンプル数の大きいデータに対し,この最小l_1ノルム系列の構成を効率よく行う手法について検討する. - 河田 岳大, 中村 篤祥, 外山 淳, 工藤 峰一 電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 103 (295) 1 -5 2003年09月01日 [査読無し][通常論文]

文字認識システムでは,認識精度を向上させるために言語情報を用いた誤り検出・訂正の方法が提案されている.我々はN-gram確率を用いた手法に着目し,両方向N-gram確率を用いて1文字の誤字を検出する方法を提案する.本方式では文脈確率と両方向N-gram確率のそれぞれの意味から自然に導出される方法を用いる.また,推定パターンに一致しないものについて原因と検出の可能性を検討する. - 設樂 洋爾, 中村 篤祥, 工藤 峰一 電子情報通信学会技術研究報告. PRMU, パターン認識・メディア理解 103 (295) 7 -11 2003年09月01日 [査読無し][通常論文]

データベースに潜む傾向をルールの形で抽出する手法は,可読性の高い出力が可能であり,デークマイニングの手法として有効である.精度の高い識別子を構成し,そこから規則を得る手法は多く提案されている.しかし,必ずしも精度の高い識別子から興味深いルールが得られるとは限らない.そこで,データに基づく属性値の抽象化とスクリーニングを組み合わせ,可読性の高いルールを抽出する手法を提案する.また,識別精度とルールの興味深さについて考察を行う.

- 2019年03月 一般社団法人 電子情報通信学会 フェロー称号授与

受賞者: 工藤 峰一 - 2018年11月 電子情報通信学会 研究奨励賞

赤外線センサーネットワークからの歩容情報抽出と個人認証受賞者: 椿野駿平;工藤峰一 - 2013年08月 The World Congress on Engineering Best Student Paper Award

Structual Change Point Detection for Evolutional Networks受賞者: 工藤 峰一 - 2009年 Outstanding Presentation Award
- 2005年 Best Paper Award
- 2001年12月 Pattern Recognition Society The Twenty-Seventh Annual Pattern Recognition Society Award

受賞者: Mineichi Kudo;Jack Sklansky - 2001年 The Twenty-Seventh Annual Pattern Recognition Society Award

- 文部科学省：科学研究費補助金(基盤研究(B))研究期間 : 2002年 -2005年代表者 : 工藤 峰一, 林 裕樹, 天元 宏, 外山 淳, 今井 英幸, 村井 哲也, 田中 章, 中村 篤祥, 天元 宏, 林 裕樹本課題の目的とした「大規模パターン認識系の実現」に対して、交付申請時に挙げた三種類課題に関する成果は以下の通りである。1.データ数に関する規模の問題:高い識別性能(汎化性能)を達成する問題から、性能を大きく損なわない範囲で計算量を削減する問題へと、問題の設定を移し、データ数に対して少ない計算量で学習する方式を考察した。特に、1)代表的な識別手法の一つである、k最近隣法をより高速に行う手法を開発し、2)データ数に対して線形の時間で構成できるプロトタイプの構成法を検討した(今後、継続検討).2.特徴数に関する規模の問題:識別性能を維持したまま識別に本質的な特徴のみを残す「特徴選択」手法に着目して各種の提案を行った。これまで提案されているて多くの手法では規模の増大に計算量的に十分対応できない。そこで,本研究では,識別に明らかに不必要な特徴を先に除去することを「識別子に依存しない特徴選択」と位置付け,その方法論を検討した.また,提案手法により検査すべき特徴の個数を予め減らし,その後従来の代表的な方法を適用することで,全体として高速な「識別子を特定した」特徴選択が実現できることを示した.基本的な方法論として二通り、加えて、選ぶべき特徴数の選択に関する研究を行った。3.クラスの数の多い場合の検討(クラス数に関する規模の問題):漢字認識のようにカテゴリ数が非常に多い場合,識別の困難さが増すことは明らかである.本研究では、新しい視点として,識別の困難なカテゴリのグループを(スーパークラスとして)まとめることで,より高性能な識別ができること、さらに、各スーパークラスを視覚化することで,困難さの所在を明らかできることを示した。また、実際の効果を類似漢字識別問題において示した.さらに、これまで検討してきたサブクラスを考慮に入れることで、与えられた分類問題の本質的を総合的に捉えられることを示した。
- 文部科学省：科学研究費補助金(奨励研究(A))研究期間 : 1995年 -1995年代表者 : 工藤 峰一所属性質問を用いた区分的線形識別関数の追加学習法の開発を目的とした。本研究では先だって、従来提案されてきた区分的線形識別関数の構成法の見直しを行った。これにより、従来の構成法が特徴数が多い場合に有効な識別規則を構成し得ないことがわかり、識別性能を向上させる新しい手法を開発し、性能を計算機シミュレーションにより確認した。また、所属性質問を用いて区分的線形識別関数の性能を向上させるために、簡単な人工データの場合に人工サンプルを用いた追加学習を試み、提案方法の基本的有効性を検討した。今後は、新しく開発した区分的線形識別関数構成法に対して所属性質問を授用し、文字認識あるいは音声認識において効果的な追加学習法を検討する予定である。
- 研究期間 : 1977年
**パターン認識に関する包括的研究** - 研究期間 : 1977年
**Pattern Recognition and Its Related Fields** **ユビキタス時代の柔らかなパターン認識、真に使いやすいインターフェースとは、学習アルゴリズムの性能と効率****Pattern Recognition in Ubiquitaous Era, What is really friendly interface ? Performace and Efficiency Analysis of Learning Algorithms\**

- 卒業論文開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部
- 大学院共通授業科目（一般科目）：自然科学・応用科学開講年度 : 2020年課程区分 : 修士課程開講学部 : 大学院共通科目キーワード : 機械学習、統計的方法，教師あり学習、パターン認識、教師なし学習、回帰、線形モデル
- 情報認識学特論開講年度 : 2020年課程区分 : 修士課程開講学部 : 情報科学研究科キーワード : 機械学習、統計的方法，教師あり学習、パターン認識、教師なし学習、回帰、線形モデル
- 情報認識学特論開講年度 : 2020年課程区分 : 修士課程開講学部 : 情報科学院キーワード : 機械学習、統計的方法，教師あり学習、パターン認識、教師なし学習、回帰、線形モデル
- コンピュータサイエンス特別研究開講年度 : 2020年課程区分 : 博士後期課程開講学部 : 情報科学研究科キーワード : コンピュータサイエンス、知識処理、数理科学、人工知能、機械学習
- 情報認識学特論開講年度 : 2020年課程区分 : 博士後期課程開講学部 : 情報科学研究科キーワード : 機械学習、統計的方法，教師あり学習、パターン認識、教師なし学習、回帰、線形モデル
- 情報認識学特論開講年度 : 2020年課程区分 : 博士後期課程開講学部 : 情報科学院キーワード : 機械学習、統計的方法，教師あり学習、パターン認識、教師なし学習、回帰、線形モデル
- インターンシップⅠ開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部キーワード : 想像的人材育成、実践的人材育成、就業体験、国内外インターンシップ
- インターンシップⅡ開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部キーワード : 想像的人材育成、実践的人材育成、就業体験、国内外インターンシップ
- 情報代数とオートマトン開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部キーワード : 集合、命題論理、述語論理、証明技法、関係、写像、オートマトン、形式言語
- 情報代数学開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部キーワード : 集合、命題論理、述語論理、証明技法、関係、写像、オートマトン、形式言語
- 科学技術英語演習開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部キーワード : 英語読解力，英語作文力，科学技術英語表現
- 計算理論開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部キーワード : 命題論理，述語論理，非古典論理
- 記号論理開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部キーワード : 命題論理，述語論理，非古典論理
- コンピュータサイエンス演習Ⅰ開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部キーワード : 集合、論理、関係、写像、オートマトン、形式言語、数の表現（2 進数，補数，浮動小数点），2進数の演算アルゴリズム，計算モデルと計算量，ベクトル空間，行列計算（ガウスの消去法，LU 分解），固有値計算，誤差つき計算
- 情報理工学演習Ⅱ開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部キーワード : 集合、論理、関係、写像、オートマトン、形式言語、数の表現（2 進数，補数，浮動小数点），2進数の演算アルゴリズム，ベクトル空間，行列計算（ガウスの消去法，LU 分解），固有値計算，誤差つき計算
- 情報理論開講年度 : 2020年課程区分 : 学士課程開講学部 : 工学部キーワード : 情報量、エントロピー、情報源、通信路、符号化、データ圧縮