Masaharu YOSHIOKA |
Faculty of Information Science and Technology Computer Science and Information Technology Knowledge Software Science |
Professor |
Wikipedia is one of the largest knowledge resources available via the Internet. Several methods have been proposed for extracting knowledge/information from Wikipedia, but methods based on extracting knowledge from the Wikipedia category structure can not utilize whole part of its structure because of the complexity of the relationships between the various categories. In this paper, we briefly review our previous researches on analyzing Wikipedia category and introduce a framework called "Wikipedia Category Ontology" (WCO) that aims to act as a basis for interpreting the Wikipedia category structure. It is based on a classification of category types and relationship types and available online in the form of Linked Open Data at http://wcontology.org/. We also demonstrate the system by using Wikipedia category analysis scenarios.
Since a category of Wikipedia contains information useful for classifying a Wikipedia pages. A classification method using this category information has been proposed.However, since various types of information are mixed in the Wikipedia category, there is a problem that a category that is not appropriate to be assigned at the same time is added.In this research, we classify more appropriate classes for pages with multiple categories using Wikipedia category ontology, which defines the types of Wikipedia categories and their parent-child relationships. By using this method, we rank the adequacy of the page class belonging to a plurality of categories. To confirm effectiveness of this method, we compar with the classification of SHINRA : Wikipedia Structuring Project.
In the NTCIR-15 QA Lab-PoliInfo-2, we are planning a Stance Classification Task that estimates the pros and cons of each bill of the group in the Tokyo Metropolitan Assembly. This paper describes the outline of the task and the construction of the data set for the task. Then, this paper analyzes the existence of remarks and their expression forms for expressing approval or disapproval of bills included in the minutes of the Tokyo Metropolitan Assembly in 2018, which is part of the data used in the task.
Lots of technical documents have been created and accumulated for the purpose of standardizing and inheriting technologies in many companies across industries. Those documents have certain formats and can be searched by documents retrieval systems. However, since many of those documents are written in specific fields, it's difficult for younger engineers to find the documents related to their tasks if they don't know appropriate key words whereas it's easy for experienced engineers who know the existence of such documents. In this study, we conduct feasibility studies to reveal the above problems with documents in a construction company as an example and introduce the basic idea of the new technical documents retrieval system utilizing construction steps and conditions.
Stock price forecasting is performed in various forms, such as technical analysis and fundamental analysis using time series data and financial indicators, and text analysis using news information. We propose a method for predicting stock prices by machine learning using multiple technical indicators, taking into account the increasing number of algorithmic trades using technical analysis. At this time, stocks with good and bad fits by technical analysis are classified and evaluated. We consider how to combine multiple technical indicators.
Since Wikipedia is an on-line encyclopedia that covers varieties topics, there are several attempts to extracting knowledge from its contents. Wikipedia category is one type of information that provides topical index for the articles and organized as hierarchical manner. However, since hierarchical structure of Wikipedia category have different characteristics compared to the one for knowledge representation, it may not fully and correctly utilized at this moment. In this paper, we propose a method to extract hierarchical structure that have similar characteristics with knowledge representation.
本試論では、自動運転の車のようなオープンな環境で利用される人工知能を搭載された人工物の製造物責任について検討する。オープンな環境における責任を議論するためには、様々な事態と付随するリスクを認識し、製造者と利用者が適切に分担する枠組が必要になると考えている。本論では、トロッコ問題を例にとり、製造者と利用者がリスクを適切に認識するための枠組とその実現のための人工知能の果たすべき役割について考察する。
Web上の百科事典であるWikipediaは、様々な事象に関する知識源であるが、その情報の品質については、編集者に依存する。本研究では、DBPediaにより抽出された構造化情報を用いて、Wikipediaのカテゴリ構造の一貫性を分析する方法を提案するとともに、その分析事例について紹介する。
Wikipediaのカテゴリーは、閲覧者にとって有用な分類を提供するものであり、そのカテゴリーは束状の階層関係として表現される。 この関係には、概念間の階層関係を含む様々な関係が存在する。 本発表では、このカテゴリー階層関係の特徴的なパターンを分析すると共に、その分析結果に基づいた日本語Wikipediaオントロジーに関する考察を行う。
本論文では,検索エンジンAPIを用いたウェブページ収集タスクにおいて,ト ピックモデルを用いたトピックの分布の観点から,できる限り多様なウェブペー ジを収集する方式を提案する.提案方式によって,ニュース・ブログにおける トピックとの関連が強いウェブページを収集することにより,より多様性に富 んだトピックの分布が観測できることを示す.
We have been working on the project "Exploring knowledge for nanodevice development" and proposed a framework to annotate useful information (e.g., source material, evaluation parameter, and so on) for analyzing nanocrystal device development papers. In this paper, we conduct nanocrystal device research papers clustering experiment once on annotated papers using automatic annotation results, and once as non-annotated using bag of word approach, and then we compare the results discussing the usefulness of the automatic annotation.
本論文では,震災に関する話題についての時系列のニュース記事集合,および, ブログ記事集合を対象として,トピックモデルを用いたトピック同定を行い, ニュース・ブログの間での話題の相関を分析する.本論文では,特に,各トピッ クと密接に対応するニュース記事集合・ブログ記事集合における話題がどの程度 まとまっているか,という点と,時系列バーストの有無との相関について分析を 行った結果を報告する.
東京の記述に日本という説明を加えないように、地名を含む情報を記載する場合に、その包含関係の情報を明示的に記述しないことが多い。しかし、これらの地名に関連した情報を検索する場合には、このような暗黙に仮定している関係の情報を利用する必要がある。本研究では、Wikipediaを対象にして、このような検索に有用な地名の包含関係情報の抽出方法を提案する。
ナノ結晶デバイスの開発プロセスにおいては、製造プロセスがデバイスの性質に大きな影響を与えるため、試行錯誤的な実験の繰り返しが必要とされる。しかし、その試行錯誤に関する知識はあまり明示化されていないため、熟練者の「匠の技」である暗黙知に依存している。本研究では、デバイス生成のための実験を行った記録を情報源として、この暗黙知の明示化を支援する実験記録データの活用システムを提案する。
本研究では、様々な国から発信される多言語ニュースに対し、機械翻訳システムを利用して対照比較を行うNSContrastの研究を行っている。本システムの分析には、翻訳間違いや表記の違いが解析に大きな影響を与えるため、正規化が必要である。本稿では、Wikipediaの言語間リンクを用いた翻訳用辞書の生成や、redirectの関係を用いた表記のバリエーションへの対応を行い、実運用における評価を行う。
During design, a designer uses various computational tools, such as a geometric modeling system, analysis tools, and databases. To support these design processes, a system that can integrate such computational tools is required. For this purpose, we have been developing the Knowledge Intensive Engineering Framework (KIEF) system. KIEF enables the designer to build, evaluate, and modify design models on the system by providing him/her with a wide variety of engineering knowledge. A common ontology plays an important role in integrating tools and representing engineering knowledge. This paper describes the structure of the ontology in KIEF, and how it is used in a modeling processes on KIEF.
In this study, we propose a framework of an integrated modeling environment for design. Since a mechanical engineering design process requires various kinds of design object models, such as a geometric model, a kinematic model, and a finite element model, this environment should maintain consistency among these models. Furthermore, commercial design tools are practically used in design processes, and these must be integrated as well. For dealing with multiple design object models, we have formalized a concept called a metamodel. A metamodel is a model which represents relationships among concepts used in various design object models. This paper proposes a framework of the pluggable metamodel mechanism that allows to plug in existing design tools, to support a modeling on it, and to maintain consistency among them. We describe a prototype system and illustrate an example of ball screw design.