Knowledge Data Engineering Lab. Information retrieval, data mining, intelligent processing for 3-D/image/video/text, text mining
Masaki Aono
Information retrieval, data mining, intelligent processing for 3-D/image/video/text, text mining

The Knowledge Data Engineering (KDE) Laboratory explores research on massive and intelligent multimedia data processing, where multimedia ranges from 3D models, images, and video, to text available on the Web or kept in storage as a collection of big data. Information retrieval as well as automatic annotation of the multimedia (3-D models, images, and video) based on machine learning are the two most focused areas of research in intelligent data processing, where feature extraction has been our main concern. We also conduct research on the Semantic Web and time series data mining. In the Semantic Web, our biggest interest is how to wisely use ontologies to make things understandable and to connect two concepts that seemingly look separated. Ontology learning is also included. In time series data mining, we employ massive times series data acquired by sensors and loggers, and attempt to find any interesting correlations hidden inside the data. We are also embarking on potential applications for deep learning with our GPGPU machines.

Intelligent multimedia data processing

We are investigating 3D shape retrieval and automatic 3D annotation having high performance and accuracy. The main focus has been on feature extraction. In addition, we have started investigating image annotation and captioning, together with applications for deep learning with GPGPU. We have submitted several patents on this research, have gained recognition through the SHREC (SHape Retrieval Contest). In the past several years in SHREC, we took world No.1 accuracy on several different tracks, including general 3-D object retrieval and 3-D shape retrieval from 2-D sketches. We also provide a 3-D shape benchmark called the Toyohashi Shape Benchmark (TSB) for public research use.

Data mining

On the basis of massive time series data, we are investigating hidden rules among multiple time series data, potential correlations, optimization, and linear and non-linear regression models, using diversified research tools including machine learning and multivariate analysis tools.

Semantic Web

Ontology alignment and ontology learning based on meta data on the Web have been our main concern in this research. Medical applications and semantic search applications are included.

Web mining

Spam Web page mining for security risks, graph-theoretic link mining, and general Web mining seeking new applications for collective intelligence are included in this research.

Text mining

Estimating emotions, judgements, and evaluations from text have been the main focus on this research. Cross-media approaches, such as image and text to augment cross media search, is also of interest. In addition, patent data retrieval and clustering have been investigated. Natural language processing technologies (both in English and Japanese) are naturally included here.

Applied Mathematics and Network Lab. Applied statistics, natural language processing, social media, e-learning
Kyoji Umemura
Mitsuo Yoshida
Applied statistics, natural language processing, social media, e-learning

The importance of computer networks becomes larger day by day. In order to make the best use of these networks, we need technologies to handle data or contents in the networks. Our approach towards these technologies is to apply statistical processing methods and to verify the effectiveness of the methods by developing application systems. Our research field includes information retrieval, natural language processing, data mining, social network systems, and e-learning systems. Some systems have already become commercial products by joint research with companies.

E-learning system in network era

More and more video content suitable for education is available these days. We believe that computer technology can provide more effective content than simple video. One example is to remove the image of the lecturer and to provide a clear image of the black board. This kind of content makes note taking easier becouse the lecturer tends to cover the characters on the black board.

Natural language processing for information retrieval

It is crucial for information retrieval systems to decide whether each term is important for retrieval. We have developed a method to extract important terms from documents using statistical analysis. Unlike commonly used approaches, this method does not require dictionaries, but collections of data. This method is the result of joint research with a company, and used in commercial products.

Handling large scale data from social networks

We have been operating a data collection system, especially from social media (Twitter), where various information is submitted. As a result, we can handle almost all tweets and retweets in the world with location information. We are conducting research to make the most of this data. One example of this research is estimating the location of tweets without location information from other information in the tweet.

Language Data Mining and Algorithm Laboratory Natural language processing, automatic text summarization,information retrieval, question answering, analysis of causalexpression and its application
Shigeru Masuyama
Natural language processing, automatic text summarization,information retrieval, question answering, analysis of causalexpression and its application

Intellectual activity support using natural language processing

Text summarization, information retreival and question answering are fundamental technologies to take advantage of the huge amount of machine-readable documents. In automatic text summarization, systematic results are obtained for sentence summarization that deletes the adnominal clauses by utilizing the dependency structure and statistical methods so that each sentence in the given text is summarized.

Text mining and information extraction

Semantic processing is indispensable for achieving the "smart" computer. We propose a general-purpose method of extracting an event and its cause and applied it to the extraction of causal expressions of a traffic accident, performance factors of companies, and patent mining. We also analyzed causal expressions and succeeded in classifying them to five basic patterns.

Algorithm engineering

As a study of algorithm design to challenge the problems that have been emerging with the arrival of the advanced information society of the 21st century, we study the Internet, graph network algorithms associated with mobile communication, and as the foundation of the ubiquitous society fault tolerance on the network reliability and distribu ted systems, automated guided vehicles (AGVs) in the area of operation control algorithms. Moreover, studies on scheduling concerning railways and sports such as baseball has also begun.

Learning and Inference Systems Laboratory Bayesian inference, learning algorithm, rate-distortion theory, data visualization
Kazuho Watanabe
Bayesian inference, learning algorithm, rate-distortion theory, data visualization

Machine learning techniques are widely used for various applications such as pattern recognition and robot control. We study fundamental theories of machine learning on the basis of statistical and information theory methods, and apply them to data analysis problems.

Analysis and development of statistical learning methods

Bayesian inference provides a framework for solving learning and inference problems. We aim to analyze and develop learning and inference methods, and apply them to problems such as data analysis and visualization.

Rate-distortion theory (lossy data compression)

Rate-distortion functions show the minimum information content required for reconstructing compressed data under allowed distortion levels. We aim to evaluate rate-distortion functions of distortion measures used in practical learning algorithms and information sources modeling real data generation processes.

Computational Linguistics Laboratory Natural Language Processing, Machine Translation, Lexical Semantics, Creative Content
Hitoshi Isahara
Natural Language Processing, Machine Translation, Lexical Semantics, Creative Content

Language is the core of human intellectual activities. We aim to realize computer systems that understand natural language as humans do through the investigation of linguistic functions by human. Toward this goal, we conduct the following studies.

Study to realize practical use of machine translation

Machine translation is one of the applications of natural language processing technology. Although the accuracy of machine translation systems has been improving day by day, it is not yet perfect. We are studying techniques to make full use of machine translation systems in real world translation processes including translation of business documents by standardizing input sentences, automatically acquiring and using translation dictionaries and developing postediting technology. We have launched new collaborative project with IT companies and local governments in Japan, aiming to make it possible for even small businesses in tourism to dispatch their own information to attract foreigners.

Study to acquire linguistic knowledge from real data

Each meaning of a word should be determined by how it is used in the actual document. We are studying technology that automatically acquires the semantic relationship of words from large amounts of data such as news articles and web documents. Such information can be used to simulate human association. In addition, we are studying extracting salient words and phrases from the document. We also conduct research and development on advancing natural language processing technology using the results obtained.

Study to generate and interpret creative contents by computer models

To compute creative contents is important in the field of artificial intelligence. Human can understand multi-modal contents such as comics and picture books easily. However, it is difficult for computers to understand stories and emotional information. To solve this problem, we have proposed several ways for analyzing process of creation. Currently, this theme consists of three sub themes: To build an application with graphical user interface for supporting writing novels based on two types of templates, to analyze relationships between comics(MANGA) features and semantics of stories utilizing by deep convolutional neural networks, to generate manuals based on learner's activity.

Applied Information Systems Laboratory Natural language processing, Web information system, user authentication
Masatoshi TSUCHIYA
Natural language processing, Web information system, user authentication

We study the implementation technology of real-scale applied information systems, through investigation, design, and operation of real university information systems. In particular, we focus on systems to aid human intellectual activities.

Automatic summarization of lecture speech using lecture slides

In order to cope with various students' capacities, e-learning content including lecture slides and lecture speeches are helpful. Unfortunately, it is quite difficult to listen to a lecture speech that skips here and there. Therefore, this research theme aims to solve this problem, through automatic summarization of lecture speech using the structural information of lecture slides.

Improvement of availability of real-scale applied information systems

Stability and robustness of real-scale applied information systems are important. However, it is quite difficult to implement stable and robust information systems that satisfy given limitations of human and time resources. We are examining ways to improve the availability of information systems through visualization of their running status.


Copyright (c) Toyohashi University of Technology All rights reserved.