|Knowledge Data Engineering Lab.||Information retrieval, data mining, intelligent processing for 3-D/image/video/text, text mining|
|Information retrieval, data mining, intelligent processing for 3-D/image/video/text, text mining|
The Knowledge Data Engineering (KDE) Laboratory explores research on massive and intelligent multimedia data processing, where multimedia ranges from 3D models, images, and video, to text available on the Web or kept in storage as a collection of big data. Information retrieval as well as automatic annotation of the multimedia (3-D models, images, and video) based on machine learning are the two most focused areas of research in intelligent data processing, where feature extraction has been our main concern. We also conduct research on the Semantic Web and time series data mining. In the Semantic Web, our biggest interest is how to wisely use ontologies to make things understandable and to connect two concepts that seemingly look separated. Ontology learning is also included. In time series data mining, we employ massive times series data acquired by sensors and loggers, and attempt to find any interesting correlations hidden inside the data. We are also embarking on potential applications for deep learning with our GPGPU machines.
Intelligent multimedia data processing
We are investigating 3D shape retrieval and automatic 3D annotation having high performance and accuracy. The main focus has been on feature extraction. In addition, we have started investigating image annotation and captioning, together with applications for deep learning with GPGPU. We have submitted several patents on this research, have gained recognition through the SHREC (SHape Retrieval Contest). In the past several years in SHREC, we took world No.1 accuracy on several different tracks, including general 3-D object retrieval and 3-D shape retrieval from 2-D sketches. We also provide a 3-D shape benchmark called the Toyohashi Shape Benchmark (TSB) for public research use.
On the basis of massive time series data, we are investigating hidden rules among multiple time series data, potential correlations, optimization, and linear and non-linear regression models, using diversified research tools including machine learning and multivariate analysis tools.
Ontology alignment and ontology learning based on meta data on the Web have been our main concern in this research. Medical applications and semantic search applications are included.
Spam Web page mining for security risks, graph-theoretic link mining, and general Web mining seeking new applications for collective intelligence are included in this research.
Estimating emotions, judgements, and evaluations from text have been the main focus on this research. Cross-media approaches, such as image and text to augment cross media search, is also of interest. In addition, patent data retrieval and clustering have been investigated. Natural language processing technologies (both in English and Japanese) are naturally included here.
|Applied Mathematics and Network Lab.||Applied statistics, natural language processing, social media, e-learning|
|Applied statistics, natural language processing, social media, e-learning|
The importance of computer networks becomes larger day by day. In order to make the best use of these networks, we need technologies to handle data or contents in the networks. Our approach towards these technologies is to apply statistical processing methods and to verify the effectiveness of the methods by developing application systems. Our research field includes information retrieval, natural language processing, data mining, social network systems, and e-learning systems. Some systems have already become commercial products by joint research with companies.
E-learning system in network era
More and more video content suitable for education is available these days. We believe that computer technology can provide more effective content than simple video. One example is to remove the image of the lecturer and to provide a clear image of the black board. This kind of content makes note taking easier becouse the lecturer tends to cover the characters on the black board.
Natural language processing for information retrieval
It is crucial for information retrieval systems to decide whether each term is important for retrieval. We have developed a method to extract important terms from documents using statistical analysis. Unlike commonly used approaches, this method does not require dictionaries, but collections of data. This method is the result of joint research with a company, and used in commercial products.
Handling large scale data from social networks
We have been operating a data collection system, especially from social media (Twitter), where various information is submitted. As a result, we can handle almost all tweets and retweets in the world with location information. We are conducting research to make the most of this data. One example of this research is estimating the location of tweets without location information from other information in the tweet.
|Language Data Mining and Algorithm Laboratory||Natural language processing, automatic text summarization,information retrieval, question answering, analysis of causalexpression and its application|
|Natural language processing, automatic text summarization,information retrieval, question answering, analysis of causalexpression and its application|
Intellectual activity support using natural language processing
Text summarization, information retreival and question answering are fundamental technologies to take advantage of the huge amount of machine-readable documents. In automatic text summarization, systematic results are obtained for sentence summarization that deletes the adnominal clauses by utilizing the dependency structure and statistical methods so that each sentence in the given text is summarized.
Text mining and information extraction
Semantic processing is indispensable for achieving the "smart" computer. We propose a general-purpose method of extracting an event and its cause and applied it to the extraction of causal expressions of a traffic accident, performance factors of companies, and patent mining. We also analyzed causal expressions and succeeded in classifying them to five basic patterns.
As a study of algorithm design to challenge the problems that have been emerging with the arrival of the advanced information society of the 21st century, we study the Internet, graph network algorithms associated with mobile communication, and as the foundation of the ubiquitous society fault tolerance on the network reliability and distribu ted systems, automated guided vehicles (AGVs) in the area of operation control algorithms. Moreover, studies on scheduling concerning railways and sports such as baseball has also begun.
|Learning and Inference Systems Laboratory||Bayesian inference, learning algorithm, rate-distortion theory, data visualization|
|Bayesian inference, learning algorithm, rate-distortion theory, data visualization|
Machine learning techniques are widely used for various applications such as pattern recognition and robot control. We study fundamental theories of machine learning on the basis of statistical and information theory methods, and apply them to data analysis problems.
Analysis and development of statistical learning methods
Bayesian inference provides a framework for solving learning and inference problems. We aim to analyze and develop learning and inference methods, and apply them to problems such as data analysis and visualization.
Rate-distortion theory (lossy data compression)
Rate-distortion functions show the minimum information content required for reconstructing compressed data under allowed distortion levels. We aim to evaluate rate-distortion functions of distortion measures used in practical learning algorithms and information sources modeling real data generation processes.
|Applied Information Systems Laboratory||Natural language processing, Web information system, user authentication|
|Natural language processing, Web information system, user authentication|
We study the implementation technology of real-scale applied information systems, through investigation, design, and operation of real university information systems. In particular, we focus on systems to aid human intellectual activities.
Automatic summarization of lecture speech using lecture slides
In order to cope with various students' capacities, e-learning content including lecture slides and lecture speeches are helpful. Unfortunately, it is quite difficult to listen to a lecture speech that skips here and there. Therefore, this research theme aims to solve this problem, through automatic summarization of lecture speech using the structural information of lecture slides.
Improvement of availability of real-scale applied information systems
Stability and robustness of real-scale applied information systems are important. However, it is quite difficult to implement stable and robust information systems that satisfy given limitations of human and time resources. We are examining ways to improve the availability of information systems through visualization of their running status.