Backup of Research(No. 10) - LANGUAGE MEDIA PROCESSING LAB

Research †

Language is the most reliable medium of human intellectual activities. Our objective is to establish the technology and academic discipline for handling and understanding language, in a manner that is as close as possible to that of humans, using computers. These include syntactic language analysis, semantic analysis, context analysis, text comprehension, text generation and dictionary systems to develop various application systems for machine translation and information retrieval.

↑

Fundamental Studies on Text Understanding †

To make computers understand language, it is essential to give computers world knowledge. This was a very hard problem ten years ago, but it has become possible to acquire knowledge from a massive amount of text in virtue of the drastic progress of computing power and network. We have successfully acquired linguistic patterns of predicate-argument structures from automatic parses of 7 billion Japanese sentences crawled from the Web using grid computing machines. By utilizing such knowledge, we study text understanding, i.e., recognizing the relationships between words and phrases in text.

↑

Machine Translation †

To bring automatic translation by computers to the level of human translation, we have been studying next-generation methodology of machine translation on the basis of text understanding and a large collection of translation examples. We have already accomplished practical translation on the domain of travel conversation, and constructed a translation-aid system that can be used by experts of patent translation. From 2006, we participate in the Japanese-Chinese translation project by the special coordination funds for the promotion of science and technology. Please see KyotoEBMT and Project on Practical Implementation of Japanese to Chinese-Chinese to Japanese Machine Translation for more details.

↑

Search Engine Infrastructure based on Deep Natural Language Processing †

The essential purpose of information retrieval is not to retrieve just a relevant document but to acquire the information or knowledge in the document. We have been developing a next-generation infrastructure of information retrieval on the basis of the following techniques of deep natural language processing: precise processing based not on words but on predicate-argument structures, identifying the variety of linguistic expressions and providing a bird's-eye view of search results via clustering and interaction.

JST CREST Research Area “Advanced Core Technologies for Big Data Integration” Establishment of Knowledge-Intensive Structural Natural Language Processing and Construction of Knowledge Infrastructure

↑

PhD Alumni †

Shuhei KURITA (栗田修平) (Mar.2019)
Neural Approaches for Syntactic and Semantic Analysis (cue)
Tomohiro SAKAGUCHI (坂口智洋) (Mar.2019)
Anchoring Events to the Time Axis toward Storyline Construction (cue)
Raj Dabre (Mar.2018)
Exploiting Multilingualism and Transfer Learning for Low Resource Machine Translation (cue)
Xun Wang (Nov.2017)
Entity-Centric Discourse Analysis and Its Applications (cue)
John Richardson (Sep.2016)
Improving Statistical Machine Translation with Target-Side Dependency Syntax (cue)
Gongye Jin (Mar.2016)
High-quality Knowledge Acquisition of Predicate-argument Structures for Syntactic and Semantic Analysis (cue)
Mo Shen (Mar.2016)
Exploiting Vocabulary, Morphological, and Subtree Knowledge to Improve Chinese Syntactic Analysis (cue)
Chenhui Chu (Mar.2015)
Integrated Parallel Data Extraction from Comparable Corpora for Statistical Machine Translation (cue)
Isao GOTO (後藤功雄) (May.2014)
Word Reordering for Statistical Machine Translation via Modeling Structural Differences between Languages (cue)
Masatsugu HANGYO (萩行正嗣) (Mar.2014)
Studies on Annotated Diverse Corpus Construction and Zero Reference Resolution in Japanese (cue)
Tomoko IZUMI (泉朋子) (Jan.2014)
Normalization and Similarity Recognition of Complex Predicate Phrases Based on Linguistically-Motivated Evidence (cue)
Jun HARASHIMA (原島純) (Mar.2013)
Studies on Re-ranking and Summarizing Search Results (cue)
CHIKARA HASHIMOTO (橋本力) (Sep.2011)
Knowledge Acquisition from the Web for Text Understanding (cue)
Fabien Cromieres (Mar.2011)
Using Scalable Run-Time Methods and Syntactic Structure in Corpus-Based Machine Translation (cue)
Yugo MURAWAKI (村脇有吾) (Mar.2011)
Automatic Acquisition of Japanese Unknown Morphemes (cue)
Toshiaki NAKAZAWA (中澤敏明) (Mar.2010)
Fully Syntactic Example-based Machine Translation (cue)
Koichi TAKEDA (武田浩一) (Mar.2010)
Building Natural Language Processing Applications Using Descriptive Models (cue)
Ryohei SASANO (笹野遼平) (Mar.2009; University of Tokyo)
Japanese Anaphora Resolution Based on Automatically Acquired World Knowledge
Manabu SASSANO (颯々野学) (Sep.2008)
Practical Use of Large Margin Classifiers in Natural Language Processing (cue)
Tomohide SHIBATA (柴田知秀) (Mar.2007; University of Tokyo)
Structural Understanding of Instruction Videos by Integrating Linguistic and Visual Information
Daisuke KAWAHARA (河原大輔) (Jul.2005)
Automatic Construction of Japanese Case Frames for Natural Language Understanding (cue)
Eiji ARAMAKI (荒牧英治) (Mar.2005; University of Tokyo)
Formalization and Realization of Example-based Machine Translation
Nobuhiro KAJI (鍜治伸裕) (Mar.2005; University of Tokyo)
Paraphrasing Written Language to Spoken Language
Yoji KIYOTA (清田陽司) (Nov.2004)
Dialog Navigator: A Navigation System from Vague Questions to Specific Answers based on Real-World Text Collections (cue)

※ The full thesis of abstract shown in this page could be found at 京都大学電気関係教室技術情報誌cue (Japanese)．