What's New

We will present the following paper at EACL2017 (2017/4/3-7) (12/4)

  • Gongye Jin, Daisuke Kawahara and Sadao Kurohashi: Improving Chinese Semantic Role Labeling using High-quality Surface and Deep Case Frames

We will present following papers at COLING2016 (2016/12/11-6) (Updated:11/7)

  • Kenji Yamauchi and Yugo Murawaki: Contrasting Vertical and Horizontal Transmission of Typological Features
  • Mo Shen, Wingmui Li, HyunJeong Choe, Chenhui Chu, Daisuke Kawahara and Sadao Kurohashi: Consistent Word Segmentation, Part-of-Speech Tagging and Dependency Labelling Annotation for Chinese Language
  • Fabien Cromieres: Kyoto-NMT: A Neural Machine Translation implementation in Chainer (System Demonstration)
  • Chenhui Chu, Toshiaki Nakazawa, Daisuke Kawahara and Sadao Kurohashi: SCTB: A Chinese Treebank in Scientific Domain (ALR12 2016)


  • Naoki Otani, Daisuke Kawahara, Sadao Kurohashi, Nobuhiro Kaji and Manabu Sassano: Large-Scale Acquisition of Commonsense Knowledge from a Quiz Game on a Dialogue System (OKBQA2016)
  • Fabien Cromieres, Chenhui Chu, Toshiaki Nakazawa and Sadao Kurohashi: Kyoto University Participation to WAT 2016 (WAT2016)
  • Toshiaki Nakazawa, Chenchen Ding, Hideya MINO, Isao Goto, Graham Neubig and Sadao Kurohashi: Overview of the 3rd Workshop on Asian Translation (WAT2016)

Kyoto University & JST Trilingual Technical Term Dictionary (TriTechDict) has been released. (10/24)

SCTB: A Chinese Treebank in Scientific Domain has been released. (10/20)

W.Sakata was awarded an incentive award of IPSJ Kansai-Branch Convention at IPSJ Kansai-Branch Convention with the following paper. (9/26)

  • W.Sakata, T.Shibata and S.Kurohashi: Improving Word Representations using Word Relational Knowledge and Patterns

Research Overview

Language is the most reliable medium of human intellectual activities. Our objective is to establish the technology and academic discipline for handling and understanding language, in a manner that is as close as possible to that of humans, using computers. These include syntactic language analysis, semantic analysis, context analysis, text comprehension, text generation and dictionary systems to develop various application systems for machine translation and information retrieval.

Search Engine Infrastructure based on Deep Natural Language Processing


The essential purpose of information retrieval is not to retrieve just a relevant document but to acquire the information or knowledge in the document. We have been developing a next-generation infrastructure of information retrieval on the basis of the following techniques of deep natural language processing: precise processing based not on words but on predicate-argument structures, identifying the variety of linguistic expressions and providing a bird's-eye view of search results via clustering and interaction.

Machine Translation


To bring automatic translation by computers to the level of human translation, we have been studying next-generation methodology of machine translation on the basis of text understanding and a large collection of translation examples. We have already accomplished practical translation on the domain of travel conversation, and constructed a translation-aid system that can be used by experts of patent translation.

Fundamental Studies on Text Understanding

To make computers understand language, it is essential to give computers world knowledge. This was a very hard problem ten years ago, but it has become possible to acquire knowledge from a massive amount of text in virtue of the drastic progress of computing power and network. We have successfully acquired linguistic patterns of predicate-argument structures from automatic parses of 7 billion Japanese sentences crawled from the Web using grid computing machines. By utilizing such knowledge, we study text understanding, i.e., recognizing the relationships between words and phrases in text.