What's New

We will hold a briefing session (2024/5/11)

The following paper received the FY2023 best paper award of the Journal of Natural Language Processing (2024/3)

  • Kazumasa Omura, Daisuke Kawahara, and Sadao Kurohashi:
    Building a Commonsense Inference Dataset based on Basic Events and its Application

We will present the following papers at LREC-COLING 2024 (2024/5).

  • Taishi Chika, Taro Okahisa, Takashi Kodama, Yin Jou Huang, Yugo Murawaki and Sadao Kurohashi:
    Domain Transferable Semantic Frames for Expert Interview Dialogues
  • Yikun Sun, Zhen Wan, Nobuhiro Ueda, Sakiko Yahata, Fei Cheng, Chenhui Chu and Sadao Kurohashi:
    Rapidly Developing High-quality Instruction Data and Evaluation Benchmark for Large Language Models with Minimal Human Effort: A Case Study on Japanese
  • Norizo Sakaguchi, Yugo Murawaki, Chenhui Chu and Sadao Kurohashi:
    Identifying Source Language Expressions for Pre-editing in Machine Translation
  • Nobuhiro Ueda, Hideko Habe, Yoko Matsui, Akishige Yuguchi, Seiya Kawano, Yasutomo Kawanishi, Sadao Kurohashi and Koichiro Yoshino: J-CRe3:
    A Japanese Conversation Dataset for Real-world Reference Resolution
  • Hao Wang, Tang Li, Chenhui Chu, Rui Wang and Pinpin Zhu:
    Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents
  • Rikito Takahashi, Hirokazu Kiyomaru, Chenhui Chu and Sadao Kurohashi:
    Abstractive Multi-Video Captioning: Benchmark Dataset Construction and Extensive Evaluation
  • Yugo Murawaki:
    Principal Component Analysis as a Sanity Check for Bayesian Phylolinguistic Reconstruction
  • Kazumasa Omura, Fei Cheng and Sadao Kurohashi:
    An Empirical Study of Synthetic Data Generation for Implicit Discourse Relation Recognition
  • Xiaotian Lu, Jiyi Li, Zhen Wan, Xiaofeng Lin, Koh Takeuchi and Hisashi Kashima:
    Evaluating Saliency Explanations in NLP by Crowdsourcing
  • Francois Meyer, Haiyue Song, Abhisek Chakrabarty, Jan Buys, Raj Dabre and Hideki Tanaka:
    NGLUEni: Benchmarking and Adapting Pretrained Language Models for Nguni Languages

We will present the following papers at EMNLP2023 (2023/12)

  • Yihang Li, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi, Wei Li:
    Video-Helpful Multimodal Machine Translation
  • Shunya Kato, Shuhei Kurita, Chenhui Chu, Sadao Kurohashi:
    ARKitSceneRefer: Text-based Localization of Small Objects in Diverse Real-World 3D Indoor Scenes (Findings)
  • Hao Wang, Xiahua Chen, Rui Wang, Chenhui Chu:
    Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning
  • Hao Wang, Qingxuan Wang, Yue Li, Changqing Wang, Chenhui Chu, Rui Wang:
    DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading (Findings)
  • Zhen Wan, Fei Cheng, Zhuoyuan Mao, Qianying Liu, Haiyue Song, Jiwei Li, Sadao Kurohashi:
    GPT-RE: In-context Learning for Relation Extraction using Large Language Models

We had a lab trip to Ehime. (9/13-14)

trip_20230913.jpg

Research Overview

Language is the most reliable medium of human intellectual activities. Our objective is to establish the technology and academic discipline for handling and understanding language, in a manner that is as close as possible to that of humans, using computers. These include syntactic language analysis, semantic analysis, context analysis, text comprehension, text generation and dictionary systems to develop various application systems for machine translation and information retrieval.

Search Engine Infrastructure based on Deep Natural Language Processing

TSUBAKI.png

The essential purpose of information retrieval is not to retrieve just a relevant document but to acquire the information or knowledge in the document. We have been developing a next-generation infrastructure of information retrieval on the basis of the following techniques of deep natural language processing: precise processing based not on words but on predicate-argument structures, identifying the variety of linguistic expressions and providing a bird's-eye view of search results via clustering and interaction.

Machine Translation

EBMT.png

To bring automatic translation by computers to the level of human translation, we have been studying next-generation methodology of machine translation on the basis of text understanding and a large collection of translation examples. We have already accomplished practical translation on the domain of travel conversation, and constructed a translation-aid system that can be used by experts of patent translation.

Fundamental Studies on Text Understanding

To make computers understand language, it is essential to give computers world knowledge. This was a very hard problem ten years ago, but it has become possible to acquire knowledge from a massive amount of text in virtue of the drastic progress of computing power and network. We have successfully acquired linguistic patterns of predicate-argument structures from automatic parses of 7 billion Japanese sentences crawled from the Web using grid computing machines. By utilizing such knowledge, we study text understanding, i.e., recognizing the relationships between words and phrases in text.

Policy Regarding Acceptance of Students from Outside

Master course

PhD course

Access