[日本語]
/ [English]
Intelligence Science and Technology Course
,
Graduate School of Informatics
,
Kyoto University
Top
Members
Research
Research
PhD Alumni
Publications
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
NLP Resources
===Tools===
KWJA
JUMAN
Juman++
KNP
┗ PyKNP
TableDisplay
KyotoEBMT
===Data===
KU Text Corpus
KU WebDoc Leads Corpus
KU Case Frame
KU Noun Case Frame
Food Knowledge Base
RTE EvalSet
JEC Basic Sentences
KUCI
Predicate Evaluation Set
ASPEC
SCTB
CTB5.0 Re-annotation
CTB5 chara POS annotation
TriTechDict
Local Page
Start:
* オープンコースウェア対訳コーパス/Coursera Parallel Corp...
** Update history [#g075a542]
- 2021 Added Chinese-English parallel dataset
- 2020 Added Japanese-English parallel dataset
** Description [#c9d7e6af]
This resource contains Japanese-English and Chinese-Engli...
- Description of the files in the Japanese-English dataset:
-- train.ja & train.en:
--- 40,770 parallel sentences extracted and aligned from ...
-- dev.ja & dev.en:
--- 541 manually checked parallel sentences, which could ...
-- test.ja & test.en:
--- 2,005 manually checked parallel sentences, which coul...
- Description of the files in the Chinese-English dataset:
-- train.zh & train.en:
--- 40,074 parallel sentences extracted and aligned from ...
-- dev.zh & dev.en:
--- 865 manually checked parallel sentences, which could ...
-- test.zh & test.en:
--- 2,009 manually checked parallel sentences, which coul...
** Sample [#b6b769d8]
- Japanese-English parallel sentences
Ja: 誰かに何をすべきか言うのか、 私がこれをしたら起こ...
En: It's the difference between telling someone what ...
- Chinese-English parallel sentences
Zh: 如今, 云计算包括虚拟化数据中心, 虚拟机和应用程序...
En: Today, cloud computing involves virtualized datac...
** Download [#h7c72293]
*** Japanese-English: [#ld1eacaa]
https://github.com/shyyhs/CourseraParallelCorpusMining/bl...
*** Chinese-English: [#u3834cd9]
https://github.com/shyyhs/CourseraParallelCorpusMining/bl...
** Github [#c35c6d03]
https://github.com/shyyhs/CourseraParallelCorpusMining
** Reference [#hff08bcb]
Haiyue Song, Raj Dabre, Atsushi Fujita and Sadao Kurohashi.
Coursera Corpus Mining and Multistage Fine-Tuning for Imp...
Proceedings of the 12th International Conference on Langu...
*** bib [#na444d6c]
@inproceedings{song-etal-2020-coursera,
title = "{C}oursera Corpus Mining and Multistage Fine...
author = "Song, Haiyue and
Dabre, Raj and
Fujita, Atsushi and
Kurohashi, Sadao",
booktitle = "Proceedings of the 12th Language Resourc...
month = may,
year = "2020",
address = "Marseille, France",
publisher = "European Language Resources Association",
url = "https://www.aclweb.org/anthology/2020.lrec-1.4...
pages = "3640--3649",
language = "English",
ISBN = "979-10-95546-34-4",
}
** Contact [#s0d94ebb]
song AT nlp.ist.i.kyoto-u.ac.jp
End:
* オープンコースウェア対訳コーパス/Coursera Parallel Corp...
** Update history [#g075a542]
- 2021 Added Chinese-English parallel dataset
- 2020 Added Japanese-English parallel dataset
** Description [#c9d7e6af]
This resource contains Japanese-English and Chinese-Engli...
- Description of the files in the Japanese-English dataset:
-- train.ja & train.en:
--- 40,770 parallel sentences extracted and aligned from ...
-- dev.ja & dev.en:
--- 541 manually checked parallel sentences, which could ...
-- test.ja & test.en:
--- 2,005 manually checked parallel sentences, which coul...
- Description of the files in the Chinese-English dataset:
-- train.zh & train.en:
--- 40,074 parallel sentences extracted and aligned from ...
-- dev.zh & dev.en:
--- 865 manually checked parallel sentences, which could ...
-- test.zh & test.en:
--- 2,009 manually checked parallel sentences, which coul...
** Sample [#b6b769d8]
- Japanese-English parallel sentences
Ja: 誰かに何をすべきか言うのか、 私がこれをしたら起こ...
En: It's the difference between telling someone what ...
- Chinese-English parallel sentences
Zh: 如今, 云计算包括虚拟化数据中心, 虚拟机和应用程序...
En: Today, cloud computing involves virtualized datac...
** Download [#h7c72293]
*** Japanese-English: [#ld1eacaa]
https://github.com/shyyhs/CourseraParallelCorpusMining/bl...
*** Chinese-English: [#u3834cd9]
https://github.com/shyyhs/CourseraParallelCorpusMining/bl...
** Github [#c35c6d03]
https://github.com/shyyhs/CourseraParallelCorpusMining
** Reference [#hff08bcb]
Haiyue Song, Raj Dabre, Atsushi Fujita and Sadao Kurohashi.
Coursera Corpus Mining and Multistage Fine-Tuning for Imp...
Proceedings of the 12th International Conference on Langu...
*** bib [#na444d6c]
@inproceedings{song-etal-2020-coursera,
title = "{C}oursera Corpus Mining and Multistage Fine...
author = "Song, Haiyue and
Dabre, Raj and
Fujita, Atsushi and
Kurohashi, Sadao",
booktitle = "Proceedings of the 12th Language Resourc...
month = may,
year = "2020",
address = "Marseille, France",
publisher = "European Language Resources Association",
url = "https://www.aclweb.org/anthology/2020.lrec-1.4...
pages = "3640--3649",
language = "English",
ISBN = "979-10-95546-34-4",
}
** Contact [#s0d94ebb]
song AT nlp.ist.i.kyoto-u.ac.jp
Page: