Culinary Interview Dialog Corpus (CIDC) †
CIDC is an interview dialogue corpus in the culinary domain in which interviewers play an active role to elicit culinary knowledge from
the cooking expert. Collected with a video conferencing tool, the corpus contains 308 dialogues (each about 10~15 minutes.) To understand the impact of the interlocutors’ skill level, we divide the experts into “professionals” and “enthusiasts” and the interviewers into “skilled interviewers” and “unskilled interviewers.” The collection of this corpus is supported by a project JPNP20006, commissioned by the New Energy and Industrial Technology Development Organization (NEDO).
Among the collected dialogues, the first 20 dialogues (Interview_1~Interview_20) are preliminary collection collected with tentative condition settings at first. Based on the analysis of the preliminary collection, the final conditions were set and the main collection was collected.
Download †
If you wish to use the Culinary Interview Dialog Corpus (CIDC) corpus, please fill in the following application form. We will contact you via email.
Application form
Contents †
- Meta-information (info.csv)
- The meta-information of each interview dialogue.
This includes the dialogue ID (1st column), expert information (column 2~5), interviewer information (column 6-9), pre-interview preparation materials (column 10~13), and the post-interview questionaire results (column 14~20). Please refer to the technical report (NEDO_CIDC_report.pdf) for further detail.
- Video files of the cooking interviews (video/)
- The video files (.mp4) of the interview dialogues recorded with Zoom.
- Audio files of the cooking interviews (audio/)
- The audio files (.wav) of the interview dialogues recorded with Zoom. This includes the audio of the expert (Interview_#_s.wav), the audio of the interviewer (Interview_#_i.wav), and the audio of both interlocutors combined (Interview_#.wav). For the preliminary collection, only the audio of both interlocutors combined are collected.
- Transcription files of the cooking interviews (transcript/)
- The transcription files (.xml) of the interview dialogues. Please refer to the technical report (NEDO_CIDC_report.pdf) for further detail.
Licence †
CIDC is distributed under the terms of the CC BY-NC-SA 4.0 licence (https://creativecommons.org/licenses/by-nc-sa/4.0/deed.ja ).
参考文献 †
- [1] Taro Okahisa, Ribeka Tanaka, Takashi Kodama, Yin Jou Huang and Sadao Kurohashi. (2022). “Constructing a Culinary Interview Dialogue Corpus with Video Conferencing Tool.” In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3131–3139.
- [2] 岡久 太郎, 田中 リベカ, 児玉 貴志, Yin Jou Huang, 黒橋 禎夫. (2022) “ウェブ会議システムを利用した料理インタビュー対話コーパス.” 言語処理学会 第28回年次大会.