Backup of A Chinese Treebank in Scientific Domain (SCTB)(No. 2)

List of Backups
View the diff.
View the diff current.
View the source.
Go to A Chinese Treebank in Scientific Domain (SCTB).
- 1 (2016-10-20 (Thu) 01:53:01)
- 2 (2016-10-20 (Thu) 02:40:03)
- 3 (2016-10-20 (Thu) 09:17:51)
- 4 (2021-09-08 (Wed) 01:06:07)
- 5 (2021-09-08 (Wed) 01:08:34)

SCTB: A Chinese Treebank in Scientific Domain †

Update History †

2016/10/20 Distribute the first version (V1)

Description †

SCTB is a phrase structure based Chinese treebank. The raw sentences are selected from Chinese scientific papers. Our annotation process follows that of CTB (Xue et al., 2005) with an exception of the segmentation standard. We apply a Chinese word segmentation standard based on character-level POS patterns (Shen et al., 2016). The first version of release contains 5,133 sentences (138,781 words).

↑

Sample †

( (IP (NP (PN 它们)) (VP (ADVP (AD 常)) (VP (VP (ADVP (AD 同时)) (VP (VV 并存))) (CC 并) (VP (ADVP (AD 相互))
(VCD (VP (VV 发展) (VV 转化)))))) (PU 。)) )

↑

License †

Copyright (C) Kurohashi-Kawahara Lab. and Japan Science and Technology Agency (JST). You can use all the data under the terms of the Creative Commons Attribution 3.0 Unported license.

↑

Download †

SCTB.tar.gz

↑

Acknowledgment †

This is work is supported by the JST MT project "Project on Practical Implementation of Japanese to Chinese-Chinese to Japanese Machine Translation." We are very appreciated for the great work of the two annotators: Ms. Fumio Hirao and Mr. Teruyasu Ueki. We thank Mr. Frederic Bergeron for his nice contribution to the annotation interface. We are also very grateful to Dr. Mo Shen for the discussion of the annotation standards.

↑

Reference †

Chenhui Chu, Toshiaki Nakazawa, Daisuke Kawahara and Sadao Kurohashi.
SCTB: A Chinese Treebank in Scientific Domain,
Proceedings of the 12th Workshop on Asian Language Resources (ALR12 2016), Osaka, Japan, (2016.12)

↑

Contact and Bug Report †

MAIL: chu at pa.jst.jp