Mo Shen
Ph.D. Student at Kyoto University
S208, Eng. Bldg. No.3,
Kyoto University,
Yoshida-honmachi, Sakyo-ku, Kyoto, 606-8501, Japan
E-mail:
msmoshen at gmail.com 
Last update: 08/18/2014
Summary
- I am currently a Ph.D. candidate at the Language & Knowledge
Engineering Lab at the Graduate School of Informatics, Kyoto
University. My advisor is Prof. Sadao Kurohashi. My research
interests include syntactic parsing, morphological analysis for Asian
languages, and cognitive modeling of language.
Education
- Kyoto University
- Doctor of Philosophy (Ph.D.), Computational Linguistics, 2012 – 2015 (Expected)
- Kyoto University
- Master of Science (M.S.), Computational Linguistics, 2010 - 2012
- Hong Kong Baptist University
- Bachelor of Science (B.S.), Mathematics, 2006 – 2010
Publications
(International Journals
and Conferences, Peer Reviewed)
- Mo Shen,
Daisuke Kawahara and Sadao Kurohashi. 2014. Dependency Parse Reranking
with Rich Subtree Features. IEEE Transactions on Audio, Speech, and
Language Processing, 22(7): 1208-1218.
- Mo Shen,
Hongxiao Liu, Daisuke Kawahara, and Sadao Kurohashi.
2014. Chinese Morphological Analysis with Character-level POS Tagging.
In proceedings of the 52th Annual Meeting of the Association for
Computational Linguistics (ACL 2014), Short Paper, pages 253–258, Baltimore, USA.
- Mo Shen, Daisuke
Kawahara, and Sadao Kurohashi. 2013. Chinese
Word Segmentation by Mining Maximized Substrings. In proceedings of the
6th International Joint Conference on Natural Language Processing
(IJCNLP 2013), pages 171-179, Nagoya, Japan.
- Mo Shen, Daisuke
Kawahara and Sadao Kurohashi. 2012. A Reranking
Approach for Dependency Parsing with Variable-sized Subtree Features.
In proceedings of 26th Pacific Asia Conference on Language Information
and Computing (PACLIC 26), pages 308-317, Bali, Indonesia.
Publications
(Domestic
Conference)
- Mo Shen, Daisuke
Kawahara and Sadao Kurohashi. 2014. Chinese
Unknown Word Extraction by Mining Maximized Substrings. In proceedings
of the 20th Annual Meeting of the Association for Computational
Linguistics (NLP2014), pp.384-387, Sapporo, Japan.
- Mo Shen, Daisuke
Kawahara and Sadao Kurohashi. 2013. Dependency
Parse Reranking Based-on Subtree Extraction. In proceedings of the 19th
Annual Meeting of the Association for Computational Linguistics
(NLP2013), pp.58-61, Nagoya, Japan.
Presentations
- Towards Fully Lexicalized Dependency Parsing for Korean. At the
13th International Conference on Parsing Technologies (IWPT2013), Nara,
Japan. 2013/11.
- A Reranking Approach for Dependency Parsing with Variable-sized
Subtree Features. At Microsoft Research Forum 2012 at Kyoto University,
Kyoto, Japan. 2012/12.
- Dependency Subtree Reranking with Rich Subtree-based Features. At
Kyoto University 35th IST Seminar, Kyoto, Japan. 2012/07.
Professional Activity
- Reviewer, IEEE
Transactions on Audio, Speech, and Language Processing, 2014.
Software
- SKP
- A high-performance
multilingual dependency parser written in c++, developed as a crucial component of the
Kyoto Example-Based Dependency-to-Dependency Translation Framework
(KyotoEBMT).
-
- KyotoMorph
- A joint
Chinese word segmentation and part-of-speech tagging system written in c++, featuring a
semi-supervised segmentation technique which explores large-scale texts
for word boundary information, and an unknown word extractor which
performs efficient Chinese word extraction and automatic lexicon
compilation from web texts.
- CUWE
- A
Chinese unknown word extractor, which can efficiently scan and choose
reliable word candidates from a million sentences in a couple of
minutes. The output can be directly compiled into a machine-readable
dictionary that benefits other language processing systems.
Language Resources
- Kyoto-U
Chinese Web Corpus
- A Chinese corpus automatically built and maintained using web
texts, which currently contains over 2 billion sentences labeled with
word segmentation, part-of-speech tagging, chunking, and dependency
parsing information.
- Chinese
Treebank with CharPOS
- An augmented version of Penn Chinese Treebank 5.0 (CTB5) with
full character-level part-of-speech annotation.
Awards
- 2009: Honorable Mention in the 2009 Mathematical Contest in
Modeling (MCM2009)
- 2010: Japanese Government (MEXT) Scholarship
- 2012: MEXT Honors Scholarship for Privately Financed
International Students
- 2014: Murata Scholarship
Programming Skills
- Code on a daily basis: C++, Python, Perl
- Familiar with: Java, C
- Code as a hobby: Matlab, Prolog
Language Proficiency
Chinese:
Native
English:
TOEIC
|
Score:
990/990 (2014/07) |
TOEFL
|
Score:
100/120 (2009/08)
|
Japanese:
JLPT N1
|
Score:
180/180 (2013/12) |