SOAS University of London

Tibetan Studies at SOAS

Tibetan in Digital Communication - Outputs

Publications

In addition the corpora and software tools linked to above, the following publications resulted from this research project.

Hill, Nathan W. (2016) 'Tibetan part-of-speech conundrums: maṅ and yun riṅ.' Rocznik Orientalistyczny, 68 (2). pp. 65-72.

Hill, Nathan W. and Di, Jiang, eds. (2016) Himalayan Linguistics 15.1 (Special Issue on Tibetan Natural Language Processing). Berkeley, CA: University of California Press.

Garrett, Edward and Hill, Nathan W. and Kilgarriff, Adam and Vadlapudi, Ravikiran and Zadoks, Abel (2015) 'The contribution of corpus linguistics to lexicography and the future of Tibetan dictionaries.' Revue d'Etudes Tibétaines, 32. pp. 51-86.

Garrett, Edward and Hill, Nathan W. (2015) 'Constituent order in the Tibetan noun phrase.' SOAS Working Papers in Linguistics, 17. pp. 35-48.

Hill, Nathan W. and Zadoks, Abel (2015) 'Tibetan √lan ‘reply’.' Journal of the Royal Asiatic Society of Great Britain & Ireland (Third Series), 25 (1). pp. 117-121.

Garrett, Edward and Hill, Nathan W. and Zadoks, Abel (2014) 'A Rule-based Part-of-speech Tagger for Classical Tibetan.' Himalayan Linguistics, 13 (1). pp. 9-57.

Garrett, Edward and Hill, Nathan W. and Zadoks, Abel (2013) 'Disambiguating Tibetan verb stems with matrix verbs in the indirect infinitive construction.' Bulletin of Tibetology, 49 (2). pp. 35-44.

Presentations

Hill, Nathan W. (2015) Tibetan Corpus Linguistics: present obstacle and future prospects. Backdoor Broadcasting. [Audio]

Hill, Nathan W. (2015) Tibetan Corpus Linguistics: our progress so far. Backdoor Broadcasting. [Audio]

Hill, Nathan W. (2014) Tibetan Word Breaking and Part of Speech Categories. Backdoor Broadcasting Company. [Audio]

Hill, Nathan W. (2014) A rule based tagger for Classical Tibetan. Backdoor Broadcasting Company. [Audio]

Computational Outputs

Hill, Nathan W., & Garrett, Edward. (2017). A part-of-speech (POS) lexicon of Classical Tibetan for NLP [Data set]. Zenodo. http://doi.org/10.5281/zenodo.574876

Garrett, Edward, & Hill, Nathan W. (2017). A rule based Tibetan part-of-speech (POS) tagger for the creation of gold standard training data [Data set]. Zenodo. http://doi.org/10.5281/zenodo.574882

Hill, Nathan W., & Garrett, Edward. (2017). A part-of-speech (POS) tagged corpus of Classical Tibetan [Data set]. Zenodo. http://doi.org/10.5281/zenodo.574878