Bangor Autoglosser

The Bangor Autoglosser allows CHAT files to be glossed (POS-tagged) automatically in Welsh, Spanish and English.

Getting the code

The code (licensed under the GPL v3) is available on GitHub.

Note that it's not really packaged properly, because a lot of the work was done ad hoc. (To get a smaller, cleaner implementation, try the Gáidhlig autoglosser.) Hopefully this will be remedied (at least for Welsh) as part of the work on the new CorcCenCC (Corpus Cenedlaethol Cymraeg Cyfoes - National Corpus of Contemporary Welsh).

For publications about the autoglosser, see the publications page.

The databundle referred to in the ISB8 presentation is available here.

Change language

Contact us

The corpora

The Siarad corpus
The Patagonia corpus
The Miami corpus

Research Team



Bangor Autoglosser


The support of the Arts and Humanities Research Council (AHRC), the Economic and Social Research Council (ESRC), the Higher Education Funding Council for Wales (HEFCW) and the Welsh Government is gratefully acknowledged.