Bangor Autoglosser

The Bangor Autoglosser allows CHAT files to be glossed (POS-tagged) automatically in Welsh, Spanish and English.

Getting the code

The code (licensed under the GPL v3) is available on GitHub.

A new version, Autoglosser2, is now available, focussed on written Welsh. This version has tidier code, and is a lot faster (22,000 glosses/minute) - see the manual.

For publications about the autoglosser, see the publications page.

The databundle referred to in the ISB8 presentation is available here.

Change language

Contact us

The corpora

The Siarad corpus
The Patagonia corpus
The Miami corpus

Research Team



Bangor Autoglosser


The support of the Arts and Humanities Research Council (AHRC), the Economic and Social Research Council (ESRC), the Higher Education Funding Council for Wales (HEFCW) and the Welsh Government is gratefully acknowledged.