/www/twita.dipinfo.di.unito.it/docs

PoSTWITA

PoSTWITA is the shared task on Part-of-Speech tagging of Twitter posts held at EVALITA 201. Its content was extracted from the SENTIPOLC. The PoSTWITA dataset consists of Italian tweets tokenized and annotated at PoS level with a tagset inspired by the Universal Dependencies scheme.

After the task took place, the PoSTWITA corpus has been used in a new independent project on the development of a Twitter-based Italian tree-bank fully compliant with the Universal Dependencies, thus becoming PoSTWITA-UD. In particular, the first core of the resource was automatically annotated by out-of-domain parsing experiments using different parsers. The output with the best results was then revised by two annotators for the final version of the resource. PoSTWITA-UD has been made available in the official UD repository since v2.1 release.

Authors:

Cristina Bosco
Fabio Tamburini
Andrea Bolioli
Alessandro Mazzei

Publications:

Cristina Bosco, Fabio Tamburini, Andrea Bolioli, Alessandro Mazzei
Overview of the EVALITA 2016 Part Of Speech on TWitter for ITAlian task
CliC-it 2016/EVALITA2016

Back to TWITA