PoSTWITA is the shared task on Part-of-Speech tagging of Twitter posts held at EVALITA 201. Its content was extracted from the SENTIPOLC. The PoSTWITA dataset consists of Italian tweets tokenized and annotated at PoS level with a tagset inspired by the Universal Dependencies scheme.
After the task took place, the PoSTWITA corpus has been used in a new independent project on the development of a Twitter-based Italian tree-bank fully compliant with the Universal Dependencies, thus becoming PoSTWITA-UD. In particular, the first core of the resource was automatically annotated by out-of-domain parsing experiments using different parsers. The output with the best results was then revised by two annotators for the final version of the resource. PoSTWITA-UD has been made available in the official UD repository since v2.1 release.
Authors:
Publications: