Hate Speech Detection task at
consists in automatically annotating messages from Twitter and Facebook.
The dataset proposed for the task is the result of a joint effort of two
research groups on harmonizing the annotation previously applied to two
different datasets: the first one is a collection of Facebook comments
developed by the group from Italian Hate Speech Corpus.
The annotation scheme has thus been simplified, and it only includes a binary
value indicating whether hateful contents towards are present or not in a given
tweet or Facebook comment.
The task organizers created such harmonized scheme also in view of
a cross-domain evaluation, with one dataset used for training and the
other one for testing the system.
It is worth pointing out, however, that despite their joint use in the task,
the resources are still maintained separately; the Turin group is therefore
responsible for the Twitter dataset only.