The Dagstuhl-15512 ArgQuality Corpus
An English corpus for studying the assessment of argumentation quality. It contains 320 online debate portal arguments, annotated for 15 different quality dimensions by three annotators. [zip v1 1mb] [zip v2 1mb]
In version 2, the annotated XMI files have been changed according to a new underlying type system where each quality dimension is represented by an own annotation. This annotation contains not only the majority score of the respective dimension (as in version 1), but also the mean score and the scores of all annotators. We recommend to use version 2.
The Webis-ArgRank-17 Dataset
An English benchmark dataset for studying argument relevance. It contains 32 rankings as well a ground-truth argument graph with more than 30,000 argument units. In addition, we provide the source code to reproduce our ranking experiments based on the dataset. [zip 13mb]
The Webis-Editorials-16 Corpus
An English corpus with 300 news editorials from three online news portals, annotated for the types of all argumentative discourse units. [zip 5mb]
The ArguAna TripAdvisor Corpus
An English corpus for studying local sentiment flows and aspect-based sentiment analysis. It contains 2100 hotel reviews balanced with respect to the reviews’ sentiment scores. All reviews are segmented into subsentence-level statements that have then been manually classified as a fact, a positive, or a negative opinion. Also, all hotel aspects mentioned in the reviews have been annotated as such. [zip v1 with software 10mb] [zip v1 8mb] [zip v2 8mb]
The corpus is free-to-use for scientific purposes, not for commercial applications. In version 2, the annotated XMI files have been changed according to a new underlying type system that is more easily extendable. Notice that some adaptations of the software of version 1 are necessary to make it work with version 2.