Cause and Rationales for Unanswerability in SQuAD 2.0 Dataset
This dataset contains two types of annotation for unanswerable questions in the SQuAD 2.0 Dataset.
About SQuAD 2.0 Dataset, visit the official page.
This dataset contains why the question is unanswerable for the given context.
Each question in the dataset is classified into 6 classed based on original paper with minor edit.
The detail information for each class is described below.
Column | Description | Format | Example |
---|---|---|---|
qid | ID from SQuAD 2.0 | hash (len 24) | 5a678aa6f038b7001ab0c2a0 |
reason | why the question can not be answered | {E, #, N, A, X, I} | E |
Name | Abbr | Description | Train | Test | Extended |
---|---|---|---|---|---|
Entity Swap | E | Entity replaced with other entity. | 5818 | 1122 | 12597 |
Number Swap | # | Number or date replaced with other number or date. | 1642 | 254 | 3167 |
Negation | N | Negation word inserted or removed. | 1860 | 506 | 4099 |
Antonym | A | Antonym word for context is used in the question. | 2818 | 593 | 7446 |
Mutual Exclusion | X | Word or phrase is mutually exclusive with something for which an answer is present. | 318 | 256 | 2942 |
No Information | I | Asks for condition that is not satisfied by anything in the paragraph, or paragraph does not imply any answer. | 841 | 375 | 2789 |
Total | 13297 | 3106 | 33040 |
causes/reason_gold_{train,test,full}.tsv
(seed data)
causes/reason_extended.tsv
(augmented data)
This dataset contains word-level scores for each question, indicates how each word contributes the unanswerabiltiy of the question. Note that some of these values are not manually labeled by human, so there can be some noisy information although we performed manual check.
Column | Description | Format | Example |
---|---|---|---|
qid | ID from SQuAD 2.0 | hash (len 24) | 5ad3ed86604f3c001a3ff7b3 |
question | question for given qid | tokens seperated with space | What royalty has n't attended Yale ? |
word_att | attention for each word | numbers seperated with comma (,) | 0,0,0,1,0,0,0 (human-label)0.008,0.029,0.108,0.997,0.012,0.006,0.0 (extended) |
rationales/word_att_gold_{train,test}.tsv
(seed data)
rationales/word_att_extended.tsv
(augmented data, only for unanswerable question)
To generate seed data for word attention, we extract answerable and unanswerable question pairs from SQuAD 2.0 dataset that has common context and answer span. For the common words in the question pair we label these words as 0 since they are tends to be unimportant for determining the answerability of the question. Otherwise, we label the words as 1.
To extend annotation from the human-labeled data, we apply tri-training (as proxy-label approach) to propagate existing annotation to unlabeled instances.