CLARITY
Unmasking Political Question Evasions

SemEval 2026 Challenge

Video Examples

Overview

Political discourse is rich in ambiguity—a feature that often serves strategic purposes. In high-stakes settings such as televised presidential debates or interviews, politicians frequently employ evasive communication strategies that leave audiences with multiple interpretations of what was actually said and misguided impressions on whether the information requested was conveyed. This phenomenon, known as equivocation or evasion in academic literature, is well-studied in political science but has received limited attention in computational linguistics. Bull (2003) report that politicians gave clear responses to only 39-46% of questions during televised interviews, while non-politicians had a significantly higher 70-89% reply rate. This stark contrast highlights the strategic nature of political communication and the need for automated tools to analyze response clarity at scale.

Our proposed CLARITY shared task aims to address this gap by introducing a computational approach to detecting and classifying response ambiguity in political discourse. Building on well grounded theory on equivocation and leveraging recent advancements in language modeling, we propose a task that challenges participants to automatically classify the clarity of responses in question/answer (QA) pairs extracted from presidential interviews.

What makes this task particularly compelling is leveraging a novel, two-level taxonomy approach, derived from our paper (Thomas et al., 2024), presented in EMNLP 2024:

  1. A high-level clarity/ambiguity classification.
  2. A fine-grained classification of 9 evasion techniques stemming from political discourse.

This hierarchical approach not only provides a deeper understanding of political discourse but also, as our preliminary experiments show, can lead to improved classification performance when the two levels are used in conjunction.

The CLARITY task will attract researchers from diverse communities, including NLP researchers interested in discourse analysis, semantic understanding, and reasoning over long contexts, fact-checking seeking to detect if an answer is factual but irrelevant, and other downstream NLP tasks such as question answering and dialogue systems. Moreover, by providing a standardized dataset and evaluation framework, CLARITY will facilitate political speech discourse analysis at scale, allowing for comparisons across politicians, time periods, and contexts. Contributing to the development of more transparent and accountable political communication and provide insights to media analysts examining patterns in political interviews and press conferences.

Can your system unmask a seasoned politician’s dodge?

Clarity Classification Pipeline

MY ALT TEXT

Figure 1 - An example from an interview from our dataset with classification along with an analysis from instruction-tuned Llama-70b.

Tasks & Evaluation

Task 1 - Clarity-level Classification

Given a question and an answer, classify the answer as Clear Reply, Ambiguous or Clear Non-Reply.


Task 2 - Evasion-level Classification

Given a question and an answer, classify the answer into one of the 9 evasion techniques.


Evaluation

Both tasks will be evaluated using macro F1-score, ensuring balanced performance across all classes. Evaluation will be conducted on both the official test set and a held-out private evaluation set to ensure robust performance assessment.

Registration is being conducted through the REGISTRATION FORM.

Timeline

  • July 15, 2025: Sample and Training data ready
  • July 31, 2025: Task announced in ACL-2025 Vienna
  • January 10, 2026: Evaluation start Task 1
  • January 20, 2026: Evaluation end Task 1
  • January 21, 2026: Evaluation start Task 2
  • January 31, 2026: Evaluation end Task 2
  • February 2026: Paper submission due
  • March 2026: Notification to authors
  • April 2026: Camera ready due
  • Summer 2026: SemEval workshop

FAQ

Q: How do I participate in the CLARITY shared task?

Registration details and submission guidelines will be announced soon. Please check back for updates.

Q: What is the format of the dataset?

The dataset consists of question-answer pairs extracted from presidential interviews, with annotations for both clarity levels and evasion techniques. Detailed format specifications are available in the dataset documentation.

Q: Can I participate in only one of the two tasks?

Yes, participants can choose to participate in Task 1 (Clarity-level Classification), Task 2 (Evasion-level Classification), or both tasks.

Q: evasion_label column in the test dataset is empty, which is the ground truth for Task 2?

The 'evasion_label' is empty because each of the 3 annotators gave their own answer. Any of the annotator1, annotator2 or annotator3 responses as considered correct.

Q: What evaluation metrics will be used?

Both tasks will be evaluated using macro F1-score to ensure balanced performance across all classes.

Q: Is there a limit on the number of submissions?

Submission guidelines including limits will be provided closer to the evaluation period. Please stay tuned for updates.

Q: Where can I find more technical details about the task?

Please refer to our EMNLP 2024 paper for detailed information about the dataset construction, annotation process, and baseline experiments.

Cite

@misc{thomas2024isaidthatdataset,
        title={"I Never Said That": A dataset, taxonomy and baselines on response clarity classification}, 
        author={Konstantinos Thomas and Giorgos Filandrianos and Maria Lymperaiou and Chrysoula Zerva and Giorgos Stamou},
        year={2024},
        eprint={2409.13879},
        archivePrefix={arXiv},
        primaryClass={cs.CL},
        url={https://arxiv.org/abs/2409.13879}, 
      }