A big problem with the ubiquity of Generative AI is that it has now become very easy to generate fake scientific papers. This can erode public trust in science and attack the foundations of science: are we standing on the shoulders of robots? The Detecting Automatically Generated Papers (DAGPAP) competition aims to encourage the development of robust, reliable AI-generated scientific text detection systems, utilizing a diverse dataset and varied machine learning models in a number of scientific domains.
Savvas Chamezopoulos, Elsevier
Dan Li, Elsevier
Anita de Waard, Elsevier
You are invited to participate in the shared task “Context25: Evidence and Grounding Context Identification for Scientific Claims” collocated with the 5th Workshop on Scholarly Document Processing (SDP 2025) to be held at ACL 2025. Participants of the competition are also invited to submit papers describing their findings.
Interpreting scientific claims in the context of empirical findings is a valuable practice, yet extremely time-consuming for researchers. Such interpretation of scientific claims requires identifying key results that provide supporting evidence from research papers, and contextualizing these results with associated methodological details (e.g., measures, sample, etc.). In this shared task, we are interested in automating identification of key results (or evidence) as well as additional grounding context to make claim interpretation more efficient.
Joel Chan (University of Maryland)
Matthew Akamatsu (University of Washington)
Aakanksha Naik (Allen Institute for AI)
Scholarly articles convey valuable information not only through unstructured text but also via (semi-)structured figures such as charts and diagrams. Automatically interpreting the semantics of knowledge encoded in these figures can be beneficial for downstream tasks such as question answering (QA). In the SciVQA challenge, the participants will develop multimodal systems capable of efficiently processing both visual (i.e., addressing attributes such as colour, shape, size, etc.) and non-visual QA pairs based on images of scientific figures and their captions.
The ClimateCheck shared task focuses on fact-checking claims from social media about climate change against peer-reviewed scholarly articles. Participants will retrieve relevant publications from a corpus of 400,000 climate research articles and classify each abstract as supporting, refuting, or not having enough information about the claim. Training data will include human-annotated claim-publication pairs, and the evaluation will combine nDCG@K and Bpref for retrieval and F1 score for classification. The task aims to develop models that link social media claims to scientific evidence, promoting informed and evidence-based discussions on climate change.
Software plays an essential role in scientific research and is considered one of the crucial entity types in scholarly documents. However, software usually is not cited formally in academic documents, resulting in various informal mentions of software. Automatic identification and disambiguation of software mentions, related attributes and the purpose of a software mentions contributes to the better understanding, accessibility, and reproducibility of research but is a challenging task. We are extending our first iteration of the shared task SOMD 2024 with new challenges.