Software Mention Detection (SOMD) 2025

Software plays an essential role in scientific research and is considered one of the crucial entity types in scholarly documents. However, the software is usually not cited formally in academic documents, resulting in various informal software mentions. Automatic identification and disambiguation of software mentions, related attributes, and the purpose of software mentions contributes to the better understanding, accessibility, and reproducibility of research but is a challenging task (Schindler et al., 2021).

This competition invites participants (Link for Participation) to develop a system that detects software mentions and their attributes as named entities from scholarly texts and classifies the relationships between these entity pairs. The dataset includes sentences from full-text scholarly documents annotated with Named Entities and Relations. It contains various software types, such as Operating Systems or Applications, and attributes like URLs and version numbers. This task emphasizes the joint learning of Named Entity Recognition (NER) and Relation Extraction (RE) (Hennen et al., 2024 ; Cabot & Navigli, 2021 ; Wadden et al., 2019; Ye et al., 2022) to improve computational efficiency and model accuracy, moving away from traditional pipeline approaches (Zeng et al., 2014; Zhang et al., 2017) . Effective integration of NER and RE, as supported by relevant studies, significantly boosts performance (Li & Ji, 2014).

Competition Platform and Phases

Platform: Participants will submit their entries on the Codabench platform. Please follow this Link to Participate. The competition will proceed in two phases:

Dataset

Evaluation

We evaluate submissions using the F1 score, a metric that reflects the accuracy and precision of the Named Entity Recognition (NER) and Relation Extraction (RE). We will calculate macro-average F1 score using exact match (Nakayama, 2018) criteria for each of the two test phases.

Competition Timeline Overview

Paper Submission Guidelines

Organizers

Stefan Dietze (GESIS Leibniz Institut für Sozialwissenschaften, Cologne & Heinrich-Heine-University Düsseldorf, Germany)

Funding

This work has received funding through the DFG project NFDI4DS (no. 460234259)

We wish to thank NFDI4DS for both funding and support. A special thanks goes to all institutions and actors engaging for the association and its goals.

For more information about NFDI4DS, visit the website