Shared Tasks

Quick links

  • MSLR22: Multi-Document Summarization for Literature Reviews
  • DAGPap22: Detecting automatically generated scientific papers
  • LongSumm 2022: Generating Long Summaries for Scientific Documents
  • SV-Ident 2022: Survey Variable Identification in Social Science Publications
  • Scholarly Knowledge Graph Generation
  • Multi Perspective Scientific Document Summarization

MSLR22: Multi-Document Summarization for Literature Reviews

Systematic literature reviews aim to comprehensively summarize evidence from all available studies relevant to a question. In the context of medicine, such reviews constitute the highest quality evidence used to inform clinical care. However, reviews are expensive to produce manually; (semi-)automation via NLP may facilitate faster evidence synthesis without sacrificing rigor. We introduce the task of multi-document summarization for generating review summaries. This task uses two datasets of review summaries derived from the scientific literature [1][2]. Participating teams are evaluated using automated and human evaluation metrics. We also encourage contributions which extend this task and dataset, e.g., by proposing scaffolding tasks, methods for model interpretability, and improved automated evaluation methods in this domain.

Details about data access, task evaluation, and more are available here.

Please join our mailing list to receive updates or email lucyw@allenai.org to be added to our Slack workspace.

  1. DeYoung, Jay, Iz Beltagy, Madeleine van Zuylen, Bailey Kuehl and Lucy Lu Wang. "MS2: A Dataset for Multi-Document Summarization of Medical Studies." EMNLP (2021).
  2. Byron C. Wallace, Sayantani Saha, Frank Soboczenski, and Iain James Marshall. (2020). "Generating (factual?) narrative summaries of RCTs: Experiments with neural multi-document summarization." AMIA Annual Symposium.

Organizers

Lucy Lu Wang, Allen Institute for AI

Jay DeYoung, Northeastern University

Byron Wallace, Northeastern University

Please email mslr-organizers@googlegroups.com to contact the organizers.


DAGPap22: Detecting automatically generated scientific papers

There are increasing reports that research papers can be written by computers, which presents a series of concerns (e.g., see [1]). In this challenge we explore the state of the art in detecting automatically generated papers. We frame the detection problem as a binary classification task: given an excerpt of text, label it as either human-written or machine-generated. To this end we will provide a corpus of over 4,000 automatically written papers, based on the work by Cabanac et al. [2], as well as documents collected by our publishing and editorial teams. As a control, we will provide a corpus of openly accessibly human-written papers from the same scientific domains of documents.

We also encourage contributions that aim to extend this dataset with other computer-generated scientific papers, or paper that propose valid metrics to assess automatically generated papers against those written by humans.

  1. Holly Else. (2021). "'Tortured phrases' give away fabricated research papers." Nature.
  2. Guillaume Cabanac, Cyril Labbé, and Alexander Magazinov. (2021). "Tortured phrases: A dubious writing style emerging in science. Evidence of critical issues affecting established journals."

Organizers

Anita de Waard, Elsevier

Yury Kashnitsky, Elsevier

Guillaume Cabanac, University of Toulouse

Cyrill Labbé, Université Grenoble

Alexander Magazinov, Yandex


LongSumm 2022: Generating Long Summaries for Scientific Documents

Most of the existing work on scientific document summarization focuses on generating short, abstract-like summaries. LongSumm task is focused on the study of generating high quality long summaries for scientific litrature. This is the 3rd iteration of LongSumm [1]. In SDP 2021, LongSumm has received 50 submissions from 8 different teams. Evaluation results are reported on a public leaderboard.

  1. Iz Beltagy, Arman Cohan, Guy Feigenblat, Dayne Fre-itag, Tirthankar Ghosal, Keith Hall, Drahomira Herrmannova, Petr Knoth, Kyle Lo, Philipp Mayr, Robert M. Patton, Michal Shmueli-Scheuer, Anita de Waard, Kuansan Wang, and Lucy Lu Wang. (2021). "Proceedings of the Second Workshop on Scholarly Document Processing." Association for Computational Linguistics.

Organizers

Guy Feigenblat, Piiano Privacy Solutions

Michal Shmueli-Scheuer, IBM Research


SV-Ident 2022: Survey Variable Identification in Social Science Publications

For this shared task, we focus on concepts specific to social science literature, namely survey variables. We build on the original work of [1], [2] and propose an evaluation exercise on the task of "Variable Detection and Linking". Survey variable mention identification in texts can be seen as a multi-label classification problem: Given a sentence in a document (in our case: a scientific publication in the social sciences), and a list of unique variables (from a reference vocabulary of survey variables), the task is to classify which variables, if any, are mentioned in each sentence.

We split the task into two sub-tasks: a) variable detection and b) variable disambiguation. Variable detection deals with identifying whether a sentence contains a variable mention or not, whereas variable disambiguation focuses on identifying which variable from the vocabulary is specifically mentioned in a certain sentence.

This task is organized by the VAriable Detection, Interlinking and Summarization (VADIS) project.

Link to the SV-Ident 2022 page (more info to come): https://vadis-project.github.io/sv-ident-sdp2022/

  1. Andrea Zielinski and Peter Mutschke. (2017). "Mining social science publications for survey variables." Proceedings of the Second Workshop on NLP and Computational Social Science. Association for Computational Linguistics (ACL).
  2. Andrea Zielinski and Peter Mutschke. (2018). "Towards a gold standard corpus for variable detection and linking in social science publications." Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA)

Organizers

Simone Paolo Ponzetto, University of Mannheim

Andrea Zielinski, Fraunhofer ISI

Tornike Tsereteli, University of Stuttgart

Yavuz Selim Kartal, GESIS

Philipp Mayr, GESIS


Scholarly Knowledge Graph Generation

With the demise of the widely used Microsoft Academic Graph (MAG) [1], [2] at the end of 2021, the scholarly document processing community is facing a pressing need to replace MAG by an open source community supported service. A number of challenging data processing tasks are essential for a scalable creation of a comprehensive scholarly graph, i.e. a graph of entities involving but not limited to research papers, their authors, research organisations and research themes. This shared task will evaluate three key sub-tasks involved in the generation of a scholarly graph:

  1. extracting research themes from scholarly documents,
  2. document deduplication, i.e. identifying and linking different versions of the same scholarly document, and
  3. affiliation mining, i.e. linking research papers or their metadata to the organisational entities that produced them.

Test and evaluation data will be supplied by the CORE aggregator [3].

  1. Kuansan Wang, Zhihong Shen, Chiyuan Huang, Chieh-Han Wu, Yuxiao Dong, and Anshul Kanakia. (2020). "Microsoft academic graph: When experts are not enough." Quantitative Science Studies. MIT Press Direct.
  2. Drahomira Herrmannova and Petr Knoth. (2016). "An analysis of the microsoft academic graph." D-Lib Magazine.
  3. Petr Knoth and Zdenek Zdrahal. (2012). "Core: three access levels to underpin open access." D-Lib Magazine.

Preregistration

Pre-register your team here and we'll keep you posted with competition updates and timelines.

Organizers

Petr Knoth, Open University

David Pride, Open University

Ronin Wu, Iris.ai

Drahomira Herrmannova


Multi Perspective Scientific Document Summarization

Generating summaries of scientific documents is known to be a challenging task. Majority of existing work in summarization assumes only one single best gold summary for each given document. Having only one gold summary negatively impacts our ability to evaluate the quality of summarization systems as writing summaries is a subjective activity. At the same time, annotating multiple gold summaries for scientific documents can be extremely expensive as it requires domain experts to read and understand long scientific documents. This shared task will enable exploring methods for generating multi-perspective summaries. We introduce a novel summarization corpus, leveraging data from scientific peer reviews to capture diverse perspectives from the reader's point of view.

Organizers

Guy Feigenblat, Piiano, Israel

Michal Shmueli-Scheuer, IBM Research AI, Haifa Research Lab, Israel

Arman Cohan, Allen Institute for AI, Seattle, USA

Tirthankar Ghosal, Charles University, Czech Republic



Contact: sdproc2022@googlegroups.com

Sign up for updates: https://groups.google.com/g/sdproc-updates

Follow us: https://twitter.com/SDProc

Back to top