Technical University Darmstadt and head of the UKP Lab
Iryna Gurevych is a German computer scientist. She is Professor at the Department of Computer Science of the Technical University of Darmstadt and Director of Ubiquitous Knowledge Processing Lab. She has a strong background in information extraction, semantic text processing, machine learning and innovative applications of NLP to social sciences and humanities.Iryna Gurevych has published over 300 publications in international conferences and journals and is member of programme and conference committees of more than 50 high-level conferences and workshops (ACL, EACL, NAACL, etc.). She is the holder of several awards, including the Lichtenberg-Professorship Career Award und the Emmy-Noether Career Award (both in 2007). In 2021 she received the first LOEWE-professorship of the LOEWE programme. She has been selected as a ACL Fellow 2020 for her outstanding work in natural language processing and machine learning and is the Vice-president-elect of the ACL since 2021.
While modern language models do a great job at finding documents, extracting information from them and generating naturally sounding language, the progress in helping humans read, connect, and make sense of interrelated long texts has been very much limited. Funded by the European Research Council, the InterText project brings natural language processing (NLP) forward by developing a general framework for modeling and analyzing fine-grained relationships between texts – intertextual relation- ships. This crucial milestone for AI would allow tracing the origin and evolution of texts and ideas and enable a new generation of AI applications for text work and critical reading. Using the scientific domain as a prototypical model of collaborative knowledge construction anchored in text, this talk will provide an overview of UKP Lab’s past and ongoing research demonstrating our intertextual approach to NLP in the scientific domain. Specifically, we will highlight two lines of our work. The first one is related to task design, practical applications and intricacies of data collection in the peer-review domain. The second one is about scientific text generation targeting (i) citation text and (ii) attitude and theme-guided rebuttals. To conclude, we will briefly describe our ongoing efforts towardsfine-grained linking of mul- tiple documents, temporal analysis of scientific datasets and research novelty modeling.
University of Copenhagen
Anna Rogers is an assistant professor in the Center for Social Data Science at the University of Copenhagen. She is currently also a visiting researcher with the RIKEN Center for Computational Science (Japan). Her main research area is Natural Language Processing, in particular model analysis and evaluation of natural language understanding systems.
Research practices in our and other fields are being actively reshaped by the new tools based on large language models. For every step in the traditional research pipeline, from experimentation to writing, commercial ‘solutions’ are already actively marketed. This talk will discuss to what extent the marketing is realistic, how the research practices seem to be changing, and how all this interacts with considerations of publication ethics and security.
University of Illinois at Urbana-Champaign, USA
Heng Ji is a professor at Computer Science Department of University of Illinois at Urbana-Champaign. She received her B.A. and M. A. in Computational Linguistics from Tsinghua University, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Information Extraction and Knowledge Base Population. She is selected as "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017. The awards she received include "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, PACLIC2012 Best paper runner-up, "Best of ICDM2013" paper award, "Best of SDM2013" paper award, ACL2018 Best Demo paper nomination, Google Research Award in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014 and Bosch Research Award in 2014-2018. She has coordinated the NIST TAC Knowledge Base Population task since 2010. She is the associate editor for IEEE/ACM Transaction on Audio, Speech, and Language Processing. She has served as the Program Committee Co-Chair of NAACL-HLT2018, NLP-NABD2018, NLPCC2015, CSCKG2016 and CCL2019, and senior area chair for many conferences. She has led several multi-institute research efforts including DARPA DEFT Tinker Bell team of seven universities and DARPA KAIROS RESIN team of six universities. She is the task leader of the U.S. ARL projects on information fusion and knowledge networks construction between 2009-2019. She is invited by the Secretary of the Air Force and AFRL to join Air Force Data Analytics Expert Panel to inform the Air Force Strategy 2030.
There exist approximately 166 billion small molecules, with 970 million deemed druglike. Despite this vast pool, only 89 tyrosine kinase inhibitors are currently approved across global healthca- re systems. This scarcity underscores the urgent need for innovative approaches, calling upon the NLP community to contribute significantly to medicine. However, the challenges are manifold. Existing large language models (LLMs) alone are insufficient due to their tendency to generate erroneous claims con- fidently (hallucinate). Moreover, traditional knowledge bases do not adequately address the issue; none of the 89 kinase inhibitors are documented in popular human-constructed databases. This gap persists because chemistry language diverges significantly from natural language, demanding specialized domain knowledge, multimodal information integration, and long context understanding. Using drug discovery as a case study, I will present our approaches to tackle these challenges and turn an AI agent into a Medi- cinal Chemist. I will share preliminary results from animal testing conducted on drug variants proposed by AI algorithms. Furthermore, I advocate for a paradigm shift towards ‘slow science’, emphasizing the integration of feedback loops from molecule synthesis and animal testing. This new paradigm aims to evaluate AI techniques in scientific contexts, moving beyond chasing precision/recall scores at leader- boards which are prevalent in the current computer science community.
Northwestern University and Allen Institute for AI, USA
Doug is a Senior Research Scientist at AI2. He is currently on leave from Northwestern University, where he is an Associate Professor of Computer Science. His research focuses on information extraction, natural language processing, and machine learning. Outside of work, he enjoys spending time with family, exploring the outdoors, and watching movies.
Natural language processing (NLP) has made major strides in recent years, due to the in- creasing capabilities of large language models. However, using NLP to power real applications is still challenging: the best models are expensive to apply at scale and are still prone to errors. I’ll describe re- cent lessons we’ve learned on the Semantic Scholar team as we’ve built and deployed applications using NLP aimed at accelerating science, including PDF content extraction, automatically-constructed topic pages for science, and complex question answering. While recent NLP breakthroughs do enable exciting new experiences, fully delivering on the potential of this technology will require solving multiple open research problems.