Knowledge Base Construction from Pre-Trained Language Models (KBC-LM)

Workshop @ 22nd International Semantic Web Conference (ISWC 2023)

Language models such as chatGPT, BERT, and T5, have demonstrated remarkable outcomes in numerous AI applications. Research has shown that these models implicitly capture vast amounts of factual knowledge within their parameters, resulting in a remarkable performance in knowledge-intensive applications. The seminal paper "Language Models as Knowledge Bases?" sparked interest in the spectrum between language models and knowledge graphs, leading to a diverse range of research on the usage of LMs for knowledge base construction. This research includes:

  1. utilizing pre-trained language models for knowledge base completion and construction tasks,
  2. performing information extraction tasks, like entity linking and relation extraction, and
  3. utilizing knowledge graphs to support language models based applications.

The 1st Workshop on Knowledge Base Construction from Pre-Trained Language Models (KBC-LM) workshop aims to give space to the emerging academic community that investigates these topics, host extended discussions around the LM-KBC Semantic Web challenge, and enable an informal exchange of researchers and practitioners.

Important Dates

Papers due: July 31, 2023
Notification to authors: August 31, 2023
Camera-ready deadline: September 14, 2023
Workshop dates: 6 or 7 November 2023


We invite contributions on the following topics:

  • Entity recognition and disambiguation with LMs
  • Relation extraction with LMs
  • Zero-shot and few-shot knowledge extraction from LMs
  • Consistency of LMs
  • Knowledge consolidation with LMs
  • Comparisons of LMs for KBC tasks
  • Methodological contributions on training and fine-tuning LMs for KBC tasks
  • Evaluations of downstream capabilities of LM-based KGs in tasks like QA
  • Designing robust prompts for large language model probing

Submissions can be novel research contributions or already published papers (these will be presentation-only, and not part of the workshop proceedings). Novel research papers can be either full papers (ca. 8-12 pages), or short papers presenting smaller or preliminary results (typically 3-6 pages). We are accepting demo and position papers as well. Check out also the LM-KBC challenge for further options to contribute to the workshop.

Submission and Review Process

Papers will be peer-reviewed by at least three researchers using a double-blind review. Selected papers will be published on CEUR (upon authors agreement). Submissions need to be formatted according to the CEUR workshop proceedings (template). Papers can be submitted directly via Openreview.

Robert Bosch GmbH has signaled that they would likely sponsor a best paper award over 500 Euro.

Tentative Schedule

The idea of this workshop is to provide a focused venue for informal exchange around LMs for KB construction. In particular, we plan to devote space to extended presentations of participants of the LM-KBC ISWC challenge (where the 1-hour ISWC conference slot will only allow presentations of the winners). Furthermore, we plan to make this a hybrid event to lower the entry barrier for participants.

10:00-10:15 Welcome
10:15-11:00 Keynote 1
11:00-11:30 Coffee Break
11:30-12:30 Paper Presentations
12:30-13:30 Lunch Break
13:30-14:00 Keynote 2 (perhaps hybrid)
14:00-15:00 LM-KBC challenge presentations
15:00-15:20 Coffee Break
15:20-16:00 Paper Presentations
16:00-16:15 Closing Ceremony


Jan-Christoph Kalo is a postdoctoral researcher at the Learning and Reasoning Group at the Vrije Universiteit Amsterdam. His research focus is on machine learning methods, particularly language models, for knowledge base construction and knowledge management. He is co-organizing this year's LM-KBC challenge at ISWC and has been part of the PC of ISWC and ESWC.

Simon Razniewski is a research scientist in the NLP and Neuro-Symbolic AI group at the Bosch Center for AI. He has previously organized the Wikidata workshop @ ISWC, and the LM-KBC challenge 2022. He has also held senior roles in program committees of major conferences such as IJCAI'21 and EACL'23 (area chair), or ISWC'20 and CIKM'20 (senior PC member).

Sneha Singhania is a PhD student at the Max-Planck Institute for Informatics. Her research focuses on enabling machines to select and utilize high-quality information sources, with an awareness of unknowns, for knowledge base (KB) construction. She co-organized the LM-KBC'22 challenge and has held PC roles at ACL, CIKM, ISWC, and ESWC.

Jeff Z. Pan is a chair of the Knowledge Graph Group at the Alan Turing Institute and is a member of the School of Informatics at The University of Edinburgh. He is an Editor of the Journal of Web Semantics (JoWS) and a Programme Chair of the 19th International Semantic Web Conference (ISWC 2020), the premier international forum for the Semantic Web / Knowledge Graph / Linked Data communities.

Selected References

  • Fabio Petroni, Tim Rockt√§schel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, and Sebastian Riedel. "Language models as knowledge bases?.", EMNLP, 2019
  • Sneha Singhania, Tuan-Phong Nguyen, Simon Razniewski. "LM-KBC: Knowledge Base Construction from Pre-trained Language Models", CEUR-WS, 2022
  • Blerta Veselhi, Sneha Singhania, Simon Razniewski, Gerhard Weikum. "Evaluating Language Models for Knowledge Base Completion", ESWC, 2023
  • Dimitrios Alivanistos, Selene Baez Santamaria, Michael Cochez, Jan-Christoph Kalo, Emile van Krieken, and Thiviyan Thanapalasingam. "Prompting as Probing: Using Language Models for Knowledge Base Construction." LM-KBC, 2022.
  • Simon Razniewski, Andrew Yates, Nora Kassner, Gerhard Weikum. "Language Models As or For Knowledge Bases", DL4KG@ISWC, 2021
  • Jan-Christoph Kalo, and Leandra Fichtel. "KAMEL: Knowledge Analysis with Multitoken Entities in Language Models", AKBC 2022
  • Roi Cohen, Mor Geva, Jonathan Berant, and Amir Globerson. "Crawling The Internal Knowledge-Base of Language Models.", Findings of EACL 2023.
  • Adam Roberts, Colin Raffel, and Noam Shazeer. "How Much Knowledge Can You Pack Into the Parameters of a Language Model?.", EMNLP 2020.