November 11–12th, 2024 — Baltimore, MD

Knowledge Base Construction from Pre-trained Language Models

Challenge @ 23rd International Semantic Web Conference (ISWC 2024)

Getting Started

News

22.04.24: Full 2024 dataset released.
19.03.24: First information on 2024 dataset available.
13.03.24: Website up!

Introduction

LM-KBC Challenge LM-KBC Challenge @ ISWC 2024

Task Description

Pretrained language models (LMs) like ChatGPT have advanced a range of semantic tasks and have also shown promise for knowledge extraction from the models itself. Although several works have explored this ability in a setting called probing or prompting, the viability of knowledge base construction from LMs remains underexplored. In the 3rd edition of this challenge, we invite participants to build actual disambiguated knowledge bases from LMs, for given subjects and relations. In crucial difference to existing probing benchmarks like LAMA (Petroni et al., 2019), we make no simplifying assumptions on relation cardinalities, i.e., a subject-entity can stand in relation with zero, one, or many object-entities. Furthermore, submissions need to go beyond just ranking predicted surface strings and materialize disambiguated entities in the output, which will be evaluated using established KB metrics of precision and recall.

Formally, given the input subject-entity (s) and relation (r), the task is to predict all the correct object-entities ({o1, o2, ..., ok}) using LM probing.

Special Features

This year, we impose a 10B parameter limit for participing systems, that ensures that no team can simply outperform others by monetary investment.

We will look at a smaller set of 5 relations (last year: 20), with very distinctive features:

  1. countryLandBordersCountry: Null values possible (e.g., Iceland)
  2. personHasCityOfDeath: Null values possible
  3. seriesHasNumberOfEpisodes: Object is numeric
  4. awardWonBy: Many objects per subject (e.g., 224 Physics Nobel prize winners)
  5. companyTradesAtStockExchange: Null values possible

Calls

Call for Participants

Important Dates

Activity Dates
Dataset (train and dev) release 30 March 2024
Release of test dataset 15 July 2024
Submission of test output and systems 19 July 2024
Submission of system description 2 August 2024
Winner announcement 16 August 2024
Presentations@ISWC (hybrid) November 11 or 12, 2024

Submission Details

Participants are required to submit:

  1. A system implementing the LM probing approach, uploaded to a public GitHub repo
  2. The output for the test dataset subject entites, in the same GitHub repo
  3. A system description in PDF format (5-12 pages, CEUR workshop style), mentioning the GitHub repo.

Organization

Challenge Organizers

Jan-Christoph Kalo
Jan-Christoph Kalo

VU Amsterdam

Tuan-Phong Nguyen
Tuan-Phong Nguyen

MPI for Informatics

Simon Razniewski
Simon Razniewski

Bosch Center for AI

Bohui Zhang
Bohui Zhang

King's College London

Contact

For general questions or discussion please use the Google Group.

Past Editions

Our challenge has been running since 2022. For more information on past editions, please visit the corresponding websites: