Task Description
Pretrained language models (LMs) like chatGPT have advanced a range of semantic tasks and have also shown promise for knowledge extraction from the models itself. Although several works have explored this ability in a setting called probing or prompting, the viability of knowledge base construction from LMs remains underexplored. In the 2nd edition of this challenge, we invite participants to build actual disambiguated knowledge bases from LMs, for given subjects and relations. In crucial difference to existing probing benchmarks like LAMA (Petroni et al., 2019), we make no simplifying assumptions on relation cardinalities, i.e., a subject-entity can stand in relation with zero, one, or many object-entities. Furthermore, submissions need to go beyond just ranking predicted surface strings and materialize disambiguated entities in the output, which will be evaluated using established KB metrics of precision and recall.Formally, given the input subject-entity (s
) and relation
(r
), the task is to predict all the correct object-entities
({o1
, o2
, ..., ok
})
using LM probing.
The challenge comes with three tracks:
- Track 1: a small-model track with low computational requirements
- Track 2: an open track, where participants can use any LM of their choice
- Track 3: a discovery track, where participants need to discover knowledge for emerging entities
Organizers
- Sneha Singhania, Max Planck Institute for Informatics, Germany
- Jan-Christoph Kalo, VU Amsterdam
- Simon Razniewski, Bosch Center for AI
- Jeff Z. Pan, University of Edinburgh