LM-KBC @ ISWC 2022

🔔 News

16-11-2022: Proceedings published at https://ceur-ws.org/Vol-3274
16-08-2022: Winning systems in each track announced.
04-07-2022: Final deadline extension until July 26, 23:59:59 AoE time.
11-07-2022: Test subject entities have been released here. Submit your predictions on CodaLab to get a confirmed score now (optional, final leaderboard will be split by track, multiple submissions possible).
04-07-2022: Submission deadline extended to July 21 (to accommodate below changes).
02-07-2022: Data format and evaluation scripts have been updated, please pull again (and read our announcement here).

Task Description

Pre-trained language models (LMs) have advanced a range of semantic tasks and have also shown promise for knowledge extraction from the models itself. Although several works have explored this ability in a setting called LM probing using prompting or prompt-based learning (Liu et al., 2021), the viability of knowledge base construction from LMs has not yet been explored. In this challenge, we invite participants to build actual knowledge bases from LMs, for given subjects and relations. In crucial difference to existing probing benchmarks like LAMA (Petroni et al., 2019), we make no simplifying assumptions on relation cardinalities, i.e., a subject-entity can stand in relation with zero, one, or many object-entities. Furthermore, submissions need to go beyond just ranking the predictions, and make concrete decisions on materializing outputs. The outputs are evaluated using the established F1-score KB metric.

Formally, given the input subject-entity (s) and relation (r), the task is to predict all the correct object-entities ({o₁, o₂, ..., o_k}) using LM probing.

The challenge comes with two tracks:

Track 1: a BERT (BERT-base or BERT-large) track with low computational requirements.
Track 2: an open track, where participants can use any LM (e.g., RoBERTa, Transformer-XL, GPT-2, BART etc.) of their choice.

🏆 Winners

Track	System	Avg. Precision	Avg. Recall	Avg. F1-score
1	Task-specific Pre-training and Prompt Decomposition for Knowledge Graph Population with Language Models Tianyi Li, Wenyu Huang, Nikos Papasarantopoulos, Pavlos Vougiouklis, Jeff Z. Pan	0.766	0.566	0.550
2	Prompting as Probing: Using Language Models for Knowledge Base Construction Dimitrios Alivanistos, Selene Baez Santamaria, Michael Cochez, Jan-Christoph Kalo, Thiviyan Thanapalasingam, Emile van Krieken	0.798	0.690	0.676

Dataset

We release a dataset (train and development) for a diverse set of 12 relations, each covering a different set of subject-entities and along with complete list ground truth object-entities per subject-relation-pair. The total number of object-entities varies for a given subject-relation pair. The train dataset subject-relation-object triples can be used for training or probing the language models in any form, while development can be used for hyperparameter tuning. Futher details on the relations are given below:

Relation	Description	Examples
`CountryBordersWithCountry`	country (`s`) shares a land border with another country (`o`)	Show/Hide (Canada, CountryBordersWithCountry, [[USA, United States of America]]) (Norway, CountryBordersWithCountry, [Finland, Sweden, Russian]) (Mauritius, CountryBordersWithCountry, [])
`CountryOfficialLanguage`	country (`s`) has an official language (`o`)	Show/Hide (Belarus, CountryOfficialLanguage, [Belarusian, Russian]) (Seychelles, CountryOfficialLanguage, [French, English, [Seychellois Creole, Creole]]) (Bosnia and Herzegovina, CountryOfficialLanguage, [Bosnian, Serbian, Croatian])
`RiverBasinsCountry`	river (`s`) basins in a country (`o`)	Show/Hide (Elbe, RiverBasinsCountry, [Germany, Poland, Austria, [Czech Republic, Czechia]]) (Drin, RiverBasinsCountry, [Albania]) (Chari, RiverBasinsCountry, [[Central African Republic, Africa], Cameroon, Chad])
`StateSharesBorderState`	state (`s`) of a country shares a land border with another state (`o`)	Show/Hide (Oregon, StateSharesBorderState, [California, Idaho, Washington, Nevada]) (Florida, StateSharesBorderState, [Georgia]) (Mexico city, StateSharesBorderState, [[State of Mexico, Mexico], Morelos])
`ChemicalCompoundElement`	chemical compound (`s`) consists of an element (`o`)	Show/Hide (Water, ChemicalCompoundElement, [Hydrogen, Oxygen]) (Borax, ChemicalCompoundElement, [Boron, Oxygen, Sodium]) (Calomel, ChemicalCompoundElement, [Mercury, Chlorine])
`PersonInstrument`	person (`s`) plays an instrument (`o`)	Show/Hide (Chester Bennington, PersonInstrument, []) (Ringo Starr, PersonInstrument, [Guitar, Drum, [Percussion Instrument, Percussion]]) (Leeteuk, PersonInstrument, [Piano])
`PersonLanguage`	person (`s`) speaks in a language (`o`)	Show/Hide (Bruno Mars, PersonLanguage, [Spanish, English]) (Aamir Khan, PersonLanguage, [Hindi, Urdu, English]) (Alicia Keys, PersonLanguage, [English])
`PersonEmployer`	person (`s`) is or was employed by a company (`o`)	Show/Hide (Susan Wojcicki, PersonEmployer, [Google]) (Steve Wozniak, PersonEmployer, [[Apple Inc, Apple], [Hewlett-Packard, HP], University of Technology Sydney, [Atari, Atari Inc]]) (Jacqueline Novogratz, PersonEmployer, [UNICEF, World Bank, Chase Bank])
`PersonProfession`	person (`s`) held a profession (`o`)	Show/Hide (Nicolas Sarkozy, PersonProfession, [Lawyer, Politician, Statesperson]) (Shakira, PersonProfession, [[Singer-Songwriter, Singer, Songwriter], Guitarist]) (Eminem, PersonProfession, [Rapper])
`PersonPlaceOfDeath`	person (`s`) died at a location (`o`)	Show/Hide (Elvis Presley, PersonPlaceOfDeath, [Graceland]) (Kofi Annan, PersonPlaceOfDeath, [Bern]) (Angela Merkel, PersonPlaceOfDeath, [])
`PersonCauseOfDeath`	person (`s`) died due to a cause (`o`)	Show/Hide (John lewis, PersonCauseOfDeath, [[Pancreatic Cancer, Cancer]]) (Pierre Nkurunziza, PersonCauseOfDeath, [[Covid-19, Covid]]) (Neil deGrasse Tyson, PersonCauseOfDeath, [])
`CompanyParentOrganization`	company (`s`) has another company (`o`) as its parent organization	Show/Hide (Apple Inc, CompanyParentOrganization, []) (Abarth, CompanyParentOrganization, [[Stellantis Italy, Stellantis]]) (Hitachi, CompanyParentOrganization, [])

Each row in the dataset files constitutes one triple, of (1) subject-entity, (2) relation, and (3) list of all possible object-entities. For (3), we sometimes provide multiple aliases for an object-entity, where outputting any one of them is sufficient for that entity. In particular, to facilitate usage of LMs like BERT (which are constrained by single-token predictions), we provide a valid single-token form for multi-token object-entities, wherever such a form is meaningful. Please read the Data format section for more details. When the subjects have zero valid objects, the ground truth is an empty list, e.g., (Apple Inc., CompanyParentOrganization, []).

Dataset Characteristics

For each of the 12 relations, the number of unique subject-entities in the train, dev, and test are 100, 50, and 50. The minimum and maximum number of object-entities for each relation is given below. If the minimum value is 0, then the subject-entity can have zero valid object-entities for that relation.

Relation	Train	Dev	Test
`CountryBordersWithCountry`	[0, 17]	[0, 14]	[0, 11]
`CountryOfficialLanguage`	[1, 4]	[1, 15]	[1, 11]
`StateSharesBorderState`	[1, 14]	[1, 15]	[1, 14]
`RiverBasinsCountry`	[1, 6]	[1, 10]	[1, 9]
`ChemicalCompoundElement`	[2, 6]	[2, 6]	[2, 6]
`PersonLanguage`	[1, 6]	[1, 5]	[1, 7]
`PersonProfession`	[1, 23]	[1, 19]	[1, 20]
`PersonInstrument`	[0, 7]	[0, 14]	[0, 7]
`PersonEmployer`	[1, 8]	[1, 8]	[1, 8]
`PersonPlaceOfDeath`	[0, 1]	[0, 1]	[0, 1]
`PersonCauseOfDeath`	[0, 3]	[0, 2]	[0, 2]
`CompanyParentOrganization`	[0, 5]	[0, 1]	[0, 3]

Download dataset

Task Evaluation

We use a standard KBC evaluation metric, the macro-averaged F1-score (based on the combination of precision and recall), to compare the predicted object-entities with true object-entities on the hidden test dataset. We release a baseline implementation and evaluation script. The baseline model probes the BERT language model using a sample prompt like "China shares border with [MASK]", and selects object-entities predicted in the [MASK] position with greater than or equal to 0.5 likelihood as outputs. This baseline achieves 31.08% F1-score (averaged across all subject-entities and relations) on the hidden test dataset. Participants can use the evaluation script to compute the F1-score for assessing the performance of their systems. For more details on LM-probing and the baseline method, please check the released notebook.

Submission Details

Participants are required to submit:

A system implementing the LM probing approach
The output for the test dataset subject-entites
A system description in PDF format (5-12 pages, LNCS style).

All materials must be uploaded on Easychair. Additionally, there will be an optional CodaLab live leaderboard that participants can submit to. The test dataset is initially hidden to preserve the integrity of results, and will be released 10 days before the final deadline. The output files for the test subject-entities must be formatted as described here, and submitted along with the system and its description. The top performing systems will get an opportunity to present their ideas and results during the ISWC 2022 conference, and the challenge proceedings will be submitted to CEUR publication system.

Organizers

Sneha Singhania

Max Planck Institute for Informatics
Germany

Tuan-Phong Nguyen

Max Planck Institute for Informatics
Germany

Simon Razniewski

Max Planck Institute for Informatics
Germany

Time	Team
10:30-10:40	Introduction (Singhania et al.)
10:40-10:50	Track 1 Winner (Li et al.)
10:50-11:00	Track 2 Winner (Alivanistos et al.)
11:00-11:05	Team 3 (Ning and Celebi)
11:05-11:10	Team 4 (Fang et al.)
11:10-11:15	Team 5 (Dalal et al.)
11:15-11:30	Summary and discussion

Activity	Dates
Dataset (train and dev) release	17 May 2022
Test subject release	11 July 2022
System + test dataset predictions + system description submission	~~14 July~~ 26 July 2022
Winner announcement	16 August 2022
ISWC invitations	16 August 2022
ISWC presentations	23-27 October 2022

Knowledge Base Construction from Pre-trained Language Models (LM-KBC)

🔔 News

Task Description

🏆 Winners

Dataset

Dataset Characteristics

Task Evaluation

Submission Details

Organizers

Presentation Schedule

Important Dates

Contact