Home | GPTKB

Welcome to GPTKB

A large general-domain knowledge base entirely from a large language model

101 million

triples

2.9 million

entities

100x

less cost

Overview

This interface presents GPTKB, a large general-domain knowledge base (KB) entirely from a large language model (LLM). It demonstrates the feasibility of large-scale KB construction from LLMs, while highlighting specific challenges arising around entity recognition, entity and property canonicalization, and taxonomy construction.

Based on GPT-4o-mini, GPTKB contains 101 million triples for more than 2.9 million entities, at a cost 100x less than previous KBC projects.

GPTKB is a landmark for two fields:

For NLP, for the first time, it provides constructive insights into the knowledge (or beliefs) of LLMs.
For the Semantic Web, it shows novel ways forward for the long-standing challenge of general-domain KB construction.

Using the search field, you can find entities by name, and browse their triples. Alternatively, you can write SPARQL queries or download the whole KB as a TTL file. Paper and code are also available.

Example entities

Example SPARQL queries

Main paper

If you use this data, please cite the following paper:

Yujia Hu, Tuan-Phong Nguyen, Shrestha Ghosh, Simon Razniewski
GPTKB: Comprehensively Materializing Factual LLM Knowledge
arXiv, 2024