OpenAI logo

GPT-2

OpenAIOpen SourceVerified

Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. It was pre-trained on a dataset of 8 million web pages (WebText) and was a direct scale-up of GPT-1 with a ten-fold increase in both parameter count and training dataset size. Initially partially withheld due to concerns about potential misuse, the full model was released in November 2019.

2019-02-14
1.5B
Decoder-only Transformer
MIT

Specifications

Parameters
1.5B
Architecture
Decoder-only Transformer
License
MIT
Context Window
1,024 tokens
Training Data Cutoff
December 2017
Type
text
Modalities
text

Benchmark Scores

view all (+2)

Advanced Specifications

Model Family
GPT
Finetuned From
GPT-1
API Access
Not Available
Chat Interface
Not Available
Multilingual Support
No
Variants
117M parameters (12 layers)345M parameters (24 layers)762M parameters (36 layers)1.5B parameters (48 layers)

Capabilities & Limitations

Capabilities
text generationtranslationsummarizationquestion answering
Known Limitations
becomes repetitive or nonsensical with long passageslacks coherence in longer textsresource-intensive deployment
Notable Use Cases
AI Dungeon text adventuresr/SubSimulatorGPT2 subredditcode autocompletioncounselor training simulations

Related Models