OpenAI logo

GPT-1

OpenAIOpen SourceVerified

Generative Pre-trained Transformer 1 (GPT-1) was the first of OpenAI's large language models, demonstrating unsupervised pre-training with a 12-layer decoder-only Transformer. It introduced a two-stage training process leveraging the BookCorpus dataset and achieved improvements across various NLP tasks.

2018-06-01
117M
12-layer decoder-only Transformer with 12 masked self-attention heads
MIT

Specifications

Parameters
117M
Architecture
12-layer decoder-only Transformer with 12 masked self-attention heads
License
MIT
Context Window
512 tokens
Training Data Cutoff
2018
Type
text
Modalities
text

Benchmark Scores

StoriesClozeTest86.5
RACE59
MultiNLI81.8
GLUE72.8
See all benchmarks

Advanced Specifications

Model Family
GPT
API Access
Not Available
Chat Interface
Not Available
Multilingual Support
No

Capabilities & Limitations

Capabilities
natural language inferencequestion answeringsemantic similaritytext classification
Known Limitations
limited context window (512 tokens)
Notable Use Cases
text classificationquestion answeringsemantic similarity assessment

Related Models