OpenAI logo

GPT-1

OpenAIOpen SourceVerified

Generative Pre-trained Transformer 1 (GPT-1) was the first of OpenAI's large language models, demonstrating unsupervised pre-training with a 12-layer decoder-only Transformer. It introduced a two-stage training process leveraging the BookCorpus dataset and achieved improvements across various NLP tasks.

2018-06-01
117M
12-layer decoder-only Transformer with 12 masked self-attention heads
MIT

Specifications

Parameters
117M
Architecture
12-layer decoder-only Transformer with 12 masked self-attention heads
License
MIT
Context Window
512 tokens
Training Data Cutoff
2018
Type
text
Modalities
text

Benchmark Scores

view all (+4)

Advanced Specifications

Model Family
GPT
API Access
Not Available
Chat Interface
Not Available
Multilingual Support
No

Capabilities & Limitations

Capabilities
natural language inferencequestion answeringsemantic similaritytext classification
Known Limitations
limited context window (512 tokens)
Notable Use Cases
text classificationquestion answeringsemantic similarity assessment

Related Models