GPT-4.1
Flagship GPT model for complex tasks. It is well suited for problem solving across domains. Features major improvements in coding, instruction following, and long context comprehension.
Specifications
- Architecture
- Decoder-only Transformer (with vision encoder for images)
- License
- Proprietary
- Context Window
- 1,047,576 tokens
- Max Output
- 32,768 tokens
- Training Data Cutoff
- Jun 2024
- Type
- multimodal
- Modalities
- textvision
Benchmark Scores
A benchmark of over 1,400 freelance software engineering tasks from Upwork, valued at $1 million USD...
Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...
Multi-IF evaluates LLMs on multi-turn and multilingual instruction following across 8 languages, wit...
A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI with 11.5...
Evaluates mathematical reasoning in visual contexts, combining vision and mathematical problem-solvi...
Tests reasoning on challenging problems from arXiv papers across multiple scientific domains....
Advanced Specifications
- Model Family
- GPT
- API Access
- Available
- Chat Interface
- Not Available
Capabilities & Limitations
- Capabilities
- codemathreasoningfunction callingstructured outputsfine-tuningdistillationpredicted outputsweb searchfile searchimage generationcode interpreterMCPstreamingbatch APIprompt caching
- Known Limitations
- Does not support audio modalitiesDoes not support computer useNot available in free tier
- Notable Use Cases
- coding assistantdocument QAagent systemssoftware engineeringextracting insights from large documentscustomer support automationlegal document analysisfrontend development
- Function Calling Support
- Yes
- Tool Use Support
- Yes