BERT
Bidirectional Transformer pre-trained for language understanding (not generative). Influential for NLP tasks with masked-language modeling.
2018-10-01
340M
Encoder-only Transformer
Apache 2.0
Specifications
- Parameters
- 340M
- Architecture
- Encoder-only Transformer
- License
- Apache 2.0
- Context Window
- 512 tokens
- Training Data Cutoff
- 2018-10
- Type
- text
- Modalities
- text
Benchmark Scores
Advanced Specifications
- Model Family
- BERT
- API Access
- Not Available
- Chat Interface
- Not Available
- Variants
- BERTBASEBERTLARGEBERTTINY
Capabilities & Limitations
- Capabilities
- coreference resolutionpolysemy resolutionquestion answeringsentiment classification
- Notable Use Cases
- document QAcoding assistanttext classification