WinoGrande

reasoningPending Human Review

An adversarial winograd schema challenge at scale.

Published: 2019

Score Range: 0-100

Top Score: 84.9

WinoGrande Leaderboard

Rank	Model	Provider	Score	Parameters	Released	Type
1	DeepSeek-V3	DeepSeek	84.9	671B total, 37B activated	2024-12-26	Text

About WinoGrande

Methodology

WinoGrande evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.