The Skill Rating
The skill rating is at the heart of WTO - a fair, transparent measure of ability whose method we publish openly, so students, parents, and schools can trust it.
It updates as students compete, showing real progress over time rather than a single pass-or-fail score.
Why a rating, not just a score
Established competitions earn prestige from a persistent signal, not a single day.
Credibility
A rating earned over many editions is far harder to fake than one lucky sitting - so the rank actually means something.
Motivation
Students get an intrinsic reason to keep going: improve your rating, climb the leaderboard, see real progress.
Comparability
One number on one scale makes results comparable across students, schools, and countries worldwide.
How it works
The rating works like a chess Elo system, with one twist: the student does not play another student - the student plays the questions. Every item in the bank carries its own difficulty on the same scale as the student's rating.
For each question, the model predicts the chance the student answers it correctly, given their rating and the item's difficulty:
p = 1 / ( 1 + 10 ^ ( ( D − R ) / 400 ) )
Across the whole paper, the model compares how the student was expected to do against how they actually did, and nudges the rating toward the surprise:
R_new = R + K × ( S − E )
Because difficulty is built in, beating a hard paper moves the rating more than beating an easy one - even for the same number of correct answers. The rating measures demonstrated skill against calibrated difficulty, not just luck of the draw.
A worked example
A new student (rating 1000) answers 22 of 40 questions correctly. Watch what the same raw score is worth against an easier versus a harder paper:
| Paper | Expected | Actual | Rating change |
|---|---|---|---|
| Easier paper | 14.4 | 22 | +46 → 1046 |
| Harder paper | 6.0 | 22 | +96 → 1096 |
Identical raw score, a bigger rating gain on the harder paper - because the model knows one was tougher than the other.
Confidence grows with participation
A brand-new rating is a rough guess and should move fast; an established rating should move slowly so leaderboards stay stable. So the rating's sensitivity narrows as a student completes more editions - provisional, then developing, then established. After long inactivity it eases gently toward the median, so boards reflect current, active skill.
Difficulty is learned, not guessed
Item difficulties are not fixed by the author - they are calibrated from real responses after every edition. Items that prove easier than predicted are adjusted down, and vice versa. Students sharpen the questions, and well-calibrated questions produce fairer ratings: a genuine data flywheel.
From rating to rank
Rank is derived from the rating, not computed separately. The same number produces every leaderboard simply by choosing the population: global, country, state, city, and school. A student's rank is one plus the number of students ahead of them, and the top percentile worldwide earns World Rank Holder recognition.
- 1M+
- learners worldwide
- 145+
- countries reached
- 2016
- trusted since
What the rating unlocks for you
The same transparent number means something different, and valuable, to each audience.
For parents
See real, motivating progress over time instead of a single pass-or-fail score - and a rank you can trust because it is hard to fake.
Why parents choose WTOFor schools
Benchmark every student on one transparent scale, compare cohorts fairly, and show measurable growth in board-ready reports.
See the school programFor partners
The rating engine is the moat: a defensible, calibrated benchmark you run on our platform without building any of the data science.
Frequently asked questions
The questions parents and schools ask most about the rating.
What is the WTO Skill Rating?
It is a single number, in the style of a chess Elo rating, that measures a student's demonstrated technology skill. Instead of playing another student, the student plays the questions: each item carries a calibrated difficulty on the same scale as the rating, and the rating moves toward the surprise between expected and actual performance.
Why a rating instead of a simple percentage score?
A percentage from one paper says little - it depends on how hard that paper was. The rating accounts for item difficulty, so beating a hard paper moves it more than beating an easy one. Earned across many editions, it is a far more credible, harder-to-fake measure than a single sitting.
Is the method public, or a black box?
It is published openly. The logistic expectation, the update rule, difficulty calibration, and how rank is derived from rating are all explained on this page, so students, parents, and schools can trust it.
Can my child's rating go down?
It can move both ways, but the design is forgiving: new ratings move fast and established ones move slowly, the word 'failed' is never used, and there is always a next edition. The goal is steady improvement, not punishment for one tough day.
How does the rating become a rank?
Rank is derived from the rating, not computed separately. The same number produces every leaderboard by choosing the population - global, country, state, city, or school - and a student's rank is one plus the number of students ahead of them.
Continue exploring
Build a rating that proves real skill
Every edition your child sits makes the rating - and the rank - more meaningful.



