Test annotators first.
Reviewers pass a short golden test before touching silver data.
Robot dataset quality
We provide golden sets, tested annotators, hidden quality checks, diversity checks, weighted consensus, and policy training proof.
The quality pyramid
A robot clip only helps training when the action is visible, the object is right, and the judgment comes from annotators who keep passing hidden checks.
Common + noisy at the base. EgoArena moves the useful clips up.
How EgoArena scores
Reviewers pass a short golden test before touching silver data.
Hidden golden clips inside earn mode catch drift over time.
Silver clips complete when enough trusted reviewers agree.
We do real training of models and rollouts to gather data signal by getting it to work.
Live arena
For dataset builders
Submit raw egocentric data. EgoArena turns noisy clips into ranked, reviewable signal.