Page 1 of 5

Research Products

Who are we?

LLM Stats is a leading independent AI evaluation platform with 200K+ monthly users, backed by Y Combinator. We are proud to serve the best researchers in the world.

Our products

Benchmarking Services

Semi-private evaluations across law, medicine, finance, cybersecurity, and more. All modalities and languages.

We work with your team to design the evaluation pipeline in less than 48h.

Human Expert Preferences

500K+ monthly preference pairs. Multi-turn, filesystem harness, expert-labeled. From verified professionals doing real work on our platform, not annotators.

RL Environments

Verifiable coding environments for multi-hour tasks. ML research, refactoring, security, finance, real codebases on multiple programming languages.