How we rank AI models and tools

We rank AI by combining public benchmarks, community reviews, pricing, and recency. Models and tools are evaluated separately — they serve different purposes and shouldn't be ranked against each other.

Weights

Benchmark performance: Public benchmarks such as LMSYS Arena Elo, MMLU, HumanEval, and GPQA.
Real-world reputation: Community reviews, expert writeups, and social media sentiment. Aggregated with AI research tools.
Accessibility and price: Free tiers, pricing, API availability, and ease of use.
Recency: Frequency of recent updates and new-feature cadence.

Tiers

S — S tier — 90 and above
A — A tier — 80 to 89
B — B tier — 70 to 79
C — C tier — below 70

Weekly curation (not a full catalog)

We refresh research on a weekly cadence and intentionally keep each task list short — popularity and recency matter more than completeness.
New candidates are expected to show credible activity within roughly the last six months (releases, changelogs, or pricing updates).
Older entries can remain when they still show strong reputation signals (e.g. sustained community adoption) even if they are not brand-new.
We may flag entries for review when both recency and reputation signals look weak; that is guidance for editors, not an automatic delist.

Limits we acknowledge

No ranking is fully objective — benchmarks do not capture all real-world use.
AI-collected community sentiment can contain errors; we verify before publishing.
We represent an opinion grounded in public data — not authoritative judgment.

We don't publish evaluations without verifiable sources.

Corrections

We respond to correction requests from product owners within 24 hours. Use "Report Incorrect Info" in the footer.

Last weight adjustment: 2026-04-28