Back to Glossary

Benchmark

ベンチマーク(ベンチマーク)

IntermediateCore Concepts

A standardized test or dataset used to measure and compare the performance of different AI models on specific tasks.

Why It Matters

Benchmarks help you choose the right model by comparing accuracy, speed, and capability across models.

Example in Practice

MMLU (Massive Multitask Language Understanding) testing how well models answer questions across 57 subjects.

Want to understand AI, not just define it?

Our courses teach you to build with these concepts, not just memorize them.