| Model | GL Accuracy | CC Accuracy | Top-3 | Top-5 | Top-10 | Latency (P50) | Latency (P95) | Cost (USD) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| {{ model_name }} | {% if llm_skipped %}N/A - GOOGLE_API_KEY not set | {% else %}{{ "%.1f"|format(r.exact_match_gl * 100) }}% | {% if model_key == 'llm' %}{{ "%.1f"|format(r.exact_match_cc * 100) }}%{% else %}N/A{% endif %} | {% if model_key == 'llm' %}N/A{% else %}{{ "%.1f"|format(r.top_3_accuracy * 100) }}%{% endif %} | {% if model_key == 'llm' %}N/A{% else %}{{ "%.1f"|format(r.top_5_accuracy * 100) }}%{% endif %} | {% if model_key == 'llm' %}N/A{% else %}{{ "%.1f"|format(r.top_10_accuracy * 100) }}%{% endif %} | {{ "%.0f"|format(r.latency_p50_ms) }}ms | {{ "%.0f"|format(r.latency_p95_ms) }}ms | ${{ "%.4f"|format(r.total_cost_usd) }} | {% endif %}|||||||
No benchmark results available.
{% endif %}Shows prediction patterns for the Google embedding model. Low-frequency accounts are grouped into "Other".
Insufficient data to generate confusion matrix.
{% endif %}Representative examples across different outcome categories to illustrate model behavior.
{% if showcase and showcase.categories %} {% for category in showcase.categories %}No examples found for this category.
{% endif %}No showcase examples available.
{% endif %}