Public Beta available

The spreadsheet to stay on top of AI models.

Benchmark your prompts across every model in a few clicks. Compare quality, cost, and speed. Get alerted when a better one drops.

Ecommerce Prompts
2
Product name
AnthropicClaude Opus 4.7
OpenAIGPT-5.4
AeroMesh Ergonomic Office Chair

Experience unparalleled comfort with the AeroMesh office chair. Designed with breathable mesh and adaptive lumbar support, it keeps you cool and focused through long workdays.

1.2s $0.003
4/ 5

The AeroMesh Ergonomic Office Chair redefines workspace seating. Its innovative breathable material and dynamic backrest ensure optimal posture and temperature control for peak productivity.

0.9s $0.005
3/ 5
Lumina Smart Desk Lamp
Generating...

Waiting for input changes to settle...

Click to add row...

Everything you need to pick the right model.

Spreadsheet-shaped

It looks like a spreadsheet but every column is either a manual input or an AI generation. No formulas to write.

Multi-model comparison

Add columns for Claude, GPT, Gemini, Grok, Llama, and Mistral. Each maintains its own provider config, system prompt, and temperature.

Automatic regeneration

Edit an input cell and all dependent AI columns regenerate instantly. The UI reflects the generating state in real time.

Automatic evaluation

Attach an LLM judge to any column. Score 1–5, pass/fail, or categorize each row. Tweak the prompt — we re-judge without re-generating.

Cost & latency tracking

Every cell shows execution time and estimated cost based on token counts. Evaluate models holistically, not just on output quality.

Model alerts

Get notified when new models drop that are faster or cheaper for your workloads. Stay current without re-running everything manually.

From prompt to benchmark in three steps.

No scripts, no dashboards. Just rows, columns, and real results.

01

Add test inputs

Paste your dataset or type inputs manually into the left-most columns. These are the variables for your prompts.

02

Configure AI columns

Add columns for any model. Set the system prompt, reference your input columns via variables, and tweak parameters.

03

Compare and evaluate

Outputs populate side-by-side. Add an LLM judge to score or pass/fail every row automatically — or mark winners yourself.

Start benchmarking in seconds.

GenTable is currently in open beta. Sign in with Google or GitHub, add your API keys, and start comparing models for free.

No credit card required during beta.