Aider LLM 排行榜

Aider 擅长与精通代码编写和编辑的大语言模型（LLM）协作，并通过基准测试来评估 LLM 在无需人工干预的情况下遵循指令并成功编辑代码的能力。Aider 的多语言基准测试针对 C++、Go、Java、JavaScript、Python 和 Rust 等语言的 225 个高难度 Exercism 编程练习对 LLM 进行测试。

Aider 多语言编程排行榜

	模型	正确率	成本	命令	正确编辑格式	编辑格式
	gemini-2.5-pro-preview-06-05 (32k think)	83.1%	$49.88	`aider --model gemini/gemini-2.5-pro-preview-06-05 --thinking-tokens 32k`	99.6%	diff-fenced
Dirname : 2025-06-06-16-36-21--gemini0605-32k-think-diff-fenced Test cases : 225 Model : gemini-2.5-pro-preview-06-05 (32k think) Edit format : diff-fenced Commit hash : f827f22 Thinking tokens : 32768 Pass rate 1 : 46.2 Pass rate 2 : 83.1 Pass num 1 : 104 Pass num 2 : 187 格式正确的百分比 : 99.6 Error outputs : 1 Num malformed responses : 1 Num with malformed responses : 1 User asks : 112 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2719961 Completion tokens : 4648227 Test timeouts : 0 Total tests : 225 Command : `aider --model gemini/gemini-2.5-pro-preview-06-05 --thinking-tokens 32k` Date : 2025-06-06 Versions : 0.84.1.dev Seconds per case : 200.3 Total cost : 49.8822
	o3 (high) + gpt-4.1	82.7%	$69.29	`aider --model o3 --architect`	100.0%	architect
Dirname : 2025-04-17-01-20-35--o3-mini-high-diff-arch Test cases : 225 Model : o3 (high) + gpt-4.1 Edit format : architect Commit hash : 80909e1-dirty Editor model : gpt-4.1 Editor edit format : editor-diff Pass rate 1 : 36.0 Pass rate 2 : 82.7 Pass num 1 : 81 Pass num 2 : 186 格式正确的百分比 : 100.0 Error outputs : 9 Num malformed responses : 0 Num with malformed responses : 0 User asks : 166 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 0 Total tests : 225 Command : `aider --model o3 --architect` Date : 2025-04-17 Versions : 0.82.2.dev Seconds per case : 110.0 Total cost : 69.2921
	o3 (high)	81.3%	$21.23	`aider --model o3 --reasoning-effort high`	94.7%	diff
Dirname : 2025-06-25-21-04-24--o3-price-reduction-high Test cases : 225 Model : o3 (high) Edit format : diff Commit hash : c48fea6 Reasoning effort : high Pass rate 1 : 40.0 Pass rate 2 : 81.3 Pass num 1 : 90 Pass num 2 : 183 格式正确的百分比 : 94.7 Error outputs : 25 Num malformed responses : 23 Num with malformed responses : 12 User asks : 116 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Prompt tokens : 3148932 Completion tokens : 2047615 Test timeouts : 2 Total tests : 225 Command : `aider --model o3 --reasoning-effort high` Date : 2025-06-25 Versions : 0.84.1.dev Seconds per case : 197.3 Total cost : 21.2259
	gemini-2.5-pro-preview-06-05 (default think)	79.1%	$45.6	`aider --model gemini/gemini-2.5-pro-preview-06-05`	100.0%	diff-fenced
Dirname : 2025-06-06-18-38-56--gemini0605-diff-fenced Test cases : 225 Model : gemini-2.5-pro-preview-06-05 (default think) Edit format : diff-fenced Commit hash : 4c161f9-dirty Pass rate 1 : 44.9 Pass rate 2 : 79.1 Pass num 1 : 101 Pass num 2 : 178 格式正确的百分比 : 100.0 Error outputs : 4 Num malformed responses : 0 Num with malformed responses : 0 User asks : 105 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 4 Prompt tokens : 2751296 Completion tokens : 4142197 Test timeouts : 1 Total tests : 225 Command : `aider --model gemini/gemini-2.5-pro-preview-06-05` Date : 2025-06-06 Versions : 0.84.1.dev Seconds per case : 175.2 Total cost : 45.5961
	o3	76.9%	$13.75	`aider --model o3`	93.8%	diff
Dirname : 2025-06-25-20-30-16--o3-price-reduction Test cases : 225 Model : o3 Edit format : diff Commit hash : c48fea6 Pass rate 1 : 40.9 Pass rate 2 : 76.9 Pass num 1 : 92 Pass num 2 : 173 格式正确的百分比 : 93.8 Error outputs : 22 Num malformed responses : 22 Num with malformed responses : 14 User asks : 108 Lazy comments : 2 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2893189 Completion tokens : 1154767 Test timeouts : 1 Total tests : 225 Command : `aider --model o3` Date : 2025-06-25 Versions : 0.84.1.dev Seconds per case : 101.7 Total cost : 13.7517
	Gemini 2.5 Pro Preview 05-06	76.9%	$37.41	`aider --model gemini/gemini-2.5-pro-preview-05-06`	97.3%	diff-fenced
Dirname : 2025-05-07-19-32-40--gemini0506-diff-fenced-completion_cost Test cases : 225 Model : Gemini 2.5 Pro Preview 05-06 Edit format : diff-fenced Commit hash : 3b08327-dirty Pass rate 1 : 36.4 Pass rate 2 : 76.9 Pass num 1 : 82 Pass num 2 : 173 格式正确的百分比 : 97.3 Error outputs : 15 Num malformed responses : 7 Num with malformed responses : 6 User asks : 105 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model gemini/gemini-2.5-pro-preview-05-06` Date : 2025-05-07 Versions : 0.82.4.dev Seconds per case : 165.3 Total cost : 37.4104
	Gemini 2.5 Pro Preview 03-25	72.9%		`aider --model gemini/gemini-2.5-pro-preview-03-25`	92.4%	diff-fenced
Dirname : 2025-04-12-04-55-50--gemini-25-pro-diff-fenced Test cases : 225 Model : Gemini 2.5 Pro Preview 03-25 Edit format : diff-fenced Commit hash : 0282574 Pass rate 1 : 40.9 Pass rate 2 : 72.9 Pass num 1 : 92 Pass num 2 : 164 格式正确的百分比 : 92.4 Error outputs : 21 Num malformed responses : 21 Num with malformed responses : 17 User asks : 69 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model gemini/gemini-2.5-pro-preview-03-25` Date : 2025-04-12 Versions : 0.81.3.dev Seconds per case : 45.3 Total cost : 0
	claude-opus-4-20250514 (32k thinking)	72.0%	$65.75	`aider --model claude-opus-4-20250514`	97.3%	diff
Dirname : 2025-05-25-20-40-51--opus4-diff-exuser Test cases : 225 Model : claude-opus-4-20250514 (32k thinking) Edit format : diff Commit hash : 9ef3211 Thinking tokens : 32000 Pass rate 1 : 37.3 Pass rate 2 : 72.0 Pass num 1 : 84 Pass num 2 : 162 格式正确的百分比 : 97.3 Error outputs : 10 Num malformed responses : 6 Num with malformed responses : 6 User asks : 97 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2567514 Completion tokens : 363142 Test timeouts : 4 Total tests : 225 Command : `aider --model claude-opus-4-20250514` Date : 2025-05-25 Versions : 0.83.3.dev Seconds per case : 44.1 Total cost : 65.7484
	o4-mini (high)	72.0%	$19.64	`aider --model o4-mini`	90.7%	diff
Dirname : 2025-04-16-22-01-58--o4-mini-high-diff-exsys Test cases : 225 Model : o4-mini (high) Edit format : diff Commit hash : b66901f-dirty Pass rate 1 : 19.6 Pass rate 2 : 72.0 Pass num 1 : 44 Pass num 2 : 162 格式正确的百分比 : 90.7 Error outputs : 26 Num malformed responses : 24 Num with malformed responses : 21 User asks : 66 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 2 Total tests : 225 Command : `aider --model o4-mini` Date : 2025-04-16 Versions : 0.82.1.dev Seconds per case : 176.5 Total cost : 19.6399
	DeepSeek R1 (0528)	71.4%	$4.8	`aider --model deepseek/deepseek-reasoner`	94.6%	diff
Dirname : 2025-06-06-16-47-07--r1-diff Test cases : 224 Model : DeepSeek R1 (0528) Edit format : diff Commit hash : 4c161f9-dirty Pass rate 1 : 34.4 Pass rate 2 : 71.4 Pass num 1 : 77 Pass num 2 : 160 格式正确的百分比 : 94.6 Error outputs : 28 Num malformed responses : 15 Num with malformed responses : 12 User asks : 105 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2644169 Completion tokens : 1842168 Test timeouts : 2 Total tests : 225 Command : `aider --model deepseek/deepseek-reasoner` Date : 2025-06-06 Versions : 0.84.1.dev Seconds per case : 716.6 Total cost : 4.8016
	claude-opus-4-20250514 (no think)	70.7%	$68.63	`aider --model claude-opus-4-20250514`	98.7%	diff
Dirname : 2025-05-25-19-57-20--opus4-diff-exuser Test cases : 225 Model : claude-opus-4-20250514 (no think) Edit format : diff Commit hash : 9ef3211 Pass rate 1 : 32.9 Pass rate 2 : 70.7 Pass num 1 : 74 Pass num 2 : 159 格式正确的百分比 : 98.7 Error outputs : 3 Num malformed responses : 3 Num with malformed responses : 3 User asks : 105 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2671437 Completion tokens : 380717 Test timeouts : 3 Total tests : 225 Command : `aider --model claude-opus-4-20250514` Date : 2025-05-25 Versions : 0.83.3.dev Seconds per case : 42.5 Total cost : 68.6253
	claude-3-7-sonnet-20250219 (32k thinking tokens)	64.9%	$36.83	`aider --model anthropic/claude-3-7-sonnet-20250219 --thinking-tokens 32k`	97.8%	diff
Dirname : 2025-02-24-21-47-23--sonnet37-diff-think-32k-64k Test cases : 225 Model : claude-3-7-sonnet-20250219 (32k thinking tokens) Edit format : diff Commit hash : 60d11a6, 93edbda Pass rate 1 : 29.3 Pass rate 2 : 64.9 Pass num 1 : 66 Pass num 2 : 146 格式正确的百分比 : 97.8 Error outputs : 66 Num malformed responses : 5 Num with malformed responses : 5 User asks : 5 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 1 Total tests : 225 Command : `aider --model anthropic/claude-3-7-sonnet-20250219 --thinking-tokens 32k` Date : 2025-02-24 Versions : 0.75.1.dev Seconds per case : 105.2 Total cost : 36.8343
	DeepSeek R1 + claude-3-5-sonnet-20241022	64.0%	$13.29	`aider --architect --model r1 --editor-model sonnet`	100.0%	architect
Dirname : 2025-01-23-19-14-48--r1-architect-sonnet Test cases : 225 Model : DeepSeek R1 + claude-3-5-sonnet-20241022 Edit format : architect Commit hash : 05a77c7 Editor model : claude-3-5-sonnet-20241022 Editor edit format : editor-diff Pass rate 1 : 27.1 Pass rate 2 : 64.0 Pass num 1 : 61 Pass num 2 : 144 格式正确的百分比 : 100.0 Error outputs : 2 Num malformed responses : 0 Num with malformed responses : 0 User asks : 392 Lazy comments : 6 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 5 Total tests : 225 Command : `aider --architect --model r1 --editor-model sonnet` Date : 2025-01-23 Versions : 0.72.3.dev Seconds per case : 251.6 Total cost : 13.2933
	o1-2024-12-17 (high)	61.7%	$186.5	`aider --model openrouter/openai/o1`	91.5%	diff
Dirname : 2024-12-21-19-23-03--polyglot-o1-hard-diff Test cases : 224 Model : o1-2024-12-17 (high) Edit format : diff Commit hash : a755079-dirty Pass rate 1 : 23.7 Pass rate 2 : 61.7 Pass num 1 : 53 Pass num 2 : 139 格式正确的百分比 : 91.5 Error outputs : 25 Num malformed responses : 24 Num with malformed responses : 19 User asks : 16 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model openrouter/openai/o1` Date : 2024-12-21 Versions : 0.69.2.dev Seconds per case : 133.2 Total cost : 186.4958
	claude-sonnet-4-20250514 (32k thinking)	61.3%	$26.58	`aider --model claude-sonnet-4-20250514`	97.3%	diff
Dirname : 2025-05-24-22-10-36--sonnet4-diff-exuser-think32k Test cases : 225 Model : claude-sonnet-4-20250514 (32k thinking) Edit format : diff Commit hash : e3cb907 Thinking tokens : 32000 Pass rate 1 : 25.8 Pass rate 2 : 61.3 Pass num 1 : 58 Pass num 2 : 138 格式正确的百分比 : 97.3 Error outputs : 10 Num malformed responses : 10 Num with malformed responses : 6 User asks : 111 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2863068 Completion tokens : 1271074 Test timeouts : 6 Total tests : 225 Command : `aider --model claude-sonnet-4-20250514` Date : 2025-05-24 Versions : 0.83.3.dev Seconds per case : 79.9 Total cost : 26.5755
	claude-3-7-sonnet-20250219 (no thinking)	60.4%	$17.72	`aider --model sonnet`	93.3%	diff
Dirname : 2025-02-24-19-54-07--sonnet37-diff Test cases : 225 Model : claude-3-7-sonnet-20250219 (no thinking) Edit format : diff Commit hash : 75e9ee6 Pass rate 1 : 24.4 Pass rate 2 : 60.4 Pass num 1 : 55 Pass num 2 : 136 格式正确的百分比 : 93.3 Error outputs : 16 Num malformed responses : 16 Num with malformed responses : 15 User asks : 12 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 0 Total tests : 225 Command : `aider --model sonnet` Date : 2025-02-24 Versions : 0.74.4.dev Seconds per case : 28.3 Total cost : 17.7191
	o3-mini (high)	60.4%	$18.16	`aider --model o3-mini --reasoning-effort high`	93.3%	diff
Dirname : 2025-01-31-20-42-47--o3-mini-diff-high Test cases : 224 Model : o3-mini (high) Edit format : diff Commit hash : b0d58d1-dirty Pass rate 1 : 21.0 Pass rate 2 : 60.4 Pass num 1 : 47 Pass num 2 : 136 格式正确的百分比 : 93.3 Error outputs : 26 Num malformed responses : 24 Num with malformed responses : 15 User asks : 19 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 7 Total tests : 225 Command : `aider --model o3-mini --reasoning-effort high` Date : 2025-01-31 Versions : 0.72.4.dev Seconds per case : 124.6 Total cost : 18.1584
	Qwen3 235B A22B diff, no think, Alibaba API	59.6%		`aider --model openai/qwen3-235b-a22b`	92.9%	diff
Dirname : 2025-05-09-17-02-02--qwen3-235b-a22b.unthink_16k_diff Test cases : 225 Model : Qwen3 235B A22B diff, no think, Alibaba API Edit format : diff Commit hash : 91d7fbd-dirty Pass rate 1 : 28.9 Pass rate 2 : 59.6 Pass num 1 : 65 Pass num 2 : 134 格式正确的百分比 : 92.9 Error outputs : 22 Num malformed responses : 22 Num with malformed responses : 16 User asks : 111 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 2816192 Completion tokens : 342062 Test timeouts : 1 Total tests : 225 Command : `aider --model openai/qwen3-235b-a22b` Date : 2025-05-09 Versions : 0.82.4.dev Seconds per case : 45.4 Total cost : 0.0
	DeepSeek R1	56.9%	$5.42	`aider --model deepseek/deepseek-reasoner`	96.9%	diff
Dirname : 2025-01-20-19-11-38--ds-turns-upd-cur-msgs-fix-with-summarizer Test cases : 225 Model : DeepSeek R1 Edit format : diff Commit hash : 5650697-dirty Pass rate 1 : 26.7 Pass rate 2 : 56.9 Pass num 1 : 60 Pass num 2 : 128 格式正确的百分比 : 96.9 Error outputs : 8 Num malformed responses : 7 Num with malformed responses : 7 User asks : 15 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 5 Total tests : 225 Command : `aider --model deepseek/deepseek-reasoner` Date : 2025-01-20 Versions : 0.71.2.dev Seconds per case : 113.7 Total cost : 5.4193
	claude-sonnet-4-20250514 (no thinking)	56.4%	$15.82	`aider --model claude-sonnet-4-20250514`	98.2%	diff
Dirname : 2025-05-24-21-17-54--sonnet4-diff-exuser Test cases : 225 Model : claude-sonnet-4-20250514 (no thinking) Edit format : diff Commit hash : ef3f8bb-dirty Pass rate 1 : 20.4 Pass rate 2 : 56.4 Pass num 1 : 46 Pass num 2 : 127 格式正确的百分比 : 98.2 Error outputs : 6 Num malformed responses : 4 Num with malformed responses : 4 User asks : 129 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Prompt tokens : 3460663 Completion tokens : 433373 Test timeouts : 7 Total tests : 225 Command : `aider --model claude-sonnet-4-20250514` Date : 2025-05-24 Versions : 0.83.3.dev Seconds per case : 29.8 Total cost : 15.8155
	gemini-2.5-flash-preview-05-20 (24k think)	55.1%	$8.56	`aider --model gemini/gemini-2.5-flash-preview-05-20`	95.6%	diff
Dirname : 2025-05-25-22-58-44--flash25-05-20-24k-think Test cases : 225 Model : gemini-2.5-flash-preview-05-20 (24k think) Edit format : diff Commit hash : a8568c3-dirty Thinking tokens : 24576 Pass rate 1 : 26.2 Pass rate 2 : 55.1 Pass num 1 : 59 Pass num 2 : 124 格式正确的百分比 : 95.6 Error outputs : 15 Num malformed responses : 15 Num with malformed responses : 10 User asks : 101 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 3666792 Completion tokens : 2703162 Test timeouts : 4 Total tests : 225 Command : `aider --model gemini/gemini-2.5-flash-preview-05-20` Date : 2025-05-25 Versions : 0.83.3.dev Seconds per case : 53.9 Total cost : 8.5625
	DeepSeek V3 (0324)	55.1%	$1.12	`aider --model deepseek/deepseek-chat`	99.6%	diff
Dirname : 2025-03-24-15-41-33--deepseek-v3-0324-polyglot-diff Test cases : 225 Model : DeepSeek V3 (0324) Edit format : diff Commit hash : 502b863 Pass rate 1 : 28.0 Pass rate 2 : 55.1 Pass num 1 : 63 Pass num 2 : 124 格式正确的百分比 : 99.6 Error outputs : 32 Num malformed responses : 1 Num with malformed responses : 1 User asks : 96 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 2 Test timeouts : 4 Total tests : 225 Command : `aider --model deepseek/deepseek-chat` Date : 2025-03-24 Versions : 0.78.1.dev Seconds per case : 290.0 Total cost : 1.1164
	Quasar Alpha	54.7%		`aider --model openrouter/openrouter/quasar-alpha`	98.2%	diff
Dirname : 2025-04-04-02-57-25--qalpha-diff-exsys Test cases : 225 Model : Quasar Alpha Edit format : diff Commit hash : 8a34a6c-dirty Pass rate 1 : 21.8 Pass rate 2 : 54.7 Pass num 1 : 49 Pass num 2 : 123 格式正确的百分比 : 98.2 Error outputs : 4 Num malformed responses : 4 Num with malformed responses : 4 User asks : 187 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 4 Total tests : 225 Command : `aider --model openrouter/openrouter/quasar-alpha` Date : 2025-04-04 Versions : 0.80.5.dev Seconds per case : 14.8 Total cost : 0.0
	o3-mini (medium)	53.8%	$8.86	`aider --model o3-mini`	95.1%	diff
Dirname : 2025-01-31-20-27-46--o3-mini-diff2 Test cases : 225 Model : o3-mini (medium) Edit format : diff Commit hash : 2fb517b-dirty Pass rate 1 : 19.1 Pass rate 2 : 53.8 Pass num 1 : 43 Pass num 2 : 121 格式正确的百分比 : 95.1 Error outputs : 28 Num malformed responses : 28 Num with malformed responses : 11 User asks : 17 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model o3-mini` Date : 2025-01-31 Versions : 0.72.4.dev Seconds per case : 47.2 Total cost : 8.8599
	Grok 3 Beta	53.3%	$11.03	`aider --model openrouter/x-ai/grok-3-beta`	99.6%	diff
Dirname : 2025-04-10-04-21-31--grok3-diff-exuser Test cases : 225 Model : Grok 3 Beta Edit format : diff Commit hash : 2dd40fc-dirty Pass rate 1 : 22.2 Pass rate 2 : 53.3 Pass num 1 : 50 Pass num 2 : 120 格式正确的百分比 : 99.6 Error outputs : 1 Num malformed responses : 1 Num with malformed responses : 1 User asks : 68 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model openrouter/x-ai/grok-3-beta` Date : 2025-04-10 Versions : 0.81.2.dev Seconds per case : 15.3 Total cost : 11.0338
	Optimus Alpha	52.9%		`aider --model openrouter/openrouter/optimus-alpha`	97.3%	diff
Dirname : 2025-04-10-19-02-44--oalpha-diff-exsys Test cases : 225 Model : Optimus Alpha Edit format : diff Commit hash : 532bc45-dirty Pass rate 1 : 21.3 Pass rate 2 : 52.9 Pass num 1 : 48 Pass num 2 : 119 格式正确的百分比 : 97.3 Error outputs : 7 Num malformed responses : 6 Num with malformed responses : 6 User asks : 182 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 3 Total tests : 225 Command : `aider --model openrouter/openrouter/optimus-alpha` Date : 2025-04-10 Versions : 0.81.2.dev Seconds per case : 18.4 Total cost : 0.0
	gpt-4.1	52.4%	$9.86	`aider --model gpt-4.1`	98.2%	diff
Dirname : 2025-04-14-21-05-54--gpt41-diff-exuser Test cases : 225 Model : gpt-4.1 Edit format : diff Commit hash : 7a87db5-dirty Pass rate 1 : 20.0 Pass rate 2 : 52.4 Pass num 1 : 45 Pass num 2 : 118 格式正确的百分比 : 98.2 Error outputs : 6 Num malformed responses : 5 Num with malformed responses : 4 User asks : 171 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 5 Total tests : 225 Command : `aider --model gpt-4.1` Date : 2025-04-14 Versions : 0.81.4.dev Seconds per case : 20.5 Total cost : 9.8556
	claude-3-5-sonnet-20241022	51.6%	$14.41	`aider --model claude-3-5-sonnet-20241022`	99.6%	diff
Dirname : 2025-01-17-19-44-33--sonnet-baseline-jan-17 Test cases : 225 Model : claude-3-5-sonnet-20241022 Edit format : diff Commit hash : 6451d59 Pass rate 1 : 22.2 Pass rate 2 : 51.6 Pass num 1 : 50 Pass num 2 : 116 格式正确的百分比 : 99.6 Error outputs : 2 Num malformed responses : 1 Num with malformed responses : 1 User asks : 11 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 8 Total tests : 225 Command : `aider --model claude-3-5-sonnet-20241022` Date : 2025-01-17 Versions : 0.71.2.dev Seconds per case : 21.4 Total cost : 14.4063
	Grok 3 Mini Beta (high)	49.3%	$0.73	`aider --model xai/grok-3-mini-beta --reasoning-effort high`	99.6%	whole
Dirname : 2025-04-10-23-59-02--xai-grok3-mini-whole-high Test cases : 225 Model : Grok 3 Mini Beta (high) Edit format : whole Commit hash : 8ee33da-dirty Pass rate 1 : 17.3 Pass rate 2 : 49.3 Pass num 1 : 39 Pass num 2 : 111 格式正确的百分比 : 99.6 Error outputs : 1 Num malformed responses : 1 Num with malformed responses : 1 User asks : 64 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 0 Total tests : 225 Command : `aider --model xai/grok-3-mini-beta --reasoning-effort high` Date : 2025-04-10 Versions : 0.81.3.dev Seconds per case : 79.1 Total cost : 0.7346
	DeepSeek Chat V3 (prev)	48.4%	$0.34	`aider --model deepseek/deepseek-chat`	98.7%	diff
Dirname : 2024-12-25-13-31-51--deepseekv3preview-diff2 Test cases : 225 Model : DeepSeek Chat V3 (prev) Edit format : diff Commit hash : 0a23c4a-dirty Pass rate 1 : 22.7 Pass rate 2 : 48.4 Pass num 1 : 51 Pass num 2 : 109 格式正确的百分比 : 98.7 Error outputs : 7 Num malformed responses : 7 Num with malformed responses : 3 User asks : 19 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 8 Total tests : 225 Command : `aider --model deepseek/deepseek-chat` Date : 2024-12-25 Versions : 0.69.2.dev Seconds per case : 34.8 Total cost : 0.3369
	gemini-2.5-flash-preview-04-17 (default)	47.1%	$1.85	`aider --model gemini/gemini-2.5-flash-preview-04-17`	85.3%	diff
Dirname : 2025-04-20-19-54-31--flash25-diff-no-think Test cases : 225 Model : gemini-2.5-flash-preview-04-17 (default) Edit format : diff Commit hash : 7fcce5d-dirty Pass rate 1 : 21.8 Pass rate 2 : 47.1 Pass num 1 : 49 Pass num 2 : 106 格式正确的百分比 : 85.3 Error outputs : 60 Num malformed responses : 55 Num with malformed responses : 33 User asks : 82 Lazy comments : 1 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 5 Test timeouts : 4 Total tests : 225 Command : `aider --model gemini/gemini-2.5-flash-preview-04-17` Date : 2025-04-20 Versions : 0.82.3.dev Seconds per case : 50.1 Total cost : 1.8451
	chatgpt-4o-latest (2025-03-29)	45.3%	$19.74	`aider --model chatgpt-4o-latest`	64.4%	diff
Dirname : 2025-03-29-05-24-55--chatgpt4o-mar28-diff Test cases : 225 Model : chatgpt-4o-latest (2025-03-29) Edit format : diff Commit hash : 0decbad Pass rate 1 : 16.4 Pass rate 2 : 45.3 Pass num 1 : 37 Pass num 2 : 102 格式正确的百分比 : 64.4 Error outputs : 85 Num malformed responses : 85 Num with malformed responses : 80 User asks : 174 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 4 Total tests : 225 Command : `aider --model chatgpt-4o-latest` Date : 2025-03-29 Versions : 0.79.3.dev Seconds per case : 10.3 Total cost : 19.7416
	gpt-4.5-preview	44.9%	$183.18	`aider --model openai/gpt-4.5-preview`	97.3%	diff
Dirname : 2025-02-27-20-26-15--gpt45-diff3 Test cases : 224 Model : gpt-4.5-preview Edit format : diff Commit hash : b462e55-dirty Pass rate 1 : 22.3 Pass rate 2 : 44.9 Pass num 1 : 50 Pass num 2 : 101 格式正确的百分比 : 97.3 Error outputs : 10 Num malformed responses : 8 Num with malformed responses : 6 User asks : 15 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 2 Total tests : 225 Command : `aider --model openai/gpt-4.5-preview` Date : 2025-02-27 Versions : 0.75.2.dev Seconds per case : 113.5 Total cost : 183.1802
	gemini-2.5-flash-preview-05-20 (no think)	44.0%	$1.14	`aider --model gemini/gemini-2.5-flash-preview-05-20`	93.8%	diff
Dirname : 2025-05-26-15-56-31--flash25-05-20-24k-think Test cases : 225 Model : gemini-2.5-flash-preview-05-20 (no think) Edit format : diff Commit hash : 214b811-dirty Thinking tokens : 0 Pass rate 1 : 20.9 Pass rate 2 : 44.0 Pass num 1 : 47 Pass num 2 : 99 格式正确的百分比 : 93.8 Error outputs : 16 Num malformed responses : 16 Num with malformed responses : 14 User asks : 79 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Prompt tokens : 5512458 Completion tokens : 514145 Test timeouts : 4 Total tests : 225 Command : `aider --model gemini/gemini-2.5-flash-preview-05-20` Date : 2025-05-26 Versions : 0.83.3.dev Seconds per case : 12.2 Total cost : 1.1354
	Qwen3 32B	40.0%	$0.76	`aider --model openrouter/qwen/qwen3-32b`	83.6%	diff
Dirname : 2025-05-08-03-20-24--qwen3-32b-default Test cases : 225 Model : Qwen3 32B Edit format : diff Commit hash : aaacee5-dirty, aeaf259 Pass rate 1 : 14.2 Pass rate 2 : 40.0 Pass num 1 : 32 Pass num 2 : 90 格式正确的百分比 : 83.6 Error outputs : 119 Num malformed responses : 50 Num with malformed responses : 37 User asks : 97 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 12 Prompt tokens : 317591 Completion tokens : 120418 Test timeouts : 5 Total tests : 225 Command : `aider --model openrouter/qwen/qwen3-32b` Date : 2025-05-08 Versions : 0.82.4.dev Seconds per case : 372.2 Total cost : 0.7603
	gemini-exp-1206	38.2%		`aider --model gemini/gemini-exp-1206`	98.2%	whole
Dirname : 2024-12-22-18-43-25--gemini-exp-1206-polyglot-whole-2 Test cases : 225 Model : gemini-exp-1206 Edit format : whole Commit hash : b1bc2f8 Pass rate 1 : 19.6 Pass rate 2 : 38.2 Pass num 1 : 44 Pass num 2 : 86 格式正确的百分比 : 98.2 Error outputs : 8 Num malformed responses : 8 Num with malformed responses : 4 User asks : 32 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 9 Total tests : 225 Command : `aider --model gemini/gemini-exp-1206` Date : 2024-12-22 Versions : 0.69.2.dev Seconds per case : 45.5 Total cost : 0.0
	Gemini 2.0 Pro exp-02-05	35.6%		`aider --model gemini/gemini-2.0-pro-exp-02-05`	100.0%	whole
Dirname : 2025-02-25-20-23-07--gemini-pro Test cases : 225 Model : Gemini 2.0 Pro exp-02-05 Edit format : whole Commit hash : 2fccd47 Pass rate 1 : 20.4 Pass rate 2 : 35.6 Pass num 1 : 46 Pass num 2 : 80 格式正确的百分比 : 100.0 Error outputs : 430 Num malformed responses : 0 Num with malformed responses : 0 User asks : 13 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 5 Total tests : 225 Command : `aider --model gemini/gemini-2.0-pro-exp-02-05` Date : 2025-02-25 Versions : 0.75.2.dev Seconds per case : 34.8 Total cost : 0.0
	Grok 3 Mini Beta (low)	34.7%	$0.79	`aider --model openrouter/x-ai/grok-3-mini-beta`	100.0%	whole
Dirname : 2025-04-10-18-47-24--grok3-mini-whole-exuser Test cases : 225 Model : Grok 3 Mini Beta (low) Edit format : whole Commit hash : 14ffe77-dirty Pass rate 1 : 11.1 Pass rate 2 : 34.7 Pass num 1 : 25 Pass num 2 : 78 格式正确的百分比 : 100.0 Error outputs : 3 Num malformed responses : 0 Num with malformed responses : 0 User asks : 73 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 5 Total tests : 225 Command : `aider --model openrouter/x-ai/grok-3-mini-beta` Date : 2025-04-10 Versions : 0.81.2.dev Seconds per case : 35.1 Total cost : 0.7856
	o1-mini-2024-09-12	32.9%	$18.58	`aider --model o1-mini`	96.9%	whole
Dirname : 2024-12-22-21-26-35--polyglot-o1mini-whole Test cases : 225 Model : o1-mini-2024-09-12 Edit format : whole Commit hash : 37df899 Pass rate 1 : 5.8 Pass rate 2 : 32.9 Pass num 1 : 13 Pass num 2 : 74 格式正确的百分比 : 96.9 Error outputs : 8 Num malformed responses : 8 Num with malformed responses : 7 User asks : 27 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 3 Total tests : 225 Command : `aider --model o1-mini` Date : 2024-12-22 Versions : 0.69.2.dev Seconds per case : 34.7 Total cost : 18.577
	gpt-4.1-mini	32.4%	$1.99	`aider --model gpt-4.1-mini`	92.4%	diff
Dirname : 2025-04-14-21-27-53--gpt41mini-diff Test cases : 225 Model : gpt-4.1-mini Edit format : diff Commit hash : ffb743e-dirty Pass rate 1 : 11.1 Pass rate 2 : 32.4 Pass num 1 : 25 Pass num 2 : 73 格式正确的百分比 : 92.4 Error outputs : 64 Num malformed responses : 62 Num with malformed responses : 17 User asks : 159 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 2 Test timeouts : 2 Total tests : 225 Command : `aider --model gpt-4.1-mini` Date : 2025-04-14 Versions : 0.81.4.dev Seconds per case : 19.5 Total cost : 1.9918
	claude-3-5-haiku-20241022	28.0%	$6.06	`aider --model claude-3-5-haiku-20241022`	91.1%	diff
Dirname : 2024-12-21-21-46-27--polyglot-haiku-diff Test cases : 225 Model : claude-3-5-haiku-20241022 Edit format : diff Commit hash : a755079-dirty Pass rate 1 : 7.1 Pass rate 2 : 28.0 Pass num 1 : 16 Pass num 2 : 63 格式正确的百分比 : 91.1 Error outputs : 31 Num malformed responses : 30 Num with malformed responses : 20 User asks : 13 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 9 Total tests : 225 Command : `aider --model claude-3-5-haiku-20241022` Date : 2024-12-21 Versions : 0.69.2.dev Seconds per case : 31.8 Total cost : 6.0583
	chatgpt-4o-latest (2025-02-15)	27.1%	$14.37	`aider --model chatgpt-4o-latest`	93.3%	diff
Dirname : 2025-02-15-19-51-22--chatgpt4o-feb15-diff Test cases : 223 Model : chatgpt-4o-latest (2025-02-15) Edit format : diff Commit hash : 108ce18-dirty Pass rate 1 : 9.0 Pass rate 2 : 27.1 Pass num 1 : 20 Pass num 2 : 61 格式正确的百分比 : 93.3 Error outputs : 66 Num malformed responses : 21 Num with malformed responses : 15 User asks : 57 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 2 Total tests : 225 Command : `aider --model chatgpt-4o-latest` Date : 2025-02-15 Versions : 0.74.3.dev Seconds per case : 12.4 Total cost : 14.3703
	QwQ-32B + Qwen 2.5 Coder Instruct	26.2%		`aider --model fireworks_ai/accounts/fireworks/models/qwq-32b --architect`	100.0%	architect
Dirname : 2025-03-07-15-11-27--qwq32b-arch-temp-topp-again Test cases : 225 Model : QwQ-32B + Qwen 2.5 Coder Instruct Edit format : architect Commit hash : 52162a5 Editor model : fireworks_ai/accounts/fireworks/models/qwen2p5-coder-32b-instruct Editor edit format : editor-diff Pass rate 1 : 9.8 Pass rate 2 : 26.2 Pass num 1 : 22 Pass num 2 : 59 格式正确的百分比 : 100.0 Error outputs : 122 Num malformed responses : 0 Num with malformed responses : 0 User asks : 489 Lazy comments : 8 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 2 Total tests : 225 Command : `aider --model fireworks_ai/accounts/fireworks/models/qwq-32b --architect` Date : 2025-03-07 Versions : 0.75.3.dev Seconds per case : 137.4 Total cost : 0
	gpt-4o-2024-08-06	23.1%	$7.03	`aider --model gpt-4o-2024-08-06`	94.2%	diff
Dirname : 2024-12-30-20-44-54--gpt4o-ex-as-sys-clean-prompt Test cases : 225 Model : gpt-4o-2024-08-06 Edit format : diff Commit hash : 09ee197-dirty Pass rate 1 : 4.9 Pass rate 2 : 23.1 Pass num 1 : 11 Pass num 2 : 52 格式正确的百分比 : 94.2 Error outputs : 21 Num malformed responses : 21 Num with malformed responses : 13 User asks : 65 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 3 Total tests : 225 Command : `aider --model gpt-4o-2024-08-06` Date : 2024-12-30 Versions : 0.70.1.dev Seconds per case : 16.0 Total cost : 7.0286
	gemini-2.0-flash-exp	22.2%		`aider --model gemini/gemini-2.0-flash-exp`	100.0%	whole
Dirname : 2024-12-22-20-08-13--gemini-2.0-flash-exp-polyglot-whole Test cases : 225 Model : gemini-2.0-flash-exp Edit format : whole Commit hash : b1bc2f8 Pass rate 1 : 11.6 Pass rate 2 : 22.2 Pass num 1 : 26 Pass num 2 : 50 格式正确的百分比 : 100.0 Error outputs : 1 Num malformed responses : 0 Num with malformed responses : 0 User asks : 9 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 8 Total tests : 225 Command : `aider --model gemini/gemini-2.0-flash-exp` Date : 2024-12-22 Versions : 0.69.2.dev Seconds per case : 12.2 Total cost : 0.0
	qwen-max-2025-01-25	21.8%		`OPENAI_API_BASE=https://dashscope-intl.aliyuncs.com/compatible-mode/v1 aider --model openai/qwen-max-2025-01-25`	90.2%	diff
Dirname : 2025-01-28-16-00-03--qwen-max-2025-01-25-polyglot-diff Test cases : 225 Model : qwen-max-2025-01-25 Edit format : diff Commit hash : ae7d459 Pass rate 1 : 9.3 Pass rate 2 : 21.8 Pass num 1 : 21 Pass num 2 : 49 格式正确的百分比 : 90.2 Error outputs : 46 Num malformed responses : 44 Num with malformed responses : 22 User asks : 23 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 9 Total tests : 225 Command : `OPENAI_API_BASE=https://dashscope-intl.aliyuncs.com/compatible-mode/v1 aider --model openai/qwen-max-2025-01-25` Date : 2025-01-28 Versions : 0.72.4.dev Seconds per case : 39.5
	QwQ-32B	20.9%		`aider --model fireworks_ai/accounts/fireworks/models/qwq-32b`	67.6%	diff
Dirname : 2025-03-06-17-40-24--qwq32b-diff-temp-topp-ex-sys-remind-user-for-real Test cases : 225 Model : QwQ-32B Edit format : diff Commit hash : 51d118f-dirty Pass rate 1 : 8.0 Pass rate 2 : 20.9 Pass num 1 : 18 Pass num 2 : 47 格式正确的百分比 : 67.6 Error outputs : 145 Num malformed responses : 143 Num with malformed responses : 73 User asks : 17 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 4 Total tests : 225 Command : `aider --model fireworks_ai/accounts/fireworks/models/qwq-32b` Date : 2025-03-06 Versions : 0.75.3.dev Seconds per case : 228.6 Total cost : 0.0
	gemini-2.0-flash-thinking-exp-01-21	18.2%		`aider --model gemini/gemini-2.0-flash-thinking-exp-01-21`	77.8%	diff
Dirname : 2025-01-21-22-51-49--gemini-2.0-flash-thinking-exp-01-21-polyglot-diff Test cases : 225 Model : gemini-2.0-flash-thinking-exp-01-21 Edit format : diff Commit hash : 843720a Pass rate 1 : 5.8 Pass rate 2 : 18.2 Pass num 1 : 13 Pass num 2 : 41 格式正确的百分比 : 77.8 Error outputs : 182 Num malformed responses : 180 Num with malformed responses : 50 User asks : 26 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 2 Test timeouts : 7 Total tests : 225 Command : `aider --model gemini/gemini-2.0-flash-thinking-exp-01-21` Date : 2025-01-21 Versions : 0.72.2.dev Seconds per case : 24.2 Total cost : 0.0
	gpt-4o-2024-11-20	18.2%	$6.74	`aider --model gpt-4o-2024-11-20`	95.1%	diff
Dirname : 2024-12-30-20-57-12--gpt-4o-2024-11-20-ex-as-sys Test cases : 225 Model : gpt-4o-2024-11-20 Edit format : diff Commit hash : 09ee197-dirty Pass rate 1 : 4.9 Pass rate 2 : 18.2 Pass num 1 : 11 Pass num 2 : 41 格式正确的百分比 : 95.1 Error outputs : 12 Num malformed responses : 12 Num with malformed responses : 11 User asks : 53 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 12 Total tests : 225 Command : `aider --model gpt-4o-2024-11-20` Date : 2024-12-30 Versions : 0.70.1.dev Seconds per case : 12.1 Total cost : 6.7351
	DeepSeek Chat V2.5	17.8%	$0.51	`aider --model deepseek/deepseek-chat`	92.9%	diff
Dirname : 2024-12-21-20-56-21--polyglot-deepseek-diff Test cases : 225 Model : DeepSeek Chat V2.5 Edit format : diff Commit hash : a755079-dirty Pass rate 1 : 5.3 Pass rate 2 : 17.8 Pass num 1 : 12 Pass num 2 : 40 格式正确的百分比 : 92.9 Error outputs : 42 Num malformed responses : 37 Num with malformed responses : 16 User asks : 23 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 5 Test timeouts : 5 Total tests : 225 Command : `aider --model deepseek/deepseek-chat` Date : 2024-12-21 Versions : 0.69.2.dev Seconds per case : 184.0 Total cost : 0.5101
	Qwen2.5-Coder-32B-Instruct	16.4%		`aider --model openai/Qwen2.5-Coder-32B-Instruct`	99.6%	whole
Dirname : 2024-12-26-00-55-20--Qwen2.5-Coder-32B-Instruct Test cases : 225 Model : Qwen2.5-Coder-32B-Instruct Edit format : whole Commit hash : b51768b0 Pass rate 1 : 4.9 Pass rate 2 : 16.4 Pass num 1 : 11 Pass num 2 : 37 格式正确的百分比 : 99.6 Error outputs : 1 Num malformed responses : 1 Num with malformed responses : 1 User asks : 33 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 6 Total tests : 225 Command : `aider --model openai/Qwen2.5-Coder-32B-Instruct` Date : 2024-12-26 Versions : 0.69.2.dev Seconds per case : 42.0 Total cost : 0.0
	Llama 4 Maverick	15.6%		`aider --model nvidia_nim/meta/llama-4-maverick-17b-128e-instruct`	99.1%	whole
Dirname : 2025-04-06-08-39-52--llama-4-maverick-17b-128e-instruct-polyglot-whole Test cases : 225 Model : Llama 4 Maverick Edit format : whole Commit hash : 9445a31 Pass rate 1 : 4.4 Pass rate 2 : 15.6 Pass num 1 : 10 Pass num 2 : 35 格式正确的百分比 : 99.1 Error outputs : 12 Num malformed responses : 2 Num with malformed responses : 2 User asks : 248 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 4 Total tests : 225 Command : `aider --model nvidia_nim/meta/llama-4-maverick-17b-128e-instruct` Date : 2025-04-06 Versions : 0.81.2.dev Seconds per case : 20.5 Total cost : 0.0
	yi-lightning	12.9%		`aider --model openai/yi-lightning`	92.9%	whole
Dirname : 2024-12-23-01-11-56--yi-test Test cases : 225 Model : yi-lightning Edit format : whole Commit hash : 2b1625e Pass rate 1 : 5.8 Pass rate 2 : 12.9 Pass num 1 : 13 Pass num 2 : 29 格式正确的百分比 : 92.9 Error outputs : 87 Num malformed responses : 72 Num with malformed responses : 16 User asks : 107 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 6 Total tests : 225 Command : `aider --model openai/yi-lightning` Date : 2024-12-23 Versions : 0.69.2.dev Seconds per case : 146.7 Total cost : 0.0
	command-a-03-2025-quality	12.0%		`OPENAI_API_BASE=https://api.cohere.ai/compatibility/v1 aider --model openai/command-a-03-2025-quality`	99.6%	whole
Dirname : 2025-03-14-23-40-00--cmda-quality-whole2 Test cases : 225 Model : command-a-03-2025-quality Edit format : whole Commit hash : a1aa63f Pass rate 1 : 2.2 Pass rate 2 : 12.0 Pass num 1 : 5 Pass num 2 : 27 格式正确的百分比 : 99.6 Error outputs : 2 Num malformed responses : 1 Num with malformed responses : 1 User asks : 215 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 4 Total tests : 225 Command : `OPENAI_API_BASE=https://api.cohere.ai/compatibility/v1 aider --model openai/command-a-03-2025-quality` Date : 2025-03-14 Versions : 0.77.1.dev Seconds per case : 85.1 Total cost : 0.0
	Codestral 25.01	11.1%	$1.98	`aider --model mistral/codestral-latest`	100.0%	whole
Dirname : 2025-01-13-18-17-25--codestral-whole2 Test cases : 225 Model : Codestral 25.01 Edit format : whole Commit hash : 0cba898-dirty Pass rate 1 : 4.0 Pass rate 2 : 11.1 Pass num 1 : 9 Pass num 2 : 25 格式正确的百分比 : 100.0 Error outputs : 0 Num malformed responses : 0 Num with malformed responses : 0 User asks : 47 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 4 Total tests : 225 Command : `aider --model mistral/codestral-latest` Date : 2025-01-13 Versions : 0.71.2.dev Seconds per case : 9.3 Total cost : 1.9834
	openhands-lm-32b-v0.1	10.2%		`aider --model openrouter/all-hands/openhands-lm-32b-v0.1`	95.1%	whole
Dirname : 2025-04-19-14-43-04--o4-mini-patch Test cases : 225 Model : openhands-lm-32b-v0.1 Edit format : whole Commit hash : c08336f Pass rate 1 : 4.0 Pass rate 2 : 10.2 Pass num 1 : 9 Pass num 2 : 23 格式正确的百分比 : 95.1 Error outputs : 55 Num malformed responses : 41 Num with malformed responses : 11 User asks : 166 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 11 Total tests : 225 Command : `aider --model openrouter/all-hands/openhands-lm-32b-v0.1` Date : 2025-04-19 Versions : 0.82.2.dev Seconds per case : 195.6 Total cost : 0.0
	gpt-4.1-nano	8.9%	$0.43	`aider --model gpt-4.1-nano`	94.2%	whole
Dirname : 2025-04-14-22-46-01--gpt41nano-diff Test cases : 225 Model : gpt-4.1-nano Edit format : whole Commit hash : 71d1591-dirty Pass rate 1 : 3.1 Pass rate 2 : 8.9 Pass num 1 : 7 Pass num 2 : 20 格式正确的百分比 : 94.2 Error outputs : 20 Num malformed responses : 20 Num with malformed responses : 13 User asks : 316 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 8 Total tests : 225 Command : `aider --model gpt-4.1-nano` Date : 2025-04-14 Versions : 0.81.4.dev Seconds per case : 12.0 Total cost : 0.4281
	Qwen2.5-Coder-32B-Instruct	8.0%		`aider --model openai/Qwen/Qwen2.5-Coder-32B-Instruct # via hyperbolic`	71.6%	diff
Dirname : 2024-12-22-13-22-32--polyglot-qwen-diff Test cases : 225 Model : Qwen2.5-Coder-32B-Instruct Edit format : diff Commit hash : 6d7e8be-dirty Pass rate 1 : 4.4 Pass rate 2 : 8.0 Pass num 1 : 10 Pass num 2 : 18 格式正确的百分比 : 71.6 Error outputs : 158 Num malformed responses : 148 Num with malformed responses : 64 User asks : 132 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 2 Total tests : 225 Command : `aider --model openai/Qwen/Qwen2.5-Coder-32B-Instruct # via hyperbolic` Date : 2024-12-22 Versions : 0.69.2.dev Seconds per case : 84.4 Total cost : 0.0
	gemma-3-27b-it	4.9%		`aider --model openrouter/google/gemma-3-27b-it`	100.0%	whole
Dirname : 2025-03-15-01-21-24--gemma3-27b-or Test cases : 225 Model : gemma-3-27b-it Edit format : whole Commit hash : fd21f51-dirty Pass rate 1 : 1.8 Pass rate 2 : 4.9 Pass num 1 : 4 Pass num 2 : 11 格式正确的百分比 : 100.0 Error outputs : 3 Num malformed responses : 0 Num with malformed responses : 0 User asks : 181 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 1 Test timeouts : 3 Total tests : 225 Command : `aider --model openrouter/google/gemma-3-27b-it` Date : 2025-03-15 Versions : 0.77.1.dev Seconds per case : 79.7 Total cost : 0.0
	gpt-4o-mini-2024-07-18	3.6%	$0.32	`aider --model gpt-4o-mini-2024-07-18`	100.0%	whole
Dirname : 2024-12-21-18-41-18--polyglot-gpt-4o-mini Test cases : 225 Model : gpt-4o-mini-2024-07-18 Edit format : whole Commit hash : a755079-dirty Pass rate 1 : 0.9 Pass rate 2 : 3.6 Pass num 1 : 2 Pass num 2 : 8 格式正确的百分比 : 100.0 Error outputs : 0 Num malformed responses : 0 Num with malformed responses : 0 User asks : 36 Lazy comments : 0 Syntax errors : 0 Indentation errors : 0 Exhausted context windows : 0 Test timeouts : 3 Total tests : 225 Command : `aider --model gpt-4o-mini-2024-07-18` Date : 2024-12-21 Versions : 0.69.2.dev Seconds per case : 17.3 Total cost : 0.3236

作者 Paul Gauthier，最后更新 June 09, 2025.

Aider LLM 排行榜

Aider 多语言编程排行榜

目录