Solve — ranked installable tools for agent jobs-to-be-done

For every task an agent needs to do (OCR PDFs, parse Brazilian NFS-e, extract invoices, scrape web, etc.) — the best skill, MCP server, vendor API, or local binary, ranked on real-world corpora with reproducible evals.

Solve — what to install for the job

Your agent hit a capability gap. It needs a tool, not a cloud service. Which skill? Which MCP server? Which vendor API? Which local binary?

/solve/ answers that with reproducible evals on real-world corpora. Each task has:

A ranked list of installable candidates (skills, MCPs, APIs, local binaries)
A scorecard across five dimensions: word accuracy (where relevant), layout preservation, latency p50, cost per 10 runs, install friction
Exact install commands you can paste
Alternatives considered and dropped, with rationale (so you trust what made the cut)
A reproducible command — re-run the eval yourself

This is not a tool marketplace. ClawHub, PulseMCP, and Smithery distribute tools. /solve/ ranks them.

Available tasks

Task	Top pick	Corpus / sources	Last verified
pdf-text-extraction-mcp — extract text from any PDF	Surya	10 real-world docs: native-text PDFs, Brazilian NFS-e invoices, boletos, phone-photo receipts	2026-04-21
nfs-e-extraction — parse Brazilian NFS-e invoices into typed JSON (prestador, tomador, CNPJs, valor, ISS)	auxiliar-nfs-e + Surya	2-doc São Paulo corpus, 41/41 fields	2026-04-23
cnpj-enrichment-mcp — CNPJ → CNAE + regime tributário (Simples / MEI) tax-registry enrichment	auxiliar-cnpj-fetch	5 ranked sources (BrasilAPI, CNPJá, CNPJ.ws, ReceitaWS, auxiliar gateway cascade)	2026-04-29

More tasks ship as walkthroughs run. Each is driven by a real agent problem — not a generic benchmark.

Agent integration

Call /solve/ via the auxiliar-mcp MCP server — one claude mcp add away:

claude mcp add auxiliar -- npx auxiliar-mcp

Then your agent can query:

solve_task(task_slug="pdf-text-extraction-mcp")
# aliases resolve: "pdf", "ocr", "nfs-e", "boleto", "receipt-parsing", "bookkeeping-ocr", "invoice-extraction", "document-ai"

list_solve_tasks()
# discovers every /solve/ task with top pick + categories

The MCP response includes the answer, the full scorecard, alternatives considered, and a pointer to the human-readable page on auxiliar.ai.

Methodology

Each walkthrough follows a five-stage protocol: Discovery → Corpus + Ground Truth → Runner → Score → Publish. Ground truth is LLM-drafted, human-finalized. Scores are deterministic where possible (word accuracy via jiwer, token F1, latency, cost, install friction rubric) and use an LLM judge only for layout preservation. Full methodology: docs/proposals/agent-upgrade-engine.md (renamed solve-engine 2026-04-23).

Why this exists

The standard agent workflow when hitting a capability gap: ask the LLM, Google “best X for Y”, read marketing blog posts, install something, hope it works. The result: uncalibrated recommendations, outdated data, unmeasured accuracy.

/solve/ closes that loop by running the eval once per task, end-to-end, against real documents — and then exposing the result to agents via MCP, CLI, and SEO-indexed Hugo content so they can find it however they search.

All ranked tasks

CNPJ → CNAE + regime tributário (Simples / MEI) — tax-registry enrichment for Brazilian bookkeeping agents, ranked — top pick: auxiliar-cnpj-fetch (verified 2026-04-29)
Tax-registry enrichment for Brazilian bookkeeping agents — given a list of CNPJs (e.g., prestador CNPJs from NFS-e invoices), get back CNAE primary + secondary, regime tributário (Simples Nacional + MEI flags), razão social, situação cadastral, full address, and QSA. Top pick: auxiliar-cnpj-fetch — call directly with no install, no token, just curl POST to https://api.auxiliar.ai/api/invoke/fetch_cnpj. Multi-provider cascade (BrasilAPI → CNPJ.ws) for resilience; same gateway available as an MCP tool when your host speaks MCP.
Como escolher a API de Open Finance no Brasil — guia honesto por caso de uso (verified 2026-05-10)
Como escolher a API de Open Finance brasileira para um produto financeiro / agente de IA. Não escolhemos vencedor — roteamos por caso de uso. Per-user AI agent → Cumbuca MCP. B2B SaaS / fintech aggregation → Pluggy, Belvo, Klavi. Lending / credit-decisioning → Klavi. Multi-país LatAm → Belvo. OF + payment-initiation (Pix automático) → Quanto. Stack completa BaaS + Pix + OF → Celcoin.
Conciliação bancária com IA no Brasil — o que instalar e a receita do agente — top pick: cumbuca-of-data-mcp (verified 2026-05-10)
Como um agente de IA reconcilia movimentações em conta-corrente e cartão de crédito contra uma fonte externa de verdade — planilha contábil, lista de recebimentos esperados, faturas emitidas. Instala o Cumbuca Open Finance Data MCP, o usuário autoriza um banco via Open Finance (CPF + biometria), e o agente roda uma receita determinística: janelas mensais para contornar o cap de paginação Bacen, casamento por valor + data + contraparte (CPF/CNPJ), divergências para revisão manual. Operando sobre Bacen-spec normalizado, data-source-agnóstico.
Find forgotten subscriptions in your Brazilian credit card — what to install, with the recipe — top pick: cumbuca-of-data-mcp (verified 2026-05-07)
How an AI agent audits a Brazilian user's credit-card and account transactions for recurring charges. Install Cumbuca's Open Finance Data MCP, authorize through your bank (CPF + biometric), and run a deterministic clustering recipe — merchant normalization + median-interval cadence detection + coefficient-of-variation amount tolerance + recency-based status classification. Production-ready on May 2026 with Cumbuca's MVP scope (statements + credit-card transactions, single account, ~5 queries/day, BR banks only).
NFS-e field extraction for agents — ranked by field accuracy on Brazilian São Paulo invoices — top pick: auxiliar-nfs-e + Surya (verified 2026-04-23)
Structured-field NFS-e parser for Brazilian agents. 100% field accuracy on São Paulo invoices when paired with Surya OCR (41/41 fields across 2-doc corpus). Also scored: Google Document AI (88%), Tesseract (63%). Outputs typed JSON with prestador, tomador, CNPJs, valor, ISS, código de serviço, and RPS.
PDF text extraction for Claude Code agents — what to install, ranked by accuracy — top pick: surya (verified 2026-04-21)
Ranked installable OCR tools for Claude Code / Cursor / Claude Desktop / OpenClaw agents parsing PDFs, Brazilian NFS-e invoices, boletos, and phone-photo receipts. Surya leads on word accuracy (76.9%) on a 10-document real-world corpus. Tesseract 5 runs 14× faster. Google Document AI wins on mobile-captured receipts.
Pra onde foi meu dinheiro mês passado? — o que instalar e a receita do agente — top pick: cumbuca-of-data-mcp (verified 2026-05-10)
Como um agente de IA responde 'pra onde foi meu dinheiro mês passado?' em português brasileiro. Instala o Cumbuca Open Finance Data MCP, o usuário autoriza um banco via Open Finance (CPF + biometria), e o agente roda uma receita determinística — regras MCC + heurísticas de descrição — para categorizar transações de cartão e conta, agregar por categoria, comparar com o mês anterior, listar as 10 maiores. Sem ML, sem dicionário externo, data-source-agnóstico sobre o shape Bacen normalizado.
Print an agent-native CLI for any API — Printing Press, with the recipe — top pick: printing-press (verified 2026-05-09)
How a developer gives an AI agent a token-efficient CLI for any API. Install Printing Press, run /printing-press <api>, and the generator emits a Go CLI + Claude Code skill + OpenClaw skill + MCP server — sharing one local SQLite mirror with FTS5 search, compound commands, and 60–80% token compression via --compact. Plus when to use this vs. a hand-built MCP.