Solve — ranked installable tools for agent jobs-to-be-done
For every task an agent needs to do (OCR PDFs, parse Brazilian NFS-e, extract invoices, scrape web, etc.) — the best skill, MCP server, vendor API, or local binary, ranked on real-world corpora with reproducible evals.
Solve — what to install for the job
Your agent hit a capability gap. It needs a tool, not a cloud service. Which skill? Which MCP server? Which vendor API? Which local binary?
/solve/ answers that with reproducible evals on real-world corpora. Each task has:
- A ranked list of installable candidates (skills, MCPs, APIs, local binaries)
- A scorecard across five dimensions: word accuracy (where relevant), layout preservation, latency p50, cost per 10 runs, install friction
- Exact install commands you can paste
- Alternatives considered and dropped, with rationale (so you trust what made the cut)
- A reproducible command — re-run the eval yourself
This is not a tool marketplace. ClawHub, PulseMCP, and Smithery distribute tools. /solve/ ranks them.
Available tasks
| Task | Top pick | Corpus / sources | Last verified |
|---|---|---|---|
| pdf-text-extraction-mcp — extract text from any PDF | Surya | 10 real-world docs: native-text PDFs, Brazilian NFS-e invoices, boletos, phone-photo receipts | 2026-04-21 |
| nfs-e-extraction — parse Brazilian NFS-e invoices into typed JSON (prestador, tomador, CNPJs, valor, ISS) | auxiliar-nfs-e + Surya | 2-doc São Paulo corpus, 41/41 fields | 2026-04-23 |
| cnpj-enrichment-mcp — CNPJ → CNAE + regime tributário (Simples / MEI) tax-registry enrichment | auxiliar-cnpj-fetch | 5 ranked sources (BrasilAPI, CNPJá, CNPJ.ws, ReceitaWS, auxiliar gateway cascade) | 2026-04-29 |
More tasks ship as walkthroughs run. Each is driven by a real agent problem — not a generic benchmark.
Agent integration
Call /solve/ via the auxiliar-mcp MCP server — one claude mcp add away:
claude mcp add auxiliar -- npx auxiliar-mcp
Then your agent can query:
solve_task(task_slug="pdf-text-extraction-mcp")
# aliases resolve: "pdf", "ocr", "nfs-e", "boleto", "receipt-parsing", "bookkeeping-ocr", "invoice-extraction", "document-ai"
list_solve_tasks()
# discovers every /solve/ task with top pick + categories
The MCP response includes the answer, the full scorecard, alternatives considered, and a pointer to the human-readable page on auxiliar.ai.
Methodology
Each walkthrough follows a five-stage protocol: Discovery → Corpus + Ground Truth → Runner → Score → Publish. Ground truth is LLM-drafted, human-finalized. Scores are deterministic where possible (word accuracy via jiwer, token F1, latency, cost, install friction rubric) and use an LLM judge only for layout preservation. Full methodology: docs/proposals/agent-upgrade-engine.md (renamed solve-engine 2026-04-23).
Why this exists
The standard agent workflow when hitting a capability gap: ask the LLM, Google “best X for Y”, read marketing blog posts, install something, hope it works. The result: uncalibrated recommendations, outdated data, unmeasured accuracy.
/solve/ closes that loop by running the eval once per task, end-to-end, against real documents — and then exposing the result to agents via MCP, CLI, and SEO-indexed Hugo content so they can find it however they search.
All ranked tasks
-
CNPJ → CNAE + regime tributário (Simples / MEI) — tax-registry enrichment for Brazilian bookkeeping agents, ranked — top pick:
auxiliar-cnpj-fetch(verified 2026-04-29)
Tax-registry enrichment for Brazilian bookkeeping agents — given a list of CNPJs (e.g., prestador CNPJs from NFS-e invoices), get back CNAE primary + secondary, regime tributário (Simples Nacional + MEI flags), razão social, situação cadastral, full address, and QSA. Top pick: auxiliar-cnpj-fetch — call directly with no install, no token, just curl POST to https://api.auxiliar.ai/api/invoke/fetch_cnpj. Multi-provider cascade (BrasilAPI → CNPJ.ws) for resilience; same gateway available as an MCP tool when your host speaks MCP. -
Como escolher a API de Open Finance no Brasil — guia honesto por caso de uso (verified 2026-05-10)
Como escolher a API de Open Finance brasileira para um produto financeiro / agente de IA. Não escolhemos vencedor — roteamos por caso de uso. Per-user AI agent → Cumbuca MCP. B2B SaaS / fintech aggregation → Pluggy, Belvo, Klavi. Lending / credit-decisioning → Klavi. Multi-país LatAm → Belvo. OF + payment-initiation (Pix automático) → Quanto. Stack completa BaaS + Pix + OF → Celcoin. -
Conciliação bancária com IA no Brasil — o que instalar e a receita do agente — top pick:
cumbuca-of-data-mcp(verified 2026-05-10)
Como um agente de IA reconcilia movimentações em conta-corrente e cartão de crédito contra uma fonte externa de verdade — planilha contábil, lista de recebimentos esperados, faturas emitidas. Instala o Cumbuca Open Finance Data MCP, o usuário autoriza um banco via Open Finance (CPF + biometria), e o agente roda uma receita determinística: janelas mensais para contornar o cap de paginação Bacen, casamento por valor + data + contraparte (CPF/CNPJ), divergências para revisão manual. Operando sobre Bacen-spec normalizado, data-source-agnóstico. -
Find forgotten subscriptions in your Brazilian credit card — what to install, with the recipe — top pick:
cumbuca-of-data-mcp(verified 2026-05-07)
How an AI agent audits a Brazilian user's credit-card and account transactions for recurring charges. Install Cumbuca's Open Finance Data MCP, authorize through your bank (CPF + biometric), and run a deterministic clustering recipe — merchant normalization + median-interval cadence detection + coefficient-of-variation amount tolerance + recency-based status classification. Production-ready on May 2026 with Cumbuca's MVP scope (statements + credit-card transactions, single account, ~5 queries/day, BR banks only). -
NFS-e field extraction for agents — ranked by field accuracy on Brazilian São Paulo invoices — top pick:
auxiliar-nfs-e + Surya(verified 2026-04-23)
Structured-field NFS-e parser for Brazilian agents. 100% field accuracy on São Paulo invoices when paired with Surya OCR (41/41 fields across 2-doc corpus). Also scored: Google Document AI (88%), Tesseract (63%). Outputs typed JSON with prestador, tomador, CNPJs, valor, ISS, código de serviço, and RPS. -
PDF text extraction for Claude Code agents — what to install, ranked by accuracy — top pick:
surya(verified 2026-04-21)
Ranked installable OCR tools for Claude Code / Cursor / Claude Desktop / OpenClaw agents parsing PDFs, Brazilian NFS-e invoices, boletos, and phone-photo receipts. Surya leads on word accuracy (76.9%) on a 10-document real-world corpus. Tesseract 5 runs 14× faster. Google Document AI wins on mobile-captured receipts. -
Pra onde foi meu dinheiro mês passado? — o que instalar e a receita do agente — top pick:
cumbuca-of-data-mcp(verified 2026-05-10)
Como um agente de IA responde 'pra onde foi meu dinheiro mês passado?' em português brasileiro. Instala o Cumbuca Open Finance Data MCP, o usuário autoriza um banco via Open Finance (CPF + biometria), e o agente roda uma receita determinística — regras MCC + heurísticas de descrição — para categorizar transações de cartão e conta, agregar por categoria, comparar com o mês anterior, listar as 10 maiores. Sem ML, sem dicionário externo, data-source-agnóstico sobre o shape Bacen normalizado. -
Print an agent-native CLI for any API — Printing Press, with the recipe — top pick:
printing-press(verified 2026-05-09)
How a developer gives an AI agent a token-efficient CLI for any API. Install Printing Press, run /printing-press <api>, and the generator emits a Go CLI + Claude Code skill + OpenClaw skill + MCP server — sharing one local SQLite mirror with FTS5 search, compound commands, and 60–80% token compression via --compact. Plus when to use this vs. a hand-built MCP.