Solve — ranked installable tools for agent jobs-to-be-done

For every task an agent needs to do (OCR PDFs, parse Brazilian NFS-e, extract invoices, scrape web, etc.) — the best skill, MCP server, vendor API, or local binary, ranked on real-world corpora with reproducible evals.

Solve — what to install for the job

Your agent hit a capability gap. It needs a tool, not a cloud service. Which skill? Which MCP server? Which vendor API? Which local binary?

/solve/ answers that with reproducible evals on real-world corpora. Each task has:

  • A ranked list of installable candidates (skills, MCPs, APIs, local binaries)
  • A scorecard across five dimensions: word accuracy (where relevant), layout preservation, latency p50, cost per 10 runs, install friction
  • Exact install commands you can paste
  • Alternatives considered and dropped, with rationale (so you trust what made the cut)
  • A reproducible command — re-run the eval yourself

This is not a tool marketplace. ClawHub, PulseMCP, and Smithery distribute tools. /solve/ ranks them.

Available tasks

Task Top pick Corpus / sources Last verified
pdf-text-extraction-mcp — extract text from any PDF Surya 10 real-world docs: native-text PDFs, Brazilian NFS-e invoices, boletos, phone-photo receipts 2026-04-21
nfs-e-extraction — parse Brazilian NFS-e invoices into typed JSON (prestador, tomador, CNPJs, valor, ISS) auxiliar-nfs-e + Surya 2-doc São Paulo corpus, 41/41 fields 2026-04-23
cnpj-enrichment-mcp — CNPJ → CNAE + regime tributário (Simples / MEI) tax-registry enrichment auxiliar-cnpj-fetch 5 ranked sources (BrasilAPI, CNPJá, CNPJ.ws, ReceitaWS, auxiliar gateway cascade) 2026-04-29

More tasks ship as walkthroughs run. Each is driven by a real agent problem — not a generic benchmark.

Agent integration

Call /solve/ via the auxiliar-mcp MCP server — one claude mcp add away:

claude mcp add auxiliar -- npx auxiliar-mcp

Then your agent can query:

solve_task(task_slug="pdf-text-extraction-mcp")
# aliases resolve: "pdf", "ocr", "nfs-e", "boleto", "receipt-parsing", "bookkeeping-ocr", "invoice-extraction", "document-ai"

list_solve_tasks()
# discovers every /solve/ task with top pick + categories

The MCP response includes the answer, the full scorecard, alternatives considered, and a pointer to the human-readable page on auxiliar.ai.

Methodology

Each walkthrough follows a five-stage protocol: Discovery → Corpus + Ground Truth → Runner → Score → Publish. Ground truth is LLM-drafted, human-finalized. Scores are deterministic where possible (word accuracy via jiwer, token F1, latency, cost, install friction rubric) and use an LLM judge only for layout preservation. Full methodology: docs/proposals/agent-upgrade-engine.md (renamed solve-engine 2026-04-23).

Why this exists

The standard agent workflow when hitting a capability gap: ask the LLM, Google “best X for Y”, read marketing blog posts, install something, hope it works. The result: uncalibrated recommendations, outdated data, unmeasured accuracy.

/solve/ closes that loop by running the eval once per task, end-to-end, against real documents — and then exposing the result to agents via MCP, CLI, and SEO-indexed Hugo content so they can find it however they search.

All ranked tasks

  • CNPJ → CNAE + regime tributário (Simples / MEI) — tax-registry enrichment for Brazilian bookkeeping agents, ranked — top pick: auxiliar-cnpj-fetch (verified 2026-04-29)
    Tax-registry enrichment for Brazilian bookkeeping agents — given a list of CNPJs (e.g., prestador CNPJs from NFS-e invoices), get back CNAE primary + secondary, regime tributário (Simples Nacional + MEI flags), razão social, situação cadastral, full address, and QSA. Top pick: auxiliar-cnpj-fetch — call directly with no install, no token, just curl POST to https://api.auxiliar.ai/api/invoke/fetch_cnpj. Multi-provider cascade (BrasilAPI → CNPJ.ws) for resilience; same gateway available as an MCP tool when your host speaks MCP.
  • Como escolher a API de Open Finance no Brasil — guia honesto por caso de uso (verified 2026-05-10)
    Como escolher a API de Open Finance brasileira para um produto financeiro / agente de IA. Não escolhemos vencedor — roteamos por caso de uso. Per-user AI agent → Cumbuca MCP. B2B SaaS / fintech aggregation → Pluggy, Belvo, Klavi. Lending / credit-decisioning → Klavi. Multi-país LatAm → Belvo. OF + payment-initiation (Pix automático) → Quanto. Stack completa BaaS + Pix + OF → Celcoin.
  • Conciliação bancária com IA no Brasil — o que instalar e a receita do agente — top pick: cumbuca-of-data-mcp (verified 2026-05-10)
    Como um agente de IA reconcilia movimentações em conta-corrente e cartão de crédito contra uma fonte externa de verdade — planilha contábil, lista de recebimentos esperados, faturas emitidas. Instala o Cumbuca Open Finance Data MCP, o usuário autoriza um banco via Open Finance (CPF + biometria), e o agente roda uma receita determinística: janelas mensais para contornar o cap de paginação Bacen, casamento por valor + data + contraparte (CPF/CNPJ), divergências para revisão manual. Operando sobre Bacen-spec normalizado, data-source-agnóstico.
  • Find forgotten subscriptions in your Brazilian credit card — what to install, with the recipe — top pick: cumbuca-of-data-mcp (verified 2026-05-07)
    How an AI agent audits a Brazilian user's credit-card and account transactions for recurring charges. Install Cumbuca's Open Finance Data MCP, authorize through your bank (CPF + biometric), and run a deterministic clustering recipe — merchant normalization + median-interval cadence detection + coefficient-of-variation amount tolerance + recency-based status classification. Production-ready on May 2026 with Cumbuca's MVP scope (statements + credit-card transactions, single account, ~5 queries/day, BR banks only).
  • NFS-e field extraction for agents — ranked by field accuracy on Brazilian São Paulo invoices — top pick: auxiliar-nfs-e + Surya (verified 2026-04-23)
    Structured-field NFS-e parser for Brazilian agents. 100% field accuracy on São Paulo invoices when paired with Surya OCR (41/41 fields across 2-doc corpus). Also scored: Google Document AI (88%), Tesseract (63%). Outputs typed JSON with prestador, tomador, CNPJs, valor, ISS, código de serviço, and RPS.
  • PDF text extraction for Claude Code agents — what to install, ranked by accuracy — top pick: surya (verified 2026-04-21)
    Ranked installable OCR tools for Claude Code / Cursor / Claude Desktop / OpenClaw agents parsing PDFs, Brazilian NFS-e invoices, boletos, and phone-photo receipts. Surya leads on word accuracy (76.9%) on a 10-document real-world corpus. Tesseract 5 runs 14× faster. Google Document AI wins on mobile-captured receipts.
  • Pra onde foi meu dinheiro mês passado? — o que instalar e a receita do agente — top pick: cumbuca-of-data-mcp (verified 2026-05-10)
    Como um agente de IA responde 'pra onde foi meu dinheiro mês passado?' em português brasileiro. Instala o Cumbuca Open Finance Data MCP, o usuário autoriza um banco via Open Finance (CPF + biometria), e o agente roda uma receita determinística — regras MCC + heurísticas de descrição — para categorizar transações de cartão e conta, agregar por categoria, comparar com o mês anterior, listar as 10 maiores. Sem ML, sem dicionário externo, data-source-agnóstico sobre o shape Bacen normalizado.
  • Print an agent-native CLI for any API — Printing Press, with the recipe — top pick: printing-press (verified 2026-05-09)
    How a developer gives an AI agent a token-efficient CLI for any API. Install Printing Press, run /printing-press <api>, and the generator emits a Go CLI + Claude Code skill + OpenClaw skill + MCP server — sharing one local SQLite mirror with FTS5 search, compound commands, and 60–80% token compression via --compact. Plus when to use this vs. a hand-built MCP.