LOCO-Agent
Open source for the agent spend era

Cost firewall for AI agents.

LOCO-Agent is the open-source scheduler, budget circuit breaker, and cost attribution layer for teams running agentic AI in production. Wrap the calls you already make. See who spent what. Decide who gets the next expensive slot.

install pip install loco-agent
$ loco doctor
found: anthropic, openai, google-adk, langchain
suggested: shared scheduler with capacity=3

$ LOCO_LOG=pretty python production_agents.py
[ENQUEUE] security model=opus   team=soc     workflow=incident-review
[GRANT]   security score=0.91   budget=critical-path remaining
[WAIT]    growth   model=sonnet queue=17.0   reason=capacity
[FLAG]    support  mode=downgrade budget=exceeded
[ATTR]    soc      workflow=incident-review model=opus cost=5.0

$ python - <<'PY'
print(scheduler.metrics.attribution.cost_by_team())
PY
{'soc': 91.0, 'support': 38.0, 'growth': 19.0}
7framework adapters
4Dteam, workflow, model, agent
0required core deps
486tests on GitHub main

Built for builders who read the invoice.

Every agent framework makes it easier to call a model. LOCO is for the moment after that: when the demo becomes a system, traffic spikes, premium models get expensive, and someone asks where the tokens went.

01 / Own the scheduler

No black-box traffic cop.

LOCO is a Python library you run in your app. The load equation is documented, deterministic, testable, and small enough to understand.

02 / Label the work

Cost starts at the task.

Attach team, workflow, model, session, tenant, and outcome metadata where the agent actually does work.

03 / Govern before spend

Budgets are runtime policy.

Set limits per agent, team, or tenant, then reject, alert, or flag downgrade paths before the call becomes a surprise line item.

04 / Route by pressure

Priority rules do not survive bursts.

LOCO re-scores waiters on each release, so urgent work can climb while long-running batch jobs still make progress.

05 / Bring your framework

Adapters, not lock-in.

Anthropic, OpenAI, Google ADK, LangChain, CrewAI, Bedrock, AutoGen, and plain async Python compete for the same shared slots.

06 / Ship the dashboard

Ops should see the queue.

Export scheduler state, wait time, utilization, policy violations, trust scores, and cost attribution into the stack you already operate.

The CFO view and the SRE view are the same trace.

LOCO connects the dispatch decision to the spend story: who waited, which model ran, which budget was touched, and whether the outcome was worth the tokens.

  • Cost by team, workflow, model, agent, and session
  • Token-to-outcome tracking for ROI attribution
  • Trust scoring and multi-tenant isolation
  • Prometheus metrics plus an importable Grafana dashboard
LOCO-Agent Grafana dashboard showing cost, queue depth, wait time, utilization, trust scores, and policy panels

Same policy across the agent zoo.

Your LangChain batch job, ADK webhook handler, OpenAI assistant, and Anthropic analyst should not each invent their own concurrency, budget, and attribution rules.

Anthropic SDK
OpenAI SDK
Google ADK
LangChain
CrewAI
AWS Bedrock
AutoGen

Wrap one call. Keep control.

Start with loco.wrap() around any async LLM call. Add adapters, budgets, tenant pools, policy enforcement, and dashboards as the system grows.

import asyncio
import loco

async def call_llm(prompt: str):
    return await your_model_client.generate(prompt)

async def main():
    loco.configure(capacity=3, budget_mode="downgrade")
    loco.set_budget("support", max_cost=25.0)

    await loco.wrap(
        call_llm,
        agent_id="support",
        weight=2.0,
        prompt="summarize customer thread",
    )

    attr = loco.get_scheduler().metrics.attribution
    print(attr.cost_by_model())

asyncio.run(main())