The shift happened fast. Twelve months ago, "Claude agent" meant a chatbot that could answer questions. Today it means an autonomous system that clones repositories, runs security scans, grades its own output against a rubric, and posts the results to Slack before you wake up.

Building a capable agent is one problem. Deploying it so it actually runs reliably in the real world is a completely different one.

Depending on where you sit (solo developer shipping personal automations, team building custom pipelines, or enterprise running high stakes production workflows) there are three primary deployment paths for Claude agents as of May 2026. Each solves a different problem. Each has tradeoffs worth understanding before you commit.

Option 1 · Zero infrastructure

1. Claude Code Loops and Cloud Routines

Best for: Scheduled automation, recurring developer tasks, and zero infrastructure deployments.

If you want Claude to do something on a schedule without managing servers, external cron jobs, or Docker containers, this is where you start.

Claude Code actually has three scheduling tiers, and understanding the difference matters:

CLI /loop is the simplest. Inside an active Claude Code terminal session, you tell Claude to repeat a task at a set interval. It runs every N minutes inside that session. The catch: it dies the moment you close the terminal. Maximum 50 tasks per session, auto expires after 7 days, and there is zero persistence across restarts. This is useful for "keep running this while I work" scenarios and nothing else.

Desktop Scheduled Tasks survive terminal closures because they run through the Claude Code desktop app. Your machine still needs to be on, but you do not need an active session. Think of these as local cron jobs with AI judgment built in.

Cloud Routines are the real game changer. Anthropic launched these on April 14, 2026, and they run entirely on Anthropic's cloud infrastructure. Your laptop can be closed. The routine keeps running. You define a prompt, connect one or more GitHub repositories, set up environment variables for API keys, attach connectors (Slack, Linear, Google Drive, GitHub), and pick a trigger.

Three trigger types are available: scheduled (hourly, daily, weekdays, weekly, or custom cron with a one hour minimum interval), API (an HTTP POST endpoint with a bearer token that you call from your own systems), and GitHub (fires on repository events like pull requests or releases, with optional filters).

When a trigger fires, Anthropic spins up a fresh Claude Code container, clones your repos, loads your prompt, runs until completion, and either commits to a branch, opens a PR, posts a message, or calls a tool. There is no human in the loop during execution. No permission prompts. No approval dialogs. The session runs autonomously.

Real world deployments that actually exist

The single most valuable routine reported across multiple teams is the nightly dependency upgrade PR. A routine runs at 3 AM, checks package.json or requirements.txt for safe minor and patch version bumps, applies the upgrade on a new branch, runs the test suite, and opens a PR if tests pass. You wake up to a ready to merge PR instead of a manual chore.

Another pattern gaining traction: documentation drift detection. A weekly routine scans merged PRs since the last run, flags documentation that references changed APIs, and opens update PRs against the docs repository. One team described it as finally solving the problem of docs that are always six months behind the code.

For CI/CD, routines can run smoke checks against new builds, scan error logs for regressions, and post a go or no go verdict to a release channel before the deploy window closes. The New Stack described this pattern as making routines the next generation version of /schedule.

The limits you need to know

Daily run caps vary by plan: Pro gets 5 runs per day, Max gets 15, and Team or Enterprise gets 25. Routines draw down the same subscription limit as interactive Claude Code sessions, so heavy automation competes with your interactive usage. Organizations with extra usage enabled can go past the cap on metered overage.

Every Cloud Routine execution is a fresh clone. It cannot read your local .env.local or local databases. If your task needs local state, use Desktop Scheduled Tasks instead. Credentials must be configured in the Routine's Environment Variables settings.

By default, Claude can only push to branches prefixed with claude/. This is a safety barrier worth keeping unless you have robust downstream review processes.

Option 2 · Developer controlled

2. The Claude Agent SDK on Serverless Platforms or Containers

Best for: Developer controlled pipelines, custom tool integrations, production applications, and anything that outgrows scheduled tasks.

When your automation needs more than a recurring script (custom tools, persistent state, parallel execution, integration into your own application), the Claude Agent SDK is the deployment path.

The SDK was originally called the Claude Code SDK. Anthropic renamed it to the Claude Agent SDK in January 2026 after realizing it powers far more than coding. As Anthropic's own engineering blog put it:

By giving Claude access to the user's computer (via the terminal), it had what it needed to write code like programmers do. But this has also made Claude effective at non-coding tasks.

The SDK wraps the same agent loop, context management, and tool execution that powers Claude Code into a library you embed in your own Python or TypeScript applications. It includes built in tools for reading files, running commands, and editing code, so your agent can start working immediately without you implementing tool execution from scratch.

Where people actually deploy these

Modal is the most commonly referenced serverless platform for Agent SDK deployments. Their gVisor isolated sandboxes support over 50,000 concurrent sessions with fast startup times, and they power over 10,000 teams running Claude related agent workflows. The serverless architecture means you pay only for active compute with automatic scale to zero when agents are not running.

Trigger.dev handles event driven deployments where agents fire in response to webhooks. AWS Lambda and Vercel Functions work for shorter lived agent tasks but impose timeout constraints that can be tight for complex agent runs.

For higher security requirements, the SDK supports ephemeral Docker containers. The pattern: create a new container for each user task, run the agent inside it, destroy the container when complete. Anthropic's own secure deployment guide documents filesystem restrictions using OS primitives (bubblewrap on Linux, sandbox exec on macOS) and network controls through a built in proxy.

Real world patterns in production

A common deployment is the automated code review agent. It clones a repository when a PR opens, reads diff files, runs security scans, and posts structured review comments to GitHub through MCP tool integrations. AWS shipped an Agent Plugin for AWS Serverless in March 2026 that packages exactly this kind of workflow into a reusable skill installable across Claude Code, Cursor, and other compatible tools.

One production patterns guide documented the five things that separate a working demo from a production deployment:

  • Durable state — conversation logs go to Postgres or Redis, not in memory
  • Hard cost caps — per task, per user, and per tenant budgets
  • Circuit breakersmaxTurns property to prevent stuck loops
  • Tool permissioning — the smallest set of tools that can do the task
  • Evaluation hooks — offline evals plus online monitoring on every production turn

The dominant cost is tokens, not infrastructure. Containers run at roughly $0.05 per hour. For most agent workloads, the Claude API bill dwarfs the compute bill.

The tradeoffs

You own everything. Infrastructure, sandboxing, state persistence, credential management, prompt injection defense. The SDK deliberately does not give you opinionated infrastructure because those decisions belong in your application. That is the right design, but it means the distance between a working demo and a production system is real engineering work.

Starting June 15, 2026, Agent SDK usage on subscription plans draws from a new monthly Agent SDK credit, separate from interactive usage limits. If you are planning a deployment, check the updated billing structure.

Option 3 · Enterprise production

3. Claude Managed Agents

Best for: Enterprise production, long running asynchronous tasks, self improving workflows, and complex multi agent coordination.

For organizations that need highly autonomous agents without building custom orchestration infrastructure, Anthropic launched Claude Managed Agents on April 8, 2026, followed by a major feature update on May 6 at the Code with Claude developer conference.

The pitch: you define the agent's persona, tools, and success criteria. Anthropic provisions an isolated, secure cloud container where Claude executes tasks over minutes or hours. You do not manage the VM, the runtime, or the orchestrator. You deploy a definition, trigger runs, and read results.

What shipped on May 6 and why it matters

Three features moved Managed Agents from "hosted agent runtime" to "self improving agent platform."

Outcomes is the rubric grading system. You write a description of what a successful output looks like. A separate Claude instance (the grader) evaluates the agent's work against your criteria in its own context window. Because the grader has no exposure to the agent's reasoning, it catches quality gaps the agent rationalized away. When something fails, the grader pinpoints what needs to change and the agent takes another pass.

Anthropic reported that Outcomes alone improved task success by up to 10 percentage points over standard prompting with no model change. Wisedocs, a document processing company, built a quality check agent on Managed Agents using Outcomes to grade each review against their internal guidelines. Their document reviews now run 50% faster while staying aligned with team standards.

Dreaming is the self improving memory system. A scheduled background process reviews past agent sessions and memory stores, extracts patterns (recurring mistakes, convergent workflows, team preferences), and curates the agent's long term memory. You control how much autonomy it has: dreaming can update memory automatically or require your review before changes land.

Harvey, the legal AI company, reported a roughly 6x increase in task completion rates after switching on dreaming for legal document workflows. They found that dreaming worked best when paired with a tight Outcomes rubric, so any drift in memory gets caught by the grader on the next run.

Multiagent Orchestration lets a lead agent break a job into pieces and delegate each one to specialist subagents running in parallel. The subagents work on a shared filesystem and contribute to the lead agent's context. The lead agent can check back in with subagents mid workflow because events are persistent and every agent remembers what it has done.

Spiral (from the Every media company) uses this pattern for their writing platform. The lead agent runs on Haiku for fast triage and delegation. Subagents running on Opus handle the actual drafting. Outcomes enforces editorial quality standards against the user's voice profile stored in memory. Only drafts that clear the bar get returned.

The real constraints

  • Maximum 24 hour runs. Hard ceiling. Longer workflows must checkpoint and restart.
  • No GPU workloads. If your agent needs local ML inference, it has to call an external service.
  • Limited language runtimes pre installed. Containers ship with Python 3.12 and Node 22 by default. Go, Rust, Ruby, and Java require custom container definitions that add 2 to 3 minutes to cold start.
  • MCP servers must be declared at deploy time. You cannot have the agent dynamically add servers mid run.

Managed Agents runs exclusively on Anthropic's managed cloud (or Claude Platform on AWS) and incurs a session hour infrastructure fee on top of standard token costs. For most workloads, AI Magicx reported that the runtime meter barely registers because Claude token usage dwarfs it.

Dreaming is in research preview (access by request). Outcomes and multiagent orchestration are in public beta.


Which Path Fits Your Situation

The decision tree is straightforward.

If your task is "do X on a schedule" and does not need custom tools or persistent state beyond what a GitHub repository provides, Cloud Routines handle it with zero infrastructure. Start here. Most people overengineer their first agent deployment when a well written routine would have solved the problem.

If you need custom tools, your own hosting, persistent state in a database, or integration into an existing application, the Agent SDK gives you full control. You build and maintain the infrastructure, but you also control every decision. This is the path for teams building agent powered products.

If you need self improving agents, rubric graded outputs, multi agent coordination, or you simply do not want to build orchestration plumbing, Managed Agents is the enterprise play. The infrastructure cost is minimal compared to token spend, and the Outcomes and Dreaming features are genuinely hard to replicate from scratch. Harvey's 6x completion rate improvement and Wisedocs' 50% faster reviews are the kind of numbers that justify the platform lock in.

A common path that Anthropic themselves recommend: prototype with the Agent SDK locally, then move to Managed Agents for production. The agent logic (prompts, tool definitions, evaluation criteria) stays portable. The infrastructure becomes someone else's problem.

The agents themselves will keep getting smarter. The deployment infrastructure is what determines whether that intelligence actually ships.