Blog·Engineering
EngineeringOct 25, 2023 · 5 min read

Why I moved AI workloads from Beanstalk to Fargate

After six months on Elastic Beanstalk, the developer experience pushed me to ECS + Fargate. For containerised AI services with token budgets and eval gates, the case is even stronger.

Alessandro Merola
Alessandro Merola
CTO
Engineering

I picked Elastic Beanstalk for an AWS deployment because it looked like the simplest place to start. After six months of running real production traffic on it, including LLM proxies and retrieval workers, I had collected enough small papercuts that I moved the workload to ECS + Fargate. The migration paid for itself in the first month. Here is the comparison I would make again today.

The AWS deployment ladder, simplified

  • Elastic Beanstalk: the simplest tier, focused on deploying code with minimal configuration
  • ECS, Elastic Container Service: a middle tier focused on orchestrating Docker containers, with Fargate for serverless compute or EC2 for self-managed hosts
  • EKS, Elastic Kubernetes Service: the most powerful tier, built on Kubernetes and aimed at platform-scale workloads

For most workloads that do not need Kubernetes, including the AI services most product teams ship, ECS is the right place to land. It gives you containers, autoscaling, and clean integration with the rest of the AWS estate without the operational weight of a full Kubernetes platform.

Fargate vs EC2 for the underlying compute

ECS gives you two hosting options. Fargate is fully serverless, you hand AWS a container and it runs it without you managing instances. EC2 mode means managing the underlying hosts yourself. I picked Fargate because the time saved on instance management more than paid for the slightly higher per-task cost, and because AI workloads benefit from the predictable cold-start and autoscaling story Fargate provides.

Five things that pushed me off Beanstalk

  • Logs: pulling logs in Beanstalk meant requesting archives and digging through multiple groups; on ECS, CloudWatch streams them in real time, the way logs should work, especially when you are debugging an LLM request that hit a 30-second timeout
  • Configuration updates: Beanstalk updates across multiple instances were unreliable enough to leave services in a half-broken state; ECS rolls updates predictably with healthy/unhealthy tracking built in
  • GitHub integration: Beanstalk has no first-party GitHub Actions support, so deploys depend on community actions; ECS has official actions that AWS maintains
  • CI/CD: Beanstalk's pipeline is restrictive, modifying environment variables or running database migrations during a deploy is awkward at best; ECS task definitions handle both naturally, including the per-deploy secret rotation that AI services need
  • Auto-scaling: Beanstalk's auto-scaling controls are unintuitive; ECS service auto-scaling is straightforward, reliable, and exposes the metrics that matter for AI workloads, queue depth on a retrieval worker, concurrent inference requests, token spend per minute

Beanstalk served its purpose, it was a starting point. The moment your deploy story has more requirements than 'push code, run app', and an AI service almost always does, it is time to graduate.

The migration was worth it, especially for AI workloads

Six months on, the operational story is calmer, the deploy pipeline is faster, and the platform scales without surprises. For AI services with token budgets, eval gates, and per-tenant rate limits, ECS + Fargate is the cleanest middle ground between 'too constrained' and 'too much platform'. It is not the right answer for every workload, but for the kind of containerised AI service most teams ship today, it is the one I would reach for first.

Next article
Operations
How to Hire Nearshore AI Engineers: A Practical Guide for CTOs
Available for new partnerships

Ready to build your next product?

Tell us about your project. We'll respond within one business day with next steps.

We use cookies

We use essential cookies for the site to work, and analytics cookies (Google Analytics) to understand how you use it. Cookie Policy.