Every architect’s first instinct is to add a load balancer. “We need horizontal scaling!” they cry. After 47 years of server management, I know the truth: one powerful server beats ten mediocre ones every time.

The Complexity Trap

Look at what “modern” architecture requires:

User → Load Balancer → Server 1 ─┐
                    → Server 2 ─┼→ Session Store → Database
                    → Server 3 ─┘
                    → Health Checks
                    → Auto-scaling Logic
                    → Service Discovery
                    → Configuration Sync

Now look at my architecture:

User → The Server → Database

One of these has 47 potential failure points. One has 3. Choose wisely.

The Cost of Distributed Systems

Component Monthly Cost
Load balancer $20
Server 1 (small) $50
Server 2 (small) $50
Server 3 (small) $50
Redis for sessions $30
DevOps engineer time $10,000
Total $10,200

vs.

Component Monthly Cost
One beefy server $200
Total $200

XKCD 1205 says it all—the time spent on “scalable” infrastructure never pays off.

The PHB Myth

The Pointy-Haired Boss in Dilbert loves buzzwords like “highly available” and “auto-scaling.” What he doesn’t understand is that availability comes from simplicity, not complexity. The fewer things that can break, the fewer things will break.

The Vertical Scaling Solution

Why buy 10 small servers when you can buy 1 large server?

# Their setup:
10x t3.small (2 vCPU, 2 GB RAM each)
= 20 vCPU, 20 GB RAM
+ coordination overhead
+ network latency
+ configuration management
+ session synchronization

# My setup:
1x c5.4xlarge (16 vCPU, 32 GB RAM)
= More power, less complexity
+ No coordination needed
+ No network latency
+ No configuration drift
+ Sessions just work

The “But What If It Goes Down?” Question

Your load balancer can also go down. Now you need redundant load balancers. And redundant connections between them. And failover logic. And health check services for the health check services.

My single server strategy:

# If server goes down:
1. Get alert
2. SSH into server
3. Restart service
4. Done

# Time: 3 minutes
# Complexity: None
# Sleep lost: 8 minutes (including waking up)

Session Affinity: A Confession

You know what load balancer users end up doing? Sticky sessions. Route the same user to the same server. Congratulations, you’ve reinvented “one server per user group” with extra steps.

# Load balancer config (after 6 months of "scaling")
upstream backend {
    ip_hash;  # Stick to the same server
    server 10.0.0.1;
    server 10.0.0.2;
    server 10.0.0.3;
}

# What you actually have: 3 single servers with a router in front

The Database Bottleneck Truth

Here’s what nobody tells you: your database is the bottleneck, not your application servers. Adding more app servers just means more connections to the same database, making it slower.

Before load balancer:
1 server → 10 DB connections → Database handles fine

After load balancer:
10 servers → 100 DB connections → Database crying
           → Connection pooling needed
           → pgbouncer deployment
           → More complexity
           → Same throughput (or worse)

Real High Availability

Want real high availability? Here’s my setup:

# Primary server: handles everything
server1.company.com

# "Backup" server: a cronjob that checks if primary is up
# If primary dies, I get a text message
# Then I fix it
# Uptime: 99.7% (good enough for any realistic SLA)

The Auto-Scaling Lie

“But auto-scaling handles traffic spikes!”

Traffic spikes are predictable:

  • Monday morning: people checking work emails
  • Black Friday: you knew this was coming
  • Product launch: you planned this
  • Viral moment: by the time auto-scaling kicks in, it’s over

Plan capacity ahead. Don’t trust automation to save you in real-time.

My Scaling Philosophy

def handle_scaling_needs():
    current_cpu = get_cpu_usage()
    
    if current_cpu > 80:
        # Step 1: Optimize code
        fix_that_n_plus_one_query()
        
    if current_cpu > 80:
        # Step 2: Add caching
        add_redis_maybe()
        
    if current_cpu > 80:
        # Step 3: Bigger server
        upgrade_instance_size()
        
    if current_cpu > 80:
        # Step 4: You're Twitter now
        # Maybe consider load balancing
        # But probably just optimize more
        pass

The author’s production server is a single machine named “GODZILLA” with 128 GB RAM. It has been running for 7 years without horizontal scaling. The power bill is concerning.