How to Build Nginx Smart Proxy Image with Multi-Domain Support

nginx smart proxy

Prologue: The Vision

In the vast digital landscape where countless applications struggle to find their voice, where developers battle with manual configurations, and where SSL certificates expire silently in the night—a vision was born. A vision to create something extraordinary: a reverse proxy that thinks, adapts, and evolves. A system that doesn’t just route traffic, but understands the intricate dance between domains and services, orchestrating them with the precision of a maestro conducting a symphony.

This is the story of the Nginx Smart Proxy—a journey from concept to reality, from manual configuration to automated intelligence, from single domain to infinite possibilities.

The Genesis – Understanding the Challenge

The Old World

Imagine a world where every new application required a new nginx configuration file. Where adding a domain meant manually crafting server blocks, upstream definitions, and SSL certificates. Where a single typo could bring down entire services. Where certificate renewals were a constant source of anxiety, and where scaling meant exponential configuration complexity.

This was the world we inherited. A world of repetitive tasks, human error, and technical debt that accumulated like snow in a blizzard.

The Awakening

The moment of realization came not in a flash of lightning, but in the quiet frustration of a developer staring at yet another nginx configuration file. There must be a better way, the thought echoed. What if we could automate this? What if we could make it intelligent? What if we could make it understand the relationship between domains and services without us having to explain it every time?

And so, the seed was planted. The goal was clear: Create an intelligent, automated reverse proxy system that could handle multiple domains and multiple applications without manual intervention.


The Grand Design – Features That Define Us

The Foundation: Multi-Domain Architecture

The cornerstone of our creation was the ability to handle unlimited domains and unlimited applications simultaneously. Not just two or three—but as many as the infrastructure could support. Each domain would be independent, each application isolated, yet all orchestrated by a single, intelligent system.

The Architecture:

  • One container, infinite possibilities
  • Dynamic configuration generation from simple environment variables
  • Automatic service discovery and routing
  • Independent health monitoring for each service

Feature 1: Automatic Configuration Generation

The Magic of Automation

In the old world, adding a new domain meant:

  1. Creating a new nginx server block
  2. Defining upstream configurations
  3. Configuring SSL certificates
  4. Testing the configuration
  5. Reloading nginx
  6. Praying nothing broke

In our new world, you simply write:

DOMAINS=app1.com:app1,app2.com:app2,api.com:api
SERVICES=app1:app1:8080,app2:app2:8081,api:api:3000

And the system thinks. It understands. It generates.

The generate-config.py script became our artisan, crafting perfect nginx configurations from these simple declarations. It reads the template, understands the structure, and weaves together upstream blocks, HTTP redirects, HTTPS configurations, and proxy settings with the precision of a master craftsman.

What We Built:

  • Intelligent template parsing system
  • Dynamic upstream block generation for each service
  • Automatic HTTP-to-HTTPS redirect configuration
  • Per-domain SSL certificate integration
  • Independent server blocks for each domain
  • Health check endpoints for every service

The Result: Zero manual nginx configuration files. Zero configuration errors. Zero guesswork.

Feature 2: Intelligent SSL Certificate Management

The Silent Guardian

SSL certificates are the guardians of the web, but they are fickle. They expire, they need renewal, they require DNS validation. In the old world, this was a constant source of anxiety. In our world, it’s automated perfection.

The SSL Symphony:

  • Automatic certificate generation using Let’s Encrypt
  • Cloudflare DNS-01 challenge integration (no HTTP verification needed)
  • Intelligent renewal scheduling (checks every 3 days by default)
  • Per-domain certificate management
  • Automatic nginx reload after certificate renewal
  • Graceful failure handling and retry logic

The Magic Behind the Scenes:

When a container starts, the system performs a delicate dance:

  1. It checks each domain for existing certificates
  2. For missing certificates, it generates them automatically
  3. For existing certificates, it checks expiration dates
  4. If renewal is needed (within 30 days), it renews automatically
  5. It verifies DNS records point to the correct server
  6. It creates Cloudflare DNS records for validation
  7. It waits for DNS propagation (30 seconds)
  8. It retrieves the certificate
  9. It reloads nginx to use the new certificate

All of this happens silently, in the background, while your applications continue running seamlessly.

Feature 3: Service Isolation & Fault Tolerance

The Art of Resilience

One of the most critical insights we had was understanding that services must be independent. If one application fails, it should not bring down others. This is not just a feature—it’s a philosophy of resilience.

The Isolation Mechanism:

Each domain has its own:

  • Independent upstream configuration
  • Separate server block
  • Isolated error handling
  • Individual logging
  • Independent health checks

The Test That Proved Everything:

We stopped app1 while app2 and api continued running. The result was beautiful in its simplicity:

  • app1.com → 502 Bad Gateway (expected, service is down)
  • app2.com → Perfect response (unaffected)
  • api.com → Perfect response (unaffected)

This wasn’t just a test—it was proof that our architecture was sound, that our design was resilient, that our system understood the concept of graceful degradation.

The Development Journey – Building the Pieces

The Entrypoint Script: The Conductor

The multi-domain-entrypoint.sh script became the conductor of our orchestra. With 674 lines of carefully crafted bash, it orchestrates the entire startup sequence:

The Opening Sequence:

  1. Configuration Detection: It checks whether you’re using environment variables or JSON files
  2. Validation: It ensures all required variables are present
  3. Domain Parsing: It intelligently parses comma-separated domain:service pairs
  4. Service Parsing: It validates service:host:port triples with port range checking
  5. Error Handling: It provides clear, actionable error messages

The SSL Ballet:

  • Domain IP verification (ensures DNS points to your server)
  • Certificate generation (for missing certificates)
  • Certificate renewal (for expiring certificates)
  • Background renewal process (runs continuously)
  • Lock file management (prevents concurrent renewals)

The Configuration Generation:

  • Calls the Python script with proper parameters
  • Handles LOCAL mode (no SSL for development)
  • Monitors configuration files for changes
  • Automatically reloads nginx when configs change

The Python Generator: The Artisan

The generate-config.py script is where the magic happens. With 314 lines of Python elegance, it transforms simple configuration into complex nginx configurations.

The Generation Process:

  1. Template Loading: It reads the nginx template with handlebars-like syntax
  2. Service Processing: It generates upstream blocks for each service
  3. Domain Processing: It creates HTTP and HTTPS server blocks for each domain
  4. Local Mode Detection: It intelligently switches between HTTP-only and HTTPS configurations
  5. Template Replacement: It uses regex to replace template sections with generated content
  6. Cleanup: It removes any remaining template conditionals

The Innovation:

We added support for --local flag, which transforms the entire configuration:

  • In LOCAL mode: HTTP-only, no SSL, direct service routing
  • In Production mode: HTTPS with SSL, HTTP-to-HTTPS redirects, security headers

This dual-mode operation allows the same system to work seamlessly in development and production.

The Template System: The Blueprint

The multi-domain.conf.template is our blueprint. It uses a handlebars-inspired syntax that our Python script understands:

{{#each services}}
upstream {{name}} {
    server {{host}}:{{port}};
}
{{/each}}

{{#each domains}}
server {
    server_name {{name}};
    location / {
        proxy_pass http://{{service}};
    }
}
{{/each}}

This template is transformed into actual nginx configuration with all the complexity hidden from the user.

The Challenges We Overcame

Challenge 1: Template Replacement Logic – The Deep Dive

The Problem: The template used handlebars syntax ({{#each}}{{#if}}), but we needed to replace these with actual generated content using regex. This is not as simple as it sounds. The template contains multiple sections that need to be replaced in a specific order, and the replacement must preserve the structure of the nginx configuration.

The Initial Attempt – Why It Failed:

Our first naive approach was simple string replacement:

config_content = config_content.replace('{{#each services}}', '\n'.join(upstream_blocks))

This failed catastrophically because:

  1. Multiple occurrences: The template had multiple {{#each}} blocks (services, domains for HTTP, domains for HTTPS)
  2. Nested content: The template markers surrounded actual content that needed to be preserved or replaced intelligently
  3. Order dependency: Upstream blocks must come before server blocks, but simple replacement doesn’t guarantee order
  4. Whitespace preservation: Nginx is sensitive to whitespace, and simple replacement doesn’t preserve formatting

The Evolution – Building the Solution:

Step 1: Understanding the Regex Pattern

The regex pattern we developed is deceptively simple but powerful:

r'\{\{#each\s+services\s*\}\}\s*\n(.*?)\n\s*\{\{/each\}\}'

Let’s break this down:

  • \{\{ – Escaped opening braces (literal {{)
  • #each – The handlebars directive
  • \s+services\s* – Matches “services” with optional whitespace
  • \}\} – Escaped closing braces (literal }})
  • \s*\n – Optional whitespace followed by newline
  • (.*?) – Non-greedy capture group – This is crucial! It captures the content between the markers
  • \n\s*\{\{/each\}\} – Newline, optional whitespace, closing marker

The re.DOTALL flag is critical – it makes . match newlines, which is essential for multi-line replacements.

Step 2: The Replacement Strategy

We developed a three-phase replacement strategy:

Phase 1: Upstream Blocks

# Replace upstream section first (it's independent)
config_content = re.sub(
    r'\{\{#each\s+services\s*\}\}\s*\n(.*?)\n\s*\{\{/each\}\}',
    '\n'.join(upstream_blocks),
    config_content,
    flags=re.DOTALL
)

Why first? Upstream blocks must appear before server blocks in nginx configuration. The order matters.

Phase 2: HTTP Server Blocks

# Replace HTTP section (first occurrence)
config_content = re.sub(
    r'\{\{#each\s+domains\s*\}\}\s*\n(.*?)\n\s*\{\{/each\}\}',
    '\n'.join(http_blocks),
    config_content,
    flags=re.DOTALL,
    count=1  # Only replace the FIRST occurrence
)

The count=1 parameter is crucial – it ensures we only replace the HTTP section, not the HTTPS section that comes later.

Phase 3: HTTPS Server Blocks (or Removal in LOCAL mode)

if not self.local_mode:
    # Replace HTTPS section (second occurrence)
    config_content = re.sub(
        r'\{\{#each\s+domains\s*\}\}\s*\n(.*?)\n\s*\{\{/each\}\}',
        '\n'.join(https_blocks),
        config_content,
        flags=re.DOTALL,
        count=1  # Only replace the SECOND occurrence (after HTTP was replaced)
    )
else:
    # In LOCAL mode, remove HTTPS section entirely
    config_content = re.sub(
        r'\{\{#each\s+domains\s*\}\}\s*\n(.*?)\n\s*\{\{/each\}\}',
        '',
        config_content,
        flags=re.DOTALL,
        count=1
    )

Step 3: Cleanup – Handling Conditional Markers

The template also contains conditional markers like {{#if service}} and {{else}}. These need to be removed after replacement:

# Remove {{#if service}} markers
config_content = re.sub(r'\{\{#if\s+service\s*\}\}\s*\n', '', config_content)
# Remove {{else}} markers
config_content = re.sub(r'\n\s*\{\{\s*else\s*\}\}\s*\n', '\n', config_content)
# Remove {{/if}} markers
config_content = re.sub(r'\n\s*\{\{/if\}\}\s*\n', '\n', config_content)

The Complexity Behind the Simplicity:

What looks like simple regex replacement is actually a carefully orchestrated sequence:

  1. Order matters: Upstream → HTTP → HTTPS (or remove HTTPS)
  2. Count matters: count=1 ensures we replace the correct occurrence
  3. Flags matter: re.DOTALL is essential for multi-line matching
  4. Cleanup matters: Conditional markers must be removed after replacement

The Result:

  • Perfect nginx configuration generation
  • Preserved whitespace and formatting
  • Correct order of directives
  • No template artifacts left behind

Real-World Example:

Input Template:

{{#each services}}
upstream {{name}} {
    server {{host}}:{{port}};
}
{{/each}}

{{#each domains}}
server {
    server_name {{name}};
    {{#if service}}
    location / {
        proxy_pass http://{{service}};
    }
    {{else}}
    location / {
        return 503;
    }
    {{/if}}
}
{{/each}}

Generated Output (for app1:app1:8080, app2:app2:8081, domain app1.com:app1):

upstream app1 {
    server app1:8080;
}
upstream app2 {
    server app2:8081;
}

server {
    server_name app1.com;
    location / {
        proxy_pass http://app1;
    }
}

The Magic: All template markers are gone, replaced with actual nginx configuration. The system thinks about what needs to be generated and generates it perfectly.

LOCAL Mode Configuration – The Deep Dive

The Problem: The template always generated HTTPS configurations, but in LOCAL mode, we don’t have SSL certificates. This is a fundamental architectural challenge. In production, we want HTTPS with SSL certificates. In development, we want HTTP without SSL complexity. The same system must work in both environments.

The Initial Problem – Why It Failed:

When we first tested the system locally, we saw this error:

nginx: [emerg] cannot load certificate "/etc/letsencrypt/live/app1.local/fullchain.pem": BIO_new_file() failed

This happened because:

  1. The template always generated HTTPS server blocks
  2. HTTPS blocks require SSL certificates
  3. Local development doesn’t have SSL certificates
  4. Nginx failed to start because it couldn’t find the certificates

We tried several workarounds:

  • Workaround 1: Generate self-signed certificates for local development
    • Problem: Complex setup, still requires certificate management
  • Workaround 2: Comment out SSL directives in local mode
    • Problem: Nginx still tries to load certificates if ssl_certificate directive exists
  • Workaround 3: Use different templates for local and production
    • Problem: Code duplication, maintenance nightmare

The Solution – The Dual-Mode Architecture:

We implemented a dual-mode system that fundamentally changes how configuration is generated:

Step 1: The local_mode Parameter

We added a local_mode boolean parameter to the NginxConfigGenerator class:

class NginxConfigGenerator:
    def __init__(self, config_file, template_file, output_file, local_mode=False):
        self.config_file = config_file
        self.template_file = template_file
        self.output_file = output_file
        self.local_mode = local_mode  # The magic flag
        self.config = None

This flag is checked at multiple points in the generation process to alter behavior.

Step 2: HTTP Block Generation – The Conditional Logic

In LOCAL mode, HTTP blocks serve directly (no redirect). In production mode, HTTP blocks redirect to HTTPS:

# Generate HTTP blocks
http_blocks = []
for domain in self.config['domains']:
    service_name = domain.get('service', '')
    
    if self.local_mode:
        # LOCAL MODE: Serve directly on HTTP (no redirect)
        if service_name:
            service_config = f"""    # Route to appropriate backend service
    location / {{
        proxy_pass http://{service_name};
        proxy_set_header Host $http_host;
        proxy_set_header X-Forwarded-Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header HTTPS "off";  # Important: Set to "off" in local mode
        
        # Proxy timeouts
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }}"""
        else:
            service_config = """    # Default response if no service is configured
    location / {
        return 503 "Service not configured for this domain";
    }"""
        
        http_block = f"""server {{
    listen 80;
    listen [::]:80;
    server_name {domain['name']};

    # Logging
    access_log /var/log/nginx/{domain['name']}_access.log logger-json;
    error_log /var/log/nginx/{domain['name']}_error.log;

{service_config}

    # Health check endpoint
    location /health {{
        access_log off;
        return 200 "healthy\\n";
        add_header Content-Type text/plain;
    }}
}}"""
    else:
        # PRODUCTION MODE: Redirect HTTP to HTTPS
        http_block = f"""server {{
    listen 80;
    listen [::]:80;
    server_name {domain['name']};

    # Redirect all HTTP traffic to HTTPS
    location / {{
        return 301 https://$host$request_uri;
    }}
}}"""
    http_blocks.append(http_block)

The Key Differences:

AspectLOCAL ModeProduction Mode
HTTP BehaviorServes directlyRedirects to HTTPS
SSL CertificatesNot requiredRequired
HTTPS BlocksRemoved entirelyGenerated normally
Proxy HeadersHTTPS "off"HTTPS "on"
Port80 only80 (redirect) + 443 (serve)

Step 3: HTTPS Block Removal in LOCAL Mode

The critical insight: In LOCAL mode, we don’t just skip HTTPS blocks—we remove them entirely:

# Replace HTTPS section (second occurrence) - only if not in LOCAL mode
if not self.local_mode:
    # PRODUCTION: Replace HTTPS section with generated blocks
    config_content = re.sub(
        r'\{\{#each\s+domains\s*\}\}\s*\n(.*?)\n\s*\{\{/each\}\}',
        '\n'.join(https_blocks),
        config_content,
        flags=re.DOTALL,
        count=1
    )
else:
    # LOCAL MODE: Remove HTTPS section entirely
    config_content = re.sub(
        r'\{\{#each\s+domains\s*\}\}\s*\n(.*?)\n\s*\{\{/each\}\}',
        '',  # Empty string - removes the section completely
        config_content,
        flags=re.DOTALL,
        count=1
    )

Why This Matters:

If we left the HTTPS section in the template (even empty), nginx might still try to load SSL certificates. By removing it entirely, we ensure nginx never sees SSL directives in local mode.

Step 4: Entrypoint Integration

The entrypoint script must pass the --local flag to the Python script:

generate_nginx_config() {
    log_info "Generating nginx configuration..."
    
    # Create combined config file
    local combined_config="/tmp/combined_config.json"
    python3 -c "
import json
domains_data = json.load(open('$DOMAINS_CONFIG'))
services_data = json.load(open('$SERVICES_CONFIG'))
combined = {
    'domains': domains_data.get('domains', []),
    'services': services_data.get('services', [])
}
with open('$combined_config', 'w') as f:
    json.dump(combined, f, indent=2)
"
    
    # Generate nginx configuration
    local local_flag=""
    if [[ "$LOCAL" == "true" ]]; then
        local_flag="--local"  # The magic flag
    fi
    
    if python3 /usr/local/bin/generate-config.py \
        --config "$combined_config" \
        --template "$NGINX_TEMPLATE" \
        --output "$NGINX_CONF" \
        $local_flag; then  # Pass the flag
        log_info "Nginx configuration generated successfully."
        return 0
    else
        log_error "Failed to generate nginx configuration."
        return 1
    fi
}

The Watcher Integration – The Critical Detail:

The watcher (for configuration monitoring) must also pass the --local flag:

monitor_config_changes() {
    log_info "Starting configuration file monitoring..."
    
    local local_flag=""
    if [[ "$LOCAL" == "true" ]]; then
        local_flag="--local"
    fi
    
    # Export LOCAL flag so it's available in the bash -c command
    export LOCAL_FLAG="$local_flag"
    export NGINX_TEMPLATE_PATH="$NGINX_TEMPLATE"
    export NGINX_CONF_PATH="$NGINX_CONF"
    
    watchexec -r -w "$DOMAINS_CONFIG" -w "$SERVICES_CONFIG" -w "$NGINX_TEMPLATE" \
        -- bash -c '
            echo "Configuration file changed, regenerating nginx config..."
            python3 /usr/local/bin/generate-config.py \
                --config /tmp/combined_config.json \
                --template "$NGINX_TEMPLATE_PATH" \
                --output "$NGINX_CONF_PATH" \
                $LOCAL_FLAG && \  # Critical: Pass the flag here too
            nginx -s reload && \
            echo "Nginx configuration reloaded successfully" || \
            echo "Failed to regenerate nginx configuration"
        ' &
    # ...
}

Why This Matters:

If the watcher doesn’t pass the --local flag, configuration changes in local mode would regenerate HTTPS blocks, causing nginx to fail when it tries to reload.

The Result:

LOCAL Mode Output:

# Upstream blocks
upstream app1 {
    server app1:8080;
}

# HTTP server blocks (no redirect, serves directly)
server {
    listen 80;
    listen [::]:80;
    server_name app1.local;
    
    location / {
        proxy_pass http://app1;
        proxy_set_header HTTPS "off";
        # ... proxy settings ...
    }
    
    location /health {
        return 200 "healthy\n";
    }
}

# No HTTPS blocks at all!

Production Mode Output:

# Upstream blocks
upstream app1 {
    server app1:8080;
}

# HTTP server blocks (redirect to HTTPS)
server {
    listen 80;
    listen [::]:80;
    server_name app1.com;
    
    location / {
        return 301 https://$host$request_uri;
    }
}

# HTTPS server blocks (serves with SSL)
server {
    listen 443 ssl;
    listen [::]:443 ssl;
    http2 on;
    server_name app1.com;
    
    ssl_certificate /etc/letsencrypt/live/app1.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/app1.com/privkey.pem;
    # ... SSL settings ...
    
    location / {
        proxy_pass http://app1;
        proxy_set_header HTTPS "on";
        # ... proxy settings ...
    }
}

The Magic: The same codebase, the same template, but completely different behavior based on a single flag. This is intelligent design.

Multi-Domain Parsing – The Deep Dive

The Problem: Parsing comma-separated domain:service pairs and validating them correctly. This sounds simple, but it’s actually quite complex. The input format is flexible (app1.com:app1,app2.com:app2), but we need to handle:

  • Empty entries
  • Missing colons
  • Invalid service names
  • Domains without services
  • Whitespace in various places
  • Port numbers in services

The Input Format:

Users provide configuration in two formats:

Format 1: Environment Variables

DOMAINS="app1.com:app1,app2.com:app2,api.com:api"
SERVICES="app1:app1:8080,app2:app2:8081,api:api:3000"

Format 2: JSON Files

{
  "domains": [
    {"name": "app1.com", "service": "app1"},
    {"name": "app2.com", "service": "app2"},
    {"name": "api.com", "service": "api"}
  ],
  "services": [
    {"name": "app1", "host": "app1", "port": 8080},
    {"name": "app2", "host": "app2", "port": 8081},
    {"name": "api", "host": "api", "port": 3000}
  ]
}

The Challenge:

The entrypoint script must parse environment variables and convert them to JSON format, which the Python script then validates. This requires careful error handling at multiple levels.

Step 1: Domain Parsing – The Bash/Python Hybrid

The entrypoint script uses a Python one-liner embedded in bash to parse domains:

parse_domains_from_env() {
    local domains_json="/tmp/domains_from_env.json"
    
    python3 -c "
import json
import sys
domains = []
errors = []
if '$DOMAINS':
    for i, domain_entry in enumerate('$DOMAINS'.split(',')):
        domain_entry = domain_entry.strip()
        if not domain_entry:
            continue  # Skip empty entries
        
        if ':' in domain_entry:
            parts = domain_entry.split(':', 1)  # Split only on first ':'
            domain_name = parts[0].strip()
            service_name = parts[1].strip() if len(parts) > 1 else ''
            
            if not domain_name:
                errors.append(f'Domain entry {i+1} \"{domain_entry}\" has empty domain name')
                continue
            
            domains.append({
                'name': domain_name,
                'service': service_name
            })
        else:
            # Domain without service - valid use case (returns 503)
            domains.append({
                'name': domain_entry.strip(),
                'service': ''
            })

if errors:
    print('ERROR: Invalid domain entries:', file=sys.stderr)
    for error in errors:
        print(f'  {error}', file=sys.stderr)
    sys.exit(1)

if not domains:
    print('ERROR: No valid domains found in DOMAINS environment variable', file=sys.stderr)
    sys.exit(1)

with open('$domains_json', 'w') as f:
    json.dump({'domains': domains}, f, indent=2)
"
    
    if [ $? -ne 0 ]; then
        log_error "Failed to parse domains from environment variable"
        exit 1
    fi
    
    echo "$domains_json"
}

The Key Insights:

  1. Empty Entry Handling: if not domain_entry: continue – Skips empty entries gracefully
  2. Split on First Colon: split(':', 1) – Prevents issues if domain names contain colons
  3. Whitespace Stripping: domain_entry.strip() – Handles whitespace in various places
  4. Domains Without Services: Valid use case – returns 503 when accessed
  5. Error Collection: Collects all errors before reporting – better user experience

Step 2: Service Parsing – Port Validation

Service parsing is more complex because it includes port validation:

parse_services_from_env() {
    local services_json="/tmp/services_from_env.json"
    
    python3 -c "
import json
import sys
services = []
errors = []
if '$SERVICES':
    for i, service_entry in enumerate('$SERVICES'.split(',')):
        service_entry = service_entry.strip()
        if not service_entry:
            continue  # Skip empty entries
        
        parts = service_entry.split(':')
        if len(parts) < 3:
            errors.append(f'Service entry {i+1} \"{service_entry}\" is invalid. Expected format: name:host:port')
            continue
        
        try:
            port = int(parts[2].strip())
            if port < 1 or port > 65535:
                errors.append(f'Service entry {i+1} \"{service_entry}\" has invalid port {port}. Port must be between 1 and 65535')
                continue
            
            services.append({
                'name': parts[0].strip(),
                'host': parts[1].strip(),
                'port': port
            })
        except ValueError:
            errors.append(f'Service entry {i+1} \"{service_entry}\" has invalid port \"{parts[2].strip()}\". Port must be a number')
            continue

if errors:
    print('ERROR: Invalid service entries:', file=sys.stderr)
    for error in errors:
        print(f'  {error}', file=sys.stderr)
    sys.exit(1)

if not services:
    print('ERROR: No valid services found in SERVICES environment variable', file=sys.stderr)
    sys.exit(1)

with open('$services_json', 'w') as f:
    json.dump({'services': services}, f, indent=2)
"
    
    if [ $? -ne 0 ]; then
        log_error "Failed to parse services from environment variable"
        exit 1
    fi
    
    echo "$services_json"
}

The Port Validation Logic:

  1. Format Validation: Checks for name:host:port format (exactly 3 parts)
  2. Type Validation: int(parts[2].strip()) – Converts to integer, catches non-numeric values
  3. Range Validation: if port < 1 or port > 65535 – Ensures valid port range
  4. Error Messages: Clear, actionable error messages with entry number and actual value

Step 3: Cross-Validation – Service Name Validation

The Python script validates that domain service references exist in the services list:

def validate_config(self):
    """Validate the configuration structure"""
    # ... basic structure validation ...
    
    # Get list of valid service names
    valid_service_names = {service['name'] for service in self.config['services']}
    
    # Validate each domain
    for i, domain in enumerate(self.config['domains']):
        # ... domain structure validation ...
        
        # Validate service name if provided
        service_name = domain.get('service', '')
        if service_name and service_name not in valid_service_names:
            print(f"Error: Domain {i} ('{domain['name']}') references service '{service_name}' which doesn't exist in services list")
            print(f"Available services: {', '.join(sorted(valid_service_names))}")
            sys.exit(1)

The Validation Flow:

  1. Collect Valid Services: Creates a set of valid service names for O(1) lookup
  2. Check Each Domain: For each domain, checks if its service reference exists
  3. Helpful Error Messages: Shows which domain has the problem and lists available services

Real-World Examples:

Example 1: Valid Configuration

DOMAINS="app1.com:app1,app2.com:app2"
SERVICES="app1:app1:8080,app2:app2:8081"

Parsed Result:

{
  "domains": [
    {"name": "app1.com", "service": "app1"},
    {"name": "app2.com", "service": "app2"}
  ],
  "services": [
    {"name": "app1", "host": "app1", "port": 8080},
    {"name": "app2", "host": "app2", "port": 8081}
  ]
}

Example 2: Domain Without Service

DOMAINS="app1.com:app1,status.com"
SERVICES="app1:app1:8080"

Parsed Result:

{
  "domains": [
    {"name": "app1.com", "service": "app1"},
    {"name": "status.com", "service": ""}
  ],
  "services": [
    {"name": "app1", "host": "app1", "port": 8080}
  ]
}

Generated Nginx Config for status.com:

server {
    listen 80;
    server_name status.com;
    
    location / {
        return 503 "Service not configured for this domain";
    }
}

Example 3: Invalid Service Reference

DOMAINS="app1.com:nonexistent"
SERVICES="app1:app1:8080"

Error Output:

Error: Domain 0 ('app1.com') references service 'nonexistent' which doesn't exist in services list
Available services: app1

Example 4: Invalid Port

SERVICES="app1:app1:invalid"

Error Output:

ERROR: Invalid service entries:
  Service entry 1 "app1:app1:invalid" has invalid port "invalid". Port must be a number

Example 5: Port Out of Range

SERVICES="app1:app1:99999"

Error Output:

ERROR: Invalid service entries:
  Service entry 1 "app1:app1:99999" has invalid port 99999. Port must be between 1 and 65535

The Result:

  • Robust parsing handles edge cases gracefully
  • Clear error messages help users fix issues quickly
  • Validation prevents invalid configurations from reaching nginx
  • Support for domains without services enables flexible use cases

The Magic: The system doesn’t just parse—it validates. It doesn’t just validate—it helps. Error messages are clear, actionable, and include context.

Certificate Renewal for Multiple Domains – The Deep Dive

The Problem: Renewing certificates for multiple domains without blocking or causing conflicts. This is a complex problem that involves:

  • Certificate expiration detection
  • Renewal scheduling
  • Concurrent renewal prevention
  • Error handling and retry logic
  • Nginx reload coordination
  • Background process management

The Challenge – Why It’s Complex:

  1. Certificate Expiration: Let’s Encrypt certificates expire after 90 days. We need to renew them before expiration (typically 30 days before).
  2. Multiple Domains: Each domain has its own certificate. We can’t renew all at once—we need to handle each individually.
  3. Concurrent Renewals: If multiple renewal processes run simultaneously, they can conflict or cause rate limiting issues.
  4. Rate Limiting: Let’s Encrypt has rate limits. Too many renewal attempts can temporarily block renewals.
  5. Nginx Coordination: After renewal, nginx must reload to use the new certificate. This must happen atomically.

The Solution – The Renewal Architecture:

Step 1: Expiration Detection

The system checks certificate expiration dates:

check_certificate_expiration() {
    local domain=$1
    local cert_file="/etc/letsencrypt/live/$domain/fullchain.pem"
    
    if [ ! -f "$cert_file" ]; then
        return 1  # Certificate doesn't exist
    fi
    
    # Get expiration date
    local expiration_date=$(openssl x509 -enddate -noout -in "$cert_file" 2>/dev/null | cut -d= -f2)
    if [ -z "$expiration_date" ]; then
        return 1  # Can't read expiration date
    fi
    
    # Convert to epoch time
    local expiration_epoch=$(date -d "$expiration_date" +%s 2>/dev/null)
    local current_epoch=$(date +%s)
    local days_until_expiration=$(( (expiration_epoch - current_epoch) / 86400 ))
    
    # Check if renewal is needed (within 30 days)
    if [ $days_until_expiration -le $CERT_RENEWAL_THRESHOLD ]; then
        return 0  # Renewal needed
    else
        return 1  # No renewal needed
    fi
}

The Logic:

  1. Check if certificate file exists
  2. Extract expiration date using OpenSSL
  3. Convert to epoch time for comparison
  4. Calculate days until expiration
  5. Return true if within renewal threshold (30 days)

Step 2: Lock File Mechanism – Preventing Concurrent Renewals

The system uses a lock file to prevent concurrent renewals:

cleanup_stale_locks() {
    local lock_file="/tmp/cert_renewal.lock"
    if [[ -f "$lock_file" ]]; then
        local pid=$(cat "$lock_file" 2>/dev/null)
        if [[ -n "$pid" ]] && ! kill -0 "$pid" 2>/dev/null; then
            log_warn "Found stale lock file from PID $pid. Removing..."
            rm -f "$lock_file"
        fi
    fi
}

acquire_renewal_lock() {
    local lock_file="/tmp/cert_renewal.lock"
    local pid=$$
    
    # Cleanup stale locks first
    cleanup_stale_locks
    
    # Try to create lock file atomically
    if (set -C; echo $pid > "$lock_file") 2>/dev/null; then
        return 0  # Lock acquired
    else
        return 1  # Lock already held
    fi
}

release_renewal_lock() {
    local lock_file="/tmp/cert_renewal.lock"
    rm -f "$lock_file"
}

The Lock Mechanism:

  • Atomic Creation: set -C ensures the file is created atomically (fails if file exists)
  • PID Tracking: Lock file contains the PID of the process holding the lock
  • Stale Lock Detection: Checks if the PID is still running before removing stale locks
  • Automatic Cleanup: Removes lock file after renewal completes

Step 3: Individual Domain Renewal

The system renews each domain individually:

renew_certificate_for_domain() {
    local domain=$1
    
    log_info "Checking certificate for $domain..."
    
    # Check if renewal is needed
    if ! check_certificate_expiration "$domain"; then
        log_info "Certificate for $domain is valid. No renewal needed."
        return 0
    fi
    
    log_info "Certificate for $domain expires soon. Renewal needed."
    
    # Try to acquire lock
    if ! acquire_renewal_lock; then
        log_warn "Renewal already in progress. Skipping $domain."
        return 1
    fi
    
    # Renew certificate
    log_info "Renewing certificate for $domain..."
    
    if certbot certonly \
        --force-renewal \
        --non-interactive \
        --agree-tos \
        --email "$CERTBOT_EMAIL" \
        --dns-cloudflare \
        --dns-cloudflare-credentials /etc/cloudflare.ini \
        --dns-cloudflare-propagation-seconds 30 \
        -d "$domain"; then
        log_info "Certificate renewed successfully for $domain"
        release_renewal_lock
        return 0
    else
        log_error "Failed to renew certificate for $domain"
        release_renewal_lock
        return 1
    fi
}

The Renewal Process:

  1. Check if renewal is needed
  2. Acquire lock (prevents concurrent renewals)
  3. Run certbot with Cloudflare DNS-01 challenge
  4. Release lock after completion
  5. Return success/failure status

Step 4: Background Renewal Process

The system runs a background process that continuously checks for certificate renewals:

autorenew_certificates() {
    log_info "Starting certificate auto-renewal process..."
    
    # Convert renewal interval to seconds
    local interval_seconds
    case "$RENEWAL_INTERVAL" in
        *d) interval_seconds=$((${RENEWAL_INTERVAL%d} * 86400)) ;;
        *h) interval_seconds=$((${RENEWAL_INTERVAL%h} * 3600)) ;;
        *m) interval_seconds=$((${RENEWAL_INTERVAL%m} * 60)) ;;
        *) interval_seconds=259200 ;;  # Default: 3 days
    esac
    
    log_info "Renewal check interval: $interval_seconds seconds ($RENEWAL_INTERVAL)"
    
    while true; do
        log_info "Checking certificates for renewal..."
        
        local renewal_failed=false
        local domains_renewed=0
        
        # Renew certificates for each domain
        for domain in $DOMAINS; do
            if renew_certificate_for_domain "$domain"; then
                domains_renewed=$((domains_renewed + 1))
            else
                renewal_failed=true
            fi
        done
        
        # Reload nginx if any certificates were renewed
        if [ $domains_renewed -gt 0 ]; then
            log_info "Reloading nginx to use renewed certificates..."
            if nginx -s reload; then
                log_info "Nginx reloaded successfully"
            else
                log_error "Failed to reload nginx"
                renewal_failed=true
            fi
        fi
        
        # Wait for next check
        log_info "Next renewal check in $interval_seconds seconds..."
        sleep $interval_seconds
    done
}

The Background Process:

  1. Converts renewal interval to seconds (supports dhm suffixes)
  2. Runs in infinite loop
  3. Checks each domain for renewal
  4. Reloads nginx if any certificates were renewed
  5. Waits for next check interval

Step 5: Error Handling and Retry Logic

The system handles errors gracefully:

renew_certificate_for_domain() {
    local domain=$1
    local max_retries=3
    local retry_count=0
    
    while [ $retry_count -lt $max_retries ]; do
        if certbot certonly ...; then
            return 0  # Success
        fi
        
        retry_count=$((retry_count + 1))
        if [ $retry_count -lt $max_retries ]; then
            log_warn "Renewal attempt $retry_count failed. Retrying in 60 seconds..."
            sleep 60
        fi
    done
    
    log_error "Failed to renew certificate for $domain after $max_retries attempts"
    return 1
}

The Error Handling:

  • Retry Logic: Attempts renewal up to 3 times
  • Exponential Backoff: Waits between retries (60 seconds)
  • Graceful Degradation: Continues with other domains even if one fails
  • Logging: Logs all attempts and failures for debugging

The Result:

Certificate Renewal Flow:

1. Background process starts
2. Waits for renewal interval (3 days default)
3. For each domain:
   a. Check expiration date
   b. If renewal needed:
      - Acquire lock
      - Renew certificate
      - Release lock
4. If any certificates renewed:
   - Reload nginx
5. Wait for next check

The Magic: The system doesn’t just renew certificates—it manages them. It prevents conflicts, handles errors gracefully, and coordinates nginx reloads automatically. The renewal process is invisible to the user, running silently in the background.

The Features That Define Us

Feature Set 1: Configuration Flexibility

Environment Variables or JSON Files?

Why choose? Our system supports both:

Method 1: Environment Variables (Simple, Docker-friendly)

DOMAINS=app1.com:app1,app2.com:app2
SERVICES=app1:app1:8080,app2:app2:8081

Method 2: JSON Files (Complex configurations, version control)

{
  "domains": [
    {"name": "app1.com", "service": "app1"},
    {"name": "app2.com", "service": "app2"}
  ]
}

Both methods produce identical results. The system detects which method you’re using and adapts accordingly.

Feature Set 2: Intelligent Error Handling

Validation at Every Step:

  1. Domain Validation:
    • Ensures domain names are not empty
    • Validates service references exist
    • Checks for duplicate domains
  2. Service Validation:
    • Verifies port numbers are valid integers
    • Ensures ports are in valid range (1-65535)
    • Checks for required fields (name, host, port)
  3. Configuration Validation:
    • Tests nginx configuration before applying
    • Provides clear error messages
    • Suggests fixes for common issues

The Error Messages Are Poetry:

Instead of cryptic errors, users see:

ERROR: Domain entry 2 "app.com:" has empty service name
ERROR: Service entry 3 "app:app:invalid" has invalid port "invalid". Port must be a number
ERROR: Domain 1 ('app.com') references service 'nonexistent' which doesn't exist in services list
Available services: app1, app2, api

Feature Set 3: Dynamic Configuration Updates

The Watcher:

Our system doesn’t just generate configuration once—it watches for changes. Using watchexec, it monitors:

  • Domain configuration files
  • Service configuration files
  • Nginx template files

When any of these change, it automatically:

  1. Regenerates the nginx configuration
  2. Tests the new configuration
  3. Reloads nginx if the test passes
  4. Logs the result

The Result: Zero-downtime configuration updates. Change a domain, and within seconds, the new configuration is live.

The Testing Odyssey – Proving Ourselves

The First Test: Does It Even Work?

Every great journey begins with a single test. For us, it was a simple question: Does it work?

We created three test applications:

  • app1.local → A simple HTML page
  • app2.local → Another simple HTML page
  • api.local → A third simple HTML page

Each running in its own container, each on its own port, each waiting to be discovered.

The Test Setup:

We crafted a docker-compose.yml file that brought together:

  • The nginx smart proxy
  • Three test applications
  • Proper networking
  • Environment variable configuration

The Moment of Truth:

We started the containers. We watched the logs. We saw the magic happen:

[INFO] Parsing domains from environment variables...
[INFO] Found 3 domains: app1.local, app2.local, api.local
[INFO] Parsing services from environment variables...
[INFO] Found 3 services: app1, app2, api
[INFO] Generating nginx configuration...
[INFO] Nginx configuration generated successfully.
[INFO] Starting nginx...

And then we tested:

curl http://app1.local:8080
curl http://app2.local:8080
curl http://api.local:8080

The Result: Perfect. All three applications responded correctly. Each domain routed to its correct service. The system worked.

But we weren’t done. We had to test the edge cases.

The Resilience Test: What Happens When One Service Fails?

The Hypothesis: If one service fails, others should continue working.

The Test:

  1. Start all three services
  2. Verify all three work
  3. Stop app1
  4. Test all three again

The Result:

  • app1.local → 502 Bad Gateway (expected, service is down)
  • app2.local → Perfect response (unaffected)
  • api.local → Perfect response (unaffected)

The Insight: This wasn’t just a test—it was proof that our architecture was sound. Services were truly isolated. One failure didn’t cascade to others. This is the essence of resilience.

The Configuration Update Test: Can We Change On The Fly?

The Hypothesis: Configuration changes should be applied without downtime.

The Test:

  1. Start the system with 2 domains
  2. Add a third domain to the configuration
  3. Watch the watcher detect the change
  4. Verify the new configuration is applied

The Result: Within seconds of changing the configuration file, the watcher detected the change, regenerated the nginx configuration, tested it, and reloaded nginx. The new domain was live. Zero downtime.

The Magic: This is what makes our system “smart.” It doesn’t just generate configuration once—it adapts. It evolves. It responds to change.

The Local Mode Test: Development vs Production

The Hypothesis: The same system should work in both development and production.

The Challenge: In development, we don’t have SSL certificates. We don’t want HTTPS redirects. We just want HTTP.

The Solution: The --local flag. A simple boolean that transforms the entire configuration:

  • LOCAL mode: HTTP-only, no SSL, direct routing
  • Production mode: HTTPS with SSL, HTTP-to-HTTPS redirects

The Test:

  1. Run in LOCAL mode (LOCAL=true)
  2. Verify HTTP-only configuration
  3. Verify no SSL certificate errors
  4. Verify direct service routing

The Result: Perfect. The system seamlessly adapted to the local environment. No SSL errors. No certificate issues. Just pure, simple routing.

The Insight: This dual-mode operation wasn’t just a feature—it was a philosophy. The same codebase, the same configuration, but different behavior based on context. This is intelligent design.

The Production Deployment – Going Live

The Preparation: What We Learned

Before going to production, we had to prepare. We had learned many lessons from local testing:

  1. Configuration is Critical: Small errors in configuration can cause big problems
  2. Validation is Essential: Every input must be validated
  3. Error Messages Matter: Clear error messages save hours of debugging
  4. Isolation is Key: Services must be independent
  5. Monitoring is Non-Negotiable: We need to know what’s happening

The Production Setup: The Real World

The Environment:

  • DigitalOcean Droplet
  • Docker and Docker Compose
  • Cloudflare DNS
  • Let’s Encrypt SSL certificates
  • Multiple domains and applications

The Configuration:

We used environment variables for simplicity:

DOMAINS=app1.com:app1,app2.com:app2,api.com:api
SERVICES=app1:app1:8080,app2:app2:8081,api:api:3000
CLOUDFLARE_EMAIL=your-email@example.com
CLOUDFLARE_API_KEY=your-api-key
LOCAL=false

The First Deployment:

We deployed. We watched the logs. We saw the magic happen:

[INFO] Parsing domains from environment variables...
[INFO] Found 3 domains: app1.com, app2.com, api.com
[INFO] Verifying domain IP addresses...
[INFO] app1.com resolves to 203.0.113.1 ✓
[INFO] app2.com resolves to 203.0.113.1 ✓
[INFO] api.com resolves to 203.0.113.1 ✓
[INFO] Checking SSL certificates...
[INFO] Generating certificate for app1.com...
[INFO] Certificate generated successfully for app1.com ✓
[INFO] Generating certificate for app2.com...
[INFO] Certificate generated successfully for app2.com ✓
[INFO] Generating certificate for api.com...
[INFO] Certificate generated successfully for api.com ✓
[INFO] Generating nginx configuration...
[INFO] Nginx configuration generated successfully.
[INFO] Starting nginx...
[INFO] Nginx started successfully.

The Result: Perfect. All three domains were live. All three had SSL certificates. All three were routing correctly.

The SSL Certificate Management: The Background Process

The Challenge: SSL certificates expire. They need to be renewed. This must happen automatically.

The Solution: A background renewal process that:

  1. Runs every 3 days by default
  2. Checks each domain’s certificate expiration
  3. Renews certificates that expire within 30 days
  4. Reloads nginx after successful renewal
  5. Logs all renewal attempts

The Implementation:

while true; do
    check_and_renew_certificates
    sleep 259200  # 3 days
done

The Result: Certificates are renewed automatically. We never worry about expiration. We never get certificate errors. The system maintains itself.

The Monitoring: Watching Over Production

What We Monitor:

  • Nginx access logs (JSON format for easy parsing)
  • Nginx error logs (domain-specific)
  • Certificate renewal status
  • Configuration generation success/failure
  • Service health endpoints

The Logs Tell a Story:

Every request is logged in JSON format:

{
  "time": "2024-01-15T10:30:45Z",
  "remote_addr": "203.0.113.50",
  "request": "GET / HTTP/1.1",
  "status": 200,
  "bytes_sent": 1024,
  "domain": "app1.com"
}

This makes it easy to:

  • Track traffic patterns
  • Debug issues
  • Monitor performance
  • Analyze usage

The Health Checks:

Every domain has a health check endpoint:

GET /health

This returns a simple 200 OK response, allowing load balancers and monitoring systems to verify the service is up.

The VPN/Private Access Challenge – Security Meets Flexibility

The Problem: One App Behind VPN, Others Public

The Scenario:

  • Three applications: app1.comapp2.comapi.com
  • Requirement: app1.com must only be accessible via VPN
  • Requirement: app2.com and api.com must remain publicly accessible

The Challenge: How do we implement this in a system designed for public access?

The Solution Space: Three Options

We explored three different approaches, each with its own trade-offs:

Option 1: Custom Nginx Template with IP Whitelist

The Concept: Create a custom nginx template that includes IP whitelist directives for specific domains.

The Implementation:

  1. Mark domains with VPN IP restrictions in configuration
  2. Modify generate-config.py to detect VPN IPs
  3. Generate allow/deny directives in the nginx configuration using a whitelist approach
  4. Allow specific VPN IPs first, then deny all other IPs

The Security Model: The implementation uses a whitelist approach where allowed VPN IPs are specified first, followed by a deny-all directive. This ensures that nginx processes the allow directives first, matching any VPN IPs, and then denies all other IPs. This order is critical because nginx processes allow/deny directives sequentially, and the first matching directive wins.

The Pros:

  • Fully integrated with the existing system
  • Single nginx instance
  • Automatic configuration generation

The Cons:

  • Requires code changes to generate-config.py
  • Requires template modifications
  • More complex configuration format

Option 2: Manual Post-Processing

The Concept: Generate the nginx configuration normally, then use sed to inject IP restrictions into specific server blocks.

The Implementation:

  1. Generate nginx configuration as usual
  2. Use sed to find specific server blocks
  3. Inject allow/deny directives before the location blocks
  4. Apply the modified configuration

The Pros:

  • No code changes required
  • Works with existing system
  • Simple post-processing script

The Cons:

  • Requires manual intervention
  • Less elegant than integrated solution
  • Potential for errors in sed patterns

The Concept: Run a dedicated nginx container with hardcoded IP restrictions for VPN-only applications, while the main nginx-smart-proxy handles public apps.

The Implementation:

  1. Create a separate docker-compose.yml for VPN apps
  2. Use a custom nginx configuration with IP whitelist
  3. Run on a separate port or internal network
  4. Keep public apps on the main nginx-smart-proxy

The Pros:

  • Clean separation of concerns
  • No code changes required
  • Easy to maintain
  • Clear security boundaries

The Cons:

  • Requires two nginx instances
  • Slightly more complex deployment
  • More resource usage

The Decision: Why Option 3?

We chose Option 3 because:

  1. Security: Clear separation between public and private apps
  2. Maintainability: Each nginx instance has a single purpose
  3. Flexibility: Easy to add more VPN apps without affecting public apps
  4. Simplicity: No complex code changes or post-processing

The Implementation:

# docker-compose.vpn.yml
version: '3.8'
services:
  nginx-vpn:
    image: nginx:alpine
    ports:
      - "8080:80"
    volumes:
      - ./vpn-nginx.conf:/etc/nginx/nginx.conf:ro
      - ./vpn-ssl:/etc/nginx/ssl:ro
    networks:
      - vpn-network

networks:
  vpn-network:
    internal: true

The Configuration:

# vpn-nginx.conf
# Allow VPN IP ranges only
allow 10.0.0.0/8;      # Private network
allow 172.16.0.0/12;   # Private network
allow 192.168.0.0/16;  # Private network
allow YOUR_VPN_IP_RANGE;  # Your specific VPN range
deny all;

server {
    listen 80;
    server_name app1.com;
    
    location / {
        proxy_pass http://app1:8080;
        # ... proxy settings ...
    }
}

The Result: Clean, secure, maintainable. VPN apps are isolated. Public apps remain public. Everyone is happy.

The CI/CD Pipeline – Automating Everything

The Vision: Push and Deploy

The Goal: Every push to the main branch should automatically:

  1. Build the Docker image
  2. Push it to the registry
  3. Deploy it to production

The Reality: We built this using GitLab CI.

The Pipeline Configuration

The .githubactions.yml:

build:
  stage: build
  script:
    - docker build -t $IMAGE_NAME:$CI_COMMIT_SHA .
    - docker push $IMAGE_NAME:$CI_COMMIT_SHA
    - docker tag $IMAGE_NAME:$CI_COMMIT_SHA $IMAGE_NAME:latest
    - docker push $IMAGE_NAME:latest
  only:
    - main
  rules:
    - changes:
        - Dockerfile
        - generate-config.py
        - multi-domain-entrypoint.sh
        - etc/nginx.conf

The Magic: This pipeline:

  1. Builds the Docker image with a unique tag (commit SHA)
  2. Pushes it to the registry
  3. Tags it as latest
  4. Pushes the latest tag

The Trigger: Only runs when specific files change:

  • Dockerfile (Docker image changes)
  • generate-config.py (Configuration generation changes)
  • multi-domain-entrypoint.sh (Entrypoint script changes)
  • etc/nginx.conf (Nginx configuration changes)

The Result: Automated builds. Automated deployments. Zero manual intervention.

The Evolution: From Manual to Automatic

Before:

  1. Make code changes
  2. Build Docker image manually
  3. Push to registry manually
  4. Deploy to production manually
  5. Hope nothing breaks

After:

  1. Make code changes
  2. Push to GitLab
  3. Watch the pipeline run
  4. Verify the deployment
  5. Celebrate

The Time Saved: Hours. Literally hours. Every deployment.

The Lessons Learned – Wisdom from the Journey

Lesson 1: Validation is Non-Negotiable

The Insight: Every input must be validated. Every configuration must be checked. Every assumption must be verified.

The Implementation:

  • Domain name validation (not empty, valid format)
  • Service name validation (exists in services list)
  • Port number validation (integer, valid range)
  • Configuration file validation (syntax, structure)

The Result: Fewer errors. Faster debugging. More reliable system.

Lesson 2: Error Messages Are User Experience

The Insight: An error message is not just for debugging—it’s for the user. It should be clear, actionable, and helpful.

Before:

ERROR: Invalid configuration

After:

ERROR: Domain entry 2 "app.com:" has empty service name.
Available services: app1, app2, api
Please check your DOMAINS environment variable.

The Result: Users can fix errors themselves. Support tickets decrease. Everyone is happier.

Lesson 3: Local Mode is Essential

The Insight: Development and production are different environments. The system must adapt.

The Implementation:

  • --local flag for development mode
  • HTTP-only configuration in local mode
  • HTTPS configuration in production mode
  • Automatic mode detection

The Result: Same codebase, different behavior. Perfect for both environments.

Lesson 4: Isolation is Key to Resilience

The Insight: Services must be independent. One failure should not cascade.

The Implementation:

  • Independent upstream configurations
  • Separate server blocks
  • Isolated error handling
  • Individual health checks

The Result: System continues working even when individual services fail.

Lesson 5: Automation is the Future

The Insight: Manual tasks are error-prone. Automation is reliable.

The Implementation:

  • Automatic configuration generation
  • Automatic SSL certificate management
  • Automatic configuration updates
  • Automatic CI/CD pipeline

The Result: Less human error. More reliability. Faster deployments.

The Future Vision – Where We’re Going

The Roadmap: What’s Next?

Phase 1: Enhanced Monitoring (Current)

  • JSON log parsing
  • Health check endpoints
  • Basic metrics collection

Phase 2: Advanced Features (Planned)

  • Rate limiting per domain
  • Request/response transformation
  • Custom error pages
  • WebSocket support
  • HTTP/2 optimization

Phase 3: Intelligence (Future)

  • Automatic traffic analysis
  • Predictive scaling
  • Anomaly detection
  • Self-healing capabilities

The Vision: The Intelligent Reverse Proxy

The Goal: A reverse proxy that:

  • Learns from traffic patterns
  • Adapts to changing conditions
  • Optimizes itself automatically
  • Predicts and prevents issues
  • Self-heals when problems occur

The Path: Machine learning. Analytics. Automation. Intelligence.

The Dream: A system that doesn’t just route traffic—it understands it. It optimizes it. It protects it. It evolves with it.

The Journey Continues

This is not the end. This is just the beginning. The nginx-smart-proxy is alive. It’s evolving. It’s growing. Every day brings new challenges. Every challenge brings new solutions. Every solution brings new possibilities.

We started with a simple question: Can we automate this?

We answered with a resounding: Yes, and we can make it intelligent.

The journey continues. The code evolves. The system improves. The vision expands.

To those who will maintain this system:

  • Read the code. Understand the architecture. Appreciate the design.
  • Test thoroughly. Validate everything. Monitor continuously.
  • Document your changes. Explain your decisions. Share your insights.
  • Remember: This is not just code—it’s a living system. Treat it with respect.

To those who will use this system:

  • Use it wisely. Configure it carefully. Monitor it closely.
  • Trust the automation, but verify the results.
  • Embrace the simplicity, but understand the complexity beneath.
  • Remember: This is a tool. Use it well, and it will serve you well.

The Final Word:

In the vast digital landscape where applications struggle to find their voice, where developers battle with manual configurations, and where SSL certificates expire silently in the night—we have created something extraordinary. We have created intelligence. We have created automation. We have created the future.

The journey continues. The code evolves. The system improves. The vision expands.

Welcome to the nginx-smart-proxy. Welcome to the future.

mrcloudbook.com avatar

Ajay Kumar Yegireddi is a DevSecOps Engineer and System Administrator, with a passion for sharing real-world DevSecOps projects and tasks. Mr. Cloud Book, provides hands-on tutorials and practical insights to help others master DevSecOps tools and workflows. Content is designed to bridge the gap between development, security, and operations, making complex concepts easy to understand for both beginners and professionals.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *