Skip to main content

What Happened

SMO1.io experienced a production outage affecting two core flows: link management (create, list, update) and the marketing landing page. Both failures had the same root pattern — deployment configuration that worked once but drifted silently over time. Database schema drift: A new feature (link protection with passwords/PINs) was merged with its migration file, but the migration was never executed on the production database. The code expected a protection_hint column that didn’t exist, causing every query to fail. DNS record gap: During a DNS provider migration to Cloudflare, the CNAME record for the API subdomain (purr.smo1.io) was not carried over. Additionally, the Vercel A record IP was outdated, causing the landing page origin to return errors.

Business Impact

  • All authenticated users were unable to create or view links
  • The public landing page at smo1.io showed “Internal Server Error”
  • Short link redirects (via Cloudflare Workers) continued working since they use a different API path

Resolution

  1. Auto-migrations on startup — SQL migrations are now embedded in the Go binary and run automatically when the server starts. This eliminates the class of “forgot to run migrations” failures permanently.
  2. DNS records restored — The missing API CNAME and incorrect Vercel IP were corrected in Cloudflare.

Takeaway

Manual deployment steps (run migrations, check DNS records) are single points of failure. Every manual step that can be automated should be — especially database migrations, which are invisible until they break production. A deploy checklist helps, but embedding the check in the deploy process itself is more reliable.