How we migrated thousands of Shopify users with zero downtime

two weeks ago our team met for the first time in Split, Croatia.

after a couple days of beach time, wine tours, restaurants, and loud guitar in our Airbnb, it was time to dig into "business" objectives.

four 18+ hour days in the terminal.

below is a recap of our first (and hopefully last) data migration from a LAMP app on Elastic Beanstalk + RDS MySQL to a Rails / Vue.js API on Heroku, using PostgreSQL.

why migrate?

before doing anything in life, ask why.

we got into the mess of 2 codebases when we acquired Notify, a Shopify-only app that showed off recent sales.

our first "big idea" was to integrate with other cart platforms like Magento, BigCommerce, WooCommerce, etc.

each of these took 1-3 weeks to launch, which kept us busy for 4 months.

in the background, however, we knew that every website benefits from social proof, so a separate team began building a standalone version of Notify on Rails.

a few months later we had 2 fully operational codebases, each with thousands of users and built in different stacks.

step 1 - trim scope

data migrations are a fancy way to "copy and paste" from one server to another.

given Fomo mostly consumes data (3rd party APIs, analytics stats, etc), we decided to skip a massive mapping procedure and instead trigger 'imports' of former data, on-demand.

here's an example.

suppose a Shopify store has been using Fomo for 2.5 years, and has 50,000 orders in our database.

we could map over 50,000 orders and their attributes from our MySQL db (source) to our PostgreSQL (destination) database, but why?

since the average Fomo user only displays a few dozen recent events, triggering a "grab last 50" type of import covered 95% of the use cases.

by copy-pasting only the critical details of each user on the legacy codebase, such as name/email/website, we decreased our procedure completion time from weeks to hours.

step 2 - feature parity

feature parity

Fomo did an OK job maintaining each codebase, but sometimes we got lazy and a feature on the PHP app was not on the Rails app, or vice versa.

prior to executing our migration, we made a few trade-offs...

  • PHP app allows users to manually import data -- won't build
  • PHP app has a 'product recommendations' integration -- won't build
  • PHP app has a 'classic' theme -- added, but can only be enabled by Fomo admin
  • PHP app supports mobile 'top screen' notifications -- built 1 day after migration (users complained)
  • etc

beyond these nuances, we also had to alert a few integration partners to prevent breaking changes, including Judgeme and Shoelace.

for us, the most important aspect of a "done for you" migration is in preserving a user's settings and preferences.

in our case, there are millions of possible configurations, and it would be unwise to 'reset' migrated users to the defaults.

to manage this expectation, we wrote a MigrationService script and ensured users throughout multiple emails that they would be migrated with the same config.

step 3 - trim scope some more

beyond the end-user experience, several strategies were implemented on our PHP vs Rails apps that needed to be re-imagined.

  • each app was connected to its own Mixpanel instance
  • each app had its own Dunning management
  • each app had its own server-side and customer lifecycle mailers
  • each app was connected proprietarily to our BI / revenue dashboard

an end-to-end migration from our old app to the new one, likely would have included steps to resolving each discrepancy above.

we opted to not worry about these components (for now), and focus all our efforts on a seamless migration for our customers.

to prevent legacy users from receiving an onslaught of "new user signup" emails, we patched our migration scripts with a naming convention that pointed all marketing messages to fake accounts @usefomo.com.

step 4 - scripting

thinking back to our initial decision to import 3rd party data on demand, we realized that the bulk operation of copy-pasting user accounts and websites could be done asynchronously, long before users are given access to the new Fomo dashboard.

rather than mess up our production PostgreSQL database, we pointed our scripts' write operations to local databases on our machines, which matched the schemas and ORM adapters of the production application.

this was illuminating and critical for finding edge case scenarios, such as:

  • PHP MySQL app boolean fields that did not require a true/false, and were sometimes nil
  • grammar differences in column attribute names
  • syntax changes between platforms, such as "product-with-link" vs "title-with-link"

after some cleanup and further abstraction, mapping over the settings from our old server to the new one was simple:

what this script did not do, was change anything related to incoming webhooks or the Fomo JS snippet.

thus, new orders and Fomo notifications continued to I/O with the PHP MySQL database, and the PostgreSQL (new platform) essentially became a 'follower' until the moment of truth.

with a few free development stores (thanks Shopify!), we successfully executed the 2nd half of the migration -- pointing users to the new dashboard -- in a small and controlled environment.

step 5 - live migration

now for the fun stuff.

a few hours after midnight (local Croatian hours), it was finally time to hit 'go' on the 2nd migration script.

this script was responsible for swapping the Shopify SSO "authentication URL," Fomo JS script tags origin URLs, and Shopify "new_order" webhooks endpoint URLs.

first, Fomo lead engineer Chris Bacon walked us through our new Shopify integration, which would become immediately available to App Store installs following the Authentication URL swap.

next, 3 of us cracked open 2 live database terminals /each, in order to run 1,000 migrations per terminal for a total of 6,800 simultaneous legacy user migrations.

for a given store, this is how our migration worked:

  1. a day before the overnight migration, the shop URL, owner name, email, etc were duplicated in the production PostgreSQL database
  2. in the moment-of migration, the JS snippet was swapped, and recent orders were imported to your PostgreSQL bucket
  3. if a store clicked to open the Fomo app after this swap, they landed on the new Fomo dashboard, with every preference / setting carried over from the PHP app dashboard

note: Step 1 above took around 8 hours, and required several restarts due to Heroku's PostgreSQL console kicking us out.

step 6 - outcomes

as 6,000+ Shopify stores were being migrated by 6 live terminals, we monitored our servers, error logs, support desk, and live chat.

fomo new relic logs

all clear.

it was around this time, at 7am in Croatia, that we popped champagne.

summary

perhaps you're wondering why the migration needed to be in one sweep, vs one-by-one or self-service or scheduled over several days.

heck, why didn't we just tell users to expect 8 hours of downtime, so that we could migrate everyone in peace, in the middle of the afternoon?

in short, because math.

Fomo notifications are shown 530,000,000 times per month.

if we're not "on" for a even 1 hour, 100s of thousands of potential customers (to our customers) won't make informed buying decisions.

Fomo's mission statement:

"to help consumers make better buying decisions online."

Fomo's vision statement:

"to give honest entrepreneurs the credibility they deserve."

fomo vision statement

we can't achieve either with downtime, so that wasn't a viable option.

(see how easy that is... making decisions based on shared principles?)

to Shopify users worldwide, enjoy Fomo 2.0!