Scaling Wix to over 70M users
2006 2007 2008 2009 2010 2011 2012 2013 2014
Wix is founded
First funding
Open beta
1 M users
eCommerce
10 M users
IPO
50 M users
Mobile
App Market
Hive
Wix Worldwide
HTML 5
Initial Architecture
Plan for
gradual rewrite
First Challenge - 2008
• Server updates imposed downtime
• Two concerns
• Creating websites
• Viewing websites
• Different service level needed
Public vs Editor
Public Public DB
Editor
Editor DB
MySQL is better NoSQL
Hosting
Co-Location Managed Hosting Cloud
Lease hardware
and maintenance
Overnight
provisioning
Reliable software
and hardware
Instant lease
hardware
Instant provisioning
unlimited resources
Reliable software
unreliable hardware
Own and maintain
your hardware
Provision -> buy
deliver and install
Reliable software
and hardware
2006 2011 2012 2013 2014
Austin Amsterdam
Amazon,
Google
Chicago
Tampa
Wix Media
• 500 GByte of small files
– Hit IO limitations
• Need scalable solution
– Number of files
– HTTP connections
• Image manipulations
Wix Media Platform
x36
x36
x32
x36
x36
x32
Austin
Chicago
get 37D815B5.jpg
First fallback
CDN
If not in CDN
Wix Media Platform
x36
x36
x32
x36
x36
x32
Google
Cloud
Austin
Chicago
get 37D815B5.jpg
First fallback
Second fallback
CDN
If not in CDN
Wix Media Platform
Google
Cloud
Austin
Tampa
get 37D815B5.jpg
CDN
If not in CDN
Amazon
Cloud
Development Velocity - 2010
• Large and entangled codebase
• Hard feature rollout
• While at the same time, the iPad was
released
• We needed to enable Wix to move fast
People are the key
2011 2012 2013 2014
CI / CD / TDD
DevOps
Scala
Wix
Framework
Micro-Services
2010
TDD Redux
2015
Node.js
2016
React
Angular
Companies &
Guilds
Wix Framework
Modern
Flash
Support
TDD
Support
DevOps
Support
Why CI / CD / TDD / DevOps?
• Fear of change
• Low quality
• Slow product development
• 3 months from dev to GA
I want
change
I want
stability
CI / CD / TDD / DevOps
• Small and fast changes
• Empower the developer
• Automate!!!
• Measure!!!
• = x xRisk
Number
of
changes
Size of
change
$$$
impact of
change
Prepare Release
Deploy
Monitor
Micro-Services
• Over 100 micro-services at Wix
• A Micro Service is
– Independent deployment
– Independent OS process
– Independent database
• Size of a service – based on the team
Micro-Services
• The good news:
– Great risk mitigation
– Enforces separation of concerns
– Minimal blockers for deployment
• The bad news:
– More network hops - Increases % of failures
– Requires Back / Forward compatibility
– Distributed operations / transactions
Scala
• Moves Java developers out of their
comfort zone.
• Forced them to grow
• Question how things are done
• End result – great innovation
Companies & Guilds
• Companies focus on products
• Guilds focus on technology
How
What
Company leader
Guild master
Angular
React
Server
QA
Analysis
UX
Angular & React
• Modern & Productive
• Angular
– Applications – like my account
• React
– Websites – Wix sites are React
– React Templates for applications – Wix editor
Node.js
• Frontend server
• Complement Scala
– Not replace
• Let frontend devs take ownership of the
full HTTP stack
– HTML, Ajax, Sockets, etc.
Questions?

Scaling wix to over 70 m users

Editor's Notes

  • #8 Built for fast development Did not know what are business is We know we will need to replace it Did not know how hard that will be
  • #10 Sites should never ever have a downtime! Sites should work as fast as possible, always! However, an editing system does not require this level of SLA
  • #11 Releases of Editing feature should have no impact on existing site operations! Solution - The two concerns evolve independently The Public segment targets serving websites Has mostly read-only usage pattern Simple publishing system Simple + readonly -> simpler to have higher SLA and DRP MySQL used as NoSQL – single large table with XML text fields The Editor segment Exposes the Editing APIs, user account and galleries management. Has different release schedule compared to the Public segment
  • #12 Use one non-normalized table, primary key access, json fields Immutable blobs, blog table with pointers No transactions No MySQL auto generated keys GUID for keys – no locks, enable master master replication
  • #14 Amsterdam for 3 way active active -> failed Doing 2 way active active + service disruption on third
  • #15 The “upload to app server, post process files, copy to lighttpd server, serve by lighttpd” pattern proved inefficient, slow and error prone ls does not scale Needed control over http headers for caching
  • #21 Train the people you already have Hiring the right people is key to success Hire only the best developers (only seniors) Don’t count only on the interview, you need to test actual coding Hire people who will challenge you (no “yes man”) Get people you can trust with “root” access to production Never stop hiring