Principles and Practices in
Continuous Deployment
Mike Brittain
Engineering Director, Etsy
@mikebrittain mikebrittain.com/talks
“Continuous Deployment”
Process by which our team deploys software changes
to production services over 30 times per day.
Where we started
Principles for our Engineering team
Continuous Deployment
Business case
Five years ago…
2-3 weeks of code changes
Release and rollback plans
Traffic and infrastructure management (Ops)
6-14 hours
Five years ago…
“Deployment Army”
Stressful, especially when things go wrong
Long days and late nights
Scheduled downtime
pro·duc·tion [pruh-duhk-shuhn] (n)
1. This complex system of application code,
distributed services, servers, networking gear, etc.,
upon which we’re going to try to carefully apply a
complicated set of changes and hope that nothing
goes wrong. Cross your fingers… here goes.
Software for large-scale web sites has been
traditionally written by one group of people, then
released and operated by a different group.
These two groups have very different levels of
visibility into how the software works.
Stagnation
“…frequent and prolonged outages.”
2010 CAPACITY PLAN
First, Principles.
Innovate or die
Innovate or die
Resolve scaling hurdles
Innovate or die
Resolve scaling hurdles
Mean-time-to-recovery
“Quality is not just testing pre-release.
It also includes our adaptability and
response time.”
- Jeff Sussna at ALM Forum, 2014
Innovate or die
Resolve scaling hurdles
Mean-time-to-recovery
Innovate or die
Resolve scaling hurdles
Mean-time-to-recovery
Healthy and talented engineering team
Autonomy, Mastery, Purpose.
“Drive: The surprising truth about what
motivates us.” ~Dan Pink, at RSA
http://youtu.be/u6XAPnuFjJc
Innovate or die
Resolve scaling hurdles
Mean-time-to-recovery
Healthy and talented engineering team
Stop stressing about releases
First, Principles.
http://timothyfitz.com/2009/02/10/continuous-deployment-at-imvu-doing-the-impossible-fifty-times-a-day/
In a software release process Fail Fast means releasing undeployed code
as fast as possible, instead of waiting for a weekly release to break.
http://youtu.be/LdOe18KhtT4
SECRET WEAPON:

Hired as VP, Tech-Ops at Etsy
Continuous Deployment
Continuous Delivery
~ vs ~
Frequent check-ins directly to
mainline.
Continuous Deployment Continuous Delivery
✓ ✓
Continuous Integration and
Automated tests.
Continuous Deployment Continuous Delivery
✓ ✓
Keep the build green.
We’re always ready to release.
Continuous Deployment Continuous Delivery
✓ ✓
“One button” deploys.
Continuous Deployment Continuous Delivery
✓ ✓
Business dictates when a build is
deployed.
Continuous Deployment Continuous Delivery
✓
Every passing build is deployed to
production.
Continuous Deployment Continuous Delivery
✓
All enhancements are gated by
Config Flags. (“Branch in code”)
Continuous Deployment Continuous Delivery
✓ ?
Most of the builds we deploy are
“dark” changes.
CSS rules and properties
Copy in templates (e.g. typos)
New, un-referenced code (e.g. classes, funcs, templates)
Code paths behind disabled config flags
etc…
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Feedback
Source: http://en.wikipedia.org/wiki/Continuous_delivery
Continuous Delivery release pipeline
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Feedback
Check in
Trigger
Feedback Trigger
Feedback
Source: http://en.wikipedia.org/wiki/Continuous_delivery
Continuous Delivery release pipeline
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Feedback
Check in
Trigger
Feedback Trigger
Feedback
Check in
Trigger
Feedback Trigger
Feedback Approval
ApprovalFeedback
Source: http://en.wikipedia.org/wiki/Continuous_delivery
Continuous Delivery release pipeline
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Feedback
Check in
Trigger
Feedback Trigger
Feedback
Check in
Trigger
Feedback Trigger
Feedback Approval
ApprovalFeedback
Continuous Delivery release pipeline
Dev / Integration Staging Production
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Feedback
Check in
Trigger
Feedback Trigger
Feedback
Check in
Trigger
Feedback Trigger
Feedback Approval
ApprovalFeedback
Continuous Delivery release pipeline
Dev / Integration Staging Production
Assumptions:
Staging is a perfect reflection of
Production, with respect to
hardware, configurations, data,
overall load, capacity, etc.
Deploy process is infallible.
“What do you mean, ‘it’s not working in
production?’ I TESTED IT BEFORE WE
RELEASED!”
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Trigger
Approval
Continuous Delivery release pipeline
Dev / Integration Staging Production
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Trigger
Approval
Continuous Delivery release pipeline
Dev / Integration Staging Production
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Approval
Check in
Trigger
Trigger
Approval
Feedback
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Trigger
Approval
Continuous Delivery release pipeline
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Approval
Check in
Trigger
Trigger
Approval
Feedback
"Because you’re integrating
so frequently, there is
significantly less back-
tracking to discover where
things went wrong , so you
can spend more time
building features.”
!
—ThoughtWorks
!
!
http://www.thoughtworks.com/continuous-integration
Dev / Integration Staging Production
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Trigger
Approval
Continuous Delivery release pipeline
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Approval
Check in
Trigger
Trigger
Approval
Feedback
Dev / Integration Staging Production
Where’s the bug?
!
In one of the numerous check-ins?
Missing unit tests?
Missing automated UA tests?
Missing manual UA tests?
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Trigger
Approval
Continuous Delivery release pipeline
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Approval
Check in
Trigger
Trigger
Approval
Feedback
Dev / Integration Staging Production
Where’s the bug?
!
In one of the numerous check-ins?
Missing unit tests?
Missing automated UA tests?
Missing manual UA tests?
!
Data out of sync?
Server configurations out of sync?
Capacity vs. current load?
Deployment script?
Dev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Release
Check in
Trigger
Trigger
Approval
Continuous Delivery release pipeline
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Check in
Trigger
Trigger
Approval
Approval
Check in
Trigger
Trigger
Approval
Feedback
Dev / Integration Staging Production
How will we know when
something is wrong in
production?
!
How long will it take to
resolve the issue?
Check in
Trigger
We aim to reduce fundamental surprise in
every release.
Furthermore, we optimize for detecting
and recovering from failures quickly.
Pre-production validation
Code deployed to de-pooled application (web) servers
touching prod services and databases.
Smoke tests
Integration tests
Functional tests
User-Acceptance (ad hoc)
Production validation
Exactly the same server configs, services and data as
pre-prod, but this is where we introduce application
code to live traffic.
Production validation
Exactly the same server configs, services and data as
pre-prod, but this is where we introduce application
code to live traffic.
Smoke tests (esp. over public hostnames)
User-Acceptance testing behind config flags
Gratuitous monitoring
Customer support and forums
Single
release
Many
releases
50K LOC/month
Few opportunities for failure

Wide surface area (50,000 LOC)
High MTTR
!
All of the bugs we’ve written
More opportunities for failure
Narrow surface area (< 100 LOC)
Low MTTR
!
A fraction of the bugs we’ve

written per release
Imagine that we’ll write
Monitoring
Monitoring
PHP Warnings Bug Reports and Help Requests
Deploy logs
Post-Mortems
Check in
Trigger
Feedback Trigger
Feedback Approval
Approval
Smoke Tests
User Acceptance!
Tests
ReleaseDev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Deploy (Prod)
Monitoring and
Automated Alerts
Continuous Deployment release pipeline
Feedback
Dev Pre-Production
(“Princess”)
Production
Check in
Trigger
Feedback Trigger
Feedback Approval
ApprovalFeedback
Smoke Tests
User Acceptance!
Tests
ReleaseDev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Deploy (Prod)
Monitoring and
Automated Alerts
Continuous Deployment release pipeline
CI
Dev Pre-Production
(“Princess”)
Production
Check in
Trigger
Feedback Trigger
Feedback Approval
ApprovalFeedback
Smoke Tests
User Acceptance!
Tests
ReleaseDev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Deploy (Prod)
Monitoring and
Automated Alerts
Continuous Deployment release pipeline
CI
Approval
Approval
Feedback
Feedback
Feedback
Dev Pre-Production
(“Princess”)
Production
Check in
Trigger
Feedback Trigger
Feedback Approval
ApprovalFeedback
Smoke Tests
User Acceptance!
Tests
ReleaseDev Team Version Control
Build & Unit
Tests
Automated
Acceptance Tests
User Acceptance
Tests
Deploy (Prod)
Monitoring and
Automated Alerts
Continuous Deployment release pipeline
CI
Approval
Approval
Feedback
Feedback
Feedback Approval
Feedback Approval
Feedback
“Allow buttons properly to inherit color
from their parent node.”
Five years ago…
2-3 weeks of code changes
Release and rollback plans
Traffic and infrastructure management (Ops)
6-14 hours
Five years ago…
“Deployment Army”
Stressful, especially when things go wrong
Long days and late nights
Scheduled downtime
Why do we do this?
Innovate or die.
Resolve scaling hurdles.
Mean-time-to-recovery.
Healthy and talented engineering team.
Stop stressing about releases.
Innovate or die.
Resolve scaling hurdles.
Mean-time-to-recovery.
Healthy and talented engineering team.
Stop stressing about releases.
Admin-launch and whitelist
Ramp-up public traffic
mainline
header_redesign
search_filter_custom_orders
checkout_blue_button
listing_css_refactor
www.etsy.com
beta01.etsy.com
beta02.etsy.com
beta03.etsy.com
www.etsy.com
beta01.etsy.com
beta02.etsy.com
beta03.etsy.com
US","region":"US","detected_currency_code":"USD","detected_language":"en-
US","detected_region":"US","accept-languages":"en-US","cdn-
provider":"","isMobileDevice":"0","isMobileSupported":"0","isMobileRequestIgnoreCookie":"0"
,"isTabletSupported":"0","isTouch":"0","isEtsyApp":"0","isPreviewRequest":"0","isChromeInst
antRequest":"0","isMozPrefetchRequest":"0","listing_ids":
[104073511,130604774,159651433,155451607,160523743,124025232,95186610,82967340,114692884,11
4767467,117266897,157579748],"scheduled_modules_content_ids":
[10808052776,10256029946],"primary_event":"1",".event_source":"web",".event_logger":"fronte
nd","php_ab_test_names":"translation_profiler.profiling;translation_profiler.logging;transl
ation_profiler.backend_event_logging;footer_redesign_20131201;international.languages.el;in
ternational.languages.ja;international.languages.no;international.languages.pl;internationa
l.languages.ro;international.languages.tr;simplified_locale_experience;full_site_ssl;admin_
toolbar;enabled_locale_subdirectories;affiliates.publishing.user_publishers;buyer_invites_r
ecipients;home_improvement;home_improvement.new_homepage;authoritative_items;refactored_foo
ter;conversations.rejuvination;contextual_homepage_recs.global;css_from_www;shrinkray.css;c
srf_nonce_refactor.allow_colon;csrf_nonce_refactor.reverse_order;csrf_nonce_refactor.no_enc
Analytics connected to config names
Catapult
Observed impact
Time series data for
duration of the
experiment
Observed impact
Time series data for
duration of the
experiment
Frank
Product Manager
“I want to find out whether
buyers will favor a single
price for the product that
includes shipping.”
https://www.etsy.com/shop/lucra
Eligibility requirements:
- Must be first page of visit
- Buyer & seller in same region
- etc…
Time: < 8 hours
Staff: One
!
Design, config flag (disabled), eligibility code in
controller, template code, CSS, code review,
automated tests, deployed code, config flag enabled.
We do not bundle the item
price and shipping cost
together today.
!
https://www.etsy.com/shop/lucra
Ambitious Product Goal
Ambitious Product Goal
Monolithic
Building and measuring many
things at once.
Ambitious Product Goal
Monolithic
Building and measuring many
things at once.
Iterative
One thing at a time, our design
goal is always in sight.
Time: < 8 hours
Staff: One
!
Design, config flag (disabled), eligibility code in
controller, template code, CSS, code review,
automated tests, deployed code, config flag enabled.
Deployed Deployed
Is this for me?
http://timothyfitz.com/2009/02/10/continuous-deployment-at-imvu-doing-the-impossible-fifty-times-a-day/
“Maybe this is just viable for a single developer … your site
will be down. A lot.”
etsystatus.com
@mikebrittain
Very end of 2009 Today
DEPLOYMENTSPERDAY
APPCODE
CONFIGFILES
$1.35 Billion Goods sold in 2013
60+ Million Unique visitors per month
!
175+ Committers, everyone deploys
http://www.etsy.com/blog/news/2013/etsy-statistics-december-2012-weather-report/Items by anjaysdesigns, betwixxt, OneStarLeatherGoods, mediumcontrol, TheDesignPallet
Thank you.
Mike Brittain
Engineering Director, Etsy
@mikebrittain mikebrittain.com/talks

Principles and Practices in Continuous Deployment at Etsy