Creating the Dev/Test/PM/Ops Supertribe: From Visible Ops ToDevOpsGene Kim, CISA, TOCICO JonahVelocity ConferenceJune 15, 2011
Where Did The High Performers Come From?
Higher Performing IT Organizations Are More Stable, Nimble, Compliant And Secure High performers maintain a posture of compliance
Fewest number of repeat audit findings
One-third amount of audit preparation effort
High performers find and fix security breaches faster
5 times more likely to detect breaches by automated control
5 times less likely to have breaches result in a loss event
When high performers implement changes…
14 times morechanges
One-half the change failure rate
One-quarter the first fix failure rate
10x fasterMTTR for Sev 1 outages
When high performers manage IT resources…
One-third the amount of unplanned work
8 times moreprojects and IT services
6 times moreapplicationsSource: IT Process Institute, 2008
Common Traits of High PerformersCulture of…Change managementIntegration of IT operations/security via problem/change management
Processes that serve both organizational needs and business objectives
Highest rate of effective change CausalityHighest service levels (MTTR, MTBF)
Highest first fix rate (unneeded rework)Compliance and continual reduction of operational varianceProduction configurations
Highest level of pre-production staffing
Effective pre-production controls
Effective pairing of preventive and detective controlsSource: IT Process Institute
Visible Ops: Playbook of High PerformersThe IT Process Institute has been studying high-performing organizations since 1999What is common to all the high performers?What is different between them and average and low performers?How did they become great?Answers have been codified in the Visible Ops MethodologyThe “Visible Ops Handbook” is now available from the ITPIwww.ITPI.org
2007: Three Controls Predict 60% Of PerformanceTo what extent does an organization define, monitor and enforce the following?Standardized configuration strategyProcess disciplineControlled access to production systemsSource: IT Process Institute, 2008
The Darkest Moment In My Journey
Tough Love From Ari Balogh
Why Was I So Unsatisfied With The State Of IT Practice?IT operations work continued to be viewed as tacticalInformation security and compliance programs were sucking all the air out of the room (due to scoping problems)The activation energy for successful improvement programs was still too highThe IT operations issues overshadowed by development Issues are amplified 10x in production: outages, findings, lawsuitsTechnical debt builds up over timeIT operations is often the constraint in the organizationLinkage of IT performance to business performance not obvious enough“Why doesn’t the business care?  I found the pump handle!”

2011 06 15 velocity conf from visible ops to dev ops final

  • 1.
    Creating the Dev/Test/PM/OpsSupertribe: From Visible Ops ToDevOpsGene Kim, CISA, TOCICO JonahVelocity ConferenceJune 15, 2011
  • 2.
    Where Did TheHigh Performers Come From?
  • 3.
    Higher Performing ITOrganizations Are More Stable, Nimble, Compliant And Secure High performers maintain a posture of compliance
  • 4.
    Fewest number ofrepeat audit findings
  • 5.
    One-third amount ofaudit preparation effort
  • 6.
    High performers findand fix security breaches faster
  • 7.
    5 times morelikely to detect breaches by automated control
  • 8.
    5 times lesslikely to have breaches result in a loss event
  • 9.
    When high performersimplement changes…
  • 10.
  • 11.
    One-half the changefailure rate
  • 12.
    One-quarter the firstfix failure rate
  • 13.
    10x fasterMTTR forSev 1 outages
  • 14.
    When high performersmanage IT resources…
  • 15.
    One-third the amountof unplanned work
  • 16.
    8 times moreprojectsand IT services
  • 17.
    6 times moreapplicationsSource:IT Process Institute, 2008
  • 18.
    Common Traits ofHigh PerformersCulture of…Change managementIntegration of IT operations/security via problem/change management
  • 19.
    Processes that serveboth organizational needs and business objectives
  • 20.
    Highest rate ofeffective change CausalityHighest service levels (MTTR, MTBF)
  • 21.
    Highest first fixrate (unneeded rework)Compliance and continual reduction of operational varianceProduction configurations
  • 22.
    Highest level ofpre-production staffing
  • 23.
  • 24.
    Effective pairing ofpreventive and detective controlsSource: IT Process Institute
  • 25.
    Visible Ops: Playbookof High PerformersThe IT Process Institute has been studying high-performing organizations since 1999What is common to all the high performers?What is different between them and average and low performers?How did they become great?Answers have been codified in the Visible Ops MethodologyThe “Visible Ops Handbook” is now available from the ITPIwww.ITPI.org
  • 26.
    2007: Three ControlsPredict 60% Of PerformanceTo what extent does an organization define, monitor and enforce the following?Standardized configuration strategyProcess disciplineControlled access to production systemsSource: IT Process Institute, 2008
  • 27.
    The Darkest MomentIn My Journey
  • 28.
    Tough Love FromAri Balogh
  • 29.
    Why Was ISo Unsatisfied With The State Of IT Practice?IT operations work continued to be viewed as tacticalInformation security and compliance programs were sucking all the air out of the room (due to scoping problems)The activation energy for successful improvement programs was still too highThe IT operations issues overshadowed by development Issues are amplified 10x in production: outages, findings, lawsuitsTechnical debt builds up over timeIT operations is often the constraint in the organizationLinkage of IT performance to business performance not obvious enough“Why doesn’t the business care? I found the pump handle!”
  • 30.
    Seeing The BiggerProblemOperations Sees…Fragile applications are prone to failureLong time required to figure out “which bit got flipped”Detective control is a salespersonToo much time required to restore serviceToo much firefighting and unplanned work Planned project work cannot completeFrustrated customers leaveMarket share goes downBusiness misses Wall Street commitmentsBusiness makes even larger promises to Wall StreetDev Sees…More urgent, date-driven projects put into the queueEven more fragile code put into productionMore releases have increasingly “turbulent installs”Release cycles lengthen to amortize “cost of deployments”Failing bigger deployments more difficult to diagnoseMost senior and constrained IT ops resources have less time to fix underlying process problemsEver increasing backlog of infrastructure projects that could fix root cause and reduce costsEver increasing amount of tension between IT Ops and DevelopmentThese aren’t IT Operations problems…These are business problems!
  • 31.
    The Dreaded DiseaseITOperations Constipatus (noun)Occurs when IT Operations creates fatal blockages in project flow. Creates blinding pain in Dev organization.Blockage worsens with chronic break/fix and security/compliance work, and when technical debt is never paid off.Causes host to lose energy, become unable to achieve organizational goals. Dangerous to CEOs.Photo credit: http://www.flickr.com/photos/keenepubliclibrary/2435790649/
  • 32.
    12DevOps Can BreakA Core Chronic Conflict In IT * Every IT organization is pressured to simultaneously:Respond more quickly to urgent business needsProvide stable, secure and predictable IT serviceWords often used to describe ITIL process owners:“hysterical, irrelevant, bureaucratic, bottleneck, difficult to understand, not aligned with the business, immature, shrill, perpetually focused on irrelevant technical minutiae…”Source: The authors acknowledge Dr. Eliyahu Goldratt, creator of the Theory of Constraints and author of The Goal, has written extensively on the theory and practice of identifying and resolving core, chronic conflicts.
  • 33.
    Framed This Way,Help Can Come From A Surprising PlaceThe VP Application Development will often have the following complaints:IT Operations is the bottleneckWe complete the code, but it takes too long for IT Operations to get the code into productionEnvironments are never available when we need themReleases often cause chaos and disruption to all the other production servicesTurbulent installs have become the norm: 30 min installs take 3 daysDue to slow OS upgrades, applications delayed by 2 quartersWe are always late getting features to market
  • 34.
    A Reframed ITOperations Problem StatementIncrease flow from Dev to ProductionIncrease throughputDecrease WIPOur goal is to create a system of operations that allows Planned work to quickly move to productionEnsure service is quickly restored when things go wrongHow does this relate to Visible Ops?We focused much on “unplanned work”What’s happening to all the planned work?At any given time, what should IT Ops be working on?Now we are focusing on the flow of planned work
  • 35.
  • 36.
    Goal #1: DecreaseCycle Time Of ReleasesCreate determinism in the release processMove packaging responsibility to developmentRelease early and oftenDecrease cycle timeReduce deployment times from 6 hours to 45 minutesRefactor deployment process that had 1300+ steps spanning 4 weeksNever again “fix forward,” instead “roll back,” escalating any deviation from plan to DevVerify for all handoffs (e.g., correctness, accuracy, timeliness, etc…)Ensure environments are properly built before deployment beginsControl code and environments down the preproduction runwaysHold Dev, QA, Int, and Staging owners accountable for integrity
  • 37.
    Goal #2: IncreaseProduction RigorDefine what work is and where work can come fromProtect the integrity of the work queue (e.g., are checks being written than won’t clear?)To preserve and increase throughput, elevate preventive projects and maintenance tasksDocument all work, changes and outcomes so that it is repeatableOps builds Agile standardized deployment stories, to be completed after Dev sprints are completeMaintains adequate situational awareness so that incidents could be quickly detected and correctedStandardize unplanned work and escalationsAlways seeking to eradicate unplanned work and increase throughputLean Principle: “Better -> Faster -> Cheaper”
  • 38.
    Some PrinciplesBecause operationsis constrained, it is always better to prevent than recoverOperations work must be plannedWe strive to have continual situational awarenessWe will strive to control as many dimensions of our work as possibleWe ruthlessly pursue to understand any deviations from normalWe expect systems in operations to never stop workingWe never do one-offs (they must be exceptions, not the rule)We require determinism to enable resiliencyWe strive for the improvement and mastery of the environment
  • 39.
    Creating A SystemOf OperationsInj: 1. Projects: ensure rapid project releases from DevelopmentInj: 1.1. Created effective centralized work demand queueInj: 1.2: Protect integrity of work queue (e.g., write only checks that will clear)Inj: 1.3: Release early and often: Freeze projects if necessary, choking materials release to reduce WIP, allow longer runways of workInj: 1.4: Elevate any deviations or incidents that stop flow of workInj: 1.5: Standardize product deployments with DevelopmentInj: 1.6: Continually seek ways to increase flowInj: 2. Ensure reliable IT operationsInj: 2.1: When failures, detect/correct quickly inside the plant (e.g., production)Inj: 2.2. Prevent failures (e.g., maintenance)Inj: 2.3. Study and create projects to reduce/eradicate unplanned workInj: 2.4. Seek ways to increase productionInj: 3. Subordinate infosec/PMO/etc. to enable Inj 1 & 2
  • 40.
    The Prescriptive DevOpsCookbookCapture and codify how to start and finish successful DevOps transformationsCreate isomorphic mapping between plant floors and IT shopsCo-authoring with Patrick DeBois, Mike Orzen, John WillisDescribe in detail how to replicate the transformations describe in “When IT Fails: The Novel”GoalsHow does IT Operations become a dependable partnerHow does Dev become a dependable partnerHow does Dev and Ops work together to solve business problems (and Infosec, too)
  • 41.
    The Prescriptive DevOpsCookbookI am seeking fellow travelers who want to capture and codify the best known methods, patterns/anti-patterns, recipes and case studies of how to implement successful DevOps-style transformations.The Theory of Constraints Approach To Visible OpsDr. Goldratt wrote The Goal in 1984, describing Alex’s challenge to fix his plant’s cost and due date issues within 90 daysSome tenets that went against common wisdom:Every flow of work has a constraint/bottleneckAny improvement not made at the bottleneck is merely an illusionFallacy of cost accounting as operational management tool
  • 42.
    When IT Fails:The NovelDay 1Steve Masters, CEODick Landry, CFOParts Unlimited$4B revenue/year
  • 43.
    When IT Fails:The NovelDay 2Bill Palmer, VP IT Operations (promoted)Wes Davis, Director, Distributed SystemsPatty McKee, Director, IT Service Support ServicesThe payroll outageAll salaried employees will get paid, but not the hourliesCISO put in tokenization application in the factories, breaking database query that uses SSNIT Ops thought it was a SAN firmware upgrade failureAll HR apps go downCFO is on front page of news, apologizing to community
  • 44.
    When IT Fails:The NovelDay 4Chris Allers, VP Application DevelopmentSarah Moulton, SVP Retail Products“We can deploy by next week by cutting some corners, but IT Ops is in the way… again…”“Bill, your team lacks a sense of urgency. We must go. We’ve already bought the newspaper ads – they’re bought, paid for and being printed…”
  • 45.
    When IT Fails:The NovelDay 3Nancy Mailer, Chief Audit ExecutiveJohn Pesche, CISOIT Operations has 980 IT general control deficiencies on critical financial systems, potentially dooming financial statement to having a footnote. Needs management response in 1 week.Bill grapples with who to put on the project. 1 yr of work, just to fix issues, even without Phoenix.
  • 46.
    The Goal ForIT: Day 10The DeploymentDatabase conversion, the point of no return, taking 1000x longer.In store POS won’t come up by Sat 8am, maybe by next TuesdayEmptying shopping cart shows last successful order credit card #
  • 47.
    Call To ActionIfyou’re interested in reviewing early versions of “When IT Fails: The Novel,” email me.If you’re interested in helping build or review the DevOps Cookbook, email me.I’m genek@realgenekim.meThank you for allowing me to join your tribe!
  • 48.
    ResourcesFrom the ITProcess Institute www.itpi.orgBoth Visible Ops HandbooksITPI IT Controls Performance Study“Lean IT” by Orzen and BellWinner of the Shingo Prize 2011“Inspired: How To Create Products That Customers Love” by Cagan“Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation” by Humble, FarleyFollow Gene Kim@RealGeneKimmailto:genek@realgenekim.mehttp://realgenekim.me/blog
  • 50.
    About Gene KimI’vespent the last 12 years studying high performing IT organizations, trying to understand:What do they have in common?What is present in successful transformations, absent in unsuccessful transformations?How do we lower the activation energy required to create the transformations?Founder and former CTO of Tripwire, Inc.Co-author of Visible Ops Handbook, Security Visible Ops HandbookActive researcherCo-founder of IT Process InstituteCommittee member of Institute of Internal AuditorsLeader of PCI Security Standards Council Scoping SIG

Editor's Notes

  • #11 How each side Actively impedes the achievement of each other’s goals.
  • #12 http://www.flickr.com/photos/keenepubliclibrary/2435790649/
  • #32 Since 1986, I’ve been a QA engineer writing filesystem QA tests, system administrator, developer, infosec, process design, operations research, auditorIncidentally, I almost moved to Seattle to be on Microsoft NT network test team in 1991 (TCP/IP stack)For 13 years, I was the founder/CTO of Tripwire, but my primary passion is studying high performing IT operations and security organizations.When I met Chris 3 years ago, he helped me see clearly one of the primary obstacles for successful transformations. I’ll describe this later.First, let me talk about what I meant by “high performers” back in 1999.