Naviga&ng	
  the	
  Transi&on	
  from	
  
Rela&onal	
  to	
  NoSQL	
  Technology	
  

                           Dip&	
  Borkar	
  
                  Director,	
  Product	
  Management	
  




                                                           1	
  
WHY	
  TRANSITION	
  TO	
  NOSQL?	
  
               	
  




                                        2	
  
Two	
  big	
  drivers	
  for	
  NoSQL	
  adop&on	
  


             49%	
  
                                                     35%	
  
                                                                                29%	
  

                                                                                                16%	
          12%	
                 11%	
  

Lack	
  of	
  flexibility/	
                    Inability	
  to	
              Performance	
      Cost	
     All	
  of	
  these	
     Other	
  
  rigid	
  schemas	
                         scale	
  out	
  data	
            challenges	
  



Source:	
  Couchbase	
  Survey,	
  December	
  2011,	
  n	
  =	
  1351.	
  


                                                                                                                                                 3	
  
NoSQL	
  catalog	
  


                       Key-­‐Value	
     Data	
  Structure	
     Document	
       Column	
        Graph	
  
(memory	
  only)	
  
   Cache	
  




                       memcached	
              redis	
  
(memory/disk)	
  




                        membase	
                                 couchbase	
     cassandra	
      Neo4j	
  
  Database	
  




                                                                  mongoDB	
  
                                                                                                               4	
  
DISTRIBUTED	
  DOCUMENT	
  
      DATABASES	
  




                              5	
  
Document	
  Databases	
  


•  Each	
  record	
  in	
  the	
  database	
  is	
  a	
  self-­‐
   describing	
  document	
  	
                                    {	
  

•  Each	
  document	
  has	
  an	
  independent	
  
                                                                   “UUID”:	
  “ 21f7f8de-­‐8051-­‐5b89-­‐86
                                                                   “Time”:	
   “2011-­‐04-­‐01T13:01:02.42
                                                                   “Server”:	
   “A2223E”,

   structure	
                                                     “Calling	
   Server”:	
   “A2213W”,
                                                                   “Type”:	
   “E100”,
                                                                   “Initiating	
   User”:	
   “dsallings@spy.net”,

•  Documents	
  can	
  be	
  complex	
  	
                         “Details”:	
  
                                                                            {
                                                                            “IP”:	
  “ 10.1.1.22”,
•  All	
  databases	
  require	
  a	
  unique	
  key	
                      “API”:	
   “InsertDVDQueueItem”,
                                                                            “Trace”:	
   “cleansed”,

•  Documents	
  are	
  stored	
  using	
  JSON	
  or	
  
                                                                            “Tags”:	
  
                                                                                     [
                                                                                     “SERVER”,	
  

   XML	
  or	
  their	
  deriva&ves	
                                                “US-­‐West”,	
  
                                                                                     “API”
                                                                                       ]

•  Content	
  can	
  be	
  indexed	
  and	
  queried	
  	
         }
                                                                            }



•  Offer	
  auto-­‐sharding	
  for	
  scaling	
  and	
  
   replica&on	
  for	
  high-­‐availability	
  
                                                                                                                     6	
  
COMPARING	
  
DATA	
  MODELS	
  




                     7	
  
h]p://www.geneontology.org/images/diag-­‐godb-­‐er.jpg	
     8	
  
Rela&onal	
  vs	
  Document	
  data	
  model	
  


            C1	
      C2	
       C3	
      C4	
  



                                                                      {	
       JSON	
  
                                                                      	
  
                                                                      	
  
                                                                              JSON	
  
                                                                      	
  
                                                                      }	
  
                                                                                  JSON	
  

     Rela&onal	
  data	
  model	
                             Document	
  data	
  model	
  
   Highly-­‐structured	
  table	
  organiza&on	
            Collec&on	
  of	
  complex	
  documents	
  with	
  
   with	
  rigidly-­‐defined	
  data	
  formats	
  and	
       arbitrary,	
  nested	
  data	
  formats	
  and	
  
                 record	
  structure.	
                            varying	
  “record”	
  format.	
  



                                                                                                                   9	
  
Example:	
  User	
  Profile	
  

                   User	
  Info	
                                                    Address	
  Info	
  
       KEY	
      First	
       Last	
     ZIP_id	
                     ZIP_id	
      CITY	
     STATE	
       ZIP	
  

         1	
      Dip&	
      Borkar	
        2	
                          1	
        DEN	
       CO	
       30303	
  



         2	
       Joe         Smith	
        2	
                          2	
         MV	
       CA	
       94040	
  
                    	
  


         3	
       Ali	
      Dodson	
        2	
                          3	
         CHI	
       IL	
      60609	
  



         4	
      John	
        Doe	
         3	
                          4	
         NY	
       NY	
       10010	
  




  To	
  get	
  informa&on	
  about	
  specific	
  user,	
  you	
  perform	
  a	
  join	
  across	
  two	
  tables	
  	
  


                                                                                                                           10	
  
Document	
  Example:	
  User	
  Profile	
  




	
  {	
  
	
  	
  	
  	
  “ID”:	
  1,	
  


                                            =	
                                     +	
  
	
  	
  	
  	
  “FIRST”:	
  “Dip&”,	
  
	
  	
  	
  	
  “LAST”:	
  “Borkar”,	
  
	
  	
  	
  	
  “ZIP”:	
  “94040”,	
  
	
  	
  	
  	
  “CITY”:	
  “MV”,	
  
	
  	
  	
  	
  “STATE”:	
  “CA”	
  
	
  	
  }	
  
                                 JSON	
  




                                             All	
  data	
  in	
  a	
  single	
  document	
  

                                                                                                11	
  
Making	
  a	
  Change	
  Using	
  RDBMS	
  
                   User	
  Table	
                                                               Photo	
  Table	
                                                 Country	
  Table	
  
                                                                     Country	
          TEL                                               Country	
  
User	
  ID	
      First	
                     Last	
       Zip	
       ID	
  
                                                                                    User	
  ID	
  
                                                                                       3	
  
                                                                                                     Photo	
  ID	
      Comment	
           ID	
             Country	
  ID	
     Country	
  name	
  

                                                                                         2	
           d043	
              NYC	
           	
  	
  001	
         001	
                   USA	
  
     1	
         Dip&	
            Borkar	
              94040	
      	
  001	
  
                                                                                         2	
           b054	
             Bday	
           	
  	
  007	
         002	
                              UK	
  
     2	
          Joe	
               Smith	
            94040	
       001	
             5	
           c036	
            Miami	
           	
  	
  001	
         003	
                Argen&na	
  
     3	
           Ali	
           Dodson	
              94040	
       001	
             7	
           d072	
            Sunset	
          	
  	
  133	
  
                                                                                                                                                                 004	
                Australia	
  
                                                                                     5002	
            e086	
             Spain	
          	
  	
  133	
  
     4	
         Sarah	
               Gorin	
            NW1	
        002	
                                                                                     005	
                  Aruba	
  
                                                                                                 Status	
  Table	
                                               006	
                 Austria	
  
     5	
          Bob	
            Young	
               30303	
       001	
                                                              Country	
  
                                                                                    User	
  ID	
     Status	
  ID	
         Text	
          ID	
  
                                                                                                                                                                 007	
                  Brazil	
  
     6	
         Nancy	
              Baker	
            10010	
       001	
             1	
             a42	
           At	
  conf	
      	
  	
  134	
  
                                                                                                                                                                 008	
                 Canada	
  
                                                                                         4	
             b26	
           excited	
          007	
  
     7	
          Ray	
                Jones	
           31311	
       001	
  
                                                                                         5	
             c32	
           hockey	
          	
  	
  008	
         009	
                   Chile	
  
     8	
          Lee	
                   Chen	
         V5V3M	
       008	
  
                                                                                        12	
             d83	
            Go	
  A’s	
      	
  	
  001	
                         • 
                                                                                                                                                                                 • 
                                                                                                                                                                                         	
  	
  
                                                                                                                                                                                         	
  	
  
                                                                                                                                                                                 •       	
  	
  

                                                                                     5000	
              e34	
            sailing	
        	
  	
  005	
  
                              •    	
  	
                                .	
  
                              •    	
  	
                                .	
                                                                                     130	
                Portugal	
  
                              •    	
  	
                                .	
               Affilia&ons	
  Table	
  
                                                                                                                                          Country	
  
                                                                                    User	
  ID	
       Affl	
  ID	
       Affl	
  Name	
        ID	
                 131	
                Romania	
  
50000	
          Doug	
            Moore	
               04252	
      001	
              2	
            a42	
               Cal	
          	
  	
  001	
         132	
                  Russia	
  
                                                                                         4	
            b96	
              USC	
           	
  	
  001	
  
50001	
          Mary	
             White	
              SW195	
      002	
                                                                                      133	
                  Spain	
  
                                                                                         7	
             c14	
              UW	
           	
  	
  001	
  
50002	
           Lisa	
                  Clark	
        12425	
      001	
              8	
            e22	
            Oxford	
          	
  	
  002	
         134	
                 Sweden	
  
                                                                                                                                                                                                             12	
  
Making	
  the	
  Same	
  Change	
  with	
  a	
  Document	
  Database	
  	
  



                             	
  {	
  
                             	
  	
  	
  	
  “ID”:	
  1,	
  
                             	
  	
  	
  	
  “FIRST”:	
  “Dip&”,	
  
                             	
  	
  	
  	
  “LAST”:	
  “Borkar”,	
  
                             	
  	
  	
  	
  “ZIP”:	
  “94040”,	
  
                             	
  	
  	
  	
  “CITY”:	
  “MV”,	
  
                             	
  	
  	
  	
  “STATE”:	
  “CA”,	
  
                             	
  	
  	
  	
  “STATUS”:	
  	
  
                                                                                           }	
  
                                                                                        ,	
  
                             	
  	
  	
  	
  	
  	
  {	
  	
  “TEXT”:	
  “At	
  Conf”	
  	
  
                             	
  }	
  	
  	
  	
  	
  	
  	
  “GEO_LOC”:	
  “134”	
  },	
  
                             	
   “COUNTRY”:	
  ”USA”	
  
                                  }	
                             	
               	
  	
  
                                                                                         JSON	
  




                Just	
  add	
  informa&on	
  to	
  a	
  document	
  
                                                                                                    13	
  
Document	
  modeling	
  

	
  
	
  
                     •    Are	
  these	
  separate	
  object	
  in	
  the	
  model	
  layer?	
  	
  
	
  
	
  
       Q	
           • 
                     • 
                          Are	
  these	
  objects	
  accessed	
  together?	
  	
  
                          Do	
  you	
  need	
  updates	
  to	
  these	
  objects	
  to	
  be	
  atomic?	
  
                     •    Are	
  mul&ple	
  	
  people	
  edi&ng	
  these	
  objects	
  concurrently?	
  	
  
	
  
               	
  When	
  considering	
  how	
  to	
  model	
  data	
  for	
  a	
  given	
  
               	
  applica&on	
  
               •  Think	
  of	
  a	
  logical	
  container	
  for	
  the	
  data	
  
               •  Think	
  of	
  how	
  data	
  groups	
  together 	
  	
  
        	
  
	
  
                                                                                                                14	
  
Document	
  Design	
  Op&ons	
  
      	
  	
  	
   	
  	
  
  •  One	
  document	
  that	
  contains	
  all	
  related	
  data	
  	
  	
  
      –  Data	
  is	
  de-­‐normalized	
  
      –  Be]er	
  performance	
  and	
  scale	
  
      –  Eliminate	
  client-­‐side	
  joins	
  	
  
      	
  
  •  Separate	
  documents	
  for	
  different	
  object	
  types	
  with	
  
     cross	
  references	
  	
  
      –  Data	
  duplica&on	
  is	
  reduced	
  
      –  Objects	
  may	
  not	
  be	
  co-­‐located	
  	
  
      –  Transac&ons	
  supported	
  only	
  on	
  a	
  document	
  boundary	
  
      –  Most	
  document	
  databases	
  do	
  not	
  support	
  joins	
  

                                                                                 15	
  
Document	
  ID	
  /	
  Key	
  selec&on	
  

 •    Similar	
  to	
  primary	
  keys	
  in	
  rela&onal	
  databases	
  
 •    Documents	
  are	
  sharded	
  based	
  on	
  the	
  document	
  ID	
  
 •    ID	
  based	
  document	
  lookup	
  is	
  extremely	
  fast	
  	
  
 •    Usually	
  an	
  ID	
  can	
  only	
  appear	
  once	
  in	
  a	
  bucket	
  
       	
  
       	
  
       	
  




 Q	
   	
       • 	
  	
  	
  	
  Do	
  you	
  have	
  a	
  unique	
  way	
  of	
  referencing	
  objects?	
  
                • 	
  	
  	
  	
  Are	
  related	
  objects	
  stored	
  in	
  separate	
  documents?	
  

 Op&ons	
  
       • UUIDs,	
  date-­‐based	
  IDs,	
  numeric	
  IDs	
  	
  	
  
       • Hand-­‐crajed	
  (human	
  readable)	
  	
  
       • Matching	
  prefixes	
  (for	
  mul&ple	
  related	
  objects)	
  

                                                                                                                 16	
  
Example:	
  En&&es	
  for	
  a	
  Blog	
  
                                                                      BLOG	
  
    •  User	
  profile	
  
       The	
  main	
  pointer	
  into	
  the	
  user	
  data	
  
         •  Blog	
  entries	
  
         •  Badge	
  sekngs,	
  like	
  a	
  twi]er	
  badge
               	
  	
  
                                                               	
  




    •  Blog	
  posts	
  
          Contains	
  the	
  blogs	
  themselves	
  	
  
            	
  




    •  Blog	
  comments	
  
       •  Comments	
  from	
  other	
  users	
  


                                                                                 17	
  
Blog	
  Document	
  –	
  Op&on	
  1	
  –	
  Single	
  document	
  	
  

       {	
  
       “UUID ”:	
  “2 1 f7 f8 de-­‐8 0 5 1 -­‐5 b89 -­‐8 6
       “Time”:	
   “2 0 1 1 -­‐0 4-­‐0 1 T1 3 :0 1 :0 2.4 2
 { “Server”:	
   “A2 2 2 3 E”,
       !
 “_id”: “jchris_Hello_World”,!3 W”,
       “Calling	
   Server”:	
   “A2 2 1
       “Type”:	
   “E1 0 0 ”,
 “author”: “jchris”, !
       “Initiating	
   Us er”:	
   “ds allings @s py.net”,
 “type”: “post”!
       “D etails ”:	
  
 “title”: “Hello World”,!
                {
 “format”: “IP”:	
  “1 0 .1 ! .2 2 ”,
                “markdown”, .1
                “API”:	
   “Ins ertD VD QueueItem”,
 “body”: “Hello from [Couchbase](http://couchbase.com).”, !
                “Trace”:	
   “cleans ed”,
 “html”: “<p>Hello from <a href=“http: …!
                “Tags ”:	
  
 “comments”:[ ! [
                     [“format”: “markdown”, “body”:”Awesome post!”],!
                         “SERVER”,	
  
                         “US-­‐Wes t”,	
  
                     [“format”: “markdown”, “body”:”Like it.” ]!
                   ]!    “API”
                          ]
 }	
  
                }
       }


                                                                         18	
  
Blog	
  Document	
  –	
  Op&on	
  2	
  -­‐	
  Split	
  into	
  mul&ple	
  docs	
  	
  

{	
  
{ !
“UUID ”:	
  “21f7f8de-­‐8051 -­‐5b89 -­‐86
“_id”: “jchris_Hello_World”,!
“Time”:	
   “2011 -­‐04-­‐01T13:01:02.42
“author”: “A2223E”, !
“Server”:	
  
                 “jchris”,
“Calling	
   Server”:	
   “A2213W”,
“type”: “E100 ”,
“Type”:	
   “post”!
“title”: “Hello World”,! @s py.net”,
“Initiating	
   Us er”:	
   “ds allings
“D etails ”:	
  
“format”: “markdown”, !
         {
“body”:“IP”:	
  “10.1.1.22”,
             “Hello from [Couchbase](
         “API”:	
   “Ins ertDVD QueueItem”,
http://couchbase.com).”, !
         “Trace”:	
   “cleans ed”,
“html”:“Tags ”:	
  
             “<p>Hello from <a href=“http: …!
                   [
“comments”:[!      “SERVER”,	
  
       !             “comment1_jchris_Hello_world”!
                   “US-­‐Wes t”,	
  
       !           “API”
                  ]! ]          {	
  
                                                                COMMENT	
  
}!       }                      “UUID ”:	
  “ 2 1 f7 f8 de-­‐8 0 5 1 -­‐5 b8   9 -­‐8 6
                                “Time”:	
   “ 2 0 1 1 -­‐0 4 -­‐0 1 T1 3 :0 1 :0 2 .4 2
                                “Server”:	
   “A2 2 2 3 E”,
}                               “Calling	
   Server”:	
   “A2 2 1 3 W ”,
                                     {!
     BLOG	
  DOC	
  
                                “Type”:	
   “E1 0 0 ”,
                                “Initiating	
   Us er”:	
   “ds allings @s py.net”,

                                     “_id”: “comment1_jchris_Hello_World”,!
                                “D etails ”:	
  
                                         {
                                         “IP ”:	
  “ 1 0 .1 .1 .2 2 ”,
                                     “format”: “markdown”, !
                                         “AP I”:	
   “ Ins ertD VD QueueItem”,
                                         “Trace”:	
   “cleans ed”,
                                         “Tags ”:	
  
                                     “body”:”Awesome post!” !
                                                   [
                                                   “SERVER”,	
  
                                                   “US-­‐Wes t”,	
  
                                     }	
           “AP I”
                                                     ]
                                         }
                                }
                                                                                          19	
  
Threaded	
  Comments	
  

•  You	
  can	
  imagine	
  how	
  to	
  take	
  this	
  to	
  a	
  threaded	
  list	
  

                                          List	
        First	
  
                                                                                       Reply	
  to	
  
                                                        comment	
  
              Blog	
                                                        List	
     comment	
  



                                                           More	
  
                                                           Comments	
  
Advantages	
  
•  Only	
  fetch	
  the	
  data	
  when	
  you	
  need	
  it	
  
    •  For	
  example,	
  rendering	
  part	
  of	
  a	
  web	
  page	
  
•  Spread	
  the	
  data	
  and	
  load	
  across	
  the	
  en&re	
  cluster	
  	
  
                                                                                                         20	
  
COMPARING	
  	
  
SCALING	
  MODEL	
  




                       21	
  
Rela&onal	
  Technology	
  Scales	
  Up	
  
                                                                                  Applica&on	
  Scales	
  Out	
  
                                                                       Just	
  add	
  more	
  commodity	
  web	
  servers	
  

                                                                                 System	
  Cost	
  
                                                                                 Applica&on	
  Performance	
  	
  


Web/App	
  Server	
  Tier	
  




                                                                     Users	
  

                                                                                          RDBMS	
  Scales	
  Up	
  
                                                                           Get	
  a	
  bigger,	
  more	
  complex	
  server	
  

                                                                                 System	
  Cost	
  
                                                                                 Applica&on	
  Performance	
  	
  



                                                                                                                                  Won’t	
  
                                                                                                                                  scale	
  
                                                                                                                                  beyond	
  
                                                                                                                                  this	
  point	
  
                                Rela&onal	
  Database	
  
                                                                     Users	
  



                      Expensive	
  and	
  disrup&ve	
  sharding,	
  doesn’t	
  perform	
  at	
  web	
  scale	
  
                                                                                                                                                      22	
  
Couchbase	
  Server	
  Scales	
  Out	
  Like	
  App	
  Tier	
  
                                                                                        Applica&on	
  Scales	
  Out	
  
                                                                             Just	
  add	
  more	
  commodity	
  web	
  servers	
  

                                                                                       System	
  Cost	
  
                                                                                       Applica&on	
  Performance	
  	
  


Web/App	
  Server	
  Tier	
  




                                                                           Users	
  

                                                                                      NoSQL	
  Database	
  Scales	
  Out	
  
                                                                              Cost	
  and	
  performance	
  mirrors	
  app	
  &er	
  

                                                                                       System	
  Cost	
  
                                                                                       Applica&on	
  Performance	
  	
  


                       Couchbase	
  Distributed	
  Data	
  Store	
  




                                                                           Users	
  



                                   Scaling	
  out	
  flatens	
  the	
  cost	
  and	
  performance	
  curves	
  
                                                                                                                                        23	
  
EVALUATING	
  NOSQL	
  




                          24	
  
The	
  Process	
  –	
  From	
  Evalua&on	
  to	
  Go	
  Live	
  	
  

                        No	
  different	
  from	
  evalua&ng	
  a	
  rela&onal	
  database	
  
                 	
  
         1	
              	
  Analyze	
  your	
  requirements	
  	
  
                 	
  
         2	
              	
  Find	
  solu&ons	
  /	
  products	
  that	
  match	
  key	
  requirements	
  
                 	
  
         3	
              	
  Execute	
  a	
  proof	
  of	
  concept	
  /	
  performance	
  evalua&on	
  
                 	
  
         4	
              	
  Begin	
  development	
  of	
  applica&on	
  	
  
                          	
  	
  
         5	
              	
  Deploy	
  in	
  staging	
  and	
  then	
  produc&on	
  
                 	
  
                                New	
  requirements	
  è	
  New	
  solu&ons	
  
                                	
                                                                            25	
  
1	
      	
  Analyze	
  your	
  requirements	
  	
  
        Common	
  applica&on	
  requirements	
  
            	
  
            •  Rapid	
  applica&on	
  development	
  
                   –  Changing	
  market	
  needs	
  
                   –  Changing	
  data	
  needs	
  	
  
            •  Scalability	
  	
  
                   –  Unknown	
  user	
  demand	
  	
  
                   –  Constantly	
  growing	
  throughput	
  
            •  Consistent	
  Performance	
  	
  
                   –  Low	
  response	
  &me	
  for	
  be]er	
  user	
  experience	
  
                   –  High	
  throughput	
  to	
  handle	
  viral	
  growth	
  	
  
            •  Reliability	
  
                   –  Always	
  online	
  	
  

                                                                                         26	
  
2	
     	
  Find	
  solu&ons	
  that	
  match	
  key	
  requirements	
  

•  Linear	
  Scalability	
  	
  
•  Schema	
  flexibility	
                                     NoSQL	
  
•  High	
  Performance	
  
•    Mul&-­‐document	
  transac&ons	
  
•    Database	
  Rollback	
  	
  
•    Complex	
  security	
  needs	
                           RDBMS	
  
•    Complex	
  joins	
  
•    Extreme	
  compression	
  needs	
  

•  Both	
  /	
  depends	
  on	
  the	
  data	
        RDBMS	
             NoSQL	
  

                                                                                      27	
  
3	
        	
  Proof	
  of	
  concept	
  /	
  Performance	
  evalua&on	
  

    Prototype	
  a	
  workload	
  
    	
  
    •  Look	
  for	
  consistent	
  performance…	
  	
  
    	
  
    	
         –  Low	
  response	
  &mes	
  /	
  latency	
  	
  
    	
             •  For	
  be]er	
  user	
  experience	
  
               –  High	
  throughput	
  	
  
                   •  To	
  handle	
  viral	
  growth	
  	
  
                   •  For	
  resource	
  efficiency	
  
    •  …	
  across	
  
               –  Read	
  heavy	
  /	
  Write	
  heavy	
  /	
  Mixed	
  workloads	
  
               –  Clusters	
  of	
  growing	
  sizes	
  	
  
    •  …	
  and	
  watch	
  for	
  	
  
               –  Conten&on	
  /	
  heavy	
  locking	
  	
  
               –  Linear	
  scalability	
  
                                                                                        28	
  
3	
     	
  Other	
  considera&ons	
  

	
  	
  	
  Accessing	
  data	
  
                                                                                  App	
  Server	
  
        –  No	
  standards	
  exist	
  yet	
  
        –  Typically	
  via	
  SDKs	
  or	
  over	
  HTTP	
  
        –  Check	
  if	
  the	
  programing	
  language	
  of	
  your	
  
           choice	
  is	
  supported.	
  

	
  	
  	
  Consistency	
  
                                                                                  App	
  Server	
  
              –  Consistent	
  only	
  at	
  the	
  document	
  level	
  
              –  Most	
  documents	
  stores	
  currently	
  don’t	
  
                 support	
  mul&-­‐document	
  transac&ons	
  
              –  Analyze	
  your	
  applica&on	
  needs	
  

	
  	
  	
  Availability	
                                                        App	
  Server	
  
              –  Each	
  node	
  stores	
  ac&ve	
  and	
  replica	
  data	
  
                 (Couchbase)	
  
              –  Each	
  node	
  is	
  either	
  a	
  master	
  or	
  slave	
  
                 (MongoDB)	
  
                                                                                                      29	
  
3	
     	
  Other	
  considera&ons	
  
	
  	
  	
  Opera&ons	
  
                                                                           App	
  Server	
  
              –  Monitoring	
  the	
  system	
  
              –  Backup	
  and	
  restore	
  the	
  system	
  
              –  Upgrades	
  and	
  maintenance	
  	
  
              –  Support	
  


	
  	
  	
  Ease	
  of	
  Scaling	
                                          App	
  Server	
  
              –  Ease	
  of	
  adding	
  and	
  reducing	
  capacity	
           Client	
  

              –  Single	
  node	
  type	
  
              –  App	
  availability	
  on	
  topology	
  changes	
  


	
  	
  	
  Indexing	
  and	
  Querying	
  
               –  Secondary	
  indexes	
  (Map	
  func&ons)	
  
               –  Aggregates	
  Grouping	
  (Reduce	
  func&ons)	
  
               –  Basic	
  querying	
  	
  
                                                                                                 30	
  
4	
        	
  Begin	
  development	
  

    	
  
    	
  
    	
  

                     Data	
  Modeling	
  and	
  
                     Document	
  Design	
  



                                                   31	
  
5	
        	
  Deploying	
  to	
  staging	
  and	
  produc&on	
  

    	
  
             •  Monitoring	
  the	
  system	
  
    	
  
                •  RESTful	
  interfaces	
  /	
  Easy	
  integra&on	
  with	
  monitoring	
  
    	
             tools	
  

             •  High-­‐availability	
  
                 •  Replica&on	
  
                 •  Failover	
  and	
  Auto-­‐failover	
  	
  

             •  Always	
  Online	
  –	
  even	
  for	
  maintenance	
  tasks	
  	
  
                 •  Database	
  upgrades	
  
                 •  Sojware	
  (OS)	
  and	
  Hardware	
  upgrades	
  
                 •  Backup	
  and	
  restore	
  
                 •  Index	
  building	
  
                 •  Compac&on	
  

                                                                                                32	
  
Couchbase	
  Server	
  Admin	
  Console	
  




                                              33	
  
34	
  
So	
  are	
  you	
  being	
  impacted	
  by	
  these?	
  	
  

                                     Schema	
  Rigidity	
  problems	
  	
  
         •  Do	
  you	
  store	
  serialized	
  objects	
  in	
  the	
  database?	
  
         •  Do	
  you	
  have	
  lots	
  of	
  sparse	
  tables	
  with	
  very	
  few	
  columns	
  
 Q	
        being	
  used	
  by	
  most	
  rows?	
  
         •  Do	
  you	
  find	
  that	
  your	
  applica&on	
  developers	
  require	
  schema	
  
            changes	
  frequently	
  due	
  to	
  constantly	
  changing	
  data?	
  	
  	
  
         •  Are	
  you	
  using	
  your	
  database	
  as	
  a	
  key-­‐value	
  store?	
  

                                          Scalability	
  problems	
  	
  
         •  Do	
  you	
  periodically	
  need	
  to	
  upgrade	
  systems	
  to	
  more	
  
            powerful	
  servers	
  and	
  scale	
  up?	
  	
  
 Q	
     •  Are	
  you	
  reaching	
  the	
  read	
  /	
  write	
  throughput	
  limit	
  of	
  a	
  single	
  
            database	
  server?	
  	
  
         •  Is	
  your	
  server’s	
  read	
  /	
  write	
  latency	
  not	
  mee&ng	
  your	
  SLA?	
  	
  
         •  Is	
  your	
  user	
  base	
  growing	
  at	
  a	
  frightening	
  pace?	
  	
  
                                                                                                                  35	
  
Is	
  NoSQL	
  the	
  right	
  choice	
  for	
  you?	
  


     Does	
  your	
  applica&on	
  need	
  rich	
  database	
  func&onality?	
  	
  
     	
  
          •  Mul&-­‐document	
  transac&ons	
  
          •  Complex	
  security	
  needs	
  –	
  user	
  roles,	
  document	
  level	
  security,	
  
             authen&ca&on,	
  authoriza&on	
  integra&on	
  
          •  Complex	
  joins	
  across	
  bucket	
  /	
  collec&ons	
  	
  
          •  BI	
  integra&on	
  	
  
          •  Extreme	
  compression	
  needs	
  

              NoSQL	
  may	
  not	
  be	
  the	
  right	
  choice	
  for	
  your	
  applica&on	
  



                                                                                                         36	
  
WHERE	
  IS	
  NOSQL	
  A	
  GOOD	
  FIT?	
  




                                                37	
  
Performance	
  driven	
  use	
  cases	
  

     •  Low	
  latency	
  
     •  High	
  throughput	
  ma]ers	
  
     •  Large	
  number	
  of	
  users	
  	
  
     •  Unknown	
  demand	
  with	
  sudden	
  growth	
  of	
  
        users/data	
  	
  
     •  Predominantly	
  direct	
  document	
  access	
  
     •  Workloads	
  with	
  very	
  high	
  muta&on	
  rate	
  per	
  
        document	
  (temporal	
  locality)	
  Working	
  set	
  with	
  
        heavy	
  writes	
  	
  


                                                                           38	
  
Data	
  driven	
  use	
  cases	
  	
  

      •    Support	
  for	
  unlimited	
  data	
  growth	
  	
  	
  
      •    Data	
  with	
  non-­‐homogenous	
  structure	
  	
  
      •    Need	
  to	
  quickly	
  and	
  ojen	
  change	
  data	
  structure	
  
      •    3rd	
  party	
  or	
  user	
  defined	
  structure	
  
      •    Variable	
  length	
  documents	
  
      •    Sparse	
  data	
  records	
  
      •    Hierarchical	
  data	
  	
  




                                                                                     39	
  
BRIEF	
  OVERVIEW	
  
COUCHBASE	
  SERVER	
  




                          40	
  
Couchbase	
  Server	
  

  NoSQL	
  Distributed	
  Document	
  Database	
  
     for	
  interac&ve	
  web	
  applica&ons	
  


                      2.0


                                                     41	
  
Couchbase	
  Server	
  


                              Grow	
  cluster	
  without	
  
                 Easy	
  
                              applica&on	
  changes,	
  without	
  
            Scalability	
  
                              down&me	
  with	
  a	
  single	
  click	
  

                              Consistent	
  sub-­‐millisecond	
  	
  
    Consistent,	
  High	
     read	
  and	
  write	
  response	
  &mes	
  	
  
       Performance	
          with	
  consistent	
  high	
  throughput	
  


           Always	
  On	
     No	
  down&me	
  for	
  sovware	
  
            24x7x365	
        upgrades,	
  hardware	
  maintenance,	
  
                              etc.	
  




                                                                                 42	
  
Flexible	
  Data	
  Model	
  


                                     	
  {	
  
                                     	
  	
  	
  	
  “ID”:	
  1,	
  
                                     	
  	
  	
  	
  “FIRST”:	
  “Dip&”,	
  
                                     	
  	
  	
  	
  “LAST”:	
  “Borkar”,	
  
                                     	
  	
  	
  	
  “ZIP”:	
  “94040”,	
  
                                     	
  	
  	
  	
  “CITY”:	
  “MV”,	
  
                                     	
  	
  	
  	
  “STATE”:	
  “CA”	
  
                                      }	
                                                    JSON	
     JSON	
  
                                                                                  JSON	
  
                                                                       JSON	
  




         •  No	
  need	
  to	
  worry	
  about	
  the	
  database	
  when	
  changing	
  your	
  
            applica&on	
  
         •  Records	
  can	
  have	
  different	
  structures,	
  there	
  is	
  no	
  fixed	
  
            schema	
  
         •  Allows	
  painless	
  data	
  model	
  changes	
  for	
  rapid	
  applica&on	
  
            development	
  
                                                                                                                   43	
  
         	
  
COUCHBASE	
  SERVER	
  	
  
  ARCHITECTURE	
  




                              44	
  
Couchbase	
  Server	
  2.0	
  Architecture	
  
    8092	
                          11211	
                       11210	
  
    Query	
  API	
                  Memcapable	
  	
  1.0	
       Memcapable	
  	
  2.0	
  



                                        Moxi	
  
         Query	
  Engine	
  




                                                                                               REST	
  management	
  API/Web	
  UI	
  




                                                                                                                                                                                                                                                                                                                         vBucket	
  state	
  and	
  replica&on	
  manager	
  
                                                Memcached	
  




                                                                                                                                                                                                               Global	
  singleton	
  supervisor	
  


                                                                                                                                                                                                                                                        Rebalance	
  orchestrator	
  
                                                                                                                                                                                  Configura&on	
  manager	
  




                                                                                                                                                                                                                                                                                         Node	
  health	
  monitor	
  
                                                                                                                                                         Process	
  monitor	
  
                                                                                                                                         Heartbeat	
  
                                      Couchbase	
  EP	
  Engine	
  
                               Data	
  Manager	
                                                                         Cluster	
  Manager	
  
                                                                storage	
  interface	
  




                               New	
  Persistence	
  Layer	
                                  htp	
                                              on	
  each	
  node	
                                                                                  one	
  per	
  cluster	
  



                                                                                                                                                                            Erlang/OTP	
  



                                                                                              HTTP	
                                         Erlang	
  port	
  mapper	
                                                                                                                 Distributed	
  Erlang	
  
                                                                                              8091	
                                         4369	
                                                                                                                                     21100	
  -­‐	
  21199	
  
                                                                                                                                                                                                                                                                                                                                                                                45	
  
Couchbase	
  Server	
  2.0	
  Architecture	
  
    8092	
                          11211	
                       11210	
  
    Query	
  API	
                  Memcapable	
  	
  1.0	
       Memcapable	
  	
  2.0	
  



                                        Moxi	
  
         Query	
  Engine	
  




                                                                                               REST	
  management	
  API/Web	
  UI	
  




                                                                                                                                                                                                                                                                                                                         vBucket	
  state	
  and	
  replica&on	
  manager	
  
                                                Memcached	
  




                                                                                                                                                                                                               Global	
  singleton	
  supervisor	
  


                                                                                                                                                                                                                                                        Rebalance	
  orchestrator	
  
                                                                                                                                                                                  Configura&on	
  manager	
  




                                                                                                                                                                                                                                                                                         Node	
  health	
  monitor	
  
                                                                                                                                                         Process	
  monitor	
  
                                                                                                                                         Heartbeat	
  
                                      Couchbase	
  EP	
  Engine	
  

                                                                storage	
  interface	
  




                               New	
  Persistence	
  Layer	
                                  htp	
                                              on	
  each	
  node	
                                                                                  one	
  per	
  cluster	
  



                                                                                                                                                                            Erlang/OTP	
  



                                                                                              HTTP	
                                         Erlang	
  port	
  mapper	
                                                                                                                 Distributed	
  Erlang	
  
                                                                                              8091	
                                         4369	
                                                                                                                                     21100	
  -­‐	
  21199	
  
                                                                                                                                                                                                                                                                                                                                                                                46	
  
Couchbase	
  deployment	
  


                          Web	
  
                        Applica&on	
  

                        Couchbase	
  
                       Client	
  Library	
  

    Data	
  Flow	
  




                                               Cluster	
  Management	
  


                                                                           47	
  
Single	
  node	
  -­‐	
  Couchbase	
  Write	
  Opera&on	
  
                                                                                 2	
  

                                                                  Doc	
  1	
  
                                 App	
  Server	
  




                                                          3	
            2	
             3	
  
                                                         Managed	
  Cache	
  
     To	
  other	
  node	
     Replica&on	
  
                                                                  Doc	
  1	
  
                                 Queue	
  




                                                                                                 Disk	
  Queue	
  
                                           Disk	
  




                                                      Couchbase	
  Server	
  Node	
                                  48	
  
Single	
  node	
  -­‐	
  Couchbase	
  Update	
  Opera&on	
  
                                                                                2	
  

                                                                Doc	
  1’	
  
                                App	
  Server	
  




                                                         3	
            2	
             3	
  
                                                        Managed	
  Cache	
  
    To	
  other	
  node	
     Replica&on	
  
                                                                Doc	
  1	
  
                                                                Doc	
  1’	
  
                                Queue	
  




                                                                                                Disk	
  Queue	
  
                                          Disk	
  
                                                                 Doc	
  1	
  




                                                     Couchbase	
  Server	
  Node	
                                  49	
  
Single	
  node	
  -­‐	
  Couchbase	
  Read	
  Opera&on	
  
                                                                                 2	
  




                                                                  Doc	
  1	
  
                                                                   GET	
  
                                 App	
  Server	
  




                                                          3	
            2	
             3	
  
                                                         Managed	
  Cache	
  
     To	
  other	
  node	
     Replica&on	
  
                                 Queue	
                          Doc	
  1	
  




                                                                                                 Disk	
  Queue	
  
                                           Disk	
  
                                                                  Doc	
  1	
  




                                                      Couchbase	
  Server	
  Node	
                                  50	
  
Single	
  node	
  -­‐	
  Couchbase	
  Cache	
  Evic&on	
  
                                                                                                       2	
  

                                                                                 Doc	
  6	
  
                                                                                        2
                                                                                        3
                                                                                        4
                                                                                        5
                                 App	
  Server	
  




                                                                3	
            2	
                             3	
  
                                                               Managed	
  Cache	
  
     To	
  other	
  node	
     Replica&on	
  
                                 Queue	
                                         Doc	
  1	
  




                                                                                                                       Disk	
  Queue	
  
                                           Disk	
  
                                                                                 Doc	
  1	
  


                                            Doc	
  6	
   Doc	
  5	
   Doc	
  4	
   Doc	
  3	
   Doc	
  2	
  




                                                       Couchbase	
  Server	
  Node	
                                                       51	
  
Single	
  node	
  –	
  Couchbase	
  Cache	
  Miss	
  
                                                                                                             2	
  




                                                                                      Doc	
  1	
  
                                                                                       GET	
  
                                 App	
  Server	
  




                                                                 3	
            2	
                                       3	
  
                                                                Managed	
  Cache	
  
     To	
  other	
  node	
     Replica&on	
  
                                 Queue	
                                              Doc	
  1	
  
                                                           Doc	
  5	
   4	
   4	
  
                                                             Doc	
  
                                                                  Doc	
                              Doc	
  3	
   2	
  
                                                                                                       Doc	
  




                                                                                                                                  Disk	
  Queue	
  
                                           Disk	
  
                                                                                      Doc	
  1	
  


                                            Doc	
  6	
   Doc	
  5	
   Doc	
  4	
   Doc	
  3	
   Doc	
  2	
  




                                                       Couchbase	
  Server	
  Node	
                                                                  52	
  
Cluster	
  wide	
  -­‐	
  Basic	
  Opera&on	
  

                            APP	
  SERVER	
  1	
                                                APP	
  SERVER	
  2	
  
                   COUCHBASE	
  Client	
  Library	
                                      COUCHBASE	
  Client	
  Library	
  
                             	
                                                                    	
  
                       CLUSTER	
  MAP	
  
                             	
                                                              CLUSTER	
  MAP	
  
                                                                                                   	
  


                                          READ/WRITE/UPDATE	
  

                        SERVER	
  1	
  
                           	
                                       SERVER	
  2	
  
                                                                       	
                                    SERVER	
  3	
  
                                                                                                                	
                •  Docs	
  distributed	
  evenly	
  across	
  
                           	
  
                         ACTIVE	
  
                                                                       	
  
                                                                     ACTIVE	
  
                                                                                                                	
  
                                                                                                              ACTIVE	
  
                                                                                                                                     servers	
  	
  

                  Doc	
  5	
        Doc	
                      Doc	
  4	
      Doc	
                    Doc	
  1	
      Doc	
     •  Each	
  server	
  stores	
  both	
  ac&ve	
  and	
  
                                                                                                                                     replica	
  docs	
  
                  Doc	
  2	
        Doc	
                      Doc	
  7	
      Doc	
                    Doc	
  2	
      Doc	
  
                                                                                                                                     Only	
  one	
  server	
  ac&ve	
  at	
  a	
  &me	
  

                                                                                                                                  •  Client	
  library	
  provides	
  app	
  with	
  
                  Doc	
  9	
        Doc	
                      Doc	
  8	
      Doc	
                    Doc	
  6	
      Doc	
  
                                                                                                                                     simple	
  interface	
  to	
  database	
  
                        REPLICA	
                                    REPLICA	
                                REPLICA	
           •  Cluster	
  map	
  provides	
  map	
  	
  
                                                                                                                                     to	
  which	
  server	
  doc	
  is	
  on	
  
                  Doc	
  4	
        Doc	
                      Doc	
  6	
      Doc	
                    Doc	
  7	
      Doc	
        App	
  never	
  needs	
  to	
  know	
  

                  Doc	
  1	
        Doc	
                      Doc	
  3	
      Doc	
                    Doc	
  9	
      Doc	
     •  App	
  reads,	
  writes,	
  updates	
  docs	
  
                  Doc	
  8	
        Doc	
                      Doc	
  2	
      Doc	
                    Doc	
  5	
      Doc	
     •  Mul&ple	
  app	
  servers	
  can	
  access	
  same
                                                                                                                                     document	
  at	
  same	
  &me	
  
                                                      COUCHBASE	
  SERVER	
  	
  CLUSTER	
  


User	
  Configured	
  Replica	
  Count	
  =	
  1	
                                                                                                                                           53	
  
Cluster	
  wide	
  -­‐	
  Add	
  Nodes	
  to	
  Cluster	
  

                                   APP	
  SERVER	
  1	
                                                        APP	
  SERVER	
  2	
  
                            COUCHBASE	
  Client	
  Library	
                                             COUCHBASE	
  Client	
  Library	
  
                                      	
                                                                           	
  
                                CLUSTER	
  MAP	
  
                                      	
                                                                     CLUSTER	
  MAP	
  
                                                                                                                   	
  


                                                READ/WRITE/UPDATE	
                                                                 READ/WRITE/UPDATE	
  


          SERVER	
  1	
  
             	
                                       SERVER	
  2	
  
                                                         	
                            SERVER	
  3	
  
                                                                                          	
                              SERVER	
  4	
  
                                                                                                                             	
                  SERVER	
  5	
  
                                                                                                                                                    	
             •  Two	
  servers	
  added	
  
             	
  
           ACTIVE	
  
                                                         	
  
                                                       ACTIVE	
  
                                                                                          	
  
                                                                                        ACTIVE	
  
                                                                                                                             	
  
                                                                                                                           ACTIVE	
  
                                                                                                                                                    	
  
                                                                                                                                                  ACTIVE	
  
                                                                                                                                                                      One-­‐click	
  opera&on	
  

     Doc	
  5	
       Doc	
                   Doc	
  4	
         Doc	
            Doc	
  1	
      Doc	
                                                            •  Docs	
  automa&cally	
  
                                                                                                                                                                      rebalanced	
  across	
  
     Doc	
  2	
       Doc	
                   Doc	
  7	
         Doc	
            Doc	
  2	
      Doc	
                                                               cluster	
  
                                                                                                                                                                      Even	
  distribu&on	
  of	
  docs	
  
                                                                                                                                                                      Minimum	
  doc	
  movement	
  
     Doc	
  9	
       Doc	
                   Doc	
  8	
         Doc	
            Doc	
  6	
      Doc	
  
                                                                                                                                                                   •  Cluster	
  map	
  updated	
  
           REPLICA	
                                   REPLICA	
                        REPLICA	
                          REPLICA	
              REPLICA	
  
                                                                                                                                                                   •  App	
  database	
  	
  
     Doc	
  4	
       Doc	
                   Doc	
  6	
         Doc	
            Doc	
  7	
      Doc	
                                                               calls	
  now	
  distributed	
  	
  
                                                                                                                                                                      over	
  larger	
  number	
  of	
  
     Doc	
  1	
       Doc	
                   Doc	
  3	
         Doc	
            Doc	
  9	
      Doc	
  
                                                                                                                                                                      servers	
  
                                                                                                                                                                      	
  
     Doc	
  8	
       Doc	
                   Doc	
  2	
         Doc	
            Doc	
  5	
      Doc	
  


                                                                        COUCHBASE	
  SERVER	
  	
  CLUSTER	
  


User	
  Configured	
  Replica	
  Count	
  =	
  1	
                                                                                                                                                       54	
  
Cluster	
  wide	
  -­‐	
  Fail	
  Over	
  Node	
  

                                      APP	
  SERVER	
  1	
                                                     APP	
  SERVER	
  2	
  
                              COUCHBASE	
  Client	
  Library	
                                           COUCHBASE	
  Client	
  Library	
  
                                        	
                                                                         	
  
                                  CLUSTER	
  MAP	
  
                                        	
                                                                   CLUSTER	
  MAP	
  
                                                                                                                   	
  




            SERVER	
  1	
  
               	
                                     SERVER	
  2	
  
                                                         	
                            SERVER	
  3	
  
                                                                                          	
                                SERVER	
  4	
  
                                                                                                                               	
                     SERVER	
  5	
  
                                                                                                                                                         	
                •  App	
  servers	
  accessing	
  docs	
  
               	
                                        	
                               	
                                   	
                        	
  
             ACTIVE	
                                  ACTIVE	
                         ACTIVE	
                             ACTIVE	
                  ACTIVE	
  
                                                                                                                                                                           •  Requests	
  to	
  Server	
  3	
  fail	
  
       Doc	
  5	
       Doc	
                   Doc	
  4	
       Doc	
            Doc	
  1	
      Doc	
               Doc	
  9	
       Doc	
     Doc	
  6	
      Doc	
  
                                                                                                                                                                           •  Cluster	
  detects	
  server	
  failed	
  
                                                                                                                                                                              Promotes	
  replicas	
  of	
  docs	
  to	
  
       Doc	
  2	
       Doc	
                   Doc	
  7	
       Doc	
            Doc	
  2	
      Doc	
               Doc	
  8	
       Doc	
                     Doc	
        ac&ve	
  
                                                                                                                                                                              Updates	
  cluster	
  map	
  
       Doc	
  1	
                               Doc	
  3	
  
                                                                                                                                                                           •  Requests	
  for	
  docs	
  now	
  go	
  to	
  
             REPLICA	
                                 REPLICA	
                        REPLICA	
                            REPLICA	
                 REPLICA	
              appropriate	
  server	
  

       Doc	
  4	
       Doc	
                   Doc	
  6	
       Doc	
            Doc	
  7	
      Doc	
               Doc	
  5	
      Doc	
      Doc	
  8	
      Doc	
     •  Typically	
  rebalance	
  	
  
                                                                                                                                                                              would	
  follow	
  
       Doc	
  1	
       Doc	
                   Doc	
  3	
       Doc	
            Doc	
  9	
      Doc	
               Doc	
  2	
                                 Doc	
  




                                                                        COUCHBASE	
  SERVER	
  	
  CLUSTER	
  


User	
  Configured	
  Replica	
  Count	
  =	
  1	
                                                                                                                                                                            55	
  
Indexing	
  and	
  Querying	
  	
  

                            APP	
  SERVER	
  1	
                                               APP	
  SERVER	
  2	
  
                    COUCHBASE	
  Client	
  Library	
                                    COUCHBASE	
  Client	
  Library	
  
                              	
                                                                  	
  
                        CLUSTER	
  MAP	
  
                              	
                                                            CLUSTER	
  MAP	
  
                                                                                                  	
  



                                                                                                              Query     	
  
                  SERVER	
  1	
                                           SERVER	
  2	
                                          SERVER	
  3	
  
                                                                                                                                           	
      •  Indexing	
  work	
  is	
  distributed	
  
                  ACTIVE	
  
                              	
  
                                                                          ACTIVE	
  
                                                                                      	
  
                                                                                                                                ACTIVE	
  
                                                                                                                                           	
         amongst	
  nodes	
  

          Doc	
  5	
       Doc	
                                  Doc	
  5	
       Doc	
                                Doc	
  5	
       Doc	
     •  Large	
  data	
  set	
  possible	
  

          Doc	
  2	
       Doc	
                                  Doc	
  2	
       Doc	
                                Doc	
  2	
       Doc	
  
                                                                                                                                                   •  Parallelize	
  the	
  effort	
  

          Doc	
  9	
       Doc	
  
                                                                                                                                                   •  Each	
  node	
  has	
  index	
  for	
  data	
  stored
                                                                  Doc	
  9	
       Doc	
                                Doc	
  9	
       Doc	
  
                                                                                                                                                      on	
  it	
  
                REPLICA	
                                               REPLICA	
                                              REPLICA	
           •  Queries	
  combine	
  the	
  results	
  from	
  
          Doc	
  4	
       Doc	
  
                                                                                                                                                      required	
  nodes	
  
                                                                  Doc	
  4	
       Doc	
                                Doc	
  4	
      Doc	
  

          Doc	
  1	
       Doc	
                                  Doc	
  1	
       Doc	
                                Doc	
  1	
      Doc	
  

          Doc	
  8	
       Doc	
                                  Doc	
  8	
       Doc	
                                Doc	
  8	
      Doc	
  

                                                      COUCHBASE	
  SERVER	
  	
  CLUSTER	
  


User	
  Configured	
  Replica	
  Count	
  =	
  1	
                                                                                                                                                  56	
  
Cross	
  Data	
  Center	
  Replica&on	
  (XDCR)	
  
      SERVER	
  1	
  	
                           SERVER	
  2	
   	
                                                 SERVER	
  3	
   	
  
                       	
   ACTIVE	
                                 	
   ACTIVE	
                                                     	
   ACTIVE	
  
                                                                                                                                                                                  COUCHBASE	
  SERVER	
  	
  CLUSTER	
  
                    Doc	
  	
                                    Doc	
                                                              Doc	
  	
  
                                                                                                                                                                                       NY	
  DATA	
  CENTER	
  

                   Doc	
  2	
                                    Doc	
  	
                                                          Doc	
  	
  

                   Doc	
  9	
                                    Doc	
  	
                                                          Doc	
  
RAM	
                                        RAM	
                                                            RAM	
  


     Doc	
  	
       Doc	
  	
     Doc	
           Doc	
  	
       Doc	
          Doc	
  	
                          Doc	
          Doc	
           Doc	
  

                   DISK	
                                        DISK	
                                                          DISK	
  



                                                                                        SERVER	
  1	
  	
                                                SERVER	
  2	
   	
                                SERVER	
  3	
   	
  
                                                                                                         	
   ACTIVE	
                                                      	
   ACTIVE	
                                    	
   ACTIVE	
  

                                                                                                      Doc	
  	
                                                         Doc	
                                             Doc	
  	
  

                                                                                                     Doc	
  2	
                                                         Doc	
  	
                                         Doc	
  	
  

                                                                                                     Doc	
  9	
                                                         Doc	
  	
                                         Doc	
  
                                                                               RAM	
                                                              RAM	
                                               RAM	
  


           COUCHBASE	
  SERVER	
  	
  CLUSTER	
                                        Doc	
  	
       Doc	
  	
     Doc	
                               Doc	
  	
        Doc	
         Doc	
  	
          Doc	
          Doc	
         Doc	
  
                SF	
  DATA	
  CENTER	
  
                                                                                                     DISK	
                                                            DISK	
                                          DISK	
                  57	
  
THANK	
  YOU	
  
          	
  
          	
  
DIPTI@COUCHBASE.COM	
  
      @DBORKAR	
  



                          58	
  
59	
  
60	
  

Navigating the Transition from relational to NoSQL - CloudCon Expo 2012

  • 1.
    Naviga&ng  the  Transi&on  from   Rela&onal  to  NoSQL  Technology   Dip&  Borkar   Director,  Product  Management   1  
  • 2.
    WHY  TRANSITION  TO  NOSQL?     2  
  • 3.
    Two  big  drivers  for  NoSQL  adop&on   49%   35%   29%   16%   12%   11%   Lack  of  flexibility/   Inability  to   Performance   Cost   All  of  these   Other   rigid  schemas   scale  out  data   challenges   Source:  Couchbase  Survey,  December  2011,  n  =  1351.   3  
  • 4.
    NoSQL  catalog   Key-­‐Value   Data  Structure   Document   Column   Graph   (memory  only)   Cache   memcached   redis   (memory/disk)   membase   couchbase   cassandra   Neo4j   Database   mongoDB   4  
  • 5.
    DISTRIBUTED  DOCUMENT   DATABASES   5  
  • 6.
    Document  Databases   • Each  record  in  the  database  is  a  self-­‐ describing  document     {   •  Each  document  has  an  independent   “UUID”:  “ 21f7f8de-­‐8051-­‐5b89-­‐86 “Time”:   “2011-­‐04-­‐01T13:01:02.42 “Server”:   “A2223E”, structure   “Calling   Server”:   “A2213W”, “Type”:   “E100”, “Initiating   User”:   “dsallings@spy.net”, •  Documents  can  be  complex     “Details”:   { “IP”:  “ 10.1.1.22”, •  All  databases  require  a  unique  key   “API”:   “InsertDVDQueueItem”, “Trace”:   “cleansed”, •  Documents  are  stored  using  JSON  or   “Tags”:   [ “SERVER”,   XML  or  their  deriva&ves   “US-­‐West”,   “API” ] •  Content  can  be  indexed  and  queried     } } •  Offer  auto-­‐sharding  for  scaling  and   replica&on  for  high-­‐availability   6  
  • 7.
  • 8.
  • 9.
    Rela&onal  vs  Document  data  model   C1   C2   C3   C4   {   JSON       JSON     }   JSON   Rela&onal  data  model   Document  data  model   Highly-­‐structured  table  organiza&on   Collec&on  of  complex  documents  with   with  rigidly-­‐defined  data  formats  and   arbitrary,  nested  data  formats  and   record  structure.   varying  “record”  format.   9  
  • 10.
    Example:  User  Profile   User  Info   Address  Info   KEY   First   Last   ZIP_id   ZIP_id   CITY   STATE   ZIP   1   Dip&   Borkar   2   1   DEN   CO   30303   2   Joe Smith   2   2   MV   CA   94040     3   Ali   Dodson   2   3   CHI   IL   60609   4   John   Doe   3   4   NY   NY   10010   To  get  informa&on  about  specific  user,  you  perform  a  join  across  two  tables     10  
  • 11.
    Document  Example:  User  Profile    {          “ID”:  1,   =   +          “FIRST”:  “Dip&”,          “LAST”:  “Borkar”,          “ZIP”:  “94040”,          “CITY”:  “MV”,          “STATE”:  “CA”      }   JSON   All  data  in  a  single  document   11  
  • 12.
    Making  a  Change  Using  RDBMS   User  Table   Photo  Table   Country  Table   Country   TEL Country   User  ID   First   Last   Zip   ID   User  ID   3   Photo  ID   Comment   ID   Country  ID   Country  name   2   d043   NYC      001   001   USA   1   Dip&   Borkar   94040    001   2   b054   Bday      007   002   UK   2   Joe   Smith   94040   001   5   c036   Miami      001   003   Argen&na   3   Ali   Dodson   94040   001   7   d072   Sunset      133   004   Australia   5002   e086   Spain      133   4   Sarah   Gorin   NW1   002   005   Aruba   Status  Table   006   Austria   5   Bob   Young   30303   001   Country   User  ID   Status  ID   Text   ID   007   Brazil   6   Nancy   Baker   10010   001   1   a42   At  conf      134   008   Canada   4   b26   excited   007   7   Ray   Jones   31311   001   5   c32   hockey      008   009   Chile   8   Lee   Chen   V5V3M   008   12   d83   Go  A’s      001   •  •          •      5000   e34   sailing      005   •      .   •      .   130   Portugal   •      .   Affilia&ons  Table   Country   User  ID   Affl  ID   Affl  Name   ID   131   Romania   50000   Doug   Moore   04252   001   2   a42   Cal      001   132   Russia   4   b96   USC      001   50001   Mary   White   SW195   002   133   Spain   7   c14   UW      001   50002   Lisa   Clark   12425   001   8   e22   Oxford      002   134   Sweden   12  
  • 13.
    Making  the  Same  Change  with  a  Document  Database      {          “ID”:  1,          “FIRST”:  “Dip&”,          “LAST”:  “Borkar”,          “ZIP”:  “94040”,          “CITY”:  “MV”,          “STATE”:  “CA”,          “STATUS”:     }   ,              {    “TEXT”:  “At  Conf”      }              “GEO_LOC”:  “134”  },     “COUNTRY”:  ”USA”   }         JSON   Just  add  informa&on  to  a  document   13  
  • 14.
    Document  modeling       •  Are  these  separate  object  in  the  model  layer?         Q   •  •  Are  these  objects  accessed  together?     Do  you  need  updates  to  these  objects  to  be  atomic?   •  Are  mul&ple    people  edi&ng  these  objects  concurrently?        When  considering  how  to  model  data  for  a  given    applica&on   •  Think  of  a  logical  container  for  the  data   •  Think  of  how  data  groups  together         14  
  • 15.
    Document  Design  Op&ons             •  One  document  that  contains  all  related  data       –  Data  is  de-­‐normalized   –  Be]er  performance  and  scale   –  Eliminate  client-­‐side  joins       •  Separate  documents  for  different  object  types  with   cross  references     –  Data  duplica&on  is  reduced   –  Objects  may  not  be  co-­‐located     –  Transac&ons  supported  only  on  a  document  boundary   –  Most  document  databases  do  not  support  joins   15  
  • 16.
    Document  ID  /  Key  selec&on   •  Similar  to  primary  keys  in  rela&onal  databases   •  Documents  are  sharded  based  on  the  document  ID   •  ID  based  document  lookup  is  extremely  fast     •  Usually  an  ID  can  only  appear  once  in  a  bucket         Q     •         Do  you  have  a  unique  way  of  referencing  objects?   •         Are  related  objects  stored  in  separate  documents?   Op&ons   • UUIDs,  date-­‐based  IDs,  numeric  IDs       • Hand-­‐crajed  (human  readable)     • Matching  prefixes  (for  mul&ple  related  objects)   16  
  • 17.
    Example:  En&&es  for  a  Blog   BLOG   •  User  profile   The  main  pointer  into  the  user  data   •  Blog  entries   •  Badge  sekngs,  like  a  twi]er  badge       •  Blog  posts   Contains  the  blogs  themselves       •  Blog  comments   •  Comments  from  other  users   17  
  • 18.
    Blog  Document  –  Op&on  1  –  Single  document     {   “UUID ”:  “2 1 f7 f8 de-­‐8 0 5 1 -­‐5 b89 -­‐8 6 “Time”:   “2 0 1 1 -­‐0 4-­‐0 1 T1 3 :0 1 :0 2.4 2 { “Server”:   “A2 2 2 3 E”, ! “_id”: “jchris_Hello_World”,!3 W”, “Calling   Server”:   “A2 2 1 “Type”:   “E1 0 0 ”, “author”: “jchris”, ! “Initiating   Us er”:   “ds allings @s py.net”, “type”: “post”! “D etails ”:   “title”: “Hello World”,! { “format”: “IP”:  “1 0 .1 ! .2 2 ”, “markdown”, .1 “API”:   “Ins ertD VD QueueItem”, “body”: “Hello from [Couchbase](http://couchbase.com).”, ! “Trace”:   “cleans ed”, “html”: “<p>Hello from <a href=“http: …! “Tags ”:   “comments”:[ ! [ [“format”: “markdown”, “body”:”Awesome post!”],! “SERVER”,   “US-­‐Wes t”,   [“format”: “markdown”, “body”:”Like it.” ]! ]! “API” ] }   } } 18  
  • 19.
    Blog  Document  –  Op&on  2  -­‐  Split  into  mul&ple  docs     {   { ! “UUID ”:  “21f7f8de-­‐8051 -­‐5b89 -­‐86 “_id”: “jchris_Hello_World”,! “Time”:   “2011 -­‐04-­‐01T13:01:02.42 “author”: “A2223E”, ! “Server”:   “jchris”, “Calling   Server”:   “A2213W”, “type”: “E100 ”, “Type”:   “post”! “title”: “Hello World”,! @s py.net”, “Initiating   Us er”:   “ds allings “D etails ”:   “format”: “markdown”, ! { “body”:“IP”:  “10.1.1.22”, “Hello from [Couchbase]( “API”:   “Ins ertDVD QueueItem”, http://couchbase.com).”, ! “Trace”:   “cleans ed”, “html”:“Tags ”:   “<p>Hello from <a href=“http: …! [ “comments”:[! “SERVER”,   ! “comment1_jchris_Hello_world”! “US-­‐Wes t”,   ! “API” ]! ] {   COMMENT   }! } “UUID ”:  “ 2 1 f7 f8 de-­‐8 0 5 1 -­‐5 b8 9 -­‐8 6 “Time”:   “ 2 0 1 1 -­‐0 4 -­‐0 1 T1 3 :0 1 :0 2 .4 2 “Server”:   “A2 2 2 3 E”, } “Calling   Server”:   “A2 2 1 3 W ”, {! BLOG  DOC   “Type”:   “E1 0 0 ”, “Initiating   Us er”:   “ds allings @s py.net”, “_id”: “comment1_jchris_Hello_World”,! “D etails ”:   { “IP ”:  “ 1 0 .1 .1 .2 2 ”, “format”: “markdown”, ! “AP I”:   “ Ins ertD VD QueueItem”, “Trace”:   “cleans ed”, “Tags ”:   “body”:”Awesome post!” ! [ “SERVER”,   “US-­‐Wes t”,   }   “AP I” ] } } 19  
  • 20.
    Threaded  Comments   • You  can  imagine  how  to  take  this  to  a  threaded  list   List   First   Reply  to   comment   Blog   List   comment   More   Comments   Advantages   •  Only  fetch  the  data  when  you  need  it   •  For  example,  rendering  part  of  a  web  page   •  Spread  the  data  and  load  across  the  en&re  cluster     20  
  • 21.
    COMPARING     SCALING  MODEL   21  
  • 22.
    Rela&onal  Technology  Scales  Up   Applica&on  Scales  Out   Just  add  more  commodity  web  servers   System  Cost   Applica&on  Performance     Web/App  Server  Tier   Users   RDBMS  Scales  Up   Get  a  bigger,  more  complex  server   System  Cost   Applica&on  Performance     Won’t   scale   beyond   this  point   Rela&onal  Database   Users   Expensive  and  disrup&ve  sharding,  doesn’t  perform  at  web  scale   22  
  • 23.
    Couchbase  Server  Scales  Out  Like  App  Tier   Applica&on  Scales  Out   Just  add  more  commodity  web  servers   System  Cost   Applica&on  Performance     Web/App  Server  Tier   Users   NoSQL  Database  Scales  Out   Cost  and  performance  mirrors  app  &er   System  Cost   Applica&on  Performance     Couchbase  Distributed  Data  Store   Users   Scaling  out  flatens  the  cost  and  performance  curves   23  
  • 24.
  • 25.
    The  Process  –  From  Evalua&on  to  Go  Live     No  different  from  evalua&ng  a  rela&onal  database     1    Analyze  your  requirements       2    Find  solu&ons  /  products  that  match  key  requirements     3    Execute  a  proof  of  concept  /  performance  evalua&on     4    Begin  development  of  applica&on         5    Deploy  in  staging  and  then  produc&on     New  requirements  è  New  solu&ons     25  
  • 26.
    1    Analyze  your  requirements     Common  applica&on  requirements     •  Rapid  applica&on  development   –  Changing  market  needs   –  Changing  data  needs     •  Scalability     –  Unknown  user  demand     –  Constantly  growing  throughput   •  Consistent  Performance     –  Low  response  &me  for  be]er  user  experience   –  High  throughput  to  handle  viral  growth     •  Reliability   –  Always  online     26  
  • 27.
    2    Find  solu&ons  that  match  key  requirements   •  Linear  Scalability     •  Schema  flexibility   NoSQL   •  High  Performance   •  Mul&-­‐document  transac&ons   •  Database  Rollback     •  Complex  security  needs   RDBMS   •  Complex  joins   •  Extreme  compression  needs   •  Both  /  depends  on  the  data   RDBMS   NoSQL   27  
  • 28.
    3    Proof  of  concept  /  Performance  evalua&on   Prototype  a  workload     •  Look  for  consistent  performance…         –  Low  response  &mes  /  latency       •  For  be]er  user  experience   –  High  throughput     •  To  handle  viral  growth     •  For  resource  efficiency   •  …  across   –  Read  heavy  /  Write  heavy  /  Mixed  workloads   –  Clusters  of  growing  sizes     •  …  and  watch  for     –  Conten&on  /  heavy  locking     –  Linear  scalability   28  
  • 29.
    3    Other  considera&ons        Accessing  data   App  Server   –  No  standards  exist  yet   –  Typically  via  SDKs  or  over  HTTP   –  Check  if  the  programing  language  of  your   choice  is  supported.        Consistency   App  Server   –  Consistent  only  at  the  document  level   –  Most  documents  stores  currently  don’t   support  mul&-­‐document  transac&ons   –  Analyze  your  applica&on  needs        Availability   App  Server   –  Each  node  stores  ac&ve  and  replica  data   (Couchbase)   –  Each  node  is  either  a  master  or  slave   (MongoDB)   29  
  • 30.
    3    Other  considera&ons        Opera&ons   App  Server   –  Monitoring  the  system   –  Backup  and  restore  the  system   –  Upgrades  and  maintenance     –  Support        Ease  of  Scaling   App  Server   –  Ease  of  adding  and  reducing  capacity   Client   –  Single  node  type   –  App  availability  on  topology  changes        Indexing  and  Querying   –  Secondary  indexes  (Map  func&ons)   –  Aggregates  Grouping  (Reduce  func&ons)   –  Basic  querying     30  
  • 31.
    4    Begin  development         Data  Modeling  and   Document  Design   31  
  • 32.
    5    Deploying  to  staging  and  produc&on     •  Monitoring  the  system     •  RESTful  interfaces  /  Easy  integra&on  with  monitoring     tools   •  High-­‐availability   •  Replica&on   •  Failover  and  Auto-­‐failover     •  Always  Online  –  even  for  maintenance  tasks     •  Database  upgrades   •  Sojware  (OS)  and  Hardware  upgrades   •  Backup  and  restore   •  Index  building   •  Compac&on   32  
  • 33.
    Couchbase  Server  Admin  Console   33  
  • 34.
  • 35.
    So  are  you  being  impacted  by  these?     Schema  Rigidity  problems     •  Do  you  store  serialized  objects  in  the  database?   •  Do  you  have  lots  of  sparse  tables  with  very  few  columns   Q   being  used  by  most  rows?   •  Do  you  find  that  your  applica&on  developers  require  schema   changes  frequently  due  to  constantly  changing  data?       •  Are  you  using  your  database  as  a  key-­‐value  store?   Scalability  problems     •  Do  you  periodically  need  to  upgrade  systems  to  more   powerful  servers  and  scale  up?     Q   •  Are  you  reaching  the  read  /  write  throughput  limit  of  a  single   database  server?     •  Is  your  server’s  read  /  write  latency  not  mee&ng  your  SLA?     •  Is  your  user  base  growing  at  a  frightening  pace?     35  
  • 36.
    Is  NoSQL  the  right  choice  for  you?   Does  your  applica&on  need  rich  database  func&onality?       •  Mul&-­‐document  transac&ons   •  Complex  security  needs  –  user  roles,  document  level  security,   authen&ca&on,  authoriza&on  integra&on   •  Complex  joins  across  bucket  /  collec&ons     •  BI  integra&on     •  Extreme  compression  needs   NoSQL  may  not  be  the  right  choice  for  your  applica&on   36  
  • 37.
    WHERE  IS  NOSQL  A  GOOD  FIT?   37  
  • 38.
    Performance  driven  use  cases   •  Low  latency   •  High  throughput  ma]ers   •  Large  number  of  users     •  Unknown  demand  with  sudden  growth  of   users/data     •  Predominantly  direct  document  access   •  Workloads  with  very  high  muta&on  rate  per   document  (temporal  locality)  Working  set  with   heavy  writes     38  
  • 39.
    Data  driven  use  cases     •  Support  for  unlimited  data  growth       •  Data  with  non-­‐homogenous  structure     •  Need  to  quickly  and  ojen  change  data  structure   •  3rd  party  or  user  defined  structure   •  Variable  length  documents   •  Sparse  data  records   •  Hierarchical  data     39  
  • 40.
  • 41.
    Couchbase  Server   NoSQL  Distributed  Document  Database   for  interac&ve  web  applica&ons   2.0 41  
  • 42.
    Couchbase  Server   Grow  cluster  without   Easy   applica&on  changes,  without   Scalability   down&me  with  a  single  click   Consistent  sub-­‐millisecond     Consistent,  High   read  and  write  response  &mes     Performance   with  consistent  high  throughput   Always  On   No  down&me  for  sovware   24x7x365   upgrades,  hardware  maintenance,   etc.   42  
  • 43.
    Flexible  Data  Model    {          “ID”:  1,          “FIRST”:  “Dip&”,          “LAST”:  “Borkar”,          “ZIP”:  “94040”,          “CITY”:  “MV”,          “STATE”:  “CA”   }   JSON   JSON   JSON   JSON   •  No  need  to  worry  about  the  database  when  changing  your   applica&on   •  Records  can  have  different  structures,  there  is  no  fixed   schema   •  Allows  painless  data  model  changes  for  rapid  applica&on   development   43    
  • 44.
    COUCHBASE  SERVER     ARCHITECTURE   44  
  • 45.
    Couchbase  Server  2.0  Architecture   8092   11211   11210   Query  API   Memcapable    1.0   Memcapable    2.0   Moxi   Query  Engine   REST  management  API/Web  UI   vBucket  state  and  replica&on  manager   Memcached   Global  singleton  supervisor   Rebalance  orchestrator   Configura&on  manager   Node  health  monitor   Process  monitor   Heartbeat   Couchbase  EP  Engine   Data  Manager   Cluster  Manager   storage  interface   New  Persistence  Layer   htp   on  each  node   one  per  cluster   Erlang/OTP   HTTP   Erlang  port  mapper   Distributed  Erlang   8091   4369   21100  -­‐  21199   45  
  • 46.
    Couchbase  Server  2.0  Architecture   8092   11211   11210   Query  API   Memcapable    1.0   Memcapable    2.0   Moxi   Query  Engine   REST  management  API/Web  UI   vBucket  state  and  replica&on  manager   Memcached   Global  singleton  supervisor   Rebalance  orchestrator   Configura&on  manager   Node  health  monitor   Process  monitor   Heartbeat   Couchbase  EP  Engine   storage  interface   New  Persistence  Layer   htp   on  each  node   one  per  cluster   Erlang/OTP   HTTP   Erlang  port  mapper   Distributed  Erlang   8091   4369   21100  -­‐  21199   46  
  • 47.
    Couchbase  deployment   Web   Applica&on   Couchbase   Client  Library   Data  Flow   Cluster  Management   47  
  • 48.
    Single  node  -­‐  Couchbase  Write  Opera&on   2   Doc  1   App  Server   3   2   3   Managed  Cache   To  other  node   Replica&on   Doc  1   Queue   Disk  Queue   Disk   Couchbase  Server  Node   48  
  • 49.
    Single  node  -­‐  Couchbase  Update  Opera&on   2   Doc  1’   App  Server   3   2   3   Managed  Cache   To  other  node   Replica&on   Doc  1   Doc  1’   Queue   Disk  Queue   Disk   Doc  1   Couchbase  Server  Node   49  
  • 50.
    Single  node  -­‐  Couchbase  Read  Opera&on   2   Doc  1   GET   App  Server   3   2   3   Managed  Cache   To  other  node   Replica&on   Queue   Doc  1   Disk  Queue   Disk   Doc  1   Couchbase  Server  Node   50  
  • 51.
    Single  node  -­‐  Couchbase  Cache  Evic&on   2   Doc  6   2 3 4 5 App  Server   3   2   3   Managed  Cache   To  other  node   Replica&on   Queue   Doc  1   Disk  Queue   Disk   Doc  1   Doc  6   Doc  5   Doc  4   Doc  3   Doc  2   Couchbase  Server  Node   51  
  • 52.
    Single  node  –  Couchbase  Cache  Miss   2   Doc  1   GET   App  Server   3   2   3   Managed  Cache   To  other  node   Replica&on   Queue   Doc  1   Doc  5   4   4   Doc   Doc   Doc  3   2   Doc   Disk  Queue   Disk   Doc  1   Doc  6   Doc  5   Doc  4   Doc  3   Doc  2   Couchbase  Server  Node   52  
  • 53.
    Cluster  wide  -­‐  Basic  Opera&on   APP  SERVER  1   APP  SERVER  2   COUCHBASE  Client  Library   COUCHBASE  Client  Library       CLUSTER  MAP     CLUSTER  MAP     READ/WRITE/UPDATE   SERVER  1     SERVER  2     SERVER  3     •  Docs  distributed  evenly  across     ACTIVE     ACTIVE     ACTIVE   servers     Doc  5   Doc   Doc  4   Doc   Doc  1   Doc   •  Each  server  stores  both  ac&ve  and   replica  docs   Doc  2   Doc   Doc  7   Doc   Doc  2   Doc   Only  one  server  ac&ve  at  a  &me   •  Client  library  provides  app  with   Doc  9   Doc   Doc  8   Doc   Doc  6   Doc   simple  interface  to  database   REPLICA   REPLICA   REPLICA   •  Cluster  map  provides  map     to  which  server  doc  is  on   Doc  4   Doc   Doc  6   Doc   Doc  7   Doc   App  never  needs  to  know   Doc  1   Doc   Doc  3   Doc   Doc  9   Doc   •  App  reads,  writes,  updates  docs   Doc  8   Doc   Doc  2   Doc   Doc  5   Doc   •  Mul&ple  app  servers  can  access  same document  at  same  &me   COUCHBASE  SERVER    CLUSTER   User  Configured  Replica  Count  =  1   53  
  • 54.
    Cluster  wide  -­‐  Add  Nodes  to  Cluster   APP  SERVER  1   APP  SERVER  2   COUCHBASE  Client  Library   COUCHBASE  Client  Library       CLUSTER  MAP     CLUSTER  MAP     READ/WRITE/UPDATE   READ/WRITE/UPDATE   SERVER  1     SERVER  2     SERVER  3     SERVER  4     SERVER  5     •  Two  servers  added     ACTIVE     ACTIVE     ACTIVE     ACTIVE     ACTIVE   One-­‐click  opera&on   Doc  5   Doc   Doc  4   Doc   Doc  1   Doc   •  Docs  automa&cally   rebalanced  across   Doc  2   Doc   Doc  7   Doc   Doc  2   Doc   cluster   Even  distribu&on  of  docs   Minimum  doc  movement   Doc  9   Doc   Doc  8   Doc   Doc  6   Doc   •  Cluster  map  updated   REPLICA   REPLICA   REPLICA   REPLICA   REPLICA   •  App  database     Doc  4   Doc   Doc  6   Doc   Doc  7   Doc   calls  now  distributed     over  larger  number  of   Doc  1   Doc   Doc  3   Doc   Doc  9   Doc   servers     Doc  8   Doc   Doc  2   Doc   Doc  5   Doc   COUCHBASE  SERVER    CLUSTER   User  Configured  Replica  Count  =  1   54  
  • 55.
    Cluster  wide  -­‐  Fail  Over  Node   APP  SERVER  1   APP  SERVER  2   COUCHBASE  Client  Library   COUCHBASE  Client  Library       CLUSTER  MAP     CLUSTER  MAP     SERVER  1     SERVER  2     SERVER  3     SERVER  4     SERVER  5     •  App  servers  accessing  docs             ACTIVE   ACTIVE   ACTIVE   ACTIVE   ACTIVE   •  Requests  to  Server  3  fail   Doc  5   Doc   Doc  4   Doc   Doc  1   Doc   Doc  9   Doc   Doc  6   Doc   •  Cluster  detects  server  failed   Promotes  replicas  of  docs  to   Doc  2   Doc   Doc  7   Doc   Doc  2   Doc   Doc  8   Doc   Doc   ac&ve   Updates  cluster  map   Doc  1   Doc  3   •  Requests  for  docs  now  go  to   REPLICA   REPLICA   REPLICA   REPLICA   REPLICA   appropriate  server   Doc  4   Doc   Doc  6   Doc   Doc  7   Doc   Doc  5   Doc   Doc  8   Doc   •  Typically  rebalance     would  follow   Doc  1   Doc   Doc  3   Doc   Doc  9   Doc   Doc  2   Doc   COUCHBASE  SERVER    CLUSTER   User  Configured  Replica  Count  =  1   55  
  • 56.
    Indexing  and  Querying     APP  SERVER  1   APP  SERVER  2   COUCHBASE  Client  Library   COUCHBASE  Client  Library       CLUSTER  MAP     CLUSTER  MAP     Query   SERVER  1   SERVER  2   SERVER  3     •  Indexing  work  is  distributed   ACTIVE     ACTIVE     ACTIVE     amongst  nodes   Doc  5   Doc   Doc  5   Doc   Doc  5   Doc   •  Large  data  set  possible   Doc  2   Doc   Doc  2   Doc   Doc  2   Doc   •  Parallelize  the  effort   Doc  9   Doc   •  Each  node  has  index  for  data  stored Doc  9   Doc   Doc  9   Doc   on  it   REPLICA   REPLICA   REPLICA   •  Queries  combine  the  results  from   Doc  4   Doc   required  nodes   Doc  4   Doc   Doc  4   Doc   Doc  1   Doc   Doc  1   Doc   Doc  1   Doc   Doc  8   Doc   Doc  8   Doc   Doc  8   Doc   COUCHBASE  SERVER    CLUSTER   User  Configured  Replica  Count  =  1   56  
  • 57.
    Cross  Data  Center  Replica&on  (XDCR)   SERVER  1     SERVER  2     SERVER  3       ACTIVE     ACTIVE     ACTIVE   COUCHBASE  SERVER    CLUSTER   Doc     Doc   Doc     NY  DATA  CENTER   Doc  2   Doc     Doc     Doc  9   Doc     Doc   RAM   RAM   RAM   Doc     Doc     Doc   Doc     Doc   Doc     Doc   Doc   Doc   DISK   DISK   DISK   SERVER  1     SERVER  2     SERVER  3       ACTIVE     ACTIVE     ACTIVE   Doc     Doc   Doc     Doc  2   Doc     Doc     Doc  9   Doc     Doc   RAM   RAM   RAM   COUCHBASE  SERVER    CLUSTER   Doc     Doc     Doc   Doc     Doc   Doc     Doc   Doc   Doc   SF  DATA  CENTER   DISK   DISK   DISK   57  
  • 58.
    THANK  YOU       DIPTI@COUCHBASE.COM   @DBORKAR   58  
  • 59.
  • 60.