NoSQL: An Architect‘s Perspective
Eberhard Wolff
Architecture and Technology Manager
adesso AG




03.04.13
About me
►    Eberhard Wolff
►    Architecture & Technology Manager at adesso
►    adesso is a leading IT consultancy in Germany
►    Speaker
►    Author (e.g. first German Spring book)
►    Blog: http://ewolff.com
►    Twitter: @ewolff
►    eberhard.wolff@adesso.de
Back in the Days….




03.04.13   NoSQL aus Sicht eines Architekten
NoSQL Is All About the Persistence Question




03.04.13   NoSQL aus Sicht eines Architekten
Key-Value Stores
                                                   Key   Value
►    Maps keys to values                           42    Some
►    Just a large globally available Map                 data

►    i.e. not very powerful data model


►    No complex queries or indices
►    Just access by key


►    Redis: Think cache + Persistence
►    Riak: Think massive scale




03.04.13       NoSQL aus Sicht eines Architekten
Wide Column
                                                             XX

                                                        XX        XX        XX        XX
►    Add any "column" you like to a row                           XX   XX   XX

►    Not key-value - "key-(column-value)”                    XX   XX        XX        XX

                                                        XX        XX   XX             XX
►    Column families are like tables                         XX        XX        XX   XX


►    E.g. in the "Users" column family                  XX        XX        XX        XX

                                                        XX   XX

     >  "someuser" è ("username"è"someuser"),                   XX   XX   XX

           ("email" è"someuser@example.com")                XX                  XX   XX

                                                        XX             XX        XX
►    Columns named: indexing possible                        XX   XX        XX        XX


►    So queries possible                                     XX   XX   XX        XX

                                                        XX        xX   XX   XX   XX




►    Apache Cassandra
►    Amazon SimpleDB
►    Apache HBase
►    All tuned for large data sets

     03.04.13       NoSQL aus Sicht eines Architekten
Document Stores
►    Aggregates are typically stored as "documents“ (key-value collection)
►    JSON quite common
►    No fixed schema
►    Indexes possible
►    Queries possible
     >  E.g. "find all baskets that contain the product 123"
►    Still great horizontal scalability
►    Relations might be modeled as links


►    MongoDB, CouchDB




03.04.13         NoSQL aus Sicht eines Architekten
Graph
►    Nodes with Properties
►    Typed relationships with properties


►    Ideal e.g. to model relations in a social network


►    Easy to find number of followers, degree of relation etc.
►    Hard to scale out


►    Neo4j




03.04.13        NoSQL aus Sicht eines Architekten
NoSQL Benefits
Costs
•  Scale out instead of Scale Up
•  Cheap Hardware                               Ops
•  Usually Open Source



                                                 Flexibility
                                                 •  Schema in code not in
                                                    database
                                                 •  Easier to upgrade schema
                                      Dev        •  Easier to handle
                                                    heterogeneous data
                                  No Object/relational impedance mismatch
                                  •  NoSQL database are more OO like
 03.04.13   NoSQL aus Sicht eines Architekten
Drivers

                 Exponential Data
                                                   Key Value
                     Growth

Cost
                       Scale Out                  Wide Column




                  Semi Structured
                       Data                        Document


Flexibility
                  More Connected
                                                     Graph
                       Data


   03.04.13   NoSQL aus Sicht eines Architekten
Document-oriented Databases are the
                 best NoSQL database

           For at least one definition of “best”



03.04.13             NoSQL aus Sicht eines Architekten
Document-oriented databases
►    Offer scale out
     >  Unless you need huge amounts of data         Cost
►    Offer a rich and flexible data model        Flexibility
     >  …and queries

►  Other databases have other sweet spots
   >  Huge data sets
   >  Graph structures
   >  Analyzing data
►  Niches or mainstream?




04.04.13     NoSQL aus Sicht eines Architekten
Polyglot Persistence in Ecommerce
Application
            Needs transactions                                 Complex document-like
             & reports. Data fit well in                       data structures and
            tables.                                            complex queries

 Financial Data                                    Product Catalog
                                                      Document
       RDBMS                                            Store


            High Performance &
                                                               Based on friends, their
            Scalability
                                                               purchases and reviews
            No complex queries

 Shopping Cart                                     Recommendation

    Key / Value                                        Graph




03.04.13       NoSQL aus Sicht eines Architekten
The NoSQL Game
            Needs transactions                                 Complex document-like
             & reports. Data fit well in                       data structures and
            tables.                                            complex queries




             2700
                                                   Product Catalog
       0
 Financial Data

       RDBMS
                                                     1000
                                                      Document
                                                        Store




           High Score!
            High Performance &
                                                               Based on friends, their
            Scalability
                                                               purchases and reviews
            No complex queries

 Shopping Cart                                     Recommendation

 900Key / Value
                                                     800
                                                       Graph




03.04.13       NoSQL aus Sicht eines Architekten
Just Like the Patterns Game!
    Points for each Pattern used
    Extra points if one class implements
    multiple Pattern




04.04.13          NoSQL aus Sicht eines Architekten
This is not how
           Software Architecture works.




03.04.13         NoSQL aus Sicht eines Architekten
Why not?
                               More is worse!

More hardware                                    More Ops Trouble
                                                 •  Installation
                                                 •  Backup
More Developer Skillz                            •  Disaster Recovery
Not necessarily bad                              •  Monitoring
                                                 •  Optimizations


  03.04.13   NoSQL aus Sicht eines Architekten
But: Polyglott Persistence Has a Point
Object-oriented Databases did it wrong
•  Strategy: Replace RDBMS
•  Enterprises will stick to RDBMS
•  Pure technology migration basically
   never happens
•  …only vendors think differently
                                          Example: Archive Database
                                          •  Store current data in RDBMS
                                          •  Store archive in NoSQL (MongoDB)
                                          •  Archive contains mainframe data

            •  Benefit: Use flexibility to allow for many data formats
            •  Benefit: No need to convert mainframe data
            •  Benefit: Store lots of data cheaply
 03.04.13      NoSQL aus Sicht eines Architekten
Complex Document Processing System




                                                           elastic
MongoDB                                    Redis
                                                           search
Document-                                  Key/value
                                                           Search
oriented                                   in memory
                                                           engine
Documents                                  Meta Data for
                                           quick access    Search
                                                           index
 04.04.13   NoSQL aus Sicht eines Architekten
Alternative: Only elasticsearch




                                     elastic
•  Stores original documents as well search
•  (like a key/value store)
•  Support for complex queries
•  Very powerful features also for data
   mining / analytics
  04.04.13   NoSQL aus Sicht eines Architekten
Alternative: Only MongoDB




MongoDB
•  Now with (limited beta) fulltext search
•  Quite fast – memory mapped files
•  So why Redis?
•  Map/Reduce support
  04.04.13   NoSQL aus Sicht eines Architekten
What about Redis?




                      Redis
•     Like a Swiss Knife
•     Cache
•     Messaging
•     Central coordination in a distributed
      environment
     04.04.13   NoSQL aus Sicht eines Architekten
Your Choice – a trade off!
           Typical architecture decision




04.04.13        NoSQL aus Sicht eines Architekten
Who Does What? RDBMS

  Developer / Architect                         DBA
  ►  Schema design (at least partly)            ►  Performance tuning
  ►  Access code                                ►  Indices

                                                ►  Query optimization



                                                ►    Changes do not influence code




03.04.13    NoSQL aus Sicht eines Architekten
Data Access: RDBMS
Optimizations                    Data Model
•  Indices                       •  Schema
•  Tables                        •  Stored Procedures
   spaces
No need to                       Data Access
change code                      •  Queries
                                 •  Other code
•  …                                                    RDBMS

                                 Architect/
                                 Developer
 DBA
  03.04.13   NoSQL aus Sicht eines Architekten
RDBMS separate data from data access
                        Indices
           Joins and normalization allow flexible data
                       access patterns




04.04.13               NoSQL aus Sicht eines Architekten
Data Access MongoDB
                                 Data Model
Optimizations
                                 •  Influences access
•  Only basic                       patterns
   indices                       Data Access
Other                            •  WriteConcerns
optimizations                       how much do
must be                             love your data?
done in                          •  Shard key           MongoDB
code                             •  Consistency


                                 Architect/
 DBA                             Developer
  04.04.13   NoSQL aus Sicht eines Architekten
Cluster: RDBMS

►    Works somehow


►    A special setup of hardware and RDBMS software




DBA
04.04.13   NoSQL aus Sicht eines Architekten
Cluster: MongoDB
►    CAP theorem                             ►    Write Concerns:
     >  Consistency                               >  Unacknowledge
     >  Availabilty                               >  Acknowledged
     >  Partition tolerance                       >  Jounrnaled
     >  Choose any two                            >  Some nodes in the
                                                     replica set
►    Deals with replication
►    MongoDB has master /
     slave replication
                       ►  Queries might go to
                                                                         MongoDB
                          master only or also
                          slaves
                                       ►    Influences consistency
                                      Architect/
                                      Developer
       04.04.13   NoSQL aus Sicht eines Architekten
More Power and more Responsibility

                                               Architect

           DB Admin




03.04.13   NoSQL aus Sicht eines Architekten
Architects

► Architect has always been a multi-
dimensional problem

►      Need to choose persistence technology

►      Need to think about operations

►      Needs to do DBA work
03.04.13   NoSQL aus Sicht eines Architekten
NoSQL Is All About the Persistence Question




03.04.13   NoSQL aus Sicht eines Architekten

NoSQL: An Architects Perspective

  • 1.
    NoSQL: An Architect‘sPerspective Eberhard Wolff Architecture and Technology Manager adesso AG 03.04.13
  • 2.
    About me ►  Eberhard Wolff ►  Architecture & Technology Manager at adesso ►  adesso is a leading IT consultancy in Germany ►  Speaker ►  Author (e.g. first German Spring book) ►  Blog: http://ewolff.com ►  Twitter: @ewolff ►  eberhard.wolff@adesso.de
  • 3.
    Back in theDays…. 03.04.13 NoSQL aus Sicht eines Architekten
  • 4.
    NoSQL Is AllAbout the Persistence Question 03.04.13 NoSQL aus Sicht eines Architekten
  • 5.
    Key-Value Stores Key Value ►  Maps keys to values 42 Some ►  Just a large globally available Map data ►  i.e. not very powerful data model ►  No complex queries or indices ►  Just access by key ►  Redis: Think cache + Persistence ►  Riak: Think massive scale 03.04.13 NoSQL aus Sicht eines Architekten
  • 6.
    Wide Column XX XX XX XX XX ►  Add any "column" you like to a row XX XX XX ►  Not key-value - "key-(column-value)” XX XX XX XX XX XX XX XX ►  Column families are like tables XX XX XX XX ►  E.g. in the "Users" column family XX XX XX XX XX XX >  "someuser" è ("username"è"someuser"), XX XX XX ("email" è"someuser@example.com") XX XX XX XX XX XX ►  Columns named: indexing possible XX XX XX XX ►  So queries possible XX XX XX XX XX xX XX XX XX ►  Apache Cassandra ►  Amazon SimpleDB ►  Apache HBase ►  All tuned for large data sets 03.04.13 NoSQL aus Sicht eines Architekten
  • 7.
    Document Stores ►  Aggregates are typically stored as "documents“ (key-value collection) ►  JSON quite common ►  No fixed schema ►  Indexes possible ►  Queries possible >  E.g. "find all baskets that contain the product 123" ►  Still great horizontal scalability ►  Relations might be modeled as links ►  MongoDB, CouchDB 03.04.13 NoSQL aus Sicht eines Architekten
  • 8.
    Graph ►  Nodes with Properties ►  Typed relationships with properties ►  Ideal e.g. to model relations in a social network ►  Easy to find number of followers, degree of relation etc. ►  Hard to scale out ►  Neo4j 03.04.13 NoSQL aus Sicht eines Architekten
  • 9.
    NoSQL Benefits Costs •  Scaleout instead of Scale Up •  Cheap Hardware Ops •  Usually Open Source Flexibility •  Schema in code not in database •  Easier to upgrade schema Dev •  Easier to handle heterogeneous data No Object/relational impedance mismatch •  NoSQL database are more OO like 03.04.13 NoSQL aus Sicht eines Architekten
  • 10.
    Drivers Exponential Data Key Value Growth Cost Scale Out Wide Column Semi Structured Data Document Flexibility More Connected Graph Data 03.04.13 NoSQL aus Sicht eines Architekten
  • 11.
    Document-oriented Databases arethe best NoSQL database For at least one definition of “best” 03.04.13 NoSQL aus Sicht eines Architekten
  • 12.
    Document-oriented databases ►  Offer scale out >  Unless you need huge amounts of data Cost ►  Offer a rich and flexible data model Flexibility >  …and queries ►  Other databases have other sweet spots >  Huge data sets >  Graph structures >  Analyzing data ►  Niches or mainstream? 04.04.13 NoSQL aus Sicht eines Architekten
  • 13.
    Polyglot Persistence inEcommerce Application Needs transactions Complex document-like & reports. Data fit well in data structures and tables. complex queries Financial Data Product Catalog Document RDBMS Store High Performance & Based on friends, their Scalability purchases and reviews No complex queries Shopping Cart Recommendation Key / Value Graph 03.04.13 NoSQL aus Sicht eines Architekten
  • 14.
    The NoSQL Game Needs transactions Complex document-like & reports. Data fit well in data structures and tables. complex queries 2700 Product Catalog 0 Financial Data RDBMS 1000 Document Store High Score! High Performance & Based on friends, their Scalability purchases and reviews No complex queries Shopping Cart Recommendation 900Key / Value 800 Graph 03.04.13 NoSQL aus Sicht eines Architekten
  • 15.
    Just Like thePatterns Game! Points for each Pattern used Extra points if one class implements multiple Pattern 04.04.13 NoSQL aus Sicht eines Architekten
  • 16.
    This is nothow Software Architecture works. 03.04.13 NoSQL aus Sicht eines Architekten
  • 17.
    Why not? More is worse! More hardware More Ops Trouble •  Installation •  Backup More Developer Skillz •  Disaster Recovery Not necessarily bad •  Monitoring •  Optimizations 03.04.13 NoSQL aus Sicht eines Architekten
  • 18.
    But: Polyglott PersistenceHas a Point Object-oriented Databases did it wrong •  Strategy: Replace RDBMS •  Enterprises will stick to RDBMS •  Pure technology migration basically never happens •  …only vendors think differently Example: Archive Database •  Store current data in RDBMS •  Store archive in NoSQL (MongoDB) •  Archive contains mainframe data •  Benefit: Use flexibility to allow for many data formats •  Benefit: No need to convert mainframe data •  Benefit: Store lots of data cheaply 03.04.13 NoSQL aus Sicht eines Architekten
  • 19.
    Complex Document ProcessingSystem elastic MongoDB Redis search Document- Key/value Search oriented in memory engine Documents Meta Data for quick access Search index 04.04.13 NoSQL aus Sicht eines Architekten
  • 20.
    Alternative: Only elasticsearch elastic •  Stores original documents as well search •  (like a key/value store) •  Support for complex queries •  Very powerful features also for data mining / analytics 04.04.13 NoSQL aus Sicht eines Architekten
  • 21.
    Alternative: Only MongoDB MongoDB • Now with (limited beta) fulltext search •  Quite fast – memory mapped files •  So why Redis? •  Map/Reduce support 04.04.13 NoSQL aus Sicht eines Architekten
  • 22.
    What about Redis? Redis •  Like a Swiss Knife •  Cache •  Messaging •  Central coordination in a distributed environment 04.04.13 NoSQL aus Sicht eines Architekten
  • 23.
    Your Choice –a trade off! Typical architecture decision 04.04.13 NoSQL aus Sicht eines Architekten
  • 24.
    Who Does What?RDBMS Developer / Architect DBA ►  Schema design (at least partly) ►  Performance tuning ►  Access code ►  Indices ►  Query optimization ►  Changes do not influence code 03.04.13 NoSQL aus Sicht eines Architekten
  • 25.
    Data Access: RDBMS Optimizations Data Model •  Indices •  Schema •  Tables •  Stored Procedures spaces No need to Data Access change code •  Queries •  Other code •  … RDBMS Architect/ Developer DBA 03.04.13 NoSQL aus Sicht eines Architekten
  • 26.
    RDBMS separate datafrom data access Indices Joins and normalization allow flexible data access patterns 04.04.13 NoSQL aus Sicht eines Architekten
  • 27.
    Data Access MongoDB Data Model Optimizations •  Influences access •  Only basic patterns indices Data Access Other •  WriteConcerns optimizations how much do must be love your data? done in •  Shard key MongoDB code •  Consistency Architect/ DBA Developer 04.04.13 NoSQL aus Sicht eines Architekten
  • 28.
    Cluster: RDBMS ►  Works somehow ►  A special setup of hardware and RDBMS software DBA 04.04.13 NoSQL aus Sicht eines Architekten
  • 29.
    Cluster: MongoDB ►  CAP theorem ►  Write Concerns: >  Consistency >  Unacknowledge >  Availabilty >  Acknowledged >  Partition tolerance >  Jounrnaled >  Choose any two >  Some nodes in the replica set ►  Deals with replication ►  MongoDB has master / slave replication ►  Queries might go to MongoDB master only or also slaves ►  Influences consistency Architect/ Developer 04.04.13 NoSQL aus Sicht eines Architekten
  • 30.
    More Power andmore Responsibility Architect DB Admin 03.04.13 NoSQL aus Sicht eines Architekten
  • 31.
    Architects ► Architect has alwaysbeen a multi- dimensional problem ►  Need to choose persistence technology ►  Need to think about operations ►  Needs to do DBA work 03.04.13 NoSQL aus Sicht eines Architekten
  • 32.
    NoSQL Is AllAbout the Persistence Question 03.04.13 NoSQL aus Sicht eines Architekten