C H A N G I N G F O R M O N G O D B
G I V I N G
S O M E
C O N T E X T
Application
WHAT DOES THE SWITCH TO MONGODB CHANGE ?
• FOR DEVELOPPERS & ARCHITECTS
• FOR OPS AND USERS OF THE PRODUCT
S W I T C H I N G T O
M O N G O D B
N U X E O
P L AT F O R M
We provide a Platform that developers can use to
build highly customised content applications.
We provide the components and the tools to assemble them.
https://github.com/nuxeo
N U X E O
P L AT F O R M
& S T O R A G E
CONTENT
REPOSITORY
N U X E O
P L AT F O R M
& S T O R A G E
CONTENT
REPOSITORY
STORAGE
BACKEND
N U X E O
P L AT F O R M
& S T O R A G E
CONTENT
REPOSITORY
STORAGE
BACKEND
N U X E O
P L AT F O R M
& S T O R A G E
SIMPLIFY SOFTWARE ARCHITECTURE
OFFER EASY SCALABILITY OPTIONS
SMALL IMPACT ON DEVELOPMENT
S I M P L I F Y A R C H I T E C T U R E
M A K I N G
W O R K E A S I E R
F O R O P S &
A R C H I T E C T S
S I M P L I F Y
A R C H I T E C T U R E
DEPLOYMENT
SCALABILTY
DEVELOPMENT COMPLEXITY
I M P E D A N C E
M I S M AT C H
I S S U E
I M P E D A N C E
M I S M AT C H
I S S U E
I M P E D A N C E
M I S M AT C H
I S S U E
I M P E D A N C E
M I S M AT C H
I S S U E
NO LAZY LOADING
NO CACHE
NO INVALIDATION
A LOT OF COMPLEXITY AND PROBLEMS AVOIDED !
I M PA C T O N
D E P L O Y M E N T
I M PA C T O N
D E P L O Y M E N T
H Y B R I D
S T O R A G E
Large stream - Large storage
attached blobs
Flexible Schema - Write Once/Read Many
Audit Log, activity log
Complex structures - R/W synchronous
Document properties and hierarchy
Flexible Schema - Search
Search index
H Y B R I D
S T O R A G E
Large stream - Large storage
attached blobs
Flexible Schema - Write Once/Read Many
Audit Log, activity log
Complex structures - R/W synchronous
Document properties and hierarchy
Flexible Schema - Search
Search index
C O N S O L I D AT E D
S T O R A G E
Structures
AuditBlobs
Indexes
SINGLE CONSOLIDATED STORAGE
Structure, Blobs, Audit & Index
FEWER BUILDING BLOCKS TO PROVISION & CONFIGURE
Easier to deploy
E A S Y
D E P L O Y M E N T
"BUILT-IN" - DATA REDUNDANCY & FAULT TOLERANCE
Active
Active
No ORM Hell
S I M P L I C I T Y ?
Single consolidated
storage
Out of the box
robust deployment
S C A L A B I T Y
A V O I D
H E A D A C H E AT
D E P L O Y M E N T
T I M E
I M P R O V E E N D
U S E R X P
S C A L A B I LT Y WILL I BE FASTER
WITH MONGODB ?
B U I LT F O R
S P E E D
N O I M P E D A N C E I S S U E
D O C U M E N T L E V E L L O C K I N G
• No table level concurrency
• Fewer backend calls
• No invalidation costs
N AT I V E D I S T R I B U T E D A R C H I T E C T U R E
• Easy scale out of reads
S P E E D
https://benchmarks.nuxeo.com/continuous/index.html
Significant RAW Speed improvements for all use cases
More importantly: some use cases are much better handled
M O R E T H A N
R A W
P E R F O R M A N C E
No cache
Less Memory per connection
Can handle more connection
Can handle more concurrent users
Handle more concurrent connections
M O R E T H A N
R A W
P E R F O R M A N C E
Writes are not blocked by Reads
With SQL, Read and Write operation are competing
M O R E T H A N
R A W
P E R F O R M A N C E
Writes are not blocked by Reads
With MongoDB writes operations are not blocked
M O R E T H A N
R A W
P E R F O R M A N C E
No side effect of impedance mismatch
Processing on large Objects sets is challenging with ORM
lazy loading
cache trashing
Sample batch on 100,000 documents
750 documents/s with SQL backend (cold cache)
11,500 documents/s with MongoDB / wiredTiger: x15
Will I scale better
with MongoDB ?
S C A L A B I L I T Y
S C A L E O U T R E A D S
S C A L E O U T W R I T E S
• Leverage sharding
• Spread Writes
• Leverages replicasets
• Read from secondaries
No Impact at application level !
S C A L A B I L I T Y
S C A L E O U T T E S T
Use massive read operations and queries.
2 Nuxeo nodes + 1 MongoDB node
1850 docs/s
MongoDB CPU is the bottleneck (800%)
S C A L A B I L I T Y
S C A L E O U T T E S T
Use massive read operations and queries.
2 Nuxeo nodes + 2 MongoDB nodes
3400 docs/s
(using read preferences)
S C A L A B I L I T Y
S H A R D I N G T E S T
2 Nuxeo nodes
+
1 MongoDB ReplicaSet
11,000 docs/s
S C A L A B I L I T Y
S H A R D I N G T E S T
2 Nuxeo nodes
+
3 MongoDB Sharded ReplicaSet
27,400 docs/s
D E V E L O P M E N T I M P A C T
MEANS
• Different transaction paradigms
• Provide shared mitigation policies for critical use case
NEW STORAGE MODEL
• Document Level transaction
• No MVCC isolation
C H A N G E S
F R O M A D E V
P O I N T O F
V I E W
D E V E L O P M E N T
I M PA C T
C O N S I S T E N C Y
I N O U R C O N T E X T
Transactions can not span across multiple documents
• Atomic Document Operations are safe
• Large batch updates can not be Atomic
Multi-documents transactions can be problematic
Workflows or custom event handlers
FIND A WAY TO MITIGATE APPLICATION LEVEL IMPACT
C O N S I S T E N C Y
C O N S I S T E N C Y
TRANSIENT STATE MANAGER
Run all operations in Memory
Populate an Undo Log
Recover Application level Transaction Management
• Commit / Rollback model
"Read uncommited" isolation
• Need to flush transient state for queries
• "uncommited" changes are visible to others
C H A N G E
I N E R T I A
New Model
New API
New Query system
PROVIDE AN EASY MIGRATION PATH
C H A N G E
I N E R T I A
NUXEO APPROACH
High level API + Encapsulation
Storage Adapters
C H A N G E
I N E R T I A
DOCUMENT REPOSITORY
Helps transitioning between storages
C H A N G E
I N E R T I A
DOCUMENT REPOSITORY
C H A N G E
I N E R T I A
DOCUMENT REPOSITORY
C H A N G E
I N E R T I A
DOCUMENT REPOSITORY
C H A N G E
I N E R T I A
DOCUMENT REPOSITORY
C H A N G E
I N E R T I A
DOCUMENT REPOSITORY
TA K E A W AY S Changing For
MongoDB
Simplify
Architecture
Offer simple
scalability options
Be an easy
migration
Content Management + MongoDB
You should try Nuxeo !
A N Y Q U E S T I O N S ?
T H A N K Y O U !
https://github.com/nuxeo nuxeo.com/careers/ @damienmetzler

MongoDB Europe 2016 - Using MongoDB to Build a Fast and Scalable Content Repository

  • 1.
    C H AN G I N G F O R M O N G O D B
  • 2.
    G I VI N G S O M E C O N T E X T Application WHAT DOES THE SWITCH TO MONGODB CHANGE ? • FOR DEVELOPPERS & ARCHITECTS • FOR OPS AND USERS OF THE PRODUCT S W I T C H I N G T O M O N G O D B
  • 3.
    N U XE O P L AT F O R M We provide a Platform that developers can use to build highly customised content applications. We provide the components and the tools to assemble them. https://github.com/nuxeo
  • 4.
    N U XE O P L AT F O R M & S T O R A G E CONTENT REPOSITORY
  • 5.
    N U XE O P L AT F O R M & S T O R A G E CONTENT REPOSITORY STORAGE BACKEND
  • 6.
    N U XE O P L AT F O R M & S T O R A G E CONTENT REPOSITORY STORAGE BACKEND
  • 7.
    N U XE O P L AT F O R M & S T O R A G E SIMPLIFY SOFTWARE ARCHITECTURE OFFER EASY SCALABILITY OPTIONS SMALL IMPACT ON DEVELOPMENT
  • 8.
    S I MP L I F Y A R C H I T E C T U R E
  • 9.
    M A KI N G W O R K E A S I E R F O R O P S & A R C H I T E C T S S I M P L I F Y A R C H I T E C T U R E DEPLOYMENT SCALABILTY DEVELOPMENT COMPLEXITY
  • 10.
    I M PE D A N C E M I S M AT C H I S S U E
  • 11.
    I M PE D A N C E M I S M AT C H I S S U E
  • 12.
    I M PE D A N C E M I S M AT C H I S S U E
  • 13.
    I M PE D A N C E M I S M AT C H I S S U E NO LAZY LOADING NO CACHE NO INVALIDATION A LOT OF COMPLEXITY AND PROBLEMS AVOIDED !
  • 14.
    I M PAC T O N D E P L O Y M E N T
  • 15.
    I M PAC T O N D E P L O Y M E N T
  • 16.
    H Y BR I D S T O R A G E Large stream - Large storage attached blobs Flexible Schema - Write Once/Read Many Audit Log, activity log Complex structures - R/W synchronous Document properties and hierarchy Flexible Schema - Search Search index
  • 17.
    H Y BR I D S T O R A G E Large stream - Large storage attached blobs Flexible Schema - Write Once/Read Many Audit Log, activity log Complex structures - R/W synchronous Document properties and hierarchy Flexible Schema - Search Search index
  • 18.
    C O NS O L I D AT E D S T O R A G E Structures AuditBlobs Indexes SINGLE CONSOLIDATED STORAGE Structure, Blobs, Audit & Index FEWER BUILDING BLOCKS TO PROVISION & CONFIGURE Easier to deploy
  • 19.
    E A SY D E P L O Y M E N T "BUILT-IN" - DATA REDUNDANCY & FAULT TOLERANCE Active Active
  • 20.
    No ORM Hell SI M P L I C I T Y ? Single consolidated storage Out of the box robust deployment
  • 21.
    S C AL A B I T Y
  • 22.
    A V OI D H E A D A C H E AT D E P L O Y M E N T T I M E I M P R O V E E N D U S E R X P S C A L A B I LT Y WILL I BE FASTER WITH MONGODB ?
  • 23.
    B U ILT F O R S P E E D N O I M P E D A N C E I S S U E D O C U M E N T L E V E L L O C K I N G • No table level concurrency • Fewer backend calls • No invalidation costs N AT I V E D I S T R I B U T E D A R C H I T E C T U R E • Easy scale out of reads
  • 24.
    S P EE D https://benchmarks.nuxeo.com/continuous/index.html Significant RAW Speed improvements for all use cases More importantly: some use cases are much better handled
  • 25.
    M O RE T H A N R A W P E R F O R M A N C E No cache Less Memory per connection Can handle more connection Can handle more concurrent users Handle more concurrent connections
  • 26.
    M O RE T H A N R A W P E R F O R M A N C E Writes are not blocked by Reads With SQL, Read and Write operation are competing
  • 27.
    M O RE T H A N R A W P E R F O R M A N C E Writes are not blocked by Reads With MongoDB writes operations are not blocked
  • 28.
    M O RE T H A N R A W P E R F O R M A N C E No side effect of impedance mismatch Processing on large Objects sets is challenging with ORM lazy loading cache trashing Sample batch on 100,000 documents 750 documents/s with SQL backend (cold cache) 11,500 documents/s with MongoDB / wiredTiger: x15
  • 29.
    Will I scalebetter with MongoDB ?
  • 30.
    S C AL A B I L I T Y S C A L E O U T R E A D S S C A L E O U T W R I T E S • Leverage sharding • Spread Writes • Leverages replicasets • Read from secondaries No Impact at application level !
  • 31.
    S C AL A B I L I T Y S C A L E O U T T E S T Use massive read operations and queries. 2 Nuxeo nodes + 1 MongoDB node 1850 docs/s MongoDB CPU is the bottleneck (800%)
  • 32.
    S C AL A B I L I T Y S C A L E O U T T E S T Use massive read operations and queries. 2 Nuxeo nodes + 2 MongoDB nodes 3400 docs/s (using read preferences)
  • 33.
    S C AL A B I L I T Y S H A R D I N G T E S T 2 Nuxeo nodes + 1 MongoDB ReplicaSet 11,000 docs/s
  • 34.
    S C AL A B I L I T Y S H A R D I N G T E S T 2 Nuxeo nodes + 3 MongoDB Sharded ReplicaSet 27,400 docs/s
  • 35.
    D E VE L O P M E N T I M P A C T
  • 36.
    MEANS • Different transactionparadigms • Provide shared mitigation policies for critical use case NEW STORAGE MODEL • Document Level transaction • No MVCC isolation C H A N G E S F R O M A D E V P O I N T O F V I E W D E V E L O P M E N T I M PA C T
  • 37.
    C O NS I S T E N C Y I N O U R C O N T E X T Transactions can not span across multiple documents • Atomic Document Operations are safe • Large batch updates can not be Atomic Multi-documents transactions can be problematic Workflows or custom event handlers FIND A WAY TO MITIGATE APPLICATION LEVEL IMPACT
  • 38.
    C O NS I S T E N C Y
  • 39.
    C O NS I S T E N C Y TRANSIENT STATE MANAGER Run all operations in Memory Populate an Undo Log Recover Application level Transaction Management • Commit / Rollback model "Read uncommited" isolation • Need to flush transient state for queries • "uncommited" changes are visible to others
  • 40.
    C H AN G E I N E R T I A New Model New API New Query system PROVIDE AN EASY MIGRATION PATH
  • 41.
    C H AN G E I N E R T I A NUXEO APPROACH High level API + Encapsulation Storage Adapters
  • 42.
    C H AN G E I N E R T I A DOCUMENT REPOSITORY Helps transitioning between storages
  • 43.
    C H AN G E I N E R T I A DOCUMENT REPOSITORY
  • 44.
    C H AN G E I N E R T I A DOCUMENT REPOSITORY
  • 45.
    C H AN G E I N E R T I A DOCUMENT REPOSITORY
  • 46.
    C H AN G E I N E R T I A DOCUMENT REPOSITORY
  • 47.
    C H AN G E I N E R T I A DOCUMENT REPOSITORY
  • 48.
    TA K EA W AY S Changing For MongoDB Simplify Architecture Offer simple scalability options Be an easy migration Content Management + MongoDB You should try Nuxeo !
  • 49.
    A N YQ U E S T I O N S ? T H A N K Y O U ! https://github.com/nuxeo nuxeo.com/careers/ @damienmetzler