OrientDB Distributed Architecture v2.0

Short history
!
In 2012, we had a Master/Slave replication
!
While it scaled up well on reads, users
complained of a single Master node
bottleneck
It’s quite easy to scale up reads, the hard
part is to scale up both reads and writes
Copyright (c) - Orient Technologies LTD 2

How Master/Slave works
Copyright (c) - Orient Technologies LTD
3
C C C
Master
Node
Slave
Node
Slave
Node
Writes
Master
node is the
bottleneck

Master/Slave
!
PROS:
- Relatively easy to develop
!
CONS:
- The master is the bottleneck for writes
- No matter how many servers you have, the
throughput is limited by the Master node

What happened to OrientDB's M/S architecture?
This is the old
MASTER/SLAVE
replication

2012: new architectural goals
Multi-Master: all the nodes must accept writes
Sharding: split data in multiple partitions
Better Fail-Over
Simplified configuration with Auto-Discovery

Auto-Discovery
C
Master
Node
I’m the
only one!

Auto-Discovery
Connected!
C
Master
Node
Master
Node

Clients see the distributed configuration
C
Master
Node
updated distributed
configuration is broadcasted to
all the connected clients
Master
Node

Auto-reconnect in case of failure
In case of failure, the
clients auto-reconnect to
C C
the available nodes
Master
Node
Master
Node

Auto-deploy of databases
automatically deployed
C
to the new joining
Master
Node
C
Master
Node
DB are
nodes
C
C
DB DB

Classes rely on Cluster to store records
1 class -> 1 cluster Class
Customer
customer
By default
Cluster

Classes can be split into more clusters
Customer
customer_usa
Class
multiple clusters
and assign them to
customer_china
Define
each node
Cluster Cluster
customer_europe
Cluster

Assign 1 cluster per Node
Master
Node
Customer
Master
Node
Master
Node
customer_usa customer_europe customer_china

Copyright (c) - Orient Technologies LTD
What about
sharing + replication?
!
We used a solution similar
to RAID for HardDrives
15

RAID for databases
Replica
factor = 2
Master
Node
Customer
Master
Node
Master
Node
customer_china customer_usa customer_europe

RAID for databases
Replica
factor = 3
Master
Node
Master
Node
Each node
owns all customers
Master
Node
customer_customer_china usa customer_europe
customer_europe customer_china customer_usa

Replication: under the hood
Client sends an INSERT request
HZ
Queue
Requests
Master
Node
HZ
Queue
Master
Node
HZ
Queue
Master
Node
C
INSERT

HZ
Queue
Response handling
Requests
Master
Node
HZ
Queue
Master
Node
HZ
Queue
WriteQuorum
= 2
Sends OK
Master
Node
C
HZ
Queue
HZ
Queue
HZ
Queue
OK
Responses

Fix the unaligned node
HZ
Queue
Requests
Master
Node
HZ
Queue
Master
Node
HZ
Queue
Master
Node
HZ
Queue
HZ
Queue
HZ
Queue
Responses
Fix

Linear and Elastic scalability
C
Master
Node
C
on both read & writes!
Master
Node
C
C
Master
Node
C
C
C
C
Master
Node
C
C
C
C Master
Node
C
C
C
Master
Node
C
C
C
Master
Node
C
C

Hazelcast’s role
Auto-Discovering (Multicast/TCP-IP/Amazon)
Queues for requests and responses
Store metadata in distributed Maps
Distributed Locks

OrientDB’s Future Roadmap
OrientDB 2.0 (Sept 2014) has even better
performance: +300% improvement on all the
distributed operations
Pluggable conflict resolution strategy
Auto-discovery also by Clients

OrientDB Distributed Architecture v2.0

In this document

More Related Content

What's hot

Similar to OrientDB Distributed Architecture v2.0

Recently uploaded

OrientDB Distributed Architecture v2.0