2016.03.09
Where does Geode fit in
modern system architectures?
2
Eitan Suez
• Eitan Suez
• Pivotal Consultant Instructor
• Teach GemFire, Cloud Native, PCF
• Prior to joining Pivotal, was Principal Consultant with ThoughtWorks
• Long-time software developer, based in Austin, TX
3
About Me
• Over the years have worked on many enterprise projects for a number of
customers
• First hands-on experience with Geode when consulting at SouthWest
Airlines..
• ..in the role of technical lead on a multi-team project, where Geode played a
prominent role in the system architecture
4
Relationship with Geode
• gfsh
• OQL and the data browser
• PDX serialization
• Spring Data GemFire
• Learn how to do automated functional testing with it
5
My Journey
• At first, we were so focused on building features
• Regions were already defined by solutions architects, treated them as
tables
• Didn’t pay too close attention to the fact that we had:
• near-linear scale-out capabilities built-in with partitioned regions
• fault-tolerance with redundant data copies
• locators adding indirection, clients isolated from cluster specifics
6
Don’t immediately realize what you’ve got
7
Example: Queries against partitioned regions
Client
Geode Distributed System
Query against partitioned region
Server
Query Executor
Partitioned
Region
Server
Partitioned
Region
Server
Partitioned
Region
Can go further with server-side functions
• A Database, but in-memory?
• Can also double as a simple cache?
• A key-value store, but supports queries?
• Supports transactions
• Events?
8
Unique Combination of Features
• Briefly reviewed the traits of Apache Geode
• It takes time to “wrap one’s head around” the whole of this
product
9
Impressive feature set
So, what can you do with it?
• Specific to Java stack: O/RM and Hibernate
• can plug in as Hibernate L2 Cache
• Peer-to-peer configuration
10
Use Cases “in the Small”
• Can be an out-of-process cache server, like Redis, or memcached
➡ gemcached
• These are fine, but does not take advantage of the full feature set
11
Use Cases “in the Small”
12
Canonical Architecture
Geode Distributed System
RegionsFunctions
Locator
Backing Store
Client Client
Events, Continuous Queries
RegionsFunctions
CacheLoader AsyncEventListener
Server
RegionsFunctions
Client
Queries, Transactions, Function Executions
• On a couple of projects over the last couple of years, have been
exposed to CQRS
• At first it seemed strange, or overly complex. Didn't get it
• Kept asking myself:
• Why not start out simpler?
• Seems rather complicated
• It’s more work
13
Switching Gears
• Stands for Command Query Responsibility Segregation
• A Pattern
• deliberately not prescriptive regarding how you implement this
separation
• Separation all the way down to the database
• Germ of the idea came from Bertrand Mayer (of Eiffel fame), with
concept of CQS
• Introduced, proposed by Greg Young
• Active .NET community, Udi Dahan among others
14
What is CQRS?
15
CQRS..
..tells you what,
not how,
but to answer why,
we are asked to look at what happens when you “go there”
16
When reads and writes are separate..
• With a single schema, you’re forced to optimize for one at the
expense of the other
• With two schemas, one can be optimized for reads and the
other for writes (have your cake and eat it too)
• relational model for writes
• denormalized views for reads
17
..can optimize reads and writes
Read-Optimized Write-Optimized
Data-Representation Spectrum
3rd normal formdenormalized views
18
Reading when your data is normalized
Request
Services
Repositories
multiple queries,
lots of joins
results
transformations
views
Relational
Database
Controller
compositions
Constantly reassembling views
• no joins necessary
• no transformations
• no need to reconstruct a view model for each
request
19
With a denormalized schema
Apache Geode
Region:
Customers
Region:
Orders
Region:
Products
. . .
• Can scale reads and writes independently
• many systems have a profile where reads outnumber writes at 100:1
ratio
• Read and write sides can be implemented with entirely different tools and
technologies
• Read-side can stay up when write-side is temporarily down
20
..more benefits
• Commands are semantic, in the language of the business, not
REST CRUD:
AddToCart, AddPaymentMethod, ChangeAddress
• Command handling can be asynchronous
• enqueue commands
• can scale command handling
21
Command Side
See: Udi Dahan
Clarified CQRS
• The LOG
• Append-only, no mutation
• Immutable storage, doesn't destroy history
• Activity just a stream of events
• tables are projections, can be derived entirely from log
• views can be recreated at will
• multiple views
22
Event Sourcing
See: Martin Kleppmann
Stream processing, Event sourcing, Reactive, CEP … and making sense of it all
• Data in Motion vs Data at Rest
• Entire History vs Snapshot in time
• Source of truth vs derived information
• materialized views, caches, indexes,
aggregations
23
The Log / Table Duality
See: Jay Kreps
The Log: What every software engineer should know about
real-time data's unifying abstraction
See: Martin Kleppmann
Stream processing, Event sourcing, Reactive, CEP … and making sense of it all
• ..in test environments to reproduce bugs
• ..in dev environments to test an upcoming release
• ..in production to “undo” a bug
• ..in production for blue-green type deployments
• Can transition to a new schema/representation of data in your regions
because you've come up with a radically different user interface for navigating
that information.
24
Replaying the Log
See: Greg Young
CQRS and Event Sourcing
25
Diagram by

“Exploring CQRS and Event Sourcing”, msdn
26
✓ update caches when new events come in
✓ invalidate caches proactively - ensure data
in caches remain fresh
✓ inverts the cache loader concept
✓ serving data from fast, in-memory caches
✓ regions contain “ViewModel” objects
Events
Projection Updates
Views in Regions
Geode Distributed System
Regions
containing
View Models
Read Side
- relay view models to ui
- little to no transformations
Events, Continuous Queries
Queries
• Apache Geode as the read store in a CQRS system is a
particularly good fit:
• eager cache invalidation
• scalable and fast reads via..
• regions store denormalized views
• partitioned regions enable linear scale-out
• in-memory data supports low-latency reads
• Curious to learn how Geode is being applied in your work
27
Summary
• Martin Kleppmann
Stream processing, Event sourcing, Reactive, CEP … and making sense of it all
• Rx, Erik Meijer
Your Mouse is a Database
• Greg Young
CQRS and Event Sourcing
• Jay Kreps
The Log: What every software engineer should know about real-time data's unifying
abstraction
• Udi Dahan
Clarified CQRS
• Dominic Betts, Julian Dominguez, Grigori Melnik, Fernando Simonazzi, Mani
Subramanian
CQRS Journey
• Dannielle Burrow
Four Real World Use Cases For An In-Memory Data Grid
28
References & Attributions
29
Join the Apache Geode Community!
• Check out http://geode.incubator.apache.org
• Subscribe: user-subscribe@geode.incubator.apache.org
• Download: http://geode.incubator.apache.org/releases/
Thank you!

#GeodeSummit - Where Does Geode Fit in Modern System Architectures

  • 1.
  • 2.
    Where does Geodefit in modern system architectures? 2 Eitan Suez
  • 3.
    • Eitan Suez •Pivotal Consultant Instructor • Teach GemFire, Cloud Native, PCF • Prior to joining Pivotal, was Principal Consultant with ThoughtWorks • Long-time software developer, based in Austin, TX 3 About Me
  • 4.
    • Over theyears have worked on many enterprise projects for a number of customers • First hands-on experience with Geode when consulting at SouthWest Airlines.. • ..in the role of technical lead on a multi-team project, where Geode played a prominent role in the system architecture 4 Relationship with Geode
  • 5.
    • gfsh • OQLand the data browser • PDX serialization • Spring Data GemFire • Learn how to do automated functional testing with it 5 My Journey
  • 6.
    • At first,we were so focused on building features • Regions were already defined by solutions architects, treated them as tables • Didn’t pay too close attention to the fact that we had: • near-linear scale-out capabilities built-in with partitioned regions • fault-tolerance with redundant data copies • locators adding indirection, clients isolated from cluster specifics 6 Don’t immediately realize what you’ve got
  • 7.
    7 Example: Queries againstpartitioned regions Client Geode Distributed System Query against partitioned region Server Query Executor Partitioned Region Server Partitioned Region Server Partitioned Region Can go further with server-side functions
  • 8.
    • A Database,but in-memory? • Can also double as a simple cache? • A key-value store, but supports queries? • Supports transactions • Events? 8 Unique Combination of Features
  • 9.
    • Briefly reviewedthe traits of Apache Geode • It takes time to “wrap one’s head around” the whole of this product 9 Impressive feature set So, what can you do with it?
  • 10.
    • Specific toJava stack: O/RM and Hibernate • can plug in as Hibernate L2 Cache • Peer-to-peer configuration 10 Use Cases “in the Small”
  • 11.
    • Can bean out-of-process cache server, like Redis, or memcached ➡ gemcached • These are fine, but does not take advantage of the full feature set 11 Use Cases “in the Small”
  • 12.
    12 Canonical Architecture Geode DistributedSystem RegionsFunctions Locator Backing Store Client Client Events, Continuous Queries RegionsFunctions CacheLoader AsyncEventListener Server RegionsFunctions Client Queries, Transactions, Function Executions
  • 13.
    • On acouple of projects over the last couple of years, have been exposed to CQRS • At first it seemed strange, or overly complex. Didn't get it • Kept asking myself: • Why not start out simpler? • Seems rather complicated • It’s more work 13 Switching Gears
  • 14.
    • Stands forCommand Query Responsibility Segregation • A Pattern • deliberately not prescriptive regarding how you implement this separation • Separation all the way down to the database • Germ of the idea came from Bertrand Mayer (of Eiffel fame), with concept of CQS • Introduced, proposed by Greg Young • Active .NET community, Udi Dahan among others 14 What is CQRS?
  • 15.
    15 CQRS.. ..tells you what, nothow, but to answer why, we are asked to look at what happens when you “go there”
  • 16.
    16 When reads andwrites are separate..
  • 17.
    • With asingle schema, you’re forced to optimize for one at the expense of the other • With two schemas, one can be optimized for reads and the other for writes (have your cake and eat it too) • relational model for writes • denormalized views for reads 17 ..can optimize reads and writes Read-Optimized Write-Optimized Data-Representation Spectrum 3rd normal formdenormalized views
  • 18.
    18 Reading when yourdata is normalized Request Services Repositories multiple queries, lots of joins results transformations views Relational Database Controller compositions Constantly reassembling views
  • 19.
    • no joinsnecessary • no transformations • no need to reconstruct a view model for each request 19 With a denormalized schema Apache Geode Region: Customers Region: Orders Region: Products . . .
  • 20.
    • Can scalereads and writes independently • many systems have a profile where reads outnumber writes at 100:1 ratio • Read and write sides can be implemented with entirely different tools and technologies • Read-side can stay up when write-side is temporarily down 20 ..more benefits
  • 21.
    • Commands aresemantic, in the language of the business, not REST CRUD: AddToCart, AddPaymentMethod, ChangeAddress • Command handling can be asynchronous • enqueue commands • can scale command handling 21 Command Side See: Udi Dahan Clarified CQRS
  • 22.
    • The LOG •Append-only, no mutation • Immutable storage, doesn't destroy history • Activity just a stream of events • tables are projections, can be derived entirely from log • views can be recreated at will • multiple views 22 Event Sourcing See: Martin Kleppmann Stream processing, Event sourcing, Reactive, CEP … and making sense of it all
  • 23.
    • Data inMotion vs Data at Rest • Entire History vs Snapshot in time • Source of truth vs derived information • materialized views, caches, indexes, aggregations 23 The Log / Table Duality See: Jay Kreps The Log: What every software engineer should know about real-time data's unifying abstraction See: Martin Kleppmann Stream processing, Event sourcing, Reactive, CEP … and making sense of it all
  • 24.
    • ..in testenvironments to reproduce bugs • ..in dev environments to test an upcoming release • ..in production to “undo” a bug • ..in production for blue-green type deployments • Can transition to a new schema/representation of data in your regions because you've come up with a radically different user interface for navigating that information. 24 Replaying the Log See: Greg Young CQRS and Event Sourcing
  • 25.
    25 Diagram by
 “Exploring CQRSand Event Sourcing”, msdn
  • 26.
    26 ✓ update cacheswhen new events come in ✓ invalidate caches proactively - ensure data in caches remain fresh ✓ inverts the cache loader concept ✓ serving data from fast, in-memory caches ✓ regions contain “ViewModel” objects Events Projection Updates Views in Regions Geode Distributed System Regions containing View Models Read Side - relay view models to ui - little to no transformations Events, Continuous Queries Queries
  • 27.
    • Apache Geodeas the read store in a CQRS system is a particularly good fit: • eager cache invalidation • scalable and fast reads via.. • regions store denormalized views • partitioned regions enable linear scale-out • in-memory data supports low-latency reads • Curious to learn how Geode is being applied in your work 27 Summary
  • 28.
    • Martin Kleppmann Streamprocessing, Event sourcing, Reactive, CEP … and making sense of it all • Rx, Erik Meijer Your Mouse is a Database • Greg Young CQRS and Event Sourcing • Jay Kreps The Log: What every software engineer should know about real-time data's unifying abstraction • Udi Dahan Clarified CQRS • Dominic Betts, Julian Dominguez, Grigori Melnik, Fernando Simonazzi, Mani Subramanian CQRS Journey • Dannielle Burrow Four Real World Use Cases For An In-Memory Data Grid 28 References & Attributions
  • 29.
    29 Join the ApacheGeode Community! • Check out http://geode.incubator.apache.org • Subscribe: user-subscribe@geode.incubator.apache.org • Download: http://geode.incubator.apache.org/releases/
  • 30.