Ben Stopford, Engineer, Confluent
• Event Driven Microservices
• The toolset: Kafka, KStreams, Connect
• 10 Principals for Streaming Services
What we’ll cover
Microservices
in the Kafka
Ecosystem
GUI
UI
Service
Orders
Service
Returns
Service
Fulfilment
Service
Payment
Service
Stock
Service
Microservices
So why
Services?
The Monolith
Can we do reuse, encapsulation?
The Monolith
What happens when we grow?
Companies are inevitably
a collection of applications
They must work together to some degree
Inverse Conway Maneuver
The 'Inverse Conway Maneuver' recommends evolving your
team and organizational structure to promote your desired
architecture.
Org Structure
Software Architecture
Eschew shared, mutable
state
Service based approaches
separate state
But state must inevitably be
shared between services
Use a toolkit that embraces
decentralisation
Some simple patterns of
distributed systems
Request / Response
When do we need
Request Response?
Looking things up
What is in my shopping basket?
Event Driven
Async / Fire and Forget / Brokered
Businesses are often modeled
as a sequence events.
Add to
cart Buy
Make
Payment
Update
Stock
Email
Conf
Package
Item
Dispatch
Fraud
When do we need
Event Driven?
SOA / Microservices
Message Broker
Event Based Request/Response
Hybrids
Event-
Based
Request/
Response
Request Response
Buy
Items
Order
Service
Stock
Service
Ful-
filment
Service
Fraud
Det-
ection
Mmm…
New IPad
REST etc
UI
Service
Event Driven
Buy
Items
Order
Service
Stock
Service
Ful-
filment
Service
Fraud
Det-
ection
Message Broker
Mmm…
New IPad
UI
Service
Hybrid
Buy
Items
Order
Service
Stock
Service
Ful-
filment
Service
Fraud
Det-
ection
Message Broker
Mmm…
New IPad
REST
UI
Service
As software engineers we are
inevitably affected by the tools we
surround ourselves with.
Languages, frameworks, even
processes all act to shape the
software we build.
GUI
UI
Service
Orders
Service
Returns
Service
Fulfilment
Service
Payment
Service
Stock
Service
The tools we choose have a big effect on
our architecture
GUI
UI
Service
Orders
Service
Returns
Service
Fulfilment
Service
Payment
Service
Stock
Service
Event Driven Request Response
Kafka is well suited to Event
Driven Architectures
It will lead you down that path
The Tool Set
a Distributed Log
What is a Distributed Log?
Shard on the way in
Producing
Services
Kafka
Consuming
Services
Each shard is a queue
Producing
Services
Kafka
Consuming
Services
Consumers share load
Producing
Services
Kafka
Consuming
Services
Reduces to a globally ordered
queue
Load Balanced Services
Fault Tolerant Services
Build ‘Always On’ Services
Rely on Fault
Tolerant Broker
Services can “Rewind &
Replay” the log
Rewind & Replay
Compacted Log
(retains only latest version)
Version 3
Version 2
Version 1
Version 2
Version 1
Version 5
Version 4
Version 3
Version 2
Version 1
A database engine for
data-in-flight
Max(price)
From orders
where ccy=‘GBP’
over 1 day window
emitting every second
Continuously Running Queries
What is stream processing
engine?
Data
Index
Query
Engine
Query
Engine
vs
Database
Finite source
Stream Processor
Infinite source
Windowing
For unordered or unpredictable streams
Sliding
Fixed
(tumbling)
Features: similar to
database query engine
JoinFilter
Aggr-
egate
View
Window
Stateful Stream Processing
stream
Compacted
stream
Join
Stream Data
Stream-Tabular
Data
Windowed
Stream
Locally Cached
Table
(disk resident)
KafkaKafka Streams
Useful for Enrichment
stream
Compacted
stream
Join
Orders
Customers
KafkaKafka Streams
Scales Out
Embeddable
Orders Service
Kafka
Orders Service
Orders Service
Kafka Connect
View Replication
Sometimes you need to
physically move data
Replicate it, so both copies
are identical
Iterate via regeneration
Kafka Connect
Kafka
Connect
Kafka
Connect
Kafka
So…
Service Backbone
Scalable, Fault Tolerant, Concurrent, Strongly Ordered, Stateful
Embeddable tool for data
manipulation
Adapt data streams
(transformation, stream maintenance,
analytic function)
Replicate Data Sources Exactly
Create Regenerable, Streaming Views
How do we actually do this?
10 (opinionated) principals for Streaming Services
1. Don’t use Kafka for
shopping carts!
(OK, you can, but use sparingly)
Shopping
Cart
UI
Service
Broker/durability/broadcast add
little to request response
Do use Kafka for your core
business processing.
Think “business events”. An order was created, a payment
was received, a trade was booked etc. Pub/Sub.
GUI
UI
Service
Orders
Service
Returns
Service
Fulfilment
Service
Payment
Service
Stock
Service
2. Pick Topics with
Business Significance
Orders Payments
Returns Invoices
Give your messages meaningful
IDs and version them
OrdersService1-Order-1234-v2
Should relate to
the real world
Should be
Versioned
(if mutable)
Include the service
name
Note the key used for sharding in Kafka may not be this key
3. Decouple publishers from
subscribers
GUI
UI
Service
Orders
Service
Returns
Service
Fulfilment
Service
Payment
Service
Stock
Service
Add Request/Response
only where needed
GUI
UI
Service
Orders
Service
Returns
Service
Fulfilment
Service
Payment
Service
Stock
Service
REST
4. Use the log to
regenerate state
Avoid journaling incoming events
Event Source side effects
• Use offsets to tie these
back to the stream
• Store in:
• Kafka
• Kstreams state store
• Other DB
5. Apply the Single Writer
Principal
• Change at source (by
calling that service)
• Let the change propagate
back
• Keep local copies read
only.
GUI
UI
Service
Orders
Service
Returns
Service
Fulfilment
Service
Payment
Service
Stock
Service
(1) Change
Orders at source
(2) Let the change propagate through
6. Leverage keeping
datasets inside the broker
customer country
supplier
ex-
rates
Leverage keeping only the
latest version (table view)
Version 3
Version 2
Version 1
Version 2
Version 1
Version 5
Version 4
Version 3
Version 2
Version 1
Join & Process on the fly
stream
Compacted
stream
Join
Orders
Customers
KafkaKafka Streams
7. Prefer stream processing over
maintaining historic views
UI
Service
Orders
Service
Returns
Service
Fulfilment
Service
Payment
Service
Stock
Service
Orders
Historic
“copies”
diverge
Join & Process on the fly
stream
Compacted
stream
Join
Orders
Customers
KafkaKafka Streams
8. Sometimes you need
historic views.
=> Replicate & Keep Read Only
Replicate
Kafka
Connect
Kafka
Connect
Kafka
Keep me
Read Only
Iterate
Polyglotic Persistence
9. Use Schemas
(especially if data is retained)
Schemaless data
doesn’t age well
Messages
Confluent Schema
Registry can help
Orders
Service
Schema
Registry
Email
Service
Avro
10. Consider “Stream
Management” Services
Stream
Management
Kafka
• Retaining data => Admin tasks
• Similar to the role of a DBA
• Data Migration
• Repartitioning
• Latest/versioned
• Environment Management
• CQRS
KStreams is a good
toolset for this
Stream Management
KStreams
Latest
Stream
Versioned
Stream
So…
1. Don’t use Kafka for shopping carts!
2. Pick Topics with Business Significance
3. Decouple publishers from subscribers
4. Use the log to regenerate state
5. Apply the Single Writer Principal
6. Leverage keeping datasets inside the broker
7. Prefer stream processing over maintaining historic views
8. Sometimes you need historic views. => Replicate Read Only
9. Use Schemas
10. Consider “Stream Management” Services
Microservices push us away
from shared, mutable state
But state needs to be
communicated
In an increasingly data-heavy world
we need tools to do this efficiently
…and in real time.
We need a data-centric
toolset to do this
All Your Data
Immutable
Streams
Event
Driven
Services
Stream
Management
Services
Legacy
CDC
View
Replication
Polyglotic
Persistence
tables streams
Streaming
Services
Keep it simple,
Keep it moving
@benstopford
Thanks!

Microservices in the Apache Kafka Ecosystem

Editor's Notes

  • #4 Today we’re going to talk about two fields which sit somewhat apart. Stream Processing & Business Systems Services encourage us to share responsibilities. To rely on others. To collaborate and evolve a business’s view of the world over time. A world that is inherently decentralised. Inherently spread across many interconnected systems. Spreading many disparate subsets of our business’s state. Yet our approach rarely reflects this. We tend to think in centralised ways, we accumulate data. We protect it. We control it. We hoard it. But it does not need to be this way.
  • #6 Today we’re going to talk about two fields which sit somewhat apart. Stream Processing & Business Systems Services encourage us to share responsibilities. To rely on others. To collaborate and evolve a business’s view of the world over time. A world that is inherently decentralised. Inherently spread across many interconnected systems. Spreading many disparate subsets of our business’s state. Yet our approach rarely reflects this. We tend to think in centralised ways, we accumulate data. We protect it. We control it. We hoard it. But it does not need to be this way.
  • #15  My name is Ben Stopford. I work as an engineer on Apache Kafka. In a previous life I did the most centralised thing you could possibly do. I built a single, central database for a large financial institution. This talk is about exploring the exact opposite approach to this problem. It’s about thinking of the world in terms of the evolution and provisioning of state across many systems. It’s about embracing decentralisation. It’s about thinking in streams.