Implementing Domain Events with
Kafka
Devtalks
How DDD and Kafka help in integrating microservices
Bucharest, 12th of June 2020
,
Agenda
2
Challenges in integrating microservices
What is a Domain Event?
Implementation patterns for Domain Events
Kafka Delivery Guarantees & Configurations
Microservices benefits
• Agility in development
• Better understanding of the architecture
• Resilience in production
• Independent scalability of microservices
• Independent deployment of microservices
• Fault tolerance (failures in a microservice can be isolated from other microservices)
How can these benefits could be achieved?
✓ Ensuring microservice autonomy
✓ Defining explicit contracts
3
Synchronous Integration between Microservices
▪ Temporal Coupling
▪ Behavioral Coupling*
The services are not autonomous
Complex and unreliable
compensation logic
Circuit breaker, timeouts, retries
etc. are of limited use
*Reference: http://iansrobinson.com/2009/04/27/temporal-and-behavioural-coupling/
4
Introduction to DDD
▪ Strategic Design:
✓ bounded contexts
✓ context map
▪ Tactical Design:
✓ entities
✓ values
✓ aggregates
✓ aggregate roots
✓ Repositories
✓ services
✓ domain events
✓ application services
5
Reference: Evans, Eric. Domain-Driven Design: Tackling Complexity in the Heart of Software. Boston: Addison-Wesley, 2003, Vernon, Vaughn. Implementing Domain-Driven Design. Boston: Addison-
Wesley, 2013.
Microservices Design Rules
1 microservice = 1 bounded context = 1 deployment unit
1 operation = 1 transaction = 1 aggregate (modify in one transaction only one
aggregate instance)
Prefer asynchronous communication between microservices over synchronous
request/reply integration ➔ integration based on domain events
6
Domain Events
• An event captures the changes in the state of a single aggregate
• Events are immutable
• They get published by the aggregates and contain the associated aggregate root id
• Each event should have a unique id that could be used for de-duplication
7
Integration Rules between Bounded Contexts
▪ Via a publish/subscriber messaging infrastructure
▪ The bounded contexts interested in the data maintained by other bounded
contexts subscribe to the relevant events
▪ Integration via events is applicable also between different aggregate instances from
the same bounded context
8
➔ Transactional Consistency within aggregate boundaries
➔ Eventual Consistency between different bounded contexts and between different
aggregate instances
µ Service µ Service µ Service µ Service
Event Bus
REST/SOAP REST/SOAP REST/SOAP REST/SOAP
UI/ API Gateway
Data DataDataData
Microservices Architecture based on DDD
Decomposing application into loosely
coupled services:
- Explicit contract/ interface
- Boundary alignment with
business capabilities
- Asynchronous communication
between microservices
- Microservices have their own
storage (they are the
authoritative source of data for
their domain)
1
2
3
4
1
2
3
4
9
Main use cases for domain events
10
Notifications other
bounded contexts need
to react to
Data replication over
events between
bounded contexts
CQRS (in the same
bounded context)
Messaging requirements in view of DDD
▪ Support for publish/subscriber pattern
▪ Support for competing consumers pattern
▪ Persistence and Durability
▪ At-least-once message delivery
▪ Partial message ordering (per aggregate)
11
Kafka Logical Architecture
12
Replication + Fault Tolerance +
Partitioning + Elastic Scaling
Single-Leader System (all
producers/consumers
publish/fetch data to/from
leaders )
CP in terms of CAP theorem for
both write and read operations
Confluent Platform 5.4 introduces
Follower Fetching, Observers, and
Replica Placement ➔ Fetching
data from asynchronous replicas
➔ AP for read operations
Availability ZoneAvailability Zone
Broker B1 – Leader for partition 1,
follower for partitions 3 and 4
Broker B2 – Leader for 2, follower
for 1 and 4
Topic-partition1
Broker B3 – Leader for 3, follower for
1 and 2
Broker B4– Leader for 4, follower
for 2 and 3
Topic-partition2
Topic-partition3
Topic-partition4
Topic-partition1
Topic-partition1
fetch
fetch
Topic-partition2 Topic-partition2
fetchfetch
Topic-partition3
Topic-partition3
fetch
fetch
Topic-partition4 Topic-partition4
fetchfetch
Producer/
Consumer
Initial ISR: {B1, B2, B3}
Initial ISR: {B2, B3, B4}
Initial ISR: {B1, B3, B4}
Initial ISR: {B1, B2, B4}
Kafka Broker Configurations
min.insync.replicas: should be set to a value greater than 1
replica.lag.time.max.ms
unclean.leader.election.enable: should be false
broker.rack: should reflect data centers, regions or availability zones where
brokers are placed
replica.selector.class (since Confluent 5.4): used by the broker to find the
preferred read replica. By default it returns partition leaders.
13
* Reference: https://docs.confluent.io/current/installation/configuration/broker-configs.html
Kafka Offsets
14
0 1 2 cp up hw ...lc end
Begin
Offset
Last
Commited
Offset of a
Consumer
Consumer s
Current Position
Undereplicated
Partition
from a follower
non-ISR broker
High Watermark
(last index
replicated to all ISR)
Messages published with
acks 0 or 1 and not yet
replicated
Consumer s batch of
messages
How to choose the number of topics/partitions
▪ Number of partitions = producer throughput/ consumer throughput
▪ limit the number of partitions per broker to 100 x b x r,
Where:
b is the number of brokers in a Kafka cluster
r is the replication factor
Why?
“Kafka broker uses only a single thread to replicate data from another broker, for all
partitions that share replicas between the two brokers. (…)replicating 1000 partitions from
one broker to another can add about 20 ms latency, which implies that the end-to-end
latency is at least 20 ms.”*
15
* Reference: https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/
How Kafka addresses DDD messaging requirements
Support for publish/subscribers pattern ➔ topics
Support for competing consumers pattern ➔ consumer groups
Persistence and Durability ➔ replication factors, ISR, retention policies
At-least-once message delivery ➔ acks (on producer-side), default behavior of
consumers (when auto commit is enabled)
Partial message ordering (per aggregate) ➔ partitions
16
Application Requirements in View of DDD
▪ Domain events are self-contained messages (events capture the state changes
in domain entities and provide decision support for subscribers)
▪ Domain events are part of a microservice's public API/contract
▪ One topic per aggregate type
▪ Partitioning by aggregate/entity ID
▪ Transactional semantics when publishing events
▪ Events deduplication on subscriber-side for achieving exactly-once delivery
semantics
17
Aggregate Processing
18
Client Application Service Aggregate Root Entity Repository KafkaProducer
remote operation
Aggregate root method
Entity method
Publish event: event type, root id, event id,
aggregate root state change, entity state change
Find aggregate
State change
State change
Save aggregate
Bounded Context
Transactional Publishing Patterns
19
Client Application Service Aggregate Root Repository Event Publisher
remote operation
Aggregate root method
Find aggregate
local transaction
Bounded Context
Events Table
(same DB)
KafkaProducer
State change
Polling Publisher
send
Update offset
loop
Kafka Producer
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("acks", "all");
props.put("key.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
props.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
….
Producer<String, String> producer = new KafkaProducer<>(props);
….
Future<RecordMetadata> metadata = producer.send(new ProducerRecord<String, String>("my-topic", key, value));
….
producer.close()
20
Kafka Producer Configurations
acks = all
retries = 2147483647.
delivery.timeout.ms
enable.idempotence = true.
max.in.flight.requests.per.connection = 1.
21
* Reference: https://docs.confluent.io/current/installation/configuration/producer-configs.html
Kafka Consumer
try {
while (true) {
ConsumerRecords<String, String> records = consumer.poll(MAX_VALUE);
for (ConsumerRecord<String, String> record : records) {
System.out.println(record.offset() + “: ” + record.value());
}
try {
//saving new offsets if enable.auto.commit = false
} catch (CommitFailedException e) {
// application specific failure handling
}
}
} catch (WakeupException e) {
// ignore for shutdown
} finally {
consumer.close();
}
22
Up to
max.poll.records
max.poll.interval.ms >
max.poll.records
*
record processing time
Commit if enable.auto.commit = true and
the interval between current poll and last
commit >=
auto.commit.interval.ms
Kafka Consumer Configurations
▪ group.id
▪ heartbeat.interval.ms & session.timeout.ms
▪ enable.auto.commit & auto.commit.interval.ms
▪ isolation.level
▪ max.poll.interval.ms
▪ max.poll.records
▪ client.rack (since Confluent 5.4; if the partition is rack aware and the replica selector is set, pick a “preferred
read replica”)
23
Exactly-once Stateful Processing
24
Kafka Broker(s) Kafka Consumer Aggregate Root Repository Offset Manager
Poll for events
Aggregate root method
Save aggregate
events
local transaction
Bounded Context
Offsets Table
Aggregate State
Table
Select offsets
State change
Update Offset
Same Database
loop
Avro Consumers & Producers
25
Schema
Registry
Kafka
Producer Consumer
Send Avro content +
Schema id
Query schema id
by schema content
Get writer s
schema by id
Read avro
content + schema Id
Apply schema
evolution
Avro
• Kafka records can have a Key and a Value and both can have a schema.
• There is a compatibility level (BACKWARDS, FORWARDS, FULL, NONE) setting for the Schema
Registry and an individual subject. Versions are also managed per subject
• Backward compatibility = consumers coded with a newer schema can read messages
written with older schema.
• Forward compatibility = consumers coded with a older schema can read messages written
with a newer schema.
• Full compatibility = a new schema version is both backward and forward compatible.
• None
• Avro schema evolution is an automatic transformation of messages from the schema used by
producers to write into the Kafka log to schema used by consumer to read them. The
transformation occurs at consuming time in Avro Deserializer.
26
Change Type Order of releasing into production
Backward Compatible Consumers, Producers
Forward Compatible Producers, Consumers (after they finish reading old messages)
Fully Compatible Order doesn’t matter
None Coordinated
Subject Name Strategy
• A DDD aggregate could publish multiple event types, each capturing a distinct business intent
• For maintaining the correct event order all the events will be published in the same topic (using
aggregate id as message key)
• Multiple event types require multiple schemas (one for each event type that could be published
in the same topic)
• Producer-side configs:
• schema registry url
• key.subject.name.strategy (which defines how to construct the subject name for message
keys) = TopicNameStrategy, RecordNameStrategy or TopicRecordNameStrategy
• value.subject.name.strategy (how to construct the subject name for message values) =
TopicNameStrategy, RecordNameStrategy or TopicRecordNameStrategy
• Consumer-side configs:
• specific.avro.reader = true (or otherwise you get Avro GenericRecord)
27
Conclusions
Domain Driven Design not only helps in decomposing a system into
microservices aligned with business capabilities, but also offers essential
lessons on:
✓ how to achieve resilience
✓ how to achieve scalability
by decoupling microservices using domain events
28
Thank you!

Implementing Domain Events with Kafka

  • 1.
    Implementing Domain Eventswith Kafka Devtalks How DDD and Kafka help in integrating microservices Bucharest, 12th of June 2020 ,
  • 2.
    Agenda 2 Challenges in integratingmicroservices What is a Domain Event? Implementation patterns for Domain Events Kafka Delivery Guarantees & Configurations
  • 3.
    Microservices benefits • Agilityin development • Better understanding of the architecture • Resilience in production • Independent scalability of microservices • Independent deployment of microservices • Fault tolerance (failures in a microservice can be isolated from other microservices) How can these benefits could be achieved? ✓ Ensuring microservice autonomy ✓ Defining explicit contracts 3
  • 4.
    Synchronous Integration betweenMicroservices ▪ Temporal Coupling ▪ Behavioral Coupling* The services are not autonomous Complex and unreliable compensation logic Circuit breaker, timeouts, retries etc. are of limited use *Reference: http://iansrobinson.com/2009/04/27/temporal-and-behavioural-coupling/ 4
  • 5.
    Introduction to DDD ▪Strategic Design: ✓ bounded contexts ✓ context map ▪ Tactical Design: ✓ entities ✓ values ✓ aggregates ✓ aggregate roots ✓ Repositories ✓ services ✓ domain events ✓ application services 5 Reference: Evans, Eric. Domain-Driven Design: Tackling Complexity in the Heart of Software. Boston: Addison-Wesley, 2003, Vernon, Vaughn. Implementing Domain-Driven Design. Boston: Addison- Wesley, 2013.
  • 6.
    Microservices Design Rules 1microservice = 1 bounded context = 1 deployment unit 1 operation = 1 transaction = 1 aggregate (modify in one transaction only one aggregate instance) Prefer asynchronous communication between microservices over synchronous request/reply integration ➔ integration based on domain events 6
  • 7.
    Domain Events • Anevent captures the changes in the state of a single aggregate • Events are immutable • They get published by the aggregates and contain the associated aggregate root id • Each event should have a unique id that could be used for de-duplication 7
  • 8.
    Integration Rules betweenBounded Contexts ▪ Via a publish/subscriber messaging infrastructure ▪ The bounded contexts interested in the data maintained by other bounded contexts subscribe to the relevant events ▪ Integration via events is applicable also between different aggregate instances from the same bounded context 8 ➔ Transactional Consistency within aggregate boundaries ➔ Eventual Consistency between different bounded contexts and between different aggregate instances
  • 9.
    µ Service µService µ Service µ Service Event Bus REST/SOAP REST/SOAP REST/SOAP REST/SOAP UI/ API Gateway Data DataDataData Microservices Architecture based on DDD Decomposing application into loosely coupled services: - Explicit contract/ interface - Boundary alignment with business capabilities - Asynchronous communication between microservices - Microservices have their own storage (they are the authoritative source of data for their domain) 1 2 3 4 1 2 3 4 9
  • 10.
    Main use casesfor domain events 10 Notifications other bounded contexts need to react to Data replication over events between bounded contexts CQRS (in the same bounded context)
  • 11.
    Messaging requirements inview of DDD ▪ Support for publish/subscriber pattern ▪ Support for competing consumers pattern ▪ Persistence and Durability ▪ At-least-once message delivery ▪ Partial message ordering (per aggregate) 11
  • 12.
    Kafka Logical Architecture 12 Replication+ Fault Tolerance + Partitioning + Elastic Scaling Single-Leader System (all producers/consumers publish/fetch data to/from leaders ) CP in terms of CAP theorem for both write and read operations Confluent Platform 5.4 introduces Follower Fetching, Observers, and Replica Placement ➔ Fetching data from asynchronous replicas ➔ AP for read operations Availability ZoneAvailability Zone Broker B1 – Leader for partition 1, follower for partitions 3 and 4 Broker B2 – Leader for 2, follower for 1 and 4 Topic-partition1 Broker B3 – Leader for 3, follower for 1 and 2 Broker B4– Leader for 4, follower for 2 and 3 Topic-partition2 Topic-partition3 Topic-partition4 Topic-partition1 Topic-partition1 fetch fetch Topic-partition2 Topic-partition2 fetchfetch Topic-partition3 Topic-partition3 fetch fetch Topic-partition4 Topic-partition4 fetchfetch Producer/ Consumer Initial ISR: {B1, B2, B3} Initial ISR: {B2, B3, B4} Initial ISR: {B1, B3, B4} Initial ISR: {B1, B2, B4}
  • 13.
    Kafka Broker Configurations min.insync.replicas:should be set to a value greater than 1 replica.lag.time.max.ms unclean.leader.election.enable: should be false broker.rack: should reflect data centers, regions or availability zones where brokers are placed replica.selector.class (since Confluent 5.4): used by the broker to find the preferred read replica. By default it returns partition leaders. 13 * Reference: https://docs.confluent.io/current/installation/configuration/broker-configs.html
  • 14.
    Kafka Offsets 14 0 12 cp up hw ...lc end Begin Offset Last Commited Offset of a Consumer Consumer s Current Position Undereplicated Partition from a follower non-ISR broker High Watermark (last index replicated to all ISR) Messages published with acks 0 or 1 and not yet replicated Consumer s batch of messages
  • 15.
    How to choosethe number of topics/partitions ▪ Number of partitions = producer throughput/ consumer throughput ▪ limit the number of partitions per broker to 100 x b x r, Where: b is the number of brokers in a Kafka cluster r is the replication factor Why? “Kafka broker uses only a single thread to replicate data from another broker, for all partitions that share replicas between the two brokers. (…)replicating 1000 partitions from one broker to another can add about 20 ms latency, which implies that the end-to-end latency is at least 20 ms.”* 15 * Reference: https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/
  • 16.
    How Kafka addressesDDD messaging requirements Support for publish/subscribers pattern ➔ topics Support for competing consumers pattern ➔ consumer groups Persistence and Durability ➔ replication factors, ISR, retention policies At-least-once message delivery ➔ acks (on producer-side), default behavior of consumers (when auto commit is enabled) Partial message ordering (per aggregate) ➔ partitions 16
  • 17.
    Application Requirements inView of DDD ▪ Domain events are self-contained messages (events capture the state changes in domain entities and provide decision support for subscribers) ▪ Domain events are part of a microservice's public API/contract ▪ One topic per aggregate type ▪ Partitioning by aggregate/entity ID ▪ Transactional semantics when publishing events ▪ Events deduplication on subscriber-side for achieving exactly-once delivery semantics 17
  • 18.
    Aggregate Processing 18 Client ApplicationService Aggregate Root Entity Repository KafkaProducer remote operation Aggregate root method Entity method Publish event: event type, root id, event id, aggregate root state change, entity state change Find aggregate State change State change Save aggregate Bounded Context
  • 19.
    Transactional Publishing Patterns 19 ClientApplication Service Aggregate Root Repository Event Publisher remote operation Aggregate root method Find aggregate local transaction Bounded Context Events Table (same DB) KafkaProducer State change Polling Publisher send Update offset loop
  • 20.
    Kafka Producer Properties props= new Properties(); props.put("bootstrap.servers", "localhost:9092"); props.put("acks", "all"); props.put("key.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer"); props.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer"); …. Producer<String, String> producer = new KafkaProducer<>(props); …. Future<RecordMetadata> metadata = producer.send(new ProducerRecord<String, String>("my-topic", key, value)); …. producer.close() 20
  • 21.
    Kafka Producer Configurations acks= all retries = 2147483647. delivery.timeout.ms enable.idempotence = true. max.in.flight.requests.per.connection = 1. 21 * Reference: https://docs.confluent.io/current/installation/configuration/producer-configs.html
  • 22.
    Kafka Consumer try { while(true) { ConsumerRecords<String, String> records = consumer.poll(MAX_VALUE); for (ConsumerRecord<String, String> record : records) { System.out.println(record.offset() + “: ” + record.value()); } try { //saving new offsets if enable.auto.commit = false } catch (CommitFailedException e) { // application specific failure handling } } } catch (WakeupException e) { // ignore for shutdown } finally { consumer.close(); } 22 Up to max.poll.records max.poll.interval.ms > max.poll.records * record processing time Commit if enable.auto.commit = true and the interval between current poll and last commit >= auto.commit.interval.ms
  • 23.
    Kafka Consumer Configurations ▪group.id ▪ heartbeat.interval.ms & session.timeout.ms ▪ enable.auto.commit & auto.commit.interval.ms ▪ isolation.level ▪ max.poll.interval.ms ▪ max.poll.records ▪ client.rack (since Confluent 5.4; if the partition is rack aware and the replica selector is set, pick a “preferred read replica”) 23
  • 24.
    Exactly-once Stateful Processing 24 KafkaBroker(s) Kafka Consumer Aggregate Root Repository Offset Manager Poll for events Aggregate root method Save aggregate events local transaction Bounded Context Offsets Table Aggregate State Table Select offsets State change Update Offset Same Database loop
  • 25.
    Avro Consumers &Producers 25 Schema Registry Kafka Producer Consumer Send Avro content + Schema id Query schema id by schema content Get writer s schema by id Read avro content + schema Id Apply schema evolution
  • 26.
    Avro • Kafka recordscan have a Key and a Value and both can have a schema. • There is a compatibility level (BACKWARDS, FORWARDS, FULL, NONE) setting for the Schema Registry and an individual subject. Versions are also managed per subject • Backward compatibility = consumers coded with a newer schema can read messages written with older schema. • Forward compatibility = consumers coded with a older schema can read messages written with a newer schema. • Full compatibility = a new schema version is both backward and forward compatible. • None • Avro schema evolution is an automatic transformation of messages from the schema used by producers to write into the Kafka log to schema used by consumer to read them. The transformation occurs at consuming time in Avro Deserializer. 26 Change Type Order of releasing into production Backward Compatible Consumers, Producers Forward Compatible Producers, Consumers (after they finish reading old messages) Fully Compatible Order doesn’t matter None Coordinated
  • 27.
    Subject Name Strategy •A DDD aggregate could publish multiple event types, each capturing a distinct business intent • For maintaining the correct event order all the events will be published in the same topic (using aggregate id as message key) • Multiple event types require multiple schemas (one for each event type that could be published in the same topic) • Producer-side configs: • schema registry url • key.subject.name.strategy (which defines how to construct the subject name for message keys) = TopicNameStrategy, RecordNameStrategy or TopicRecordNameStrategy • value.subject.name.strategy (how to construct the subject name for message values) = TopicNameStrategy, RecordNameStrategy or TopicRecordNameStrategy • Consumer-side configs: • specific.avro.reader = true (or otherwise you get Avro GenericRecord) 27
  • 28.
    Conclusions Domain Driven Designnot only helps in decomposing a system into microservices aligned with business capabilities, but also offers essential lessons on: ✓ how to achieve resilience ✓ how to achieve scalability by decoupling microservices using domain events 28
  • 29.