How to choose the right messaging
service for your workload
O(n)
NP
P
NP-Hard
NP-Complete
Yan Cui
http://theburningmonk.com
@theburningmonk
AWS user since 2010
Yan Cui
http://theburningmonk.com
@theburningmonk
Developer Advocate @
Yan Cui
http://theburningmonk.com
@theburningmonk
Independent Consultant
advise
training delivery
AWS services are the new data structures.
Understanding their trade-offs is the new big O notation.
Event-Driven Architecture
Event-Driven Architecture
event stuff happens
?
event stuff happens
EventBridge
publisher
publisher
publisher
consumer
consumer
consumer
service
service
service
service
service
service
service
service
service
service
service
service
service
service
service
service
event
event
FIFO
Archiving
Replays
Schema Discovery
event
FIFO
Archiving
Replays
Schema Discovery
Scalability
Cost
Performance
Debugging
Error Handling
Kinesis
EventBridge
SNS SQS DynamoDB
Stream
IOT Core
…
picking the right AWS service is a valuable skill
Kinesis
EventBridge
SNS SQS DynamoDB
Stream
IOT Core
…
Scaling
What are the scaling
constraint for the
messaging service?
Google “{service name} quotas”
How does Lambda’s
concurrency scale
with throughput?
Concurrency
Msgs/s
Concurrency
Msgs/s
SNS EventBridge
https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html
Concurrency
Msgs/s
SNS EventBridge
SQS
Concurrency
Msgs/s
SNS EventBridge
SQS
Kinesis
https://amzn.to/2RudmGV
Concurrency
Msgs/s
SNS EventBridge
SQS
Kinesis
more concurrency is not always better…
if you want…
maximum
throughput
precise control
over throughput
SNS EventBridge Kinesis Provisioned
if you want…
maximum
throughput
precise control
over throughput
SNS EventBridge Kinesis Provisioned
Downstream
System
SNS
Lambda
use Reserved
Concurrency to
limit concurrency
Reserved Concurrency are taken out of the
available regional Lambda concurrency
managing Reserved Concurrency for many
functions is dif
fi
cult and error prone, easy to
create more problems than it solves
Costs
always factor scale into the equation
$10.836
1 msg/s for a month, 1KB per msg
1 x 60s x 60m x 24hr x 30days
@ $0.014 per mil
+
24hrs x 30days
@ $0.015 per shard per hr
$2.592
SNS
SQS
EventBridge
Kinesis
1 x 60s x 60m x 24hr x 30days
@ $1.00 per mil
1 x 60s x 60m x 24hr x 30days
@ $0.40 per mil
1 x 60s x 60m x 24hr x 30days
@ $0.50 per mil
$1.037
$1.296
Kinesis on-demand mode pricing
$10.836
1 msg/s for a month, 1KB per msg
1 x 60s x 60m x 24hr x 30days
@ $0.014 per mil
+ 24hrs x 30days
@ $0.015 per shard per hr
$2.592
SNS
SQS
EventBridge
Kinesis Provisioned
1 x 60s x 60m x 24hr x 30days
@ $1.00 per mil
1 x 60s x 60m x 24hr x 30days
@ $0.40 per mil
1 x 60s x 60m x 24hr x 30days
@ $0.50 per mil
$1.037
$1.296
Kinesis On-Demand
1kb x 60s x 60m x 24hr x 30days
@ $0.08 per GB ingested
+ 24hrs x 30days
@ $0.04 per stream per hr
$28.998
$47.088
1,000 msg/s for a month, 1KB per msg
1000 x 60s x 60m x 24hr x 30days
@ $0.014 per mil
+ 24hrs x 30days
@ $0.015 per shard per hr
$2592.00
SNS
SQS
EventBridge
Kinesis Provisioned
1000 x 60s x 60m x 24hr x 30days
@ $1.00 per mil
1000 x 60s x 60m x 24hr x 30days
@ $0.40 per mil
1000 x 60s x 60m x 24hr x 30days
@ $0.50 per mil
$1036.80
$1296.00
Kinesis On-Demand
1000kb x 60s x 60m x 24hr x 30days
@ $0.08 per GB ingested
+ 24hrs x 30days
@ $0.04 per stream per hr
$226.55
services that charge by uptime are order(s) of magnitude
cheaper when running at scale
National broadcaster of Finland
National broadcaster of Finland
500+ millions events per day, peaks at 600K+ messages/min
National broadcaster of Finland
500+ millions events per day, peaks at 600K+ messages/min
Anahit Pogosova
National broadcaster of Finland
500+ millions events per day, peaks at 600K+ messages/min
Anahit Pogosova
decisions driven by
cost ef
fi
ciency
National broadcaster of Finland
500+ millions events per day, peaks at 600K+ messages/min
Anahit Pogosova
National broadcaster of Finland
500+ millions events per day, peaks at 600K+ messages/min
Anahit Pogosova
fallback in case
Kinesis is down
Error Handling
0, 1, 2
DLQs only capture the failed event
DLQ
SNS, SQS, Lambda,
EventBridge
prefer Lambda destination over DLQs


(you can use both together, but there’s no clear reason for doing so)
Kinesis
EventBridge
SNS SQS DynamoDB
Stream
DLQ
Observability
A measure of how well the internal state of an
application can be inferred from its external outputs
X-Ray
X-Ray doesn’t trace through many
popular messaging services.
X-Ray doesn’t trace through many
popular messaging services.
Need separate solution (e.g.
correlation IDs) for Lambda logs.
all the indirect invocations
are accounted for
all the Lambda logs from
the transaction in
chronological order
Step 1.
Step 2.
No manual instrument.


Better support for messaging services.


Supports containers and Lambda.
“which messaging service should I use?”
task vs event
Task Event
“Something happened”
“Go do a thing”
Task Event
“Something happened”
“Go do a thing”
Intended for a target receiver.
Task Event
“Something happened”
“Go do a thing”
Intended for a target receiver.
Often expects an answer.
Task Event
“Something happened”
“Go do a thing”
Intended for a target receiver. Publishers are obvious to subscribers.
Often expects an answer.
EventBridge
SNS FIFO
Do Stuff!
EventBridge SNS FIFO
Do Stuff!
EventBridge SNS FIFO
Do Stuff!
order is lost
EventBridge
SNS FIFO
Do Stuff!
order is lost
EventBridge
SNS FIFO SQS FIFO
Do Stuff (in order)!
Other subscribers


(not my problem!)
Kinesis Lambda
EventBridge
Kinesis Lambda
Other subscribers


(not my problem!)
send domain events
service
service
service
service
service
service
service
service
service
service
service
service
service
https://theburningmonk.com/hire-me
Advise
Training Delivery
“Fundamentally, Yan has improved our team by increasing our
ability to derive value from AWS and Lambda in particular.”
Nick Blair
Tech Lead
@theburningmonk
theburningmonk.com
github.com/theburningmonk

How to choose the right messaging service for your workload