SCALING MQTT WITH
KAFKA
Tim Kellogg
April 7, 2014
@kellogh
•  MQTT broker
•  Protocol onboarding
•  Cloud environment (we’re a startup)
•  Standard
•  Lightweight
o  >= 2 byte overhead per message
•  Easy to parse
o  Length prefixed strings
•  Requires very little resources on client side
o  Broker keeps track of state
•  Reliable
o  QoS 1 & 2
o  Last Will & Testament messages
•  Secure
o  Username + Password
o  Tunnel over TLS
Publish / Subscribe
Pub
Pub
Pub
Broker
Sub
Topic/A
Topic/B
Topic/C
Topic/B
SubTopic/C
SubTopic/A
Topics
•  foo/bar/baz
•  com.example/device/17/thermo
•  Patterns
•  com.example/device/+/thermo
•  com.example/device/#
Scaling Goals
•  More than 2 Million
connected publishers
•  More than 65,000
msg/s
•  Single subscriber
Scaling Goals
•  Amazon’s EC2
•  Horizontal scaling
o  Reduce cost
o  Plan for the future
o  Less impact from
downtime
Problems with Scaling MQTT
Load Balancing
•  Which broker to connect to?
o  DNS load balancing
•  HAProxy
•  QoS 1-2 messages stored in
Cassandra
o  Consistent hash ring
Single Subscriber
Pub
Pub
Pub
Broker
Sub
Topic/A
Topic/B
Topic/C
Topic/#
Single Subscriber
Pub
Pub
Pub
Broker
Sub
Topic/A
Topic/B
Topic/C
Topic/#
Broker
Broker
LoadBalancing
Single Subscriber
Pub
Pub
Pub
Broker
Sub
Topic/A
Topic/B
Topic/C
Topic/#
Broker
Broker
LoadBalancing
LoadBalancing
Single Subscriber
Broker
Subscriber
Topic/#
Broker
Broker
Using HTTP
POST From The Broker
Pub
Pub
Pub
Broker
Topic/A
Topic/B
Topic/C Broker
Broker HTTP POST
LoadBalancing
Server
HTTP POST Server
HTTP POST Server
LoadBalancing
Benefits
•  Easy to load balance
•  Well known & well
supported
Drawbacks
•  HTTP is heavy
•  Headers
•  Creating & destroying
TCP connections
•  Subscriber servers must
be available
•  Retry logic to guarantee
delivery
Apache Kafka
• 
•  Distributed log
aggregation framework
•  Server to server
•  “Smart” clients
•  Apache ZooKeeper
•  Append-only files per topic
o  Client keeps track of what messages it’s processed
•  No topic wildcards
•  Key is used for out of band data
•  device/42/thermo è topic: device-thermo key: 42
Subscriber Group
Pub
Pub
Pub
Broker
Subscriber Group
Pub
Pub
Pub
Broker
Broker
Broker
LoadBalancing
Kafka
Results
•  Linear scaling for fire hose subscriber
•  At least 2 million clients
•  At least 65,000 msg/s
Wish List
•  Security
•  Configuration
Open Source IoT
The Book: Mastering The Internet of Things
Questions?
@kellogh

Scaling MQTT With Apache Kafka