Using Apache Cassandra and
Apache Kafka to Scale Next Gen
Applications
Adam Zegelin
Founding Software Engineer, Instaclustr
1.Xxxxxxxxx
xxxxx
Introduction
• Adam Zegelin
• Co-founded Instaclustr 5 years ago
• In Canberra, Australia
• Current focus is Cassandra on Kubenetes
• Instaclustr
• Managed Apache Cassandra, Spark and Kafka in the ☁️
 AWS, GCP, Azure & IBM
 3000 nodes under management
 24×7×365 support
• Consulting
 Schema & application design
 Workshops & Training
• 2nd-level on-call support for on-premise deployments
Agenda
• Introduction to Cassandra and Kafka
• Real-world Use Cases
• Worldpay
• Lendi
• Instaclustr
• Partitioning: the key to scale
• Fitting and architecting for your use case
• Linearly Scalable
• Always Available
• Multi-Region Data
Store
• Apache Cassandra is the leading NoSQL operational
database for high-scale and high-reliability applications.
• Shared nothing peer-to-peer architecture provides
reliability up to 100% (with Instaclustr SLAs).
• replicated data and multiple nodes capable of fulfilling queries
 Node outage? Service just keeps running
• full online maintenance and in-place upgrades
• Low latency for operational applications
• Sub-10ms P95 reads and writes achievable
• Native active-active multi data center support
• Geographic distribution (to meet latency requirements)
• Disaster resilience
• Workload isolation (analytics)
• Cassandra is a data storage system, not an
analytics/query engine or place to run logic
Typical Use Cases
• High write to read ratio
• Data is rarely updated
• Including explicit deletes
• The Primary Key is known at read time
• Limited filtering & aggregation
• No JOINs or referential integrity
• Transaction logging
• Time series data
• IoT status and event history
• Health tracker data
• Order & package statuses & tracking
• Weather service history
• Messages and email envelopes
Queuing, Pub/Sub and
Streaming at Scale
• Apache Kafka is a distributed streaming platform
• Publish and subscribe to streams of records
 Similar to a message queue or EMS
• Store streams of records
 Fault-tolerant
 Durable
• Process streams of records
 as they occur
 randomly, any position in the stream
• Replicated architecture
• High-level similarities to Cassandra
• Scalability
• Reliability
Typical Use Cases
• As a message bus
• Loose coupling between producers and consumers
• Basis for micro-services
• As a commit log
• A store of logical transactions
• Populating analytical data stores or edge caches
• As a buffer
• Manage backpressure & workload spikes
And when combined with Kafka Streams/Spark Streaming…
• As the basis of a streaming architecture
• (near) real-time analytics
• Data processing pipelines
Typical Use Cases
cont’d
• Website activity tracking
• Page views
• Searches
• Other user actions
• Metrics
• Operational monitoring data
• Log aggregation
• Centralized logging
• Event sourcing
• Application state changes
• “we don't just want to see where we are, we also want to know
how we got there”
Case study
• Payment processor
• spun out of RBS in 2010
• merged with Vantive in US in Jan 2018 for USD 10.4B to form
WorldPay Inc.
• Processes
• >40 Million transactions per day
• for 400,000 merchants
• 42% of all UK non-cash transactions
Case study
cont’d
• Re-architecting of WorldPay’s XML Payment API
• facilitates ~40M transactions per month
• New architecture based on open source technologies
• including Cassandra and Kafka
• to provide scalability, availability and reduced costs
• New Idempotency Service
• first project to use the new architecture
• provides capabilities to ensure payments are not repeated
Case study
cont’d
• Challenges
• Tight deployment timeframe
• Very high availability expectations
• Low latency requirements
• Utilises Cassandra to provide highest levels of availability
and scalability
• 18 node cluster
• 3 AWS regions (in Europe)
• Leverages Cassandras tuneable consistency
 QUORUM = strong consistency across regions
 still able to operate with a whole region unavailable
 Latency is tolerable (restricted to EU)
• Simple data model with atomic reads/writes
 fits well with Cassandra capability
Case study
cont’d
• Worked with Instaclustr to accelerate development and
time to stable service:
• Consulting engagement assisted with data model design
• Cassandra cluster run on Instaclustr managed service
 production ready in weeks
• Initial preference was to run on-prem
• security compliance
• did not expect cloud to meet latency requirements
• However, timeframes did not allow establishment of
internal deployment
• Used Instaclustr’s managed Cassandra service on AWS for
initial go-live.
• Now satisfied as a long-term solution
Case study
• Australia’s leading online home loan lender
• Processing over 90% of Australia’s online lending enquiries.
• Re-architecture of their platform following a major
funding round
• customer and data-centric
Case study
cont’d
• Integration-heavy environment
• Bespoke interfaces with banks, etc.
• Moving to a micro-services architecture
• Kafka as a message bus
• New architecture
• Decoupled application code from embedded data sets from
various business applications
• Unified data models from the various point solutions and
market segments
• Enabled extensive scale
 supports rapid and large growth in data as the consumer base
grows
Case study
• Cassandra
• Storage for monitoring metrics & events
• Custom collector
• RabbitMQ transport
 Will eventually move to Kafka as the transport
• Metrics are processed by Riemann
 Raises PagerDuty alerts, tickets, emails
 Writes to Cassandra
• Kafka
• Centralised logging
• Events are collected by fluentd
• Pumped into LogStash via Kafka
• Indexed via ElasticSearch
• Viewed with Kibana
Partitioning
The key to scale
• Partitioning
• using a key in your data to split the data across multiple
servers
• Manual partitioning is possible but painful
• Cassandra and Kafka make partitioning transparent
• needs conscious consideration
1.Xxxxxxxxx
xxxxx
Cassandra Cluster
Cluster
Data Center (optional)
Rack (optional, recommended)
Node
1.Xxxxxxxxx
xxxxx
Partitioning
Partitioning
Partitioning
1.Xxxxxxxxx
xxxxx
Cassandra Partitions
Queuing and Streaming at Scale
1.Xxxxxxxxx
xxxxxQueuing and Streaming at Scale
● Broker
○ Node/server/VM
● Topic
○ Logical grouping of data (category/feed/name)
○ Settings:
○ Replication
○ Partition count
○ Retention
○ Compaction
○ …
Kafka Brokers, Topics and Partitions
1.Xxxxxxxxx
xxxxxQueuing and Streaming at Scale
Partition
○ Subset of messages in a topic
■ Have a single master broker
■ Guarantee ordered delivery within that
subset
○ Number of partitions is set on topic creation
Kafka Topics and Partitions (cont’d)
1.Xxxxxxxxx
xxxxxQueuing and Streaming at Scale
• Messages are mapped to a partition by the Producer
• Randomly/round-robin
• Hash of record key
• Consumers are members of Consumer Groups
• Consumer Groups register to consume records from
Topics
• Each Consumer in a Consumer Group is the exclusive
consumer of a “fair share” of partitions in the topic.
Kafka Partitions in Action
Fitting and
architecting
for your
use case
Cassandra
• Big data
• one or more individually big (>1TB) tables
• Need to pre-determine read pattern
• at least to partition key
• Very low cost writes
• great for high write / read ratio use cases
• Ideal for small reads
• 1, 10, 100, 1000 rows at a time
• No limits to horizontal scaling (data size or ops/sec)
• provided you can find a partition that fits.
• No relational integrity
• No Foreign Keys, no JOIN’s
• Limited filtering, aggregation
Fitting and
architecting
for your
use case
Kafka
• Big data
• 5k+ message/topic/second
• Not transactional
• unlike traditional MQ tech
• although guaranteed once delivery now available
• Kafka Streams very powerful tool for analysis and
mutations on data streams
Adam Zegelin
adam@instaclustr.com
Founding Software
Engineer
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications

Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications

  • 1.
    Using Apache Cassandraand Apache Kafka to Scale Next Gen Applications Adam Zegelin Founding Software Engineer, Instaclustr
  • 2.
    1.Xxxxxxxxx xxxxx Introduction • Adam Zegelin •Co-founded Instaclustr 5 years ago • In Canberra, Australia • Current focus is Cassandra on Kubenetes • Instaclustr • Managed Apache Cassandra, Spark and Kafka in the ☁️  AWS, GCP, Azure & IBM  3000 nodes under management  24×7×365 support • Consulting  Schema & application design  Workshops & Training • 2nd-level on-call support for on-premise deployments
  • 3.
    Agenda • Introduction toCassandra and Kafka • Real-world Use Cases • Worldpay • Lendi • Instaclustr • Partitioning: the key to scale • Fitting and architecting for your use case
  • 4.
    • Linearly Scalable •Always Available • Multi-Region Data Store • Apache Cassandra is the leading NoSQL operational database for high-scale and high-reliability applications. • Shared nothing peer-to-peer architecture provides reliability up to 100% (with Instaclustr SLAs). • replicated data and multiple nodes capable of fulfilling queries  Node outage? Service just keeps running • full online maintenance and in-place upgrades • Low latency for operational applications • Sub-10ms P95 reads and writes achievable • Native active-active multi data center support • Geographic distribution (to meet latency requirements) • Disaster resilience • Workload isolation (analytics) • Cassandra is a data storage system, not an analytics/query engine or place to run logic
  • 5.
    Typical Use Cases •High write to read ratio • Data is rarely updated • Including explicit deletes • The Primary Key is known at read time • Limited filtering & aggregation • No JOINs or referential integrity • Transaction logging • Time series data • IoT status and event history • Health tracker data • Order & package statuses & tracking • Weather service history • Messages and email envelopes
  • 6.
    Queuing, Pub/Sub and Streamingat Scale • Apache Kafka is a distributed streaming platform • Publish and subscribe to streams of records  Similar to a message queue or EMS • Store streams of records  Fault-tolerant  Durable • Process streams of records  as they occur  randomly, any position in the stream • Replicated architecture • High-level similarities to Cassandra • Scalability • Reliability
  • 7.
    Typical Use Cases •As a message bus • Loose coupling between producers and consumers • Basis for micro-services • As a commit log • A store of logical transactions • Populating analytical data stores or edge caches • As a buffer • Manage backpressure & workload spikes And when combined with Kafka Streams/Spark Streaming… • As the basis of a streaming architecture • (near) real-time analytics • Data processing pipelines
  • 8.
    Typical Use Cases cont’d •Website activity tracking • Page views • Searches • Other user actions • Metrics • Operational monitoring data • Log aggregation • Centralized logging • Event sourcing • Application state changes • “we don't just want to see where we are, we also want to know how we got there”
  • 9.
    Case study • Paymentprocessor • spun out of RBS in 2010 • merged with Vantive in US in Jan 2018 for USD 10.4B to form WorldPay Inc. • Processes • >40 Million transactions per day • for 400,000 merchants • 42% of all UK non-cash transactions
  • 10.
    Case study cont’d • Re-architectingof WorldPay’s XML Payment API • facilitates ~40M transactions per month • New architecture based on open source technologies • including Cassandra and Kafka • to provide scalability, availability and reduced costs • New Idempotency Service • first project to use the new architecture • provides capabilities to ensure payments are not repeated
  • 11.
    Case study cont’d • Challenges •Tight deployment timeframe • Very high availability expectations • Low latency requirements • Utilises Cassandra to provide highest levels of availability and scalability • 18 node cluster • 3 AWS regions (in Europe) • Leverages Cassandras tuneable consistency  QUORUM = strong consistency across regions  still able to operate with a whole region unavailable  Latency is tolerable (restricted to EU) • Simple data model with atomic reads/writes  fits well with Cassandra capability
  • 12.
    Case study cont’d • Workedwith Instaclustr to accelerate development and time to stable service: • Consulting engagement assisted with data model design • Cassandra cluster run on Instaclustr managed service  production ready in weeks • Initial preference was to run on-prem • security compliance • did not expect cloud to meet latency requirements • However, timeframes did not allow establishment of internal deployment • Used Instaclustr’s managed Cassandra service on AWS for initial go-live. • Now satisfied as a long-term solution
  • 13.
    Case study • Australia’sleading online home loan lender • Processing over 90% of Australia’s online lending enquiries. • Re-architecture of their platform following a major funding round • customer and data-centric
  • 14.
    Case study cont’d • Integration-heavyenvironment • Bespoke interfaces with banks, etc. • Moving to a micro-services architecture • Kafka as a message bus • New architecture • Decoupled application code from embedded data sets from various business applications • Unified data models from the various point solutions and market segments • Enabled extensive scale  supports rapid and large growth in data as the consumer base grows
  • 15.
    Case study • Cassandra •Storage for monitoring metrics & events • Custom collector • RabbitMQ transport  Will eventually move to Kafka as the transport • Metrics are processed by Riemann  Raises PagerDuty alerts, tickets, emails  Writes to Cassandra • Kafka • Centralised logging • Events are collected by fluentd • Pumped into LogStash via Kafka • Indexed via ElasticSearch • Viewed with Kibana
  • 16.
    Partitioning The key toscale • Partitioning • using a key in your data to split the data across multiple servers • Manual partitioning is possible but painful • Cassandra and Kafka make partitioning transparent • needs conscious consideration
  • 17.
    1.Xxxxxxxxx xxxxx Cassandra Cluster Cluster Data Center(optional) Rack (optional, recommended) Node
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
    1.Xxxxxxxxx xxxxxQueuing and Streamingat Scale ● Broker ○ Node/server/VM ● Topic ○ Logical grouping of data (category/feed/name) ○ Settings: ○ Replication ○ Partition count ○ Retention ○ Compaction ○ … Kafka Brokers, Topics and Partitions
  • 23.
    1.Xxxxxxxxx xxxxxQueuing and Streamingat Scale Partition ○ Subset of messages in a topic ■ Have a single master broker ■ Guarantee ordered delivery within that subset ○ Number of partitions is set on topic creation Kafka Topics and Partitions (cont’d)
  • 24.
    1.Xxxxxxxxx xxxxxQueuing and Streamingat Scale • Messages are mapped to a partition by the Producer • Randomly/round-robin • Hash of record key • Consumers are members of Consumer Groups • Consumer Groups register to consume records from Topics • Each Consumer in a Consumer Group is the exclusive consumer of a “fair share” of partitions in the topic. Kafka Partitions in Action
  • 25.
    Fitting and architecting for your usecase Cassandra • Big data • one or more individually big (>1TB) tables • Need to pre-determine read pattern • at least to partition key • Very low cost writes • great for high write / read ratio use cases • Ideal for small reads • 1, 10, 100, 1000 rows at a time • No limits to horizontal scaling (data size or ops/sec) • provided you can find a partition that fits. • No relational integrity • No Foreign Keys, no JOIN’s • Limited filtering, aggregation
  • 26.
    Fitting and architecting for your usecase Kafka • Big data • 5k+ message/topic/second • Not transactional • unlike traditional MQ tech • although guaranteed once delivery now available • Kafka Streams very powerful tool for analysis and mutations on data streams
  • 27.

Editor's Notes

  • #11 Lower throughput system