1
Stefan Richter

@stefanrrichter



29.10.2016
A look at Flink 1.2 and beyond
Agenda
▪ Flink 1.2 feature overview & walkthrough
▪ Taking a closer look at two features:
▪ Queryable state
▪ Dynamic scaling
2
Feature Overview
Flink Release 1.2
3
Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application

Features
Broader

Audience
Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application

Features
Broader

Audience
Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application

Features
Broader

Audience
Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application

Features
Broader

Audience
Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application

Features
Broader

Audience
Flink 1.2 Improvements
5
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Operations
Ecosystem
Application

Features
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Broader

Audience
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Security / Authentication - Flink 1.2
6
Authorized data access
Secured clusters with Kerberos-based authentication
• Kafka, ZooKeeper, HDFS, YARN, HBase, …
Encrypted traffic between Flink Processes
• RPC, Data Exchange, Web UI, … - „SSL for all connections“
Largely contributed by
Prevent malicious users to hook into Flink jobs
Cluster Management - Flink 1.1
7
Standalone
Flink on Yarn
Cluster Management - Flink 1.2
8Mesos integration contributed by
Standalone
Flink on Yarn
Flink on Mesos
Cluster Management - Beyond 1.2
9
Efforts to seamlessly interoperate with various
cluster managers.
Generalized abstraction (FLIP-6).
Driven by and
Cluster Management - Beyond (ct’d)
10
TaskManagerJobManager
(1) Register
(2) Deploy Tasks
ResourceManager
(1) Request
slots
TaskManager
JobManager
(2) Start
TaskManager
(3) Register
(4) Deploy Tasks
Dispatcher
(0) Start
JobManager
Cluster Management - Beyond (ct’d)
10
TaskManagerJobManager
(1) Register
(2) Deploy Tasks
ResourceManager
(1) Request
slots
TaskManager
JobManager
(2) Start
TaskManager
(3) Register
(4) Deploy Tasks
Dispatcher
(0) Start
JobManager
Metrics
11
Metrics
▪ Rates
11
Metrics
▪ Rates
▪ Latency (operator)
11
Metrics
▪ Rates
▪ Latency (operator)
▪ Visualization in WebUI
11
Savepoint / Checkpoint Robustness
12
Savepoint / Checkpoint Robustness
▪ Resume job from
checkpoints
12
C S
Savepoint / Checkpoint Robustness
▪ Resume job from
checkpoints
▪ Use older checkpoint
on failed recovery
12
C1 C2 C3
t
✘
Savepoint / Checkpoint Robustness
▪ Resume job from
checkpoints
▪ Use older checkpoint
on failed recovery
▪ Skip failed Checkpoints
12
C1 C2 C3
t
✘
Savepoint / Checkpoint Robustness
▪ Resume job from
checkpoints
▪ Use older checkpoint
on failed recovery
▪ Skip failed Checkpoints
▪ Backwards compatible
12
S
1.1 1.2
Processing Function
13
Stream SQL
Streaming API
Processing Function
Window
Operator
Timer
Handling
?
Problem: Implement custom windowing?
Processing Function
13
Stream SQL
Streaming API
Processing Function
Window
Operator
Timer
Handling
Interface ProcessingFunction:
void flatMap(I value, Context ctx, Collector<O> out) throws Exception;
void onTimer(long timestamp, OnTimerContext ctx, Collector<O> out) throws Exception
Table API & Stream SQL
14
Example:
Table API & Stream SQL
▪ Group-windows
14
Example:
table
.groupBy('user')
.window(Session withGap
10.minutes on 'rowtime')
.select('uid', 'product.count')
Table API & Stream SQL
▪ Group-windows
▪ More SQL operations
14
Example:
EXISTS, VALUES, LIMIT
Table API & Stream SQL
▪ Group-windows
▪ More SQL operations
▪ More built-in scalar functions
14
Example:
CURRENT_DATE, INITCAP, NULLIF
Table API & Stream SQL
▪ Group-windows
▪ More SQL operations
▪ More built-in scalar functions
▪ More datatypes & better
integration
14
Example:
pojo.get('field')
pojo.flatten()
Table API & Stream SQL
▪ Group-windows
▪ More SQL operations
▪ More built-in scalar functions
▪ More datatypes & better
integration
▪ User-defined scalar functions
14
Example:
table.
select('uid',
parseName('userJson'))
Many more improvements…
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
▪ Bucketing Sink: divides output into different file w.r.t. user
logic
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
▪ Bucketing Sink: divides output into different file w.r.t. user
logic
▪ Detached execution: first step in programatically controlled
job
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
▪ Bucketing Sink: divides output into different file w.r.t. user
logic
▪ Detached execution: first step in programatically controlled
job
▪ Async IO operator: non-blocking queries to external systems
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
▪ Bucketing Sink: divides output into different file w.r.t. user
logic
▪ Detached execution: first step in programatically controlled
job
▪ Async IO operator: non-blocking queries to external systems
▪ Improved scalability, robustness + bugfixes
15
Queryable State
Flink 1.2
16
Queryable State - Motivation
17
Realtime
Queries
Periodically (every second)

flush new aggregates

to Redis
Queryable State - Motivation
18
Number of

Keys
Queryable State - Motivation
19
Realtime
QueriesWhere is the bottleneck?
Queryable State - Motivation
19
Writes to the key/value

store take too long
Realtime
QueriesWhere is the bottleneck?
Queryable State - Idea
20
Realtime
Queries
Archive
Database
Optional +
only at end of windows
“Streamprocessor
as a database“
Queryable State - Performance
21
Number of

Keys
Queryable State - Implementation
22
Query Client
State

Registry
window()
/
sum()
Job Manager Task Manager
ExecutionGraph
State Location Server
deploy
status
Query: /job/operation/state-name/key
State

Registry
Task Manager
(1) Get location of "key-partition"

for "operator" of" job"
(2) Look up

location
(3)

Respond location
(4) Query

state-name and key
local

state
register
window()
/
sum()
Queryable State Enablers
23
Queryable State Enablers
▪ Flink has state as a first class citizen
23
Queryable State Enablers
▪ Flink has state as a first class citizen
▪ State is fault tolerant (exactly once semantics)
23
Queryable State Enablers
▪ Flink has state as a first class citizen
▪ State is fault tolerant (exactly once semantics)
▪ State is partitioned (sharded) together with the
operators that create/update it
23
Queryable State Enablers
▪ Flink has state as a first class citizen
▪ State is fault tolerant (exactly once semantics)
▪ State is partitioned (sharded) together with the
operators that create/update it
▪ State is continuous (not mini batched)
23
Queryable State Enablers
▪ Flink has state as a first class citizen
▪ State is fault tolerant (exactly once semantics)
▪ State is partitioned (sharded) together with the
operators that create/update it
▪ State is continuous (not mini batched)
▪ State is scalable (e.g., embedded RocksDB state
backend)
23
Dynamic Scaling
Flink 1.2
24
Motivation - Changing Workloads
25
Motivation - Changing Workloads
25
Motivation - Changing Workloads
25
Motivation - Resource Adaption
26
time
Workload
Resources
Motivation - Resource Adaption
26
time
Workload
Resources
time
Workload
Resources
Motivation - Resource Adaption
26
+
time
Workload
Resources
time
Workload
Resources
Basic Idea
27
• Spread work across more workers to decrease workload
Scaling Stateless Jobs
28
Scale Up Scale Down
Source
Mapper
Sink
• Scale up: Deploy new tasks
• Scale down: Cancel running tasks
Scaling Stateful Jobs
29
?
• Problem 1: Which state to assign to new task?
• Problem 2: Read + filter whole state?
Non-keyed vs Keyed State
30
• State bound to an operator + key
• E.g. Keyed UDF and window state
• „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator
• E.g. Source state
KeyedNon-keyed
Non-keyed vs Keyed State
30
• State bound to an operator + key
• E.g. Keyed UDF and window state
• „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator
• E.g. Source state
KeyedNon-keyed
Repartitioning Non-keyed state
31
#1 #2
#3 #4
#1 #2
#3 #4
Flink 1.1:
T snapshot()
void restore(T)
Flink 1.2:
List<T> snapshot()
void restore(List<T>)
Idea: break up state into finer granules that can be redistributed independently
Example: Kafka Source Flink 1.1
32
partitionId: 1, offset: 42
partitionId: 3, offset: 10
partitionId: 6, offset: 27
?
Operator state is black box. How to repartition?
Example: Kafka Source Flink 1.2
33
partitionId: 1, offset: 42
partitionId: 3, offset: 10
partitionId: 6, offset: 27
?
?
?
Return a list of sub-states which can be freely repartitioned.
partitionId: 1, offset: 42
partitionId: 6, offset: 27
Example: Kafka Source Flink 1.2
34
partitionId: 3, offset: 10
Scale Out
partitionId: 1, offset: 42
partitionId: 6, offset: 27
Example: Kafka Source Flink 1.2
34
partitionId: 3, offset: 10
Scale Out
Example: Kafka Source Flink 1.2
35
partitionId: 1, offset: 42
partitionId: 6, offset: 27
partitionId: 3, offset: 10
Scale In
Example: Kafka Source Flink 1.2
35
partitionId: 1, offset: 42
partitionId: 6, offset: 27
partitionId: 3, offset: 10
Scale In
Non-keyed vs Keyed State
36
• State bound to an operator + key
• E.g. Keyed UDF and window state
• „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator
• E.g. Source state
KeyedNon-keyed
Repartitioning Keyed State
▪ Split key space into
key groups
▪ Every key falls into
exactly one key group
▪ Assign key groups to
tasks
37
Key space
Key group #1 Key group #2
Key group #3Key group #4
One key
Repartitioning Keyed State (ct’d)
▪ Rescaling changes
key group assignment
▪ Maximum parallelism
defined by #key
groups
38
Current State in Flink 1.2
▪ Manual rescaling
1. Take savepoint
2. Restart job with adjusted parallelism and
savepoint
39
Next Steps beyond Flink 1.2
▪ Rescaling individual operators w/o restart
▪ Refactor Flink deployment and process
model (previously discussed)
▪ On-the-fly Scaling
40
Autoscaling Policies
41
• Latency
• Throughput
• Resource utilization
• Kubernetes on GCE, EC2 and Mesos (marathon-
autoscale) already support auto-scaling
Conclusion
42
Conclusion
▪ Many great features in Flink 1.2
▪ Walkthrough
▪ Queryable State & Dynamic Scaling
42
Conclusion
▪ Many great features in Flink 1.2
▪ Walkthrough
▪ Queryable State & Dynamic Scaling
▪ Glimpse beyond the 1.2 release
42
43
Thank you!
@stefanrrichter
@ApacheFlink
@dataArtisans
Questions?
44

A look at Flink 1.2

  • 1.
  • 2.
    Agenda ▪ Flink 1.2feature overview & walkthrough ▪ Taking a closer look at two features: ▪ Queryable state ▪ Dynamic scaling 2
  • 3.
  • 4.
    Flink 1.1+ ongoingdevelopment 4 Session Windows(Stream) SQL Library
 enhancements Metric
 System Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors Operations Ecosystem Application
 Features Broader
 Audience
  • 5.
    Flink 1.1+ ongoingdevelopment 4 Session Windows(Stream) SQL Library
 enhancements Metric
 System Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors Operations Ecosystem Application
 Features Broader
 Audience
  • 6.
    Flink 1.1+ ongoingdevelopment 4 Session Windows(Stream) SQL Library
 enhancements Metric
 System Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors Operations Ecosystem Application
 Features Broader
 Audience
  • 7.
    Flink 1.1+ ongoingdevelopment 4 Session Windows(Stream) SQL Library
 enhancements Metric
 System Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors Operations Ecosystem Application
 Features Broader
 Audience
  • 8.
    Flink 1.1+ ongoingdevelopment 4 Session Windows(Stream) SQL Library
 enhancements Metric
 System Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors Operations Ecosystem Application
 Features Broader
 Audience
  • 9.
    Flink 1.2 Improvements 5 SessionWindows(Stream) SQL Library
 enhancements Metric
 System Operations Ecosystem Application
 Features Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Broader
 Audience Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors
  • 10.
    Security / Authentication- Flink 1.2 6 Authorized data access Secured clusters with Kerberos-based authentication • Kafka, ZooKeeper, HDFS, YARN, HBase, … Encrypted traffic between Flink Processes • RPC, Data Exchange, Web UI, … - „SSL for all connections“ Largely contributed by Prevent malicious users to hook into Flink jobs
  • 11.
    Cluster Management -Flink 1.1 7 Standalone Flink on Yarn
  • 12.
    Cluster Management -Flink 1.2 8Mesos integration contributed by Standalone Flink on Yarn Flink on Mesos
  • 13.
    Cluster Management -Beyond 1.2 9 Efforts to seamlessly interoperate with various cluster managers. Generalized abstraction (FLIP-6). Driven by and
  • 14.
    Cluster Management -Beyond (ct’d) 10 TaskManagerJobManager (1) Register (2) Deploy Tasks ResourceManager (1) Request slots TaskManager JobManager (2) Start TaskManager (3) Register (4) Deploy Tasks Dispatcher (0) Start JobManager
  • 15.
    Cluster Management -Beyond (ct’d) 10 TaskManagerJobManager (1) Register (2) Deploy Tasks ResourceManager (1) Request slots TaskManager JobManager (2) Start TaskManager (3) Register (4) Deploy Tasks Dispatcher (0) Start JobManager
  • 16.
  • 17.
  • 18.
  • 19.
    Metrics ▪ Rates ▪ Latency(operator) ▪ Visualization in WebUI 11
  • 20.
  • 21.
    Savepoint / CheckpointRobustness ▪ Resume job from checkpoints 12 C S
  • 22.
    Savepoint / CheckpointRobustness ▪ Resume job from checkpoints ▪ Use older checkpoint on failed recovery 12 C1 C2 C3 t ✘
  • 23.
    Savepoint / CheckpointRobustness ▪ Resume job from checkpoints ▪ Use older checkpoint on failed recovery ▪ Skip failed Checkpoints 12 C1 C2 C3 t ✘
  • 24.
    Savepoint / CheckpointRobustness ▪ Resume job from checkpoints ▪ Use older checkpoint on failed recovery ▪ Skip failed Checkpoints ▪ Backwards compatible 12 S 1.1 1.2
  • 25.
    Processing Function 13 Stream SQL StreamingAPI Processing Function Window Operator Timer Handling ? Problem: Implement custom windowing?
  • 26.
    Processing Function 13 Stream SQL StreamingAPI Processing Function Window Operator Timer Handling Interface ProcessingFunction: void flatMap(I value, Context ctx, Collector<O> out) throws Exception; void onTimer(long timestamp, OnTimerContext ctx, Collector<O> out) throws Exception
  • 27.
    Table API &Stream SQL 14 Example:
  • 28.
    Table API &Stream SQL ▪ Group-windows 14 Example: table .groupBy('user') .window(Session withGap 10.minutes on 'rowtime') .select('uid', 'product.count')
  • 29.
    Table API &Stream SQL ▪ Group-windows ▪ More SQL operations 14 Example: EXISTS, VALUES, LIMIT
  • 30.
    Table API &Stream SQL ▪ Group-windows ▪ More SQL operations ▪ More built-in scalar functions 14 Example: CURRENT_DATE, INITCAP, NULLIF
  • 31.
    Table API &Stream SQL ▪ Group-windows ▪ More SQL operations ▪ More built-in scalar functions ▪ More datatypes & better integration 14 Example: pojo.get('field') pojo.flatten()
  • 32.
    Table API &Stream SQL ▪ Group-windows ▪ More SQL operations ▪ More built-in scalar functions ▪ More datatypes & better integration ▪ User-defined scalar functions 14 Example: table. select('uid', parseName('userJson'))
  • 33.
  • 34.
    Many more improvements… ▪Kafka 0.10 (with watermarks) 15
  • 35.
    Many more improvements… ▪Kafka 0.10 (with watermarks) ▪ Bucketing Sink: divides output into different file w.r.t. user logic 15
  • 36.
    Many more improvements… ▪Kafka 0.10 (with watermarks) ▪ Bucketing Sink: divides output into different file w.r.t. user logic ▪ Detached execution: first step in programatically controlled job 15
  • 37.
    Many more improvements… ▪Kafka 0.10 (with watermarks) ▪ Bucketing Sink: divides output into different file w.r.t. user logic ▪ Detached execution: first step in programatically controlled job ▪ Async IO operator: non-blocking queries to external systems 15
  • 38.
    Many more improvements… ▪Kafka 0.10 (with watermarks) ▪ Bucketing Sink: divides output into different file w.r.t. user logic ▪ Detached execution: first step in programatically controlled job ▪ Async IO operator: non-blocking queries to external systems ▪ Improved scalability, robustness + bugfixes 15
  • 39.
  • 40.
    Queryable State -Motivation 17 Realtime Queries Periodically (every second)
 flush new aggregates
 to Redis
  • 41.
    Queryable State -Motivation 18 Number of
 Keys
  • 42.
    Queryable State -Motivation 19 Realtime QueriesWhere is the bottleneck?
  • 43.
    Queryable State -Motivation 19 Writes to the key/value
 store take too long Realtime QueriesWhere is the bottleneck?
  • 44.
    Queryable State -Idea 20 Realtime Queries Archive Database Optional + only at end of windows “Streamprocessor as a database“
  • 45.
    Queryable State -Performance 21 Number of
 Keys
  • 46.
    Queryable State -Implementation 22 Query Client State
 Registry window() / sum() Job Manager Task Manager ExecutionGraph State Location Server deploy status Query: /job/operation/state-name/key State
 Registry Task Manager (1) Get location of "key-partition"
 for "operator" of" job" (2) Look up
 location (3)
 Respond location (4) Query
 state-name and key local
 state register window() / sum()
  • 47.
  • 48.
    Queryable State Enablers ▪Flink has state as a first class citizen 23
  • 49.
    Queryable State Enablers ▪Flink has state as a first class citizen ▪ State is fault tolerant (exactly once semantics) 23
  • 50.
    Queryable State Enablers ▪Flink has state as a first class citizen ▪ State is fault tolerant (exactly once semantics) ▪ State is partitioned (sharded) together with the operators that create/update it 23
  • 51.
    Queryable State Enablers ▪Flink has state as a first class citizen ▪ State is fault tolerant (exactly once semantics) ▪ State is partitioned (sharded) together with the operators that create/update it ▪ State is continuous (not mini batched) 23
  • 52.
    Queryable State Enablers ▪Flink has state as a first class citizen ▪ State is fault tolerant (exactly once semantics) ▪ State is partitioned (sharded) together with the operators that create/update it ▪ State is continuous (not mini batched) ▪ State is scalable (e.g., embedded RocksDB state backend) 23
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
    Motivation - ResourceAdaption 26 time Workload Resources
  • 58.
    Motivation - ResourceAdaption 26 time Workload Resources time Workload Resources
  • 59.
    Motivation - ResourceAdaption 26 + time Workload Resources time Workload Resources
  • 60.
    Basic Idea 27 • Spreadwork across more workers to decrease workload
  • 61.
    Scaling Stateless Jobs 28 ScaleUp Scale Down Source Mapper Sink • Scale up: Deploy new tasks • Scale down: Cancel running tasks
  • 62.
    Scaling Stateful Jobs 29 ? •Problem 1: Which state to assign to new task? • Problem 2: Read + filter whole state?
  • 63.
    Non-keyed vs KeyedState 30 • State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“ • State bound only to operator • E.g. Source state KeyedNon-keyed
  • 64.
    Non-keyed vs KeyedState 30 • State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“ • State bound only to operator • E.g. Source state KeyedNon-keyed
  • 65.
    Repartitioning Non-keyed state 31 #1#2 #3 #4 #1 #2 #3 #4 Flink 1.1: T snapshot() void restore(T) Flink 1.2: List<T> snapshot() void restore(List<T>) Idea: break up state into finer granules that can be redistributed independently
  • 66.
    Example: Kafka SourceFlink 1.1 32 partitionId: 1, offset: 42 partitionId: 3, offset: 10 partitionId: 6, offset: 27 ? Operator state is black box. How to repartition?
  • 67.
    Example: Kafka SourceFlink 1.2 33 partitionId: 1, offset: 42 partitionId: 3, offset: 10 partitionId: 6, offset: 27 ? ? ? Return a list of sub-states which can be freely repartitioned.
  • 68.
    partitionId: 1, offset:42 partitionId: 6, offset: 27 Example: Kafka Source Flink 1.2 34 partitionId: 3, offset: 10 Scale Out
  • 69.
    partitionId: 1, offset:42 partitionId: 6, offset: 27 Example: Kafka Source Flink 1.2 34 partitionId: 3, offset: 10 Scale Out
  • 70.
    Example: Kafka SourceFlink 1.2 35 partitionId: 1, offset: 42 partitionId: 6, offset: 27 partitionId: 3, offset: 10 Scale In
  • 71.
    Example: Kafka SourceFlink 1.2 35 partitionId: 1, offset: 42 partitionId: 6, offset: 27 partitionId: 3, offset: 10 Scale In
  • 72.
    Non-keyed vs KeyedState 36 • State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“ • State bound only to operator • E.g. Source state KeyedNon-keyed
  • 73.
    Repartitioning Keyed State ▪Split key space into key groups ▪ Every key falls into exactly one key group ▪ Assign key groups to tasks 37 Key space Key group #1 Key group #2 Key group #3Key group #4 One key
  • 74.
    Repartitioning Keyed State(ct’d) ▪ Rescaling changes key group assignment ▪ Maximum parallelism defined by #key groups 38
  • 75.
    Current State inFlink 1.2 ▪ Manual rescaling 1. Take savepoint 2. Restart job with adjusted parallelism and savepoint 39
  • 76.
    Next Steps beyondFlink 1.2 ▪ Rescaling individual operators w/o restart ▪ Refactor Flink deployment and process model (previously discussed) ▪ On-the-fly Scaling 40
  • 77.
    Autoscaling Policies 41 • Latency •Throughput • Resource utilization • Kubernetes on GCE, EC2 and Mesos (marathon- autoscale) already support auto-scaling
  • 78.
  • 79.
    Conclusion ▪ Many greatfeatures in Flink 1.2 ▪ Walkthrough ▪ Queryable State & Dynamic Scaling 42
  • 80.
    Conclusion ▪ Many greatfeatures in Flink 1.2 ▪ Walkthrough ▪ Queryable State & Dynamic Scaling ▪ Glimpse beyond the 1.2 release 42
  • 81.
  • 82.