MULTIPLEMULTIPLEMULTIPLEMU
THREADSTHREADSTHREADSTHR
MULTIPLE
THREADS
EVENT STREAM PROCESSING
with
Sylvain Hallé
Raphael Khoury
Sébastien Gaboury
Université du Québec a Chicoutimi
The Problem
input event stream
The Problem
P
computation
input event stream
The Problem
output event stream
P
computation
input event stream
The Problem
Find
ways to split P into parts that
can be processed in parallel
1
output event stream
P
computation
input event stream
The Problem
genericFind
ways to split P into parts that
can be processed in parallel
1
output event stream
P
computation
input event stream
The Problem
Input events arrive
one by one2
genericFind
ways to split P into parts that
can be processed in parallel
1
output event stream
P
computation
input event stream
The Problem
Input events arrive
one by one2
generic
Output events are
produced one by one3
Find
ways to split P into parts that
can be processed in parallel
1
output event stream
P
computation
input event stream
The Problem
The Problem
Single-
threaded
The Problem
Single-
threaded
Multi-
threaded
The Problem
The System
The System
Based on the
composition (piping)
of simple computing
units called
processors
The System
The System
→
Fork f = new Fork(2);
The System
→
→
→
→
Fork f = new Fork(2);
FunctionProcessor sum =
new FunctionProcessor( );
f
The System
→
→
→
+
→
Fork f = new Fork(2);
FunctionProcessor sum =
new FunctionProcessor(Addition instance);
f
The System
→
→
→
+
→
Fork f = new Fork(2);
FunctionProcessor sum =
new FunctionProcessor(Addition.instance);
CountDecimate decimate = new CountDecimate(n);
f
n
The System
→
→
→
+
→
→
Fork f = new Fork(2);
FunctionProcessor sum =
new FunctionProcessor(Addition.instance);
CountDecimate decimate = new CountDecimate(n);
Connector.connect(fork, LEFT, sum, LEFT)
f
n
The System
→
→
→
+
→
→
→
Fork f = new Fork(2);
FunctionProcessor sum =
new FunctionProcessor(Addition.instance);
CountDecimate decimate = new CountDecimate(n);
Connector.connect(fork, LEFT, sum, LEFT)
. connect(fork, RIGHT, decimate, INPUT)
f
n
The System
→
→
→
+
→
→
→
Fork f = new Fork(2);
FunctionProcessor sum =
new FunctionProcessor(Addition.instance);
CountDecimate decimate = new CountDecimate(n);
Connector.connect(fork, LEFT, sum, LEFT)
. connect(fork, RIGHT, decimate, INPUT)
. connect(decimate, OUTPUT, sum, RIGHT);
f
n
The System
→
→
→
+
→
→
→
Fork f = new Fork(2);
FunctionProcessor sum =
new FunctionProcessor(Addition.instance);
CountDecimate decimate = new CountDecimate(n);
Connector.connect(fork, LEFT, sum, LEFT)
. connect(fork, RIGHT, decimate, INPUT)
. connect(decimate, OUTPUT, sum, RIGHT);
Pullable p = sum.getOutputPullable(OUTPUT);
while (p.hasNext() != NextStatus.NO) {
Object o = p.next();
. ..
}
f
n
The System
The System
Function
Applies a function to
every input event
Cumulative
Computes the
progressive "sum" of
all input events
Trim
Removes the first n
input events
Decimate
Outputs every
n-th event
Group
Encloses a group of
connected processors
Fork
Duplicates a stream
into multiple copies
Slice
Splits a stram into multiple
sub-streams
Window
Applies a function to
a sliding window of
n events
f
Σ.f
n
Filter
A first, Boolean stream, decides if each
event of a second stream should be output
{
f
The System
X
G F
U
1
<
Σ f
⊥
<
Σ f
⊥
F
G
=
= =
= Σ f
?
♣
LTL PALETTE
Existing processors should be changed !
4
not
P P PPP
P PPP
The Problem
Existing processors should be changed !
4
not
P P PPP
P PPP
Enclose them within new,
processors
Solution:
thread-aware
P
The Problem
Thread Manager
Responsible for a pool
of N threads
M
Runnable r = ...
ManagedThread t = m.tryNewThread(r);
if (t != null) t.start();
else r.run();
Returns a new thread that can be started, or null if
all N threads are busy
If t is null, the desired processing must be done
in the current thread
Thread Manager
Responsible for a pool
of N threads
M
ψφ
ψ.pull
Sequential Pulling
TIME
ψφ
ψ.pull
Sequential Pulling
TIME
ψ
φ.pull
φ
ψ.pull
Sequential Pulling
TIME
ψ
φ.pull
φ
ψ.pull
Sequential Pulling
TIME
ψ
φ.pull
φ
e1
ψ.pull
Sequential Pulling
TIME
ψ
φ.pull
φ
e1
ψ.pull
Sequential Pulling
TIME
ψ
φ.pull
φ
e1
ψ.pull
Sequential Pulling
TIME
ψ
φ.pull
φ
e1
φ.pull
ψ.pull
ψ.pull
Sequential Pulling
TIME
ψ
φ.pull
φ
e1
φ.pull
ψ.pull
ψ.pull
Sequential Pulling
TIME
ψ
φ.pull
φ
e1
e2
φ.pull
ψ.pull
ψ.pull
Sequential Pulling
TIME
ψ
φ.pull
φ
e1
e2
φ.pull
ψ.pull
ψ.pull
Sequential Pulling
TIME
ψ
φ.pull
φ
e1
e2
φ.pull
ψ.pull
ψ.pull
Sequential Pulling
TIME
ψφ W
Pre-emptive Pulling
TIME
ψ
φ.pull
φ W
Pre-emptive Pulling
TIME
ψ
φ.pull
φ W
W.pull ψ.pull
Pre-emptive Pulling
TIME
ψ
φ.pull
φ W
W.pull ψ.pull
Pre-emptive Pulling
TIME
ψ
φ.pull
φ W
φ.pull
e1
e2
W.pull
e1
ψ.pull
Pre-emptive Pulling
TIME
ψ
φ.pull
φ W
φ.pull
e1
e2
φ.pull
W.pull
e1
ψ.pull
Pre-emptive Pulling
TIME
ψ
φ.pull
φ W
φ.pull
e1
e2
φ.pull
W.pull
e1
W.pull
e2
ψ.pull
ψ.pull
Pre-emptive Pulling
TIME
ψ
φ.pull
φ W
φ.pull
e1
e2
φ.pull
e3
W.pull
e1
W.pull
e2
ψ.pull
ψ.pull
Pre-emptive Pulling
TIME
ψ
φ.pull
φ W
φ.pull
e1
e2
φ.pull
e3
W.pull
e1
W.pull
e2
ψ.pull
ψ.pull
Pre-emptive Pulling
TIME
ψ
φ.pull
φ W
φ.pull
e1
e2
φ.pull
e3
W.pull
e1
W.pull
e2
W.pull
e3
ψ.pull
ψ.pull
ψ.pull
Pre-emptive Pulling
TIME
ψ
φ.pull
φ W
φ.pull
e1
e2
φ.pull
e3
W.pull
e1
W.pull
e2
W.pull
e3
ψ.pull
ψ.pull
ψ.pull
Pre-emptive Pulling
TIME
ψφ
S
Pull Pipeline
TIME
ψ
φ.pull
φ
S
Pull Pipeline
TIME
ψ
push(e1)
φ.pull
φ1
φe1
S
e'1
Pull Pipeline
TIME
ψ
push(e1)
φ.pull
φ2φ1
φe1
e2
S
push(e2)
e'1
e'2
Pull Pipeline
TIME
ψ
push(e1)
φ.pull
φ2φ1
φe1
e2
e3
S
push(e2)
e'1
e'2
φ2
push(e3)
e'3
Pull Pipeline
TIME
ψ
push(e1)
φ.pull
φ2φ1
φe1
e2
e3
S
push(e2)
e'1
e'2
φ2
push(e3)
e'3
Pull Pipeline
TIME
ψ
push(e1)
φ.pull
φ2φ1
φe1
e2
e3
S
push(e2)
e'1
e'1
e'2
φ2
push(e3)
e'3
Pull Pipeline
TIME
ψ
push(e1)
φ.pull
φ2φ1
φe1
e2
e3
S
push(e2)
e'1
e'1
e'2
φ2
push(e3)
e'3
Pull Pipeline
TIME
ψ
push(e1)
φ.pull
φ2φ1
φe1
e2
e3
S
φ.pull
push(e2)
e'1
e'1
e'2
φ2
push(e3)
e'3
Pull Pipeline
TIME
ψ
push(e1)
φ.pull
φ2φ1
φe1
e2
e3
S
φ.pull
push(e2)
e'1
e'1
e'2
φ2
push(e3)
e'3
Pull Pipeline
TIME
ψ
push(e1)
φ.pull
φ2φ1
φe1
e2
e3
S
φ.pull
push(e2)
e'1
e'1
e'2
e'2
φ2
push(e3)
e'3
Pull Pipeline
TIME
ψ
push(e1)
φ.pull
φ2φ1
φe1
e2
e3
S
φ.pull
push(e2)
e'1
e'1
e'2
e'2
φ2
push(e3)
e'3
e'3
φ.pull
Pull Pipeline
TIME
Window ψ
Blocking Push
TIME
Window ψ
push(e)
Blocking Push
TIME
Window ψ
push(e)
φ1.push(e)
Blocking Push
TIME
Window ψ
push(e)
φ1.push(e)
φ2.push(e)
Blocking Push
TIME
Window ψ
push(e)
φ1.push(e)
φn-1.push(e)
. . .
φ2.push(e)
Blocking Push
TIME
Window ψ
push(e)
φ1.push(e)
φn-1.push(e)
. . .
collect e'
shift
copy
φ2.push(e)
Blocking Push
TIME
Window ψ
push(e)
φ1.push(e)
φn-1.push(e)
. . .
collect e'
shift
copy
push(e')
φ2.push(e)
Blocking Push
TIME
Window ψ
push(e)
φ1.push(e)
φn-1.push(e)
. . .
collect e'
shift
copy
push(e')
φ2.push(e)
Event e is pushed to the n copies
of φ sequentially. Each call to
push is blocking.
Blocking Push
TIME
Window ψ
Non-blocking Push
TIME
Window ψ
push(e)
Non-blocking Push
TIME
Window ψ
push(e) φ0.push(e)
φ0
Non-blocking Push
TIME
Window ψ
push(e) φ0.push(e) φ1.push(e)
φ0 φ1
Non-blocking Push
TIME
Window ψ
push(e) φ0.push(e)
φn-1.push(e)
φ1.push(e)
φ0 φ1 φn-1. . .
Non-blocking Push
TIME
Window ψ
push(e) φ0.push(e)
φn-1.push(e)
φ1.push(e)
φ0
φ0.waitFor
φ1 φn-1. . .
Non-blocking Push
TIME
Window ψ
push(e) φ0.push(e)
φn-1.push(e)
φ1.push(e)
φ0
φ0.waitFor
φn-1.waitFor
φ1 φn-1. . .
Non-blocking Push
TIME
Window ψ
push(e) φ0.push(e)
φn-1.push(e)
collect e'
shift
copy
φ1.push(e)
φ0
φ0.waitFor
φn-1.waitFor
φ1 φn-1. . .
Non-blocking Push
TIME
Window ψ
push(e) φ0.push(e)
φn-1.push(e)
collect e'
shift
copy push(e')
φ1.push(e)
φ0
φ0.waitFor
φn-1.waitFor
φ1 φn-1. . .
Non-blocking Push
TIME
Window ψ
push(e) φ0.push(e)
φn-1.push(e)
collect e'
shift
copy push(e')
φ1.push(e)
φ0
φ0.waitFor
φn-1.waitFor
φ1 φn-1. . .
Calls to push are non-blocking;
each runs in a separate thread.
Non-blocking Push
TIME
3 new thread-aware processors
Non-blocking
push
Pull
pipeline
Pre-emptive
pull
Implemented in a separate
independent from the rest of the code
palette
→
→
NP →
PP →
PE
Multi-threading in a Query
→
→
{
f
→
→
+
Σ.f
10
FunctionProcessor sum = new FunctionProcessor(
new CumulativeFunction(Addition.instance));
WindowProcessor win = new WindowProcessor(sum, 10);
ThreadManager m = new ThreadManager(4);
FunctionProcessor sum = new FunctionProcessor(
new CumulativeFunction(Addition.instance));
NonBlockingProcessor nbp =
new NonBlockingProcessor(sum, m);
WindowProcessor win = new WindowProcessor(nbp, 10);
→
→
{
f 10
Multi-threading in a Query
M→
→ →
→
+
Σ.f
4
Experimental Results
Auction bidding
Candidate selection
Endless bashing
Spontaneous Pingu creation
Turn around
1.07
4.44
1.45
1.05
1.86
Benchmark (CRV 2016) Speedup
A single thread-aware processor inserted in the query
Experimental Results
Turn around
→
A
/a/b
//character[status=Walker]/id/text()
→ p1
→
A
→ p2
→ →
→
→
/a/b
//character[status=Blocker]/id/text()
→
→
→
3
→
→
→
<?
→
→
→→
→
→
→
→
f1
f2
→
→ →
→
→
→
→
M4
Multi-threaded
Experimental Results
Turn around
→
A
/a/b
//character[status=Walker]/id/text()
→ p1
→
A
→ p2
→ →
→
→
/a/b
//character[status=Blocker]/id/text()
→
→
→
3
→
→
→
<?
→
→
→→
→
→
→
→
f1
f2
→
→ →
→
→
→
→
Take-home Points
Multi-threading capabilities added to BeepBeep 3
Existing processors do not need to be aware of the
presence of threads
Minimally intrusive: less than 5 lines to add multi-
threading to part of a query
5% - 400% speedup on a small sample of queries
Requires manual tuning and intuition
Not all queries can gain from parallelism!
https://liflab.github.io/beepbeep-3
THE
END
THE
END ...but for how long ?
https://www.researchgate.net/publication/318337325

Event Stream Processing with Multiple Threads