© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Java to deploy Deep
Learning models with MXNet
Andrew Ayres
Amazon Web Services
Qing Lan
Amazon Web Services
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
● Introduction to Deep Learning
● Apache MXNet
● Getting started with MXNet Java
● Technical Challenges
● Q&A By the end of this session, you will
understand what deep learning is,
and how you can start using
MXNet Java API today!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AI
Machine
Learning
Deep
Learning
Can machines do what we can?
(Turing, 1950)
Machine Learning
Data
Answers
Rules
Traditional
programming
Data
Rules
Answers
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deep learning is a big deal
It has a growing impact on our lives
Personalization Logistics Voice
Autonomous
vehicles
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deep learning is a big deal
It is able to do better than humans
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Inspired by the brain’s neurons
We have ~100B of them, and ~1Q synapses
ANN is a simple computation construct:
w1
w2
wn
x1
x2
xn
Σ φ 𝑦
…
𝑦 = 𝜑(
𝑗=1
𝑛
𝑤𝑗 𝑥𝑗)
ArtificialNeuralNetwork
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Neuralnetwork
Output
layer
Input
layer
Hidden
layers
ManyMore…
• Non-linear
• Hierarchical
feature learning
• Scalable architecture
• Computationally
intensive
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fullyconnectedlayer
Networks are comprised of
stacked layers (mostly)
Composed of artificial neurons,
each fully connected to the
previous layer, and to the next
Connections have an associated
“weight” learned during training
Fully connected
layer
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Convolutionlayer
Operator is “sliding” a kernel
across the input, computing
value at each position in the
output tensor
The kernel values are the
learned parameters of the layer
Convolution layers are effective
in perceptual problems involving
visual data
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Convolutionlayer
When combined across multiple layers, convolution is able to
learn hierarchical features, with increasing levels of abstraction
Layer 1 – Edges Layer 2 – Curves Layer N –
Classes
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Recurrentneural networklayer
In some problems, predicting a given
point in time depends on input
context from an earlier time
RNN layer “remembers the past” –
using loops!
RNNs are effective for time series
and natural language problems
A
X
y
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Recurrentneural networklayer
We can “unroll” an RNN layer to understand how it works
A
x0
h
0
A
x1
h
1
A
x2
h
2
A
xt
h
t
…A
X
y
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Forward pass
Backwards pass
Input Data
Neural
Network
Output
Loss
Back
Propagate
Update
Weights
Forward-backward repeats across multiple epochs, each
epoch goes through the entire training dataset.
Trainingneuralnetworks
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Apache MXNet
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Apache MXNet - Background
● Framework for building, training, and deploying Deep Neural Nets
● Apache (incubating) open source project
● Created by academia (CMU and UW)
● Adopted by AWS as DNN framework of choice, Nov 2016
http://mxnet.apache.org
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Apache MXNet - Highlights
•Imperative, Symbolic and Dynamic APIs
•Supports 8 languages including 3 JVM languages (Scala, Java, Clojure)
Programmability
•Optimized for CPU, GPU, ARM (and more)
•Highly scalable distributed training
•Quantization, Sparse, NCCL, and more…
Performance
• ONNX
• Toolkits to quickly get started Gluon-CV, Gluon-NP, Model Zoo
• Model Server
MXNet ecosystem
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why MXNet Java?
● Community interest in deploying models to Java
● Enable Java developers to use DL in their existing Java
workflows without having to become experts
Blogpost: https://medium.com/apache-mxnet/introducing-java-apis-for-deep-learning-inference-
with-apache-mxnet-8406a698fa5a
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Getting started with MXNet Java
● Step1: Install Java8
● Step2: MXNet Java Demo project
● Linux and OSX are supported
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
MXNet Java Inference Demo – Inference API
Single Shot Object Detection
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
MXNet Java API: Architecture
C
Frontend
Backend
C++
ClojureJuliaPerlR
ScalaPython Java
While keeping high performance from efficient backend
Clojure
Scala
Java
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Java Code Generation
Scenario
● Around 200+ math functions optimized for large dataset
(operators) are implemented in C that we want to provide
● Function names and arguments are not being provided
● We don’t want to code!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Java Code Generation
Solution
● Fetching all argument information from C
● C to Java argument mapping
● Function Generation
● How to deal with too many arguments?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Java Code Generation
Solution
● Operators with few arguments
● Call directly
// public static NDArray[] broadcast_mul(NDArray lhs, NDArray rhs, NDArray out)
NDArray test = NDArray.broadcast_mul(lhs, rhs, out);
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Java Code Generation
Solution
● Operators with more arguments
● Use ParamObject ( NDArray.RNN(RNNParam obj) )
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Java Memory Management
Scenario
● Some Java objects allocate
off-heap memory
● Java GC does not release
off-heap memory
● A couple of mechanisms
are provided to prevent
memory leaks
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Java Memory Management
Mechanisim
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Java Memory Management
PhantomReference
● These MXNet objects (NDArray, Executor, Module) are
tracked using Phantom References.
● When the Java objects lose all references, they are put
into a reference queue.
● We take advantage of this and do a pre-mortem
cleanup by freeing the native memory corresponding to
these objects.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Java Memory Management
ResourceScope
● ResourceScope implements AutoClosable
● Native memory is tracked within the scope and freed
when the scope is closed
try(ResourceScope scope = new ResourceScope()) {
NDArray test = NDArray.ones((Shape(2,2));
}
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Java Memory Management
DisposeMethod
● You can manually dispose the native memory just like
hamming a nail to the board
● You can hamming your finger to it too…
NDArray test = NDArray.ones((Shape(2,2));
test.dispose();
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Contribute to Apache MXNet
● GitHub: https://github.com/apache/incubator-mxnet
● Subscribe to our developer mailing list:
dev@mxnet.incubator.apache.org
● Slack Channel: https://the-asf.slack.com and go to #mxnet
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Andrew Ayres
Amazon Web Services
Qing Lan
Amazon Web Services
Q & A Session

Using Java to deploy Deep Learning models with MXNet

  • 1.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Java to deploy Deep Learning models with MXNet Andrew Ayres Amazon Web Services Qing Lan Amazon Web Services
  • 2.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Agenda ● Introduction to Deep Learning ● Apache MXNet ● Getting started with MXNet Java ● Technical Challenges ● Q&A By the end of this session, you will understand what deep learning is, and how you can start using MXNet Java API today!
  • 3.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AI Machine Learning Deep Learning Can machines do what we can? (Turing, 1950) Machine Learning Data Answers Rules Traditional programming Data Rules Answers
  • 4.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Deep learning is a big deal It has a growing impact on our lives Personalization Logistics Voice Autonomous vehicles
  • 5.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Deep learning is a big deal It is able to do better than humans
  • 6.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Inspired by the brain’s neurons We have ~100B of them, and ~1Q synapses ANN is a simple computation construct: w1 w2 wn x1 x2 xn Σ φ 𝑦 … 𝑦 = 𝜑( 𝑗=1 𝑛 𝑤𝑗 𝑥𝑗) ArtificialNeuralNetwork
  • 8.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Neuralnetwork Output layer Input layer Hidden layers ManyMore… • Non-linear • Hierarchical feature learning • Scalable architecture • Computationally intensive
  • 9.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Fullyconnectedlayer Networks are comprised of stacked layers (mostly) Composed of artificial neurons, each fully connected to the previous layer, and to the next Connections have an associated “weight” learned during training Fully connected layer
  • 10.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Convolutionlayer Operator is “sliding” a kernel across the input, computing value at each position in the output tensor The kernel values are the learned parameters of the layer Convolution layers are effective in perceptual problems involving visual data
  • 11.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Convolutionlayer When combined across multiple layers, convolution is able to learn hierarchical features, with increasing levels of abstraction Layer 1 – Edges Layer 2 – Curves Layer N – Classes
  • 12.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Recurrentneural networklayer In some problems, predicting a given point in time depends on input context from an earlier time RNN layer “remembers the past” – using loops! RNNs are effective for time series and natural language problems A X y
  • 13.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Recurrentneural networklayer We can “unroll” an RNN layer to understand how it works A x0 h 0 A x1 h 1 A x2 h 2 A xt h t …A X y
  • 14.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Forward pass Backwards pass Input Data Neural Network Output Loss Back Propagate Update Weights Forward-backward repeats across multiple epochs, each epoch goes through the entire training dataset. Trainingneuralnetworks
  • 15.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Apache MXNet
  • 16.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Apache MXNet - Background ● Framework for building, training, and deploying Deep Neural Nets ● Apache (incubating) open source project ● Created by academia (CMU and UW) ● Adopted by AWS as DNN framework of choice, Nov 2016 http://mxnet.apache.org
  • 17.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved.
  • 18.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Apache MXNet - Highlights •Imperative, Symbolic and Dynamic APIs •Supports 8 languages including 3 JVM languages (Scala, Java, Clojure) Programmability •Optimized for CPU, GPU, ARM (and more) •Highly scalable distributed training •Quantization, Sparse, NCCL, and more… Performance • ONNX • Toolkits to quickly get started Gluon-CV, Gluon-NP, Model Zoo • Model Server MXNet ecosystem
  • 19.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Why MXNet Java? ● Community interest in deploying models to Java ● Enable Java developers to use DL in their existing Java workflows without having to become experts Blogpost: https://medium.com/apache-mxnet/introducing-java-apis-for-deep-learning-inference- with-apache-mxnet-8406a698fa5a
  • 20.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 21.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Getting started with MXNet Java ● Step1: Install Java8 ● Step2: MXNet Java Demo project ● Linux and OSX are supported
  • 22.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. MXNet Java Inference Demo – Inference API Single Shot Object Detection
  • 23.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 24.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. MXNet Java API: Architecture C Frontend Backend C++ ClojureJuliaPerlR ScalaPython Java While keeping high performance from efficient backend Clojure Scala Java
  • 25.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Java Code Generation Scenario ● Around 200+ math functions optimized for large dataset (operators) are implemented in C that we want to provide ● Function names and arguments are not being provided ● We don’t want to code!
  • 26.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Java Code Generation Solution ● Fetching all argument information from C ● C to Java argument mapping ● Function Generation ● How to deal with too many arguments?
  • 27.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Java Code Generation Solution ● Operators with few arguments ● Call directly // public static NDArray[] broadcast_mul(NDArray lhs, NDArray rhs, NDArray out) NDArray test = NDArray.broadcast_mul(lhs, rhs, out);
  • 28.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Java Code Generation Solution ● Operators with more arguments ● Use ParamObject ( NDArray.RNN(RNNParam obj) )
  • 29.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 30.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Java Memory Management Scenario ● Some Java objects allocate off-heap memory ● Java GC does not release off-heap memory ● A couple of mechanisms are provided to prevent memory leaks
  • 31.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Java Memory Management Mechanisim
  • 32.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Java Memory Management PhantomReference ● These MXNet objects (NDArray, Executor, Module) are tracked using Phantom References. ● When the Java objects lose all references, they are put into a reference queue. ● We take advantage of this and do a pre-mortem cleanup by freeing the native memory corresponding to these objects.
  • 33.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Java Memory Management ResourceScope ● ResourceScope implements AutoClosable ● Native memory is tracked within the scope and freed when the scope is closed try(ResourceScope scope = new ResourceScope()) { NDArray test = NDArray.ones((Shape(2,2)); }
  • 34.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Java Memory Management DisposeMethod ● You can manually dispose the native memory just like hamming a nail to the board ● You can hamming your finger to it too… NDArray test = NDArray.ones((Shape(2,2)); test.dispose();
  • 35.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Contribute to Apache MXNet ● GitHub: https://github.com/apache/incubator-mxnet ● Subscribe to our developer mailing list: dev@mxnet.incubator.apache.org ● Slack Channel: https://the-asf.slack.com and go to #mxnet
  • 36.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Andrew Ayres Amazon Web Services Qing Lan Amazon Web Services Q & A Session