Icml2017 overview

ICML2017 Overview
& Some Topics
September 18th, 2017
Tatsuya Shirakawa

ABEJA, Inc. (Researcher)
- Deep Learning
- Computer Vision
- Natural Language Processing
- Graph Convolution / Graph Embedding
- Mathematical Optimization
- https://github.com/TatsuyaShiraka
tech blog → http://tech-blog.abeja.asia/
Poincaré Embeddings Graph Convolution
We are hiring! → https://www.abeja.asia/recruit/
→ https://six.abejainc.com/

1. ICML Intro & Stats
2. Trends and Topics
Table of Contents
3

Table of Contents
4

International Conference on Machine Learning
• Top ML Conference
• 434 orals in 3 days
• 9 parallel tracks
• Submitted 1629 papers
• 4 talks from invited speakers
• 9 tutorial talks
• 9(parallel)x3(sessions)x3(days)=81 sessions in main conference
ICML 2017 at Sydney
5

Schedule
16
8/6
Tutorial Session 
9 tutorials (3 parallel)
8/7
Main Conference Day 1
27 sessions (9 parallel)
8/8
8/9
8/10
Workshop Conference Day 1 
8/11
Workshop Conference Day 2 
1/3
max attend
1/9
1/9
1/9
1/11
1/11

Table of Contents
17

• Deep learning is still the biggest trend
• Autonomous vehicles
• Health care / computational biology
• Human interpretability and visualization
• Multitask learning for small data or hard tasks
• Reinforcement learning
• Imitation learning (inverse reinforcement learning)
• Language and speech processing
• GANs / CNNs / RNNs / LSTMs are default options
• RNNs and their variant
• Optimizations
• Online learning / bandit
• Time series modeling
• Applications Session
Some Trends (highly biased)
18

• Gluon is a new deep learning wrapper framework, which integrates
dynamic dl frameworks (chainer, pytorch) and static dl frameworks
(keras, mxnet) and get the best of the both worlds (hybridize)
• Great resources including many latest models 
https://github.com/apache/incubator-mxnet/tree/master/example
• Looks easy to write
• Alex Smola was the presenter
• … not so fast yet ? ← 
[Tutorial] Distributed Deep Learning with MxNet Gluon
19

http://www-bcf.usc.edu/~liu32/icml_tutorial.pdf
• RNN works well
• + pretraining (combine other clinics’ data)
• + expert defined features
• + new models for missing data
• CNN works well on image data and achieved super-human accuracy
• Some Features of Health Care Data
• Small sample size
• Missing values
• Medical domain knowledge
• Interpretation
• Use gradient boosting trees to mimic deep learning models (cool idea!)
• Hard to annotate even for experts
• Big Small Data
• Limited amount of data available to train age-specific or disease-specific models
[Tutorial] Deep Learning Models for Health Care:
Challenges and Solutions
20
Future Directions:
- Modeling heterogeneous data sources
- Model interpretation
- More complex output
“Interpretable Deep Models for ICU Outcome Prediction”, 2016

• Deep Neural Networks are “black boxes”.
• Sensitive analyses methods can be applied
• ex: Grad-CAM
[Tutorial] Interpretable Machine Learning
21

• Generating periodic patterns with GANs
• Local/Global/Periodic vectors
“Learning Texture Manifolds with the Periodic Spatial
GAN”
22
Example for many texture and many periodicity.
Local vectors
Global vectors
Periodic vectors

• Sequence revising with generative/Inference models
• Generative model P(x, y, z)=P(x, y|z)P(z)
• x : input seq., y: goodness of x, z: hidden var.
• Inference model P(z|x) , P(y|z)
• Input x0  
-> infer z0  
-> search better z (better F(z))  
-> reconstruct x
“Sequence to better sequence: Continuous Revision of
Combinatorial Structures”
23

• Generating a new step chart  
from a raw audio track
“Dance Dance Convolution”
24

• Gave a new algorithm and theoretical analysis for
sum of norms (SON) clustering
• SON (2011)
• Assigning center to each data point and applied
some regularization which magnetize centers
• Convex problem!
“Clustering by Sum of Norms: Stochastic Incremental
Algorithm, Convergence and Cluster Recovery”
25

Image Compression using Deep Learning
• VAE(almost reconstruction) + GAN(reﬁnement)
• Faster than jpeg on gpu, but several secs on cpu
“Real-Time Adaptive Image Compression”
26

• Subgoals
• Breaking up the problem Into Subgoals
• Learn sub-policies to achieve them
• StreetLearn
• Transfer Learning
• Progressive Neural Networks
• Distral: Robust Multitask Reinforcement Learning
[Invited Talk] “Towards Reinforcement Learning in the
Complex World” - Raia Hadsell (Google Deep Mind)
27

• GANs are approximated by discrete distribution on some
ﬁnite samples (with high probability)
• Sample size =
• P = discriminator size, ε = error
• “The birthday paradox” test
• Sample m images from generator
• See if there are duplicate images
• Estimate the sample size
“Generalization and Equilibrium in Generative Adversarial Nets”  
& “Do GANs actually learn the distribution? Some theory and empirics”
28
˜O(p log(p/✏)/✏2
)

• Deterministic Rounding vs. Stochastic Rounding
• Theoretical explanation that SGD with stochastic
rounding does not converge well
• Every updates are too noisy
• Won the Google Best Student Paper Award
“Towards a Deeper Understanding of Training
Quantized Networks”
29

• RL produces much better sequence than log-likelihood based methods
• Why RL is so eﬀective? (Beam Search Issues?)
“Sequence-Level Training of Neural Models for Visual
Dialog”
30

• Google’s Expander which enhances broad range of tasks using graph structure
• smart reply, personal assistant
• image recognition
• Integrated framework for
• zero-shot/one-shot learning
• multi-modal learning
• semi-supervised learning
• multi-task learning
• “Neural Graph Machines”
• introduces graph regularization into DL
• Adjacent nodes (data) are constrained to have near vector representations
Neural Graph Learning
31

2019 ICML + CVPR !
2021 Asia/Pac!
Future ICMLs
32

Icml2017 overview

More Related Content

What's hot

Similar to Icml2017 overview

More from Tatsuya Shirakawa

Recently uploaded

Icml2017 overview