Font classification with 5 deep learning models using tensor flow

Font Classification
with 5 Deep Learning Models Using TensorFlow
Alina Li Zhang
March, 2019
TensorFlow User Group Toronto - Women in AI

What you can get from this presentation:
Build decent Deep Learning models
with a few lines of code in
TensorFlow.

Data Engineering
convert images to grayscale with 36*36 pixels
- grayscale
- rgb
- rgba
add labels to dataset
- SansSerif 0
- Serif 1

split the dataset into 2 datasets -> permutate

5 Models
● logistic regression
● Single hidden layer model
● multiple hidden layers model
● Deep CNN with convolutional and pooling layer
● Deeper CNN with 2 conv and pooling layers

Logistic regression model - build model
sess = tf.InteractiveSession()
# These will be inputs
## Input pixels, flattened
x = tf.placeholder("float", [None, 1296])
## Known labels
y_ = tf.placeholder("float", [None,2])
# Variables
W = tf.Variable(tf.zeros([1296,2]))
b = tf.Variable(tf.zeros([2]))
# Just initialize
sess.run(tf.global_variables_initializer())
# Define model
y = tf.nn.softmax(tf.matmul(x,W) + b)
### End model specification, begin training code

Logistic regression model - training
# Climb on cross-entropy
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits_v2(
logits = y + 1e-50, labels = y_))
# How we train
train_step = tf.train.GradientDescentOptimizer(
0.02).minimize(cross_entropy)
…
# Actually train
epochs = 3000
train_acc = np.zeros(epochs//10)
test_acc = np.zeros(epochs//10)
for i in tqdm(range(epochs)):
...
train_step.run(feed_dict={
x: train_dataset,
y_: train_labels})

Logistic regression model - computed weights

Single hidden layer model
# Hidden layer
num_hidden = 128
W1 = tf.Variable(tf.truncated_normal([1296, num_hidden],
stddev=1./math.sqrt(1296)))
b1 = tf.Variable(tf.constant(0.1,shape=[num_hidden]))
h1 = tf.sigmoid(tf.matmul(x,W1) + b1)
# Output Layer
W2 = tf.Variable(tf.truncated_normal([num_hidden, 2],
b2 = tf.Variable(tf.constant(0.1,shape=[2]))
# Just initialize
# Define model
y = tf.nn.softmax(tf.matmul(h1,W2) + b2)
# Actually train
epochs = 20000
train_acc = np.zeros(epochs//10)
test_acc = np.zeros(epochs//10)
for i in tqdm(range(epochs), ascii=True):
if i % 10 == 0:
# Check accuracy on train set
A = accuracy.eval(feed_dict={
x: train_dataset,
y_: train_labels})
train_acc[i//10] = A
# And now the validation set
A = accuracy.eval(feed_dict={
x: valid_dataset,
y_: valid_labels})
test_acc[i//10] = A
train_step.run(feed_dict={
x: train_dataset,
y_: train_labels})

The multiple hidden layer model
# Hidden layer 1
num_hidden1 = 256
W1 = tf.Variable(tf.truncated_normal([1296,num_hidden1],
b1 = tf.Variable(tf.constant(0.1,shape=[num_hidden1]))
h1 = tf.sigmoid(tf.matmul(x,W1) + b1)
# Hidden Layer 2
num_hidden2 = 64
W2 = tf.Variable(tf.truncated_normal([num_hidden1,
num_hidden2],stddev=2./math.sqrt(num_hidden1)))
b2 = tf.Variable(tf.constant(0.2,shape=[num_hidden2]))
h2 = tf.sigmoid(tf.matmul(h1,W2) + b2)
# Output Layer
W3 = tf.Variable(tf.truncated_normal([num_hidden2, 2],
# Just initialize
# Define model

The multiple hidden layer model

Deep CNN with convolutional and pooling layer
# Conv layer 1
num_filters = 4
winx = 5
winy = 5
W1 = tf.Variable(tf.truncated_normal(
[winx, winy, 1 , num_filters],
stddev=1./math.sqrt(winx*winy)))
b1 = tf.Variable(tf.constant(0.1,
shape=[num_filters]))
# 5x5 convolution, pad with zeros on edges
xw = tf.nn.conv2d(x_im, W1,
strides=[1, 1, 1, 1],
padding='SAME')
h1 = tf.nn.relu(xw + b1)
# 2x2 Max pooling, no padding on edges
p1 = tf.nn.max_pool(h1, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='VALID')
# Need to flatten convolutional output for use in dense layer
p1_size = np.product(
[s.value for s in p1.get_shape()[1:]])
p1f = tf.reshape(p1, [-1, p1_size ])
# Dense layer
num_hidden = 32
[p1_size, num_hidden],
stddev=2./math.sqrt(p1_size)))
shape=[num_hidden]))
h2 = tf.nn.relu(tf.matmul(p1f,W2) + b2)
# Output Layer
[num_hidden, 2],
stddev=1./math.sqrt(num_hidden)))

Deep CNN with convolutional and pooling layer

Deeper CNN with 2 conv and pooling layers
# Conv layer 1
num_filters1 = 16
winx1 = 5
winy1 = 5
[winx1, winy1, 1 , num_filters1],
stddev=1./math.sqrt(winx1*winy1)))
shape=[num_filters1]))
xw = tf.nn.conv2d(x_im, W1,
strides=[1, 1, 1, 1],
padding='SAME')
h1 = tf.nn.relu(xw + b1)
Conv layer 2
num_filters2 = 4
winx2 = 3
winy2 = 3
[winx2, winy2, num_filters1, num_filters2],
stddev=1./math.sqrt(winx2*winy2)))
shape=[num_filters2]))
p1w2 = tf.nn.conv2d(p1, W2,
strides=[1, 1, 1, 1], padding='SAME')
h1 = tf.nn.relu(p1w2 + b2)

# Need to flatten convolutional output
p2_size = np.product(
[s.value for s in p2.get_shape()[1:]])
p2f = tf.reshape(p2, [-1, p2_size ])
# Dense layer
num_hidden = 32
[p2_size, num_hidden],
stddev=2./math.sqrt(p2_size)))
shape=[num_hidden]))
h3 = tf.nn.relu(tf.matmul(p2f,W3) + b3)
# Output Layer
[num_hidden, 2],
stddev=1./math.sqrt(num_hidden)))
# Just initialize
# Define model

Why does accuracy decreasing?
source code: https://github.com/alinazhanguwo/fontClassification

Future Work
expanding source dataset:
- introduce random noise
- flip images
- rotate images
- etc.
MORE DATA > FINE-TUNED ALGORITHM

Font classification with 5 deep learning models using tensor flow

More Related Content

What's hot

Similar to Font classification with 5 deep learning models using tensor flow

More from Devatanu Banerjee

Recently uploaded

Font classification with 5 deep learning models using tensor flow

Editor's Notes