• KJ Ha

ML4PCOMP Assignment 3

As a continuation of last week's assignment, I focused more on optimizing the model rather than having a physical output. I took several different approaches to improve the model as I will describe below. While I was capturing gestures, I figured out that it was more difficult to capture the circle or w gesture than punch or flex so, I switched back to those classes from the gesture recognition tutorial.


0. Initial model


Graph the loss:


Graph the loss again, skipping a bit of the start



1. Collect more data


Modified (with larger dataset)


I initially collected 10 gestures for each class as suggested by the tutorial. The above two graphs are showing the loss using the vanilla model used on the tutorial. As the graphs indicate, the result was good enough, performing quite accurately with the IMU classifier.


Later, I tripled the number of data in the datasets. The graphs below are showing the loss of the modified model, which was trained on approximately 30 gestures per each class.

2. Seed randomization


Vanilla

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tensorflow as tf

print(f"TensorFlow version = {tf.__version__}\n")

# Set a fixed random seed value, for reproducibility, this will allow us to get
# the same random numbers each time the notebook is run
SEED = 1337
np.random.seed(SEED)
tf.random.set_seed(SEED)



Modified (adding more steps in seed randomization)

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tensorflow as tf
import random as rn
import os

os.environ['PYTHONHASHSEED'] = '0'

print(f"TensorFlow version = {tf.__version__}\n")

# Set a fixed random seed value, for reproducibility, this will allow us to get
# the same random numbers each time the notebook is run
SEED = 2020

# Setting the seed for numpy-generated random numbers
np.random.seed(42)

# Setting the seed for Python random numbers
rn.seed(SEED)

# Setting the seed for Tensorflow random numbers
tf.random.set_seed(SEED)

# Force Tensorflow to use a single thread
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
tf.compat.v1.set_random_seed(SEED)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)

tf.compat.v1.keras.backend.set_session(sess)



Graph the loss


Graph the loss again, skipping a bit of the start


I've added more steps to randomize seed for better reproducibility. The vanilla code still allows users to use front-end packages but they may not use TF as their backend. tf.set_random_seed will make random number generation in the TensorFlow backend have a well-defined initial state.


For further details, please see: https://www.tensorflow.org/api_docs/python/tf/compat/v1/set_random_seed



3. Layer reconfiguration


Vanilla

# build the model and train it
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(50, activation='relu')) # relu is used for performance
model.add(tf.keras.layers.Dense(15, activation='relu'))
model.add(tf.keras.layers.Dense(NUM_GESTURES, activation='softmax')) # softmax is used, because we only expect one gesture to occur per input
model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])
history = model.fit(inputs_train, outputs_train, epochs=600, batch_size=1, validation_data=(inputs_validate, outputs_validate))



Modified (layer reconfigured)

# build the model and train it
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(6, activation='relu'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.add(tf.keras.layers.Dense(12, activation='relu'))
model.add(tf.keras.layers.Dense(NUM_GESTURES, activation='softmax'))

model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])history = model.fit(inputs_train, outputs_train, epochs=600, batch_size=1, validation_data=(inputs_validate, outputs_validate))



Graph the loss

Graph the loss again, skipping a bit of the start


I've added a few more layers including tf.keras.layers.Flatten(). As I discovered last week, the combination of ReLU and Sigmoid gives me the best result so far.



4. Hyperparameter tuning


Final version

# build the model and train itmodel = tf.keras.Sequential()model.add(tf.keras.layers.Dense(6, activation='relu'))model.add(tf.keras.layers.Flatten())model.add(tf.keras.layers.Dense(1, activation='sigmoid'))model.add(tf.keras.layers.Dense(12, activation='relu'))model.add(tf.keras.layers.Dense(NUM_GESTURES, activation='softmax'))

model.compile(optimizer= 'adam', loss='categorical_crossentropy', metrics=['mae'])
history = model.fit(inputs_train, outputs_train, epochs=600, batch_size=1, validation_data=(inputs_validate, outputs_validate))



Graph the loss


Graph the loss again, skipping a bit of the start

Finally, I made some changes to hyperparameters. The major changes are the use of Adam as an optimizer and of categorical crossentropy for the loss function respectively. I've also tried several other optimizers including RMSprop and SGD and Adam was giving me the most stable result among others.




Final comparison: vanilla vs. modified model


Graph the loss again, skipping a bit of the start




The one thing I've learned through the course of my journey with machine learning is that a great portion of developing a model is still based on trial and error methods. Therefore, faster computation is highly preferred to get an experimental result back faster. For that purpose, I found Tinyml workshop was really helpful.


I'm still doubtful that if the dataset--partially due to its small size--is appropriate for such optimization process; it was still worth to dig deeper into the code to have a more solid understanding of how each line contributing to the model.


© 2020 Kyungjoo Ha. All Rights Reserved