June 25, 2025

Day 22 – GOOD DAY, GREAT DAY, WEDNESDAY

By Aayam Raj Shakya

What I Learned

Today, I started the day by reviewing the performance metrics for the model that I left on training yesterday. I wasn’t quite happy with the results, although the accuracy was ~98%, because this is a bad sign of overfitting. So, I searched for some random ensemble codebases on GitHub and, based on that, I tweaked the custom CNN layers slightly. I added a few extra layers like batch-norm and ReLU and increased the dropout rate slightly to prevent overfitting. Then, I left the model training.

While the model was training, I looked at some videos on Optuna. I am almost done setting up Optuna on my codebase; I just need to let the current model finish training and then I can run the model with Optuna in the next instance. One good news is that I got GPU access from my research lab back at Mississippi State University. We have a couple of Nvidia A100s and 4090s there, and I got access to one of the machines. Now the training will take significantly less time than it currently does. I will only train the model at night, though, as I don’t want to disturb the ongoing work that my research team is doing on that machine.

Blockers

No issues faced.

Reflection

Today, I am quite happy as I finally got access to a powerful GPU after weeks of trying. I was about to buy some cloud GPUs, but thankfully I got access to one for free. With this GPU, I could literally train my model in under an hour or half compared to the current 10–12 hrs that the CBEIS lab PC takes. I am also excited to see Optuna doing its magic on my model. Hyperparameter tuning is the only major job for me at this point, so I hope Optuna performs well.