Experiment Management for Machine Learning
An average data scientist (ML Practitioner, AI expert) spends a significant amount of time designing and running machine learning experiments (and waiting for them to complete)
This involves one or many of the following:
– trying out various training algorithms
– doing some feature engineering
– changing preprocessing steps to get more homogeneous data
– trying out different types of hyperparameters
– testing data with different datasets
There is a lot that is involved with creating and running experiments, but the only thing that we seem to be equipped to keep track of is the source code of the best performing experiments, and none of the other configuration parameters that actually constitute a machine learning experiment.
Because of this, it is quite frequently that we hear phrases like:
“It was working yesterday” – highlighting the commonality in reproducibility of experiment “I don’t remember what the actual scores are, but using feature X didn’t help” – documentation issue “I fixed a bug, but I ran so many previous experiments with that bug”
– code dependency issue
“I am using the same parameters as experiment 4, why is it not working” – reproducibility and documentation issue
In this talk I will go through the typical process that machine learning practitioners and data scientists follow, taking python and scikit-learn as a use case, and the recurring issues that we are starting to see with these processes.
I will describe the best practices to follow to help document experiments to help reproducibility, and tools and startups that are working on this space to fix the gaping issues that we have for machine learning experiment management.
Dr. Rutu Mulkar is the founder of Hunchera, and previously the founder of Ticary Solutions (acquired by Sigmoidal). She received her Ph.D. in Natural Language Processing from USC and has contributed to IBM’s Watson system that defeated humans in Jeopardy!
She is interested in solving problems related to Natural Language Processing, specifically – Topic Modeling, Recommender Systems, Information Extraction, Semantics, and Search to name a few, and to apply them to various domains such as SEO and healthcare.