EDA Tips and Tricks: Building Better Data Pipelines

Play Video

Recent Insights

Kajanan Sangaralingam | Chief Data Scientist @mobilewalla5482
Anindya Datta | Chief Executive Officer @mobilewalla5482

Feature engineering is the process of selecting the features used to train your ML models. Your base data must be turned into features and attributes to be useful in predictive modeling. Proper feature selection drives the ultimate performance of your model’s predictive accuracy, resilience and output scale. The process of building features to train and operate your ML models can take 70-80% of overall time and effort. EDA is one of the most challenging and important steps in this process. If you don’t have a deep understanding of your data, you can waste resources creating sub-optimal features that ultimately are the biggest culprits causing low-quality models.

This session will dig into the key activities in EDA and discuss new ideas and tools for how to improve the process and outcomes. The session format will be: starting with a 20 minute presentation taking a deep dive into EDA, 15 minutes spent installing Anovos (open-source libraries supporting EDA and the remainder of the session walking through a hands-on case study performing EDA with Anovos