Python Scripting: Train, Test, and Predict with ML models

Keshav Sarda
Jun 27, 2021

Created Python scripts for EDA, Data Cleaning, training ML models, and predictions.

Explored data, applied feature engineering techniques, trained models using Logistic Regression, Random Forest, and XGBoost. 3 separate training scripts for each of the models giving output model file, ROC curve, Feature Importance graph. Another prediction script uses output model/s and new unlabeled data to predict Classification probabilities.

Preprocessed by joining over the appropriate key column. Explored and cleaned by handling null and missing values. Applied PCA to reduce feature dimension. Trained 3 models with appropriate hyperparameters. Applied Inverse PCA transformation on prediction data before using model/s.

Working on creating a single file for providing data, input argument (customize), other requirements/resources to implement actions. Looking to create a pipeline on Cloud.

https://github.com/keshavs0305/scripting-training-and-prediction-tasks

--

--

Keshav Sarda

I am an enthusiast Data Scientist with 2 years of experience in IT. I also have experience on Cloud Technology with couple of AWS and GCP Certifications.