Description
Data Science Consultant Freelancer having 4 years of hands-on experience in
machine learning, deep learning and predictive analytics.
- Experienced in Statistical Analysis, Visualization, Hypothesis Testing, Feature Engineering, Hyperparameter Tuning, Model Analysis and Evaluation.
- Sound knowledge of ML and DL algorithms - Regression, Neural Networks, CNNs, RNNs, Decision Trees, Random Forest, Gradient Boost, XGBoost, AdaBoost.
- Solid3years of hands-on experience in Pandas, NumPy, SciPy, Matplotlib, Seaborn, Plotly, Scikitlearn, OpenCV, Tensorflow and Keras Framework
- Experienced in deployment tools- Flask and gunicorn.
- More than 3 years of experience in SQL scripting on SQL Server and MySQL, data-extraction, manipulation, pipelining, automation and query-tuning.
- Experienced in VCS and remote repository managers such as GitLab and Github
Projects Summary
- Project: Skin Analysis App
- Duration: November 2020 –Present
- Role: Python Developer
- Environment: Tensorflow2.x, OpenCV, DLIB, Python, pandas, sklearn, gunicorn
- Outline: Developed, Tested and Deployed multiple ML classification models for image skin tone mapping and skin issue prediction. Developed image processing algorithms to eliminate bad lighting conditions from image.
2. Project: Web Scrapping of MS affiliated companies in USA
- Duration: August 2020 –October 2020
- Environment: Beautiful Soup, Selenium (Python), pandas
- Outline: Developed multiple scripts to scrape and map various Microsoft affiliated companies across USA
3. Project: Supervised Machine Learning Prediction for Customer Attrition
- Duration: November2016 to April 2017
- Role: Solution Integrator, Ericsson
- Environment: Python, Numpy, Pandas, Matplotlib, Seaborn, Sickit Learn (Logistic Regression, Decision Tree-based Learning)
- Outline: The data set was collated from operator contained a mix of categorical and numerical attributes of postpaid customers. The goal was to predict whether the customers switched away or not.
- •Performed data exploratory analysis(Feature Distribution Graphs, Visual Analysis),data cleaning, missing value imputation and deduplication using Numpy, Matplotlib, Seaborn and Plotly and Categorization libraries. Generated correlation heatmaps, used PCA to combine features showing high multi-collinearity.
- Implemented Label Encoders, Standard Scaler and Min Max Scaler
- Baseline Model was prepared using Logistic Regression on default hyperparameters
- Implemented Univariate Selection techniques (Chi-squared, RFE, Select KBest, Decision Trees)to select the most useful features
- Evaluated over - sampling models built using SMOTE+ Logistic Regression and Recursive Feature Elimination + Logistic Regression model
- Evaluated all decision-treebased ML models(includinghyper-parameter tuning)-AdaBoost, Random Forest, Gradient Boost, XGboost, Cat boost and generated their performance metrics respectively.
- Model was deployed on the Development environment and then on Production cluster using Pickle, Flask and gunicorn