Projects

My expertise in data showcases my ability to extract valuable insights from complex datasets and drive data-informed decision-making. Throughout my career, I’ve worked on a diverse range of projects utilizing cutting-edge technologies and methodologies. From developing predictive machine learning models in Python and R to crafting intricate SQL queries for data manipulation, my work demonstrates a comprehensive approach to data analysis. I’ve also created dynamic visualizations and interactive dashboards using industry-standard BI tools such as Power BI and Tableau, translating raw data into actionable business intelligence. Below, you’ll find a selection of my most impactful projects, highlighting my proficiency in statistical analysis, predictive modeling, and data visualization across various industries and applications.

Multilingual Customer Support Ticket Classification Using SVM

This project classifies multilingual customer support tickets using Support Vector Machines (SVM) to predict ticket types.

Key Steps:

Technologies: Python, scikit-learn, pandas, imblearn, matplotlib, seaborn

Key Insight: SVMs can effectively classify customer support tickets, with performance varying by language due to data characteristics and linguistic complexities. Further improvements are needed to enhance cross-linguistic generalizability.

E-commerce Sales Analysis

This project classifies emails using a Multinomial Naive Bayes model to predict labels (e.g., spam or not spam).

Key Steps:

Technologies: Power BI, SQL, Excel

Key Insight: Utilizes a probabilistic model to classify emails, with room for improvement in handling class imbalance and model accuracy.

Banknote Authentication Using PCA and KNN

This project uses K-Nearest Neighbors (KNN) with PCA for banknote authentication, optimizing accuracy and fraud detection.

Key Steps:

Technologies: Python, Scikit-learn, PCA, KNN, Pandas, Numpy, Google Colab

Key Insight: The model effectively detects counterfeit banknotes with near-perfect recall, minimizing false negatives in fraud detection.

Email Classification Using Multinomial Naive Bayes

Key Steps:

Technologies: Python, Pandas, NLTK, Scikit-learn, Google Colab

Key Insight: Utilizes a probabilistic model to classify emails, with room for improvement in handling class imbalance and model accuracy.

Value at Risk (VaR) Analysis of AAPL Stock

This project analyzes the Value at Risk (VaR) of Apple Inc. (AAPL) stock using historical, parametric, and Monte Carlo methods.

Key Steps:

Technologies: R, quantmod, PerformanceAnalytics, ggplot2, dplyr,

Key Insight: A comparative analysis of different risk estimation techniques to assess potential losses.

Descriptive Statistics and Data Visualization of MPG Dataset

This project explores the MPG dataset through descriptive statistics and visualization techniques to analyze acceleration and other key attributes.

Key Steps:

Technologies: Python, pandas, matplotlib, seaborn, scipy

Key Insight: Statistical and visual exploration of vehicle acceleration patterns, highlighting distribution characteristics and differences by engine type.

Back