Projects

A collection of projects where I explore real-world problems through data, machine learning, and practical experimentation. Each one reflects my interest in turning raw data into insight and building models that actually make sense outside of a textbook.

Flight Delay Predictor

XGBoost · Time-Based Validation · Cost-Aware Decision Support · Operations Analytics

This project develops a flight delay risk model using operational and historical flight data, with a focus on how machine learning predictions should be used in real airline operations. An XGBoost model was trained and validated using time-based splits, achieving strong ranking performance (ROC-AUC ≈ 0.93).

Beyond standard model evaluation, the project explored threshold tuning, alert volume tradeoffs, and cost-based utility analysis to assess whether automated operational interventions would create value. Under realistic cost assumptions, automated alerts were found to add limited operational benefit despite strong predictive accuracy.

As a result, the final system is designed as a decision-support tool that ranks flights by delay risk, enabling operations teams to prioritize attention and monitoring rather than triggering rigid automated actions. The project highlights the importance of aligning machine learning systems with operational constraints and real-world decision costs.

GitHub

NBA Player Valuation Using Machine Learning

Machine Learning · Salary Prediction · Roster Optimization · Sports Analytics

This project develops a machine learning–driven player valuation model using historical NBA performance data and contract information, with a focus on how analytics can be used to improve roster construction and contract decision-making. Multiple models were trained and optimized to predict player salaries based on on-court performance, advanced statistics, and role-specific indicators, achieving strong predictive accuracy across seasons.

Beyond standard model evaluation, the project explored feature engineering, salary cap normalization, and performance archetype segmentation to understand how different player profiles are valued across the league. Cluster analysis was used to identify high-impact, low-cost player types and highlight inefficiencies in contract allocation, particularly among role players and high-variance contributors.

As a result, the final system is designed as a decision-support tool that highlights undervalued talent and potential overpayments, enabling teams to prioritize smarter spending, reduce contract risk, and improve long-term roster efficiency. The project demonstrates how machine learning can be applied to real-world strategic decisions in high-stakes, salary-constrained environments.

GitHub