Data Science Portfolio
From healthcare to finance, these projects showcase quantifiable business impact across healthcare analytics, quantitative finance, and AI/ML implementation. Every project represents real-world challenges solved with advanced statistical methods and proven results.
๐ฏ Results-Driven Approach: Combining healthcare analytics with statistical modeling to deliver measurable business impact across diverse industries.
Grant Buddy: AI-Powered Grant Assistant
๐ผ Business Impact: Streamlines grant application process for resource-constrained nonprofits
AI agent that discovers relevant grants, analyzes organizational context, and generates draft grant applications for nonprofit organizations using RAG, OpenAI API, and LangChain.
Technologies Used:
๐ฏ Key Challenge Solved:
Balancing AI automation with nonprofit organizational authenticity and grant requirements
Care.com Marketplace Liquidity Analysis
๐ผ Business Impact: Directly influenced multimillion-dollar resource allocation strategies
Built executive-level strategic recommendations using four multi-target Ridge regression models, achieving 66% variance explanation and identifying critical market saturation point of 10.64 across 435 congressional districts.
Technologies Used:
๐ฏ Key Challenge Solved:
Complex marketplace dynamics with non-linear relationships between supply and demand
Healthcare Early Warning System
๐ผ Business Impact: Early detection system for preventing healthcare incidents and improving patient safety
Developed advanced anomaly detection system analyzing 139 months of hospital data, achieving 100% sensitivity for emergency admission anomalies and 91% specificity - validated against real incident cases.
Technologies Used:
๐ฏ Key Challenge Solved:
Balancing sensitivity vs. specificity while minimizing false alarms in critical healthcare settings
Global Poverty Dynamics Research
๐ผ Business Impact: Generated actionable policy insights for development research organizations
Comprehensive econometric analysis spanning 54 years (1970-2024) across 100+ countries, integrating six datasets with 2,705+ observations to explain 57.9% variance in global poverty reduction patterns.
Technologies Used:
๐ฏ Key Challenge Solved:
Harmonizing multiple datasets with varying methodologies and time coverage
Istanbul Tourism & Retail Analytics
๐ผ Business Impact: Actionable customer segmentation strategies for tourism and retail optimization
Applied K-means clustering and Random Forest regression to 2021-2023 customer data, identifying four distinct segments with spending ranges $471-$14,511 and 50% predictive importance from item pricing.
Technologies Used:
๐ฏ Key Challenge Solved:
Balancing model complexity with interpretability for business stakeholder communication
NYC Transit Equity Analysis - MHC ร MTA Datathon
๐ผ Business Impact: Data-driven policy recommendations for expanding transit accessibility programs
Participant in inaugural MHC ร MTA Datathon analyzing 10GB+ Fair Fares ridership data across 6 NYC neighborhoods, identifying 98% correlation between bus/subway usage and peak patterns for policy recommendations.
Technologies Used:
๐ฏ Key Challenge Solved:
Processing massive real-time transit datasets while maintaining query performance and accuracy
Pokรฉmon Franchise Analytics - Foundation Project
๐ผ Business Impact: Personal skill development and validation of programming capabilities
My first comprehensive Python project analyzing global video game sales using VGChartz dataset and PokeAPI integration, demonstrating OOP principles and revealing Game Boy platform dominance (80M+ units).
Technologies Used:
๐ฏ Key Challenge Solved:
Learning foundational programming concepts while handling real-world data integration complexities
Proven Skills Across Industries
Every project demonstrates measurable impact through advanced analytics, from 66% variance explanation in marketplace modeling to 100% sensitivity in healthcare anomaly detection.
Statistical Modeling
- โข Ridge Regression: 66% variance explained in marketplace analysis
- โข Time Series Analysis: CUSUM & ETS models for anomaly detection
- โข Econometric Modeling: 57.9% Rยฒ across 54 years of global data
- โข Panel Data Analysis: 2,705+ observations, 100+ countries
- โข Interaction Effects: Growth-distribution policy modeling
Machine Learning & AI
- โข RAG Implementation: LangChain + OpenAI for grant assistance
- โข Cambio Labs Project: Production-ready AI grant assistant
- โข Customer Segmentation: K-means clustering, 4 distinct personas
- โข Random Forest: Feature importance analysis for business insights
- โข Neural Networks: 95% accuracy on image classification (ML Fellowship)
Data Engineering & Analytics
- โข Clinical Data Analysis: 150+ hours at NYU Langone Health
- โข Anomaly Detection: 100% sensitivity, 91% specificity rates
- โข Big Data Processing: 10GB+ transit datasets, optimized SQL
- โข Multi-dataset Integration: Six complementary data sources
- โข Domain Expertise: Finance, policy analysis, and data science
Ready to Drive Business Impact with Data?
From 66% variance explanation in marketplace optimization to 100% sensitivity in anomaly detection; these projects demonstrate measurable ROI through advanced analytics. Let's discuss how similar methodologies can accelerate your business objectives.