I am currently a Master of Science in Data Science student at the University of Washington. My journey started at UC Santa Barbara, where I double-majored in Psychological and Brain Sciences and Statistics and Data Science.
I am passionate about bridging the gap between technical complexity and human understanding. I specialize in transforming messy, high-dimensional data into meaningful stories and interpretable results that drive social impact and better decision-making.
Applied Materials • Santa Clara, CA
Bionic Vision Lab • Santa Barbara, CA
Python • API Integration • CI/CD • PlotlyDash
Developed an end-to-end grocery planning tool designed to reduce barriers to home cooking for individuals in food desert regions. The system enables users to select recipes from a database of 500k+ records and automatically generates a store-specific grocery list with real-time pricing and availability.
Interactive Data Storytelling • Tableau • Tableau Prep
The United States is a melting pot of environments; however, the need for quality medical care is universal. This project assesses satisfaction rates across the country to inform communities of how their local hospitals compare to nationwide standards. Using a top-down geographical approach, we visualized data from over 4,300 unique hospitals across 53 states and territories.
Data Engineering: Merged five years of HCAHPS datasets (1.6M+ records) in Python. We pivoted the data from wide to tall format to standardize satisfaction indicators like Nurse Communication and Cleanliness.
Geocoding: Leveraged the Google Maps API to map exact hospital coordinates, ensuring high-fidelity spatial accuracy in the final visualization.
Insights: Identified that while clinical communication is generally high (3.41 stars), environmental factors like Quietness (2.97 stars) remain significant pain points for patients.
Usability: Conducted evaluations with healthcare professionals (nurses, pharmacists) to refine the "drill-down" navigation from state to county to specific facility.
PySpark • Databricks • Distributed Computing
Leveraged Databricks and PySpark to process millions of records, analyzing how household demographics affect civic engagement. I built a scalable pipeline to categorize voter segments and visualized geographic patterns through custom choropleth maps.
"Analysis revealed that homeownership is a primary driver of turnout, while single-person households showed the lowest probability of voting across all segments."