Data Science Alliance Research
Overview
I am currently working with the Data Science Alliance and a group of undergraduate HDSI students to build a model
to optimize the placement of food banks in San Diego County.
Research Goals
- Create an interactive dashboard for stakeholders to optimize food bank placement through data driven decision making
- Develop a model incorporating demographics, income, and location data
- Build a scalable, open-source solution for wider adoption
- See more here!
Methodology
Coming Soon...
Key Findings
Coming Soon...
Impact & Applications
Coming Soon...
San Diego Supercomputer Center Research
Overview
At SDSC, my research focused on developing a machine learning model to predict and compress matrix
values derived from plasma physics simulations. The main objective was to create a model capable of estimating
matrix elements from input parameter tuples while reducing storage requirements. The project aimed to strike a balance
between prediction accuracy and data compression efficiency, contributing to more optimized data management for large-scale
computational simulations.
Research Goals
- Develop a Neural Network model that accurately predicts 256x256 matrix elements from given input parameters.
- Optimize the model to minimize data storage needs without significantly sacrificing prediction accuracy.
- Enhance the generalization capability of the model for unseen data to ensure reliable performance across various scenarios.
Methodology
The research involved extensive data preparation and normalization to ensure high-quality inputs for training.
Neural network models were designed and trained to predict matrix elements based on parameter tuples, with iterative
adjustments made to the architecture to improve performance. The process included selecting appropriate training data,
tuning hyperparameters, and employing validation techniques to assess and refine the model. The primary focus was on balancing
the trade-off between prediction precision and data compression.
Key Findings
- The model achieved approximately 85% accuracy on unseen data, demonstrating potential for predictive reliability.
- Over 90% of predicted matrix values fell within a tolerance of ±1 from actual values, indicating reasonable accuracy
for practical applications.
- Generalizing predictions to new, unseen data proved challenging, highlighting areas for further enhancement and model
robustness improvements.
Impact & Applications
This research offers significant contributions to the field of data compression in computational simulations. By enabling
efficient estimation and storage of matrix values, it supports improved data management and resource allocation for
large-scale scientific simulations, particularly in plasma physics. The findings lay the groundwork for future advancements
in predictive modeling for high-dimensional data, potentially benefiting fields that rely on large-scale, complex simulations.