About Me
Skills
Education
University of California, Berkeley
Master of Molecular Science and Software Engineering
Expected Graduation: May 2025
CHEM 277B: Machine Learning
CHEM 281: Software Engineering for Scientific Computing
CHEM 274A: Python and C++ for Molecular Sciences
CS 267: Applications of Parallel Computing
DATA 200S: Principles and Techniques of Data Science
CHEM 274B: Software Engineering Fundamentals
Carnegie Mellon University
Bachelor of Science in Chemistry with University and College Honors
Minors: Creative Writing and Engineering Studies
Graduated: May 2023
09-563: Molecular Modeling and Computational Chemistry
19-443: R for Data Science, Technology, and Public Policy
15-110: Principles of Computing
09-231: Mathematical Methods for Chemists
09-322: Molecular Spectroscopy and Design
27-212: Defects in Materials
Experience
AI & Analytics Intern - Digital Endpoints and Patient-Centered Solutions
Genentech
January 2025 - Present
In this role, I was responsible for piloting MVP's to enable data science and AI democratization among real-world data and regulatory stakeholders
- Piloting MVPs through Agile SDLC lifecycle and transitioned validated prototypes to platform teams for scalable deployment
- Leading discovery sessions with Regulatory and RWD teams to define needs, map data flows, and blueprint GenAI solutions
- Partnering with regulatory scientists to convert health authority feedback into features for AI-driven clinical protocol evaluations
- Deploying AI-driven information organization workflow to identify trends in health authority questions of clinical trial protocols
- Built LLM-based automation workflows to transform unstructured clinical trial data into modeling pipelines for decision-making
- Conducted large-scale data analysis on 20M+ medical records using bioBERT-based clustering to surface clinical decision signals
- Developed a 4-step AI-driven workflow to extract and organize information related to treatment adherence from doctor’s notes
- Demonstrated a 10% prediction improvement over traditional ML methods in identifying trends and correlations in medical data
Team Lead - Open Innovation Squad
UC Berkeley
January 2025 - June 2025
In this role, I was responsible for leading a cross-functional team to deliver a comprehensive competitor, feature, and market analysis for an agentic AI workflow automation and data analysis product
- Conducted competitive benchmarking and user research to surface user needs, pain points, and guide product differentiation
- Facilitated iterative design thinking cycles and incorporated customer feedback to boost non-technical user engagement by 15%
- Defined KPIs and delivery timelines, aligning internal and client teams to gain 4% monthly revenue and 7% satisfaction growth
- Aligned communication across client, internal team, and Open Innovation Board to maintain strategic and executional alignment
- Defining low-fidelity prototypes and feature sets to enhance AI functionalities, aligning with user needs and business objectives
- Analyzing AI and data regulations in deployment markets to comply with international legislation such as GDPR and EU AI Act
Laboratory Development Engineer
Emerald Cloud Lab
June 2023 - June 2024
In this role, I was responsible for designing, building, and deploying solutions to automate laboratory workflow
- Led the design and deployment of customized magnetic tube rack system to minimize dead volume and improve lab efficiency
- Cut material resource usage by 25% by iterating on magnetic rack to enhance ergonomic design and space utilization
- Reengineered experiment database on AWS to streamline magnetic rack data handling and change operator task prioritization
- Designed and installed customized IR temperature sensor holders to standardize positioning and data collection
- Led a team of 8 temporary workers to assemble and integrate various sensor arrays, accelerating speed on build out by 15%
- Updated deprecated solvent bottle location objects using Wolfram Mathematica to increase speed for storage tasks by 10%
Undergraduate Research Assistant
Kurnikova Group (CMU)
March 2022 - September 2023
I co-authored a paper detailing a machine learning workflow to reduce computational costs for protein-ligand binding by 85%
- Simulated complex alchemical thermodynamic cycles with molecular dynamics for RBFE / ABFE calculations on T4 lysozyme
- Scripted selection algorithms in Python to implement machine learning ligand orientation predictions for binding simulations
- Analyzed the electrostatic potential of the WD40 domain of LRRK2, identifying 4 target regions for future drug development
- Developed force field parameters for ligands using GAFF2 and derived atomic charges via the RESP method with Gaussian
- Utilized GPU clusters to perform high performance computing for expensive protein-ligand simulations
Awards:
- 1st Place Capstone Presentation -- highest score in the last 5 years
Manufacturing Science and Technology Intern
Merck & Co.
May 2022 - August 2022
I used statistical and analytical techniques to double the speed of tablet stability assessments
- Designed tablet stability experiment across temperature and humidity conditions to collect high-quality degradation data
- Conducted stability testing with analytical chemistry techniques and solution preparation methods with LC-MS
- Modeled laboratory data with Ridge and Lasso linear regressions using RMSE in R to assess the accuracy of detection protocol
- Developed Bayesian machine learning model to double the speed of stability assessments and degradation forecasts
Undergraduate Research Assistant
Jin Laboratory
Dec 2020 - August 2022
I designed synthetic pathways for novel small thiolated gold nanoclusters
- Utilized UV-Visible Spectroscopy and Photoluminescence Spectroscopy to analyze properties related to electronic structure
- Conducted experiments with techniques like aqueous-organic separation, ligand exchange, and thiol etching
- Probed vibrational levels in nanoparticle metallic core with THz Raman Spectroscopy and cryo-optical methods
Awards:
- 1st Place in Chemistry -- CMU Sigma Xi Quantitative Research Competition
- 2nd Place Overall -- CMU Sigma Xi Quantitative Research Competition
Poster Presentations:
- Synthesis and Characterization of Thiolate-Protected Gold Nanoclusters of Atomic Precision
- Presented at: ACS Fall 2022, ACS Regional Symposium Spring 2021 and CMU Meeting of the Minds
Technical Operations Chemistry Intern
Merck & Co.
May 2021 - August 2021
I employed my chemical expertise to resolve antibiotic manufacturing process deviations
- Conducted 50+ flow experiments on column chromatography set-up to confirm raw material functionality
- Documented experimental results in GMP format to support process change requests for raw materials specifications
- Performed weekly safety inspections, housekeeping walkthroughs, and flow tests on safety equipment
Projects
Click on a title to view the project code and/or publication!
LLM-Powered Information Extraction from Patient EHR Data
I created a 4-step, LLM-powered, low-code workflow that helps users quickly explore their data, reducing analysis time from days to hours. We aimed to extract treatment adherence insights from free-text doctor's notes and use them to train a machine learning model predicting visual acuity improvement after one year. I worked with product leaders and data scientists to refine the approach, which involves breaking down user queries into key factors, scoring each doctor's note accordingly, and combining the results with demographic data in a tabular format. This enabled us to identify serious comorbidities and patient independence as key predictors of treatment success.
Tech Stack / Techniques:
Compensation Summation, Cheminformatics Graph Walking, and Substructure Searching
For my final project in the MSSE Chem 274A course, I developed a computational toolset to explore molecular systems using advanced techniques in computational chemistry. The project involved implementing methods such as compensation summation, graph walking and adjacency matrices for molecule, and substructure searches with hashing functional groups. The project involved developing a framework for reproducible and scalable scientific computations which can be easily accessed through a user interface. This work highlights my ability to apply interdisciplinary skills to solve complex problems in the molecular science.
Tech Stack / Techniques:
Leveraging Computer Vision to Efficiently Allocate Emergency Resources
I co-developed a logistic regression model and a convolutional neural network (CNN) using scikit-learn and TensorFlow to classify images with 90% accuracy. To handle a dataset of 20,000 images, I applied computer vision techniques (Sobel Edge filtering, etc.) for feature extraction and to scale the model implementation. I further improved the damage detection accuracy by 30% by using Principal Component Analysis (PCA), hyperparameter tuning, 5-fold cross-validation, and gradient descent optimization.
Tech Stack / Techniques:
Predicting Regenerative Properties at the Genomic Level with Machine Learning
I automated extracting trimer counts and calculating the AT/GC ratio from over 100 genes using Pandas, NumPy, and Biopython. This, combined with the NCBI Command Line Tool, streamlined the data processing workflow. I designed a convolutional neural network (CNN) with a Long Short-Term Memory (LSTM) layer using TensorFlow to classify regenerative genes from the NCBI database, achieving 75% accuracy. I visualized the trimers based on their regenerative importance, mapping prevalent amino acids and validating the results against existing literature. Additionally, I developed an API to access UniProt protein structures for known regenerative proteins and automated comparisons with AlphaFold.
Tech Stack / Techniques:
Collaborative Modular Banking Software for Transaction Management
I co-designed and implemented a banking system using Python to streamline account management and transaction processing. To improve efficiency, I refactored transactions, spending aggregation, and balance retrieval by utilizing dictionary lookups and binary search to reduce runtime. I also debugged account histories for joint accounts, creating a unified transaction log through efficient data aggregation and handling, ensuring seamless access and accurate transaction tracking across multiple users.
Tech Stack / Techniques:
Automated Resource Allocation for Protein-Ligand Binding Simulations
I collected data by conducting both RBFE and ABFE calculations by simulating an alchemical thermodynamic cycle for our benchmark system and our ligands as well. I also processed the data and helped guide the development of our automation framework. We submitted a manuscript detailing the automated framework. Our resource allocation algorithm saved up to 85% of computational cost for high-throughput protein-ligand binding free energy simulations.
Tech Stack / Techniques:
Bayesian Machine Learning Model Development for Tablet Stability Experiments
I developed a Bayesian machine learning model that reduced analytical data collection time by 50% for high-throughput tablet stability experiment design. I collaborated with the LC-MS team to create a more sensitive method for quantifying trace degradation levels and conducted a short-term stability study using this approach. With my model, I accurately predicted degradation patterns for up to two years with just three months of data, cutting the previous requirement of six months in half.
Tech Stack / Techniques:
Modeling the Impact of Electric Vehicles (eV) on Greenhouse Gas Emissions
In this project, I aimed to gain insights into EV purchase trends and their environmental implications. I utilized GeoPlot to visualize the locations of purchases over a map of New York State, allowing me to identify trends based on population density and socioeconomic factors. Additionally, I employed linear regression to investigate the correlation between EV sales and GHG emissions, while applying elastic net regressions to enhance model accuracy by incorporating penalty terms for coefficients.
Tech Stack / Techniques: