Diya Vij

Berkeley, CA · (925) 785-4040 · diyavij@berkeley.edu · LinkedIn

Education

University of California, Berkeley

Expected May 2026

Bachelors, Applied Mathematics and Statistics; minor, Data Science

Relevant coursework: Data Structures & Algorithms, Bayesian Statistics & Machine Learning, Principles & Techniques of Data Science, Probability, Regression Analysis, Design of Experiments, Abstract Linear Algebra, Numerical Analysis, Abstract Algebra, Stochastic Processes, Time Series, Game Theory, Human Contexts & Ethics of Data

Skills

Languages: SQL, Python, R, MATLAB

Applications & databases: Power BI, Tableau, Excel, MySQL, SQLite, Git, Google Cloud, R Shiny, Azure Data Services

Machine learning & tools: Pandas, Matplotlib, TensorFlow, scikit-learn, SAS, React, Flask, FastAPI

Hackathons: Winner, BayHacks 2021; Best Use of Twilio, AthenaHacks (USC) 2023; LAHacks (UCLA); TreeHacks (Stanford)

Highlighted experience and projects

Data Science & Machine Learning Intern — Regeneron Pharmaceuticals, Tarrytown, NY

Summer 2025
  • Processed high-dimensional accelerometer time series from patients with neurological disorders for downstream modeling.
  • Built a RAG application combining GPT-4 / LLMs with biomedical APIs (PubMed, ClinicalTrials.gov, UK Biobank) for literature review and cross-population analysis; semantic search and retrieval-augmented workflows to support digital biomarker hypotheses.
  • Trained and evaluated ML models on large time series to predict disease progression (85% accuracy), supporting go/no-go decisions in clinical development.
  • Presented in three cross-functional meetings; collaborated with medical directors and technical working groups. Selected to present at an internal symposium for 150+ faculty and postdocs.

UC Berkeley Physics 188 Group Final — Predicting sleep apnea (DREAMT)

Fall 2025
  • Developed a CNN–Transformer hybrid in TensorFlow on ~150 GB multimodal time-series sensor data, achieving 78% accuracy and 0.86 ROC-AUC.
  • Engineered nine predictive features from raw ECG and respiratory signals (FFT, bandpass filtering, peak detection); trained four ML approaches (CNN, Transformer, MLP, Random Forest) with cross-validation.

Data Services Intern — University of California Information Technology

Spring–Fall 2024
  • Updated, maintained, and created Power BI reports for Facilities Financial Services across the UC system.
  • Contributed to Power BI development and refactored SSIS ETL pipelines into Python, using masked data for testing within Administrative and Residential IT.

Data Science & Analytics Intern — 365Labs, Baton Rouge, LA

Summer 2023
  • Analyzed data in Power BI, queried Microsoft SQL Server, and presented findings to law enforcement stakeholders.
  • Cleaned and documented data; built data dictionaries to improve accessibility across five-plus applications.
  • Worked with the Chief Software Architect on integrating Microsoft Fabric and LLMs into the data science workflow.

Leadership & extracurricular activities

Marketing Lead & Vice President — Hackathons @ Berkeley

Aug 2024 – present
  • Helps run Cal Hacks (largest collegiate hackathon worldwide) and the UC Berkeley AI Hackathon.
  • Manages social media, content, and outreach to 25,000+ applicants, 4,000+ hackers, and sponsors.
  • Organizes club events, Berkeley makerspace collaborations, and budget planning on the order of a $1.25M annual program budget.

Officer — Mathematics Undergraduate Student Association

Aug 2024 – present
  • Leads reading groups on stochastic applications, game theory, and other pure math topics.
  • Undergraduate representative on the Berkeley Math Equity and Inclusion Committee; organizes events and math talks.

Student Consultant — UC & Haas Information Technology

Aug 2023 – present
  • ~20 hours/week supporting faculty and staff: desktops, laptops, printers, and mobile devices for 400+ users.
  • Standard configurations, inventory, imaging, and device setup.

Director of Operations — Data Science Club

May 2023 – Aug 2024
  • Supported 150+ students on 30+ personal projects; taught exploratory data analysis and workflows.
  • Ran team meetings, alumni relations, operations, scheduling, and room bookings.