Jahnavi Kalpathy


Welcome to my portfolio! I am a
Data Analyst - Engineer - Lifelong Learner

About Me I am a Data Analyst and Engineer, interested in using data to identify trends and key insights that can shape strategy. I care about customer products, sustainability, and technical innovation. I hope to leverage my skills in analysis, modeling, visualization, and ETL to collaborate with peers and solve data-driven questions.

My Skills
Regression
Classification
Natural Language Processing
Neural Networks
Python
Pandas
Numpy
Scikit-learn
SQL
R
dbt
Hex
BigQuery
Excel
Tensorflow
Data Visualization
Matplotlib
Seaborn
Plotly
Looker
Tableau
JupyterLab
GitHub
Google Cloud
AWS
PostgreSQL
System Validation
Error Analysis

Projects Here are a few of the data projects I've worked on. Checkout my Github profile for more! Or view my resume for more work experience.

Volunteer Recruitment

Campaign Volunteer Recruitment

As a Data Analyst for the 2024 Harris-Walz campaign, I built reporting pipelines to track key metrics regarding volunteer recruitment and voter outreach. This included managing backend databases in BigQuery, creating SQL models in dbt, and building dashboards in Hex. These tools enabled our team to share key insights with leadership, driving strategic decision-making.

NYC buildings icon

Modeling Energy Efficiency

This was my capstone project. I collected data on thousands of buildings in New York City and created a model that would try to predict the ENERGY STAR score based on data regarding size, primary use, location, and energy and water consumption. The ENERGY STAR score is a government-backed score for measuring the efficiency of a building in comparison to a national benchmark for each building type.

Project Repo
China Study image

Community Health Study

The goal of this group project was to use data from The China Study to identify mortality rates in rural China and create links with various predictors. We created a pipeline for a Linear Regression model that would allow us to select a mortality category and identify the lifestyle factors that had the greatest impact on that mortality.

Project Repo
Subreddit icons

NLP Subreddit Classification

The goal of this project was to create an Natural Language Processing (NLP) model that could classify the source of a reddit post. I compared two subreddits r/collapse and r/futurology, which both cover topics from climate change to AI - various science and tech trends will be impactful. There is a difference, however, in the sentiment and optimism in the two subreddits, and I wanted to see if I could identify that.

Project Repo
Housing Price icon

Real-Estate Prices Regression

The goal of this project was to create a linear regression that could predict the price of a home in Ames, Iowa. I used a well-known dataset for this that contained over 80 features describing and quantifying different aspects of each home.

Project Repo

Education

General Assembly

Data Science Immersive Fellow

Oct 2022 - Feb 2023

500 hours of expert-led, hands-on instruction in data science, analytics, and Machine Learning


Massachusetts Institute of Technology

B.S. Mechanical Engineering

Aug 2012 - May 2017

Relevant Coursework: Global Engineering, Energy & Materials in Manufacturing, Design for Scale