
Hi, my name is
Vanshaj Gupta
Data Engineer | Software Engineer
Curiously captivated by the MYSTERIES hidden in data and systems.
I approach problems like puzzles - solving them piece by piece to build scalable solutions that drive meaningful business impact. My passions span AI, AUTOMATION, DATA ENGINEERING, and SOFTWARE DESIGN. Outside of tech, you'll find me solving Rubik's cubes, exploring entrepreneurship, following sports, or diving into music.
Let's connect if any of these resonate with you - I'm always up for a great conversation.

Hi, my name is
Vanshaj Gupta.
Data Engineer | Software Engineer.
Curiously captivated by the MYSTERIES hidden in data and systems.
I approach problems like puzzles - solving them piece by piece to build scalable solutions that drive meaningful business impact. My passions span AI, AUTOMATION, DATA ENGINEERING, and SOFTWARE DESIGN. Outside of tech, you'll find me solving Rubik's cubes, exploring entrepreneurship, following sports, or diving into music.
Let's connect if any of these resonate with you - I'm always up for a great conversation.
Experience
My professional journey in data engineering and analytics
Business and Data Analyst II
Arizona State University, Learning Enterprise
- •Automated feedback data processing for 60+ courses by designing an ETL pipeline with Apache Airflow, Python and Google Cloud to ingest API data into BigQuery, saving the design team 30+ hours per term
- •Collaborated with 7+ clients to improve revenue reporting by developing end-to-end Alteryx workflow, reducing invoice errors by 75% and manual reporting efforts by 95%
- •Led data architecture design by creating materialized views and data models in BigQuery using SQL, enabling 4+ Looker Studio dashboards that improved business intelligence reporting and decision-making
Business and Data Analyst II
Arizona State University, Learning Enterprise
- •Automated feedback data processing for 60+ courses by designing an ETL pipeline with Apache Airflow, Python and Google Cloud to ingest API data into BigQuery, saving the design team 30+ hours per term
- •Collaborated with 7+ clients to improve revenue reporting by developing end-to-end Alteryx workflow, reducing invoice errors by 75% and manual reporting efforts by 95%
- •Led data architecture design by creating materialized views and data models in BigQuery using SQL, enabling 4+ Looker Studio dashboards that improved business intelligence reporting and decision-making
Data Engineer
Arizona State University, Learning Enterprise
- •Performed data modeling on 300K+ Salesforce records in AWS Redshift using Airflow and Python, automating weekly pipelines that reduced operational cost and produced 4+ customized files to support team OKR reporting
- •Enabled self-service analytics for 7+ cross-functional teams by building 12+ Tableau and Looker Studio dashboards with automated refresh schedules and interactive data visualizations, decreasing monthly data team requests by 65%
- •Contributed to data architecture by migrating 8+ database tables from Star Schema to Relational Schema in AWS Redshift
- •Improved data quality by developing Python and SQL validation scripts on AWS Redshift and PostgreSQL, reducing incorrect insights and system failures by 70%, while strengthening data governance through code reviews and schema documentation
- •Accelerated monthly and quarterly reporting cycles by automating 30+ business intelligence reports across Salesforce, Excel, and Google Sheets, providing stakeholders with timely actionable insights to guide business decisions
Data Engineer
Arizona State University, Learning Enterprise
- •Performed data modeling on 300K+ Salesforce records in AWS Redshift using Airflow and Python, automating weekly pipelines that reduced operational cost and produced 4+ customized files to support team OKR reporting
- •Enabled self-service analytics for 7+ cross-functional teams by building 12+ Tableau and Looker Studio dashboards with automated refresh schedules and interactive data visualizations, decreasing monthly data team requests by 65%
- •Contributed to data architecture by migrating 8+ database tables from Star Schema to Relational Schema in AWS Redshift
- •Improved data quality by developing Python and SQL validation scripts on AWS Redshift and PostgreSQL, reducing incorrect insights and system failures by 70%, while strengthening data governance through code reviews and schema documentation
- •Accelerated monthly and quarterly reporting cycles by automating 30+ business intelligence reports across Salesforce, Excel, and Google Sheets, providing stakeholders with timely actionable insights to guide business decisions
Data Engineer
GCS Medical College, Hospital and Research Centre
- •Enabled accurate healthcare analytics by automating ETL workflows with Apache Airflow and Python to ingest 500K+ electronic healthcare records into PostgreSQL, improving data quality by 35% for downstream reporting
- •Supported targeted healthcare campaigns by developing demographic-based Power BI dashboards with SQL DirectQuery, increasing patient engagement by 25% and expanding reach across 10K+ patients
- •Designed financial analytics dashboard in Power BI for 100+ clients across 20+ services, monitoring revenue trends and equipping finance teams with insights for quarterly strategy reviews
Data Engineer
GCS Medical College, Hospital and Research Centre
- •Enabled accurate healthcare analytics by automating ETL workflows with Apache Airflow and Python to ingest 500K+ electronic healthcare records into PostgreSQL, improving data quality by 35% for downstream reporting
- •Supported targeted healthcare campaigns by developing demographic-based Power BI dashboards with SQL DirectQuery, increasing patient engagement by 25% and expanding reach across 10K+ patients
- •Designed financial analytics dashboard in Power BI for 100+ clients across 20+ services, monitoring revenue trends and equipping finance teams with insights for quarterly strategy reviews
Projects
Featured work and data projects


Cloud Based Face Recognition System
End-to-end data pipeline processing real-time sales data from multiple sources. Implemented using AWS Lambda, Kinesis, and Redshift.



Skills
Tools and technologies I work with
Programming Languages
Data Visualization
Data Engineering
Databases and Query Tools
Cloud and DevOps
Certifications
Professional credentials and achievements

Alteryx Designer Core Certification
Alteryx
March 2025
Certified in Alteryx Designer with expertise in data preparation, blending, and analytics workflow automation to solve real-world business problems

Hands-On Essentials : Data Warehouse
Snowflake
April 2025
Completed Snowflake Hands-On Essentials workshop with practical experience in data warehousing fundamentals and cloud data platform operations
Let's Connect
I'm always interested in hearing about new opportunities, collaborations, or just having a chat about data. Feel free to reach out!
