Data Engineer | Analytics | Cloud Solutions

Transforming data into actionable insights.

Data engineer with a passion for designing, building, and optimizing large-scale data pipelines on cloud platforms. Specialized in ETL/ELT orchestration, big data analytics, and machine learning infrastructure. Currently a Data Analyst Research Assistant at USC, driving innovation through data-driven solutions.

View Projects View Experience

LocationLos Angeles, CA

Emailsv51119@usc.edu

Phone(213) 913-1568

Portrait placeholder for Sanjana Venkatesh

About

Certified Associate Cloud Engineer with 4+ years of experience in data engineering and analytics. Passionate about leveraging cloud platforms (GCP, AWS) to solve complex data challenges and drive business impact. Currently pursuing a Master of Science in Analytics at USC (GPA: 3.8).

Infrastructure savings

$10M / year

Cloud migration (Hadoop → Dataproc/BigQuery)

Real-time throughput

150M+ / day

Kafka + PySpark streaming pipelines

BigQuery cost reduction

38%

Partitioning + clustering optimization

Data quality

99%+

Validation frameworks (SQL checks, DVT)

Core Strengths

Capabilities across cloud architecture, pipeline engineering, governance, and ML infrastructure.

Big Data & Cloud Architecture

GCP (BigQuery, Dataproc, GCS, Vertex AI), AWS (S3, EC2), Snowflake, and modern data stack patterns.

Pipeline Orchestration & Automation

Airflow/Composer, CI/CD with Git, API-based ingestion, event-driven processing, and production monitoring.

Data Quality & Governance

Validation checks, DVT, lineage, and audit-ready reporting — with an emphasis on reliability and compliance.

Analytics Enablement

dbt modeling (star schemas), Looker/LookML metrics layers, and Power BI/Tableau dashboards for stakeholders.

Performance & Cost Optimization

Query tuning, partitioning/clustering, Spark parameter optimization, and infrastructure right-sizing.

ML Pipelines & Deployment

Feature engineering, training workflows, Vertex AI pipelines, and reproducible experimentation (RAG / embeddings).

Why Hire Me

A track record of measurable impact and scalable delivery.

Proven Impact

$10M+ annual infrastructure savings through cloud modernization
$1.2M annual savings enabled via governed metrics layer insights
$23k annual AWS cost reduction through EC2 optimization

Scalable Engineering

Built pipelines processing 150M+ daily records from 30+ sources
Migrated 1,500+ tables to BigQuery with zero downtime
Eliminated 16 hours/day of manual operations via orchestration automation

Quick Links

Use these entry points to navigate quickly.

Projects Resume Certifications

Contact

Open to data engineering and analytics opportunities in Los Angeles or remote.

Get in touch

Best way to reach me is email. I typically respond within 24–48 hours.

Emailsv51119@usc.edu

Phone(213) 913-1568

LinkedInlinkedin.com/in/sanjana-venkatesh-24557a199

GitHubgithub.com/sanjana-v

What I’m looking for

Data Engineering / Analytics Engineering roles (GCP, Snowflake, BigQuery)
Pipeline orchestration, data modeling, performance optimization
ML infrastructure / feature pipelines / streaming (Kafka, Spark)
Stakeholder-facing analytics and metrics layer governance