Sanjana Venkatesh

Data Engineer | Analytics | Cloud Solutions

Transforming data into actionable insights.

Data engineer with a passion for designing, building, and optimizing large-scale data pipelines on cloud platforms. Specialized in ETL/ELT orchestration, big data analytics, and machine learning infrastructure. Currently a Data Analyst Research Assistant at USC, driving innovation through data-driven solutions.

LocationLos Angeles, CA
Portrait placeholder for Sanjana Venkatesh

About

Certified Associate Cloud Engineer with 4+ years of experience in data engineering and analytics. Passionate about leveraging cloud platforms (GCP, AWS) to solve complex data challenges and drive business impact. Currently pursuing a Master of Science in Analytics at USC (GPA: 3.8).

Infrastructure savings
$10M / year
Cloud migration (Hadoop → Dataproc/BigQuery)
Real-time throughput
150M+ / day
Kafka + PySpark streaming pipelines
BigQuery cost reduction
38%
Partitioning + clustering optimization
Data quality
99%+
Validation frameworks (SQL checks, DVT)

Core Strengths

Capabilities across cloud architecture, pipeline engineering, governance, and ML infrastructure.

Big Data & Cloud Architecture

GCP (BigQuery, Dataproc, GCS, Vertex AI), AWS (S3, EC2), Snowflake, and modern data stack patterns.

Pipeline Orchestration & Automation

Airflow/Composer, CI/CD with Git, API-based ingestion, event-driven processing, and production monitoring.

Data Quality & Governance

Validation checks, DVT, lineage, and audit-ready reporting — with an emphasis on reliability and compliance.

Analytics Enablement

dbt modeling (star schemas), Looker/LookML metrics layers, and Power BI/Tableau dashboards for stakeholders.

Performance & Cost Optimization

Query tuning, partitioning/clustering, Spark parameter optimization, and infrastructure right-sizing.

ML Pipelines & Deployment

Feature engineering, training workflows, Vertex AI pipelines, and reproducible experimentation (RAG / embeddings).

Why Hire Me

A track record of measurable impact and scalable delivery.

Proven Impact

  • $10M+ annual infrastructure savings through cloud modernization
  • $1.2M annual savings enabled via governed metrics layer insights
  • $23k annual AWS cost reduction through EC2 optimization

Scalable Engineering

  • Built pipelines processing 150M+ daily records from 30+ sources
  • Migrated 1,500+ tables to BigQuery with zero downtime
  • Eliminated 16 hours/day of manual operations via orchestration automation

Quick Links

Use these entry points to navigate quickly.

Contact

Open to data engineering and analytics opportunities in Los Angeles or remote.

Get in touch

Best way to reach me is email. I typically respond within 24–48 hours.

What I’m looking for

  • Data Engineering / Analytics Engineering roles (GCP, Snowflake, BigQuery)
  • Pipeline orchestration, data modeling, performance optimization
  • ML infrastructure / feature pipelines / streaming (Kafka, Spark)
  • Stakeholder-facing analytics and metrics layer governance