Financial Data Pipeline at JPMorgan

Distributed Spark pipelines for multi-terabyte datasets

Designed and deployed distributed Spark pipelines processing multi-terabyte financial datasets for real-time analytics. Built Python services and AWS workflows to support compliance, reporting, and internal AI-driven tools.

Company: JPMorgan Chase & Co.
Role: Software Engineer II
Duration: Jan 2022 - Aug 2025

Key Achievements

  • Scalable Data Pipelines: Built Apache Spark pipelines handling petabyte-scale financial data
  • Cloud Migration: Led migration of legacy systems to AWS infrastructure
  • Cost Optimization: Reduced operating costs by 40% through infrastructure improvements
  • System Reliability: Improved system uptime from 95% to 99.5%

Technologies

Apache Spark, Python, AWS, Databricks, Docker, Kubernetes

Recognition

  • Corporate Tech Champion Award (Apr 2025)
  • Early Career Recognition Award (Nov 2024)
  • Engineering Excellence Award (Aug 2023)
  • Innovation Hackathon Winner (Mar 2023)