Financial Data Pipeline at JPMorgan
Distributed Spark pipelines for multi-terabyte datasets
Designed and deployed distributed Spark pipelines processing multi-terabyte financial datasets for real-time analytics. Built Python services and AWS workflows to support compliance, reporting, and internal AI-driven tools.
Company: JPMorgan Chase & Co.
Role: Software Engineer II
Duration: Jan 2022 - Aug 2025
Key Achievements
- Scalable Data Pipelines: Built Apache Spark pipelines handling petabyte-scale financial data
- Cloud Migration: Led migration of legacy systems to AWS infrastructure
- Cost Optimization: Reduced operating costs by 40% through infrastructure improvements
- System Reliability: Improved system uptime from 95% to 99.5%
Technologies
Apache Spark, Python, AWS, Databricks, Docker, Kubernetes
Recognition
- Corporate Tech Champion Award (Apr 2025)
- Early Career Recognition Award (Nov 2024)
- Engineering Excellence Award (Aug 2023)
- Innovation Hackathon Winner (Mar 2023)