Job Description
As part of the Data and Technology Services practice, you will be responsible for ensuring the reliability, performance, and scalability of client’s production systems, with a strong focus on troubleshooting, optimization, and operational excellence across AWS cloud infrastructure and big data environments.
Global technology megatrends, regulatory developments, competitive landscape and rise of alternative data are changing the way capital markets operate. At the Data and Technology services operations, we partner with some of the largest financial services firms and corporations in this transformative journey. We are looking for professionals who are hands on, passionate about the work and have the ambition to drive disruptive changes to global business models.
Key Responsibilities
• Act as a first responder to production incidents. Continuously monitor and troubleshoot live data pipelines and database systems for performance, failures, and anomalies
• Ensure the Client Master remains the single source of truth for client data. Understand the enterprise-wide impact and dependencies of MDM systems.
• Assess and manage data volume, system capacity, and interdependencies. Monitor data growth, project future storage and compute resource needs, and plan for horizontal scaling.
• Maintain and back up MDM schemas with an understanding of their unique constraints.
• Optimize performance and ensure high availability and reliability
• Diagnose and resolve issues in Python-based ETL pipelines. Identify root causes such as memory bottlenecks,
deadlocks, or DB connection timeouts. Analyze pipeline logic and dependencies to triage failures effectively.
• Understand AWS components involved in pipeline hosting (e.g., EC2, S3, CloudWatch, ELB, VPC).
• Support high-throughput environments with multiple data sources and sinks.
• Implement and manage high availability, disaster recovery, and replication strategies.
• Ensure data security, compliance, and access control using AWS-native tools and best practices.
• Participate in capacity planning, performance tuning, and incident resolution.
• Participate in post-incident reviews, update monitoring based on new learnings, and refine operational procedures.
Desired Experience & Qualification
• 5+ years of experienced DBA with AWS production support.
• Strong expertise in working with production environment (first line of defense)
• Python Triaging skills (Debugging, Root Cause Analysis, Resolution)
• Proficient with Enterprise-wide impact and dependencies of MDM systems
• Implement backup and restore strategies for all MDM and critical data schemas