Principal Network Reliability Engineer at JPMorgan Chase
Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability. As a Principal Network Reliability Engineer at JPMorgan Chase within the Networking team, youwork with your fellow stakeholders to define non-functional requirements (NFRs) and availability targets for the services in your application and product lines. You will ensure those NFRs are accounted for in your products’ design and test phases, that your service level indicators are effectively measuring customer experience, and that service level objectives are defined with stakeholders and implemented in production.
Job responsibilities:
Creates high quality designs, roadmaps, and program charters that are delivered by you or the engineers under your guidance
Provides advice and mentoring to other engineers and acts as a key resource for technologists seeking advice on technical and business-related issues
Demonstrates site reliability principles and practices every day and champions the adoption of site reliability throughout your team
Collaborates with others to create and implement observability and reliability designs for complex systems that are robust, stable, and do not incur additional toil or technical debt
Works toward becoming an expert on the applications and platforms in your remit while understanding their interdependencies and limitations
Evolves and debug critical components of applications and platforms
Provides comprehensive and ongoing guidance, tools, and solutions to support the firms’ growth
Makes significant contributions to JPMorgan Chase’s site reliability community via internal forums, communities of practice, guilds, and conferences
Required qualifications, capabilities, and skills:
Formal training or certification on software engineering concepts and 8+ years of applied experience
Strong knowledge on algorithms, data structures, design patterns and OOP
Proficient using Python language and popular frameworks to design and implement microservices based architecture
Experience with queues and events design using message brokers like Kafka, Redis, RabbitMQ (celery)
Experience designing and building high performing APIs
Good understanding of ElasticSearch and or OpenSearch
Proficiency and experience in observability such as white and black box monitoring, SLO alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.
Hands-on practical experience delivering system design, application development, testing, and operational stability
Hands on experience on working on private cloud applications (Docker, Kubernetes, Pivotal Cloud Foundry)
Fair knowledge of ETL process, tools/infra for ETL process, involved challenges
Experience managing logging, metrics and traces in microservices based applications (TICK and prometheus stacks is a plus)
Experience adopting agile methodologies in development teams like scrum and kanban
Experience adopting CI/CD practices and technologies for development projects
Engage in coding, troubleshooting, and process automation making up 30% of work time
Ability to anticipate, identify, and troubleshoot defects found during testing
Preferred qualifications, capabilities, and skills:
Experience in SRE or DevOps roles
Understanding of and exposure to AWS Cloud Infrastructure is a plus
Prior experience with team-based development following a structured lifecycle
Prior experience with Agile Methodologies (Scrum, TDD)
#J-18808-Ljbffr