Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.
As a Principal Network Reliability Engineer at JPMorgan Chase within the Networking team, you work with your fellow stakeholders to define non-functional requirements (NFRs) and availability targets for the services in your application and product lines. You will ensure those NFRs are accounted for in your products’ design and test phases, that your service level indicators are effectively measuring customer experience, and that service level objectives are defined with stakeholders and implemented in production.
Job responsibilities
Creates high quality designs, roadmaps, and program charters that are delivered by you or the engineers under your guidance
Provides advice and mentoring to other engineers and acts as a key resource for technologists seeking advice on technical and business-related issues
Demonstrates site reliability principles and practices every day and champions the adoption of site reliability throughout your team
Collaborates with others to create and implement observability and reliability designs for complex systems that are robust, stable, and do not incur additional toil or technical debt
Works toward becoming an expert on the applications and platforms in your remit while understanding their interdependencies and limitations
Evolves and debug critical components of applications and platforms
Provides comprehensive and ongoing guidance, tools, and solutions to support the firms’ growth
Makes significant contributions to JPMorgan Chase’s site reliability community via internal forums, communities of practice, guilds, and conferences
Required qualifications, capabilities, and skills
Formal training or certification on software engineering concepts and 8+ years of applied experience
Strong knowledge on algorithms, data structures, design patterns and OOP.
Proficient using Python language and popular frameworks to design and implement microservices based architecture. Best practices, design patterns, TDD is a plus. At least 6+ years.
Experience with queues and events design using message brokers like Kafka, Redis, RabbitMQ (celery)
Experience designing and building high performing APIs. Desirable frameworks: FAST API, Flask, Django
Experience with NoSQL database products, graph databases is a plus
Good understanding of ElasticSearch and or OpenSearch
Proficiency and experience in observability such as white and black box monitoring, SLO alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.
Hands-on practical experience delivering system design, application development, testing, and operational stability
Hands on experience on working on private cloud applications (Docker, Kubernetes, Pivotal Cloud Foundry)
Fair knowledge of ETL process, tools/infra for ETL process, involved challenges
Experience managing logging, metrics and traces in microservices based applications (TICK and prometheus stacks is a plus)
Experience adopting agile methodologies in development teams like scrum and kanban
Experience adopting CI/CD practices and technologies for development projects. Manage different environments, design branching strategy, etc
Engage in coding, troubleshooting, and process automation making up 30% of work time.
Ability to anticipate, identify, and troubleshoot defects found during testing
Preferred qualifications, capabilities, and skills
Experience in SRE or DevOps roles
Understanding of and exposure to AWS Cloud Infrastructure is a plus
Prior experience with team-based development following a structured lifecycle
Prior experience with Agile Methodologies (Scrum, TDD)
GTIARGALL
#J-18808-Ljbffr