Chief Senior Manager Site Reliability Engineering -Application (Production Support) Navi Mumbai Navi-Mumbai

Chief / Senior Manager - Site Reliability Engineering -Application (Production Support) Navi Mumbai

Opening: 1 Nos.

Job ID: 135813

Employment Type: Full Time

Reference:

Work Experience: 12.0 Year(s) To 14.0 Year(s)

CTC Salary: 30.00 LPA TO 32.00 LPA

Function: IT Software- Application Programming / Maintenance

Industry: Banking/Financial Services

Qualification: BCA/BCS - Computers

Location:

Posted On: 04th Jun, 2026

Job Description:

Be responsible for production support & release management for application assigned - SRE C1 - Elastic Stack : ELK , Application Performance Management : APM and Disaster Recovery (DR).
Should possess excellent troubleshooting and analytical skills.
This senior leadership role requires strong technical expertise, strategic thinking, and proven experience in managing mission-critical systems at scale.

 Elastic Stack (ELK) Cluster Lead
 Architect, deploy, and optimize ELK clusters for enterprise observability.
 Ensure log ingestion, parsing, and visualization meet compliance and
performance standards.
 Drive automation for scaling, resilience, and performance tuning.
 Application Monitoring Management (APM) Cluster Lead
 Define and implement APM strategy across critical applications.
 Lead deployment and integration of APM tools (Dynatrace, AppDynamics,
New Relic, Datadog etc..).
 Establish KPIs, SLAs, and proactive monitoring frameworks to ensure
application reliability.
 Design synthetic monitoring for different critical business journey & key
metrics.

 Disaster Recovery(DR) Oversight
 Own DR strategy, planning, and execution for enterprise applications.
 Conduct regular DR drills, audits, and compliance checks.
 Align DR processes with business continuity and regulatory requirements.
 Ensuring the robust replication between primary & secondary sites.
 Oversee daily backup(s).
 Ensuring all Disaster recovery process and documentation meets oblication
mandate by the regulators.
 Provide comprehensive Audit reports for DR/DC environments.
 Lead the command structure when disruptive event occurs and direct the
recovery team such network, database, application etc.
 Co-ordinate the dissemination of critical information for senior management
& external stakeholders.
 Conduct through evaluation of incidents to determine failures and issues.
 SRE Practices
 Champion SRE principles: reliability, scalability, automation, and continuous
improvement.
 Monitor error budgets, SLIs, SLOs, and SLAs for critical systems.
 Drive incident management, root cause analysis, and long-term remediation.

Key Skills :

Company Profile

A leading Non-Banking --- Company (NBFC) that caters to the growing needs of an Aspirational India, serving both Individual & Business Clients.Incorporated

Apply Now

Interested candidates are requested to apply for this job.
Recruiters will evaluate your candidature and will get in touch with you.

Sign-In To Apply

Apply Without Registration

Chief / Senior Manager - Site Reliability Engineering -Application (Production Support) Navi Mumbai