Senior Incident Optimization Specialist – CPU, Memory, Unix, Linux, OS agents, Windows
- Job Req Id:
- 26931138
- Location(s):
- Chennai, Tamil Nadu, India
- Job Type:
- On-Site/Resident
- Posted:
- Mar. 17, 2026
Discover your future at Citi
Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.
Job Overview
Position Summary
The Senior Incident Optimization Specialist is a specialized technical leadership role requiring deep expertise in understanding the technology foundations of core infra services in platforms and devices such as “CPU, Memory, Unix, Linux, OS agents, Windows” that helps the Business & Product Application in Enterprise to function.
This position is critical to the success of the Incident Reduction Program, providing delivery of solutions which optimize and automate operations workflows.
You will be responsible for building automated incident remediation workflows and achieving measurable incident reduction through intelligent alert optimization, correlation, and automation while preserving the critical observability required for business-critical mainframe applications and batch processing. This role offers the unique opportunity to modernize event management for legacy systems using cutting-edge AIOps platforms and automation technologies.
Key Responsibilities
- Incident & Alert Analysis: Conduct in-depth analysis of mainframe and batch processing alerts to identify chronic issues, reduce operational noise, and develop strategies to address high-volume incident generators, including recurring job failures.
- Intelligent Event Management: Design and implement domain-specific correlation, de-duplication, and suppression rules on AIOps and event management platforms. Develop logic that understands mainframe subsystem relationships and cascading batch job dependencies.
- Automation & Self-Healing: Architect and develop automation playbooks for incident data enrichment, automated job restarts, and self-healing capabilities for common mainframe and batch processing failures.
- Observability Enhancement: Assess monitoring gaps in mainframe and batch environments, proposing enhancements to ensure critical business processes have appropriate alerting coverage and align with enterprise standards.
- Cross-Functional Collaboration: Partner closely with mainframe operations, batch scheduling, and application development teams to validate correlation logic, define automation initiatives, and provide expert guidance on modern event management practices.
- Quality Assurance: Continuously validate the effectiveness of implemented rules and automation. Establish feedback loops with operational teams to conduct post-implementation reviews and iterative improvements.
Required Qualifications
- Education: Bachelor’s degree in computer science, Information Technology, Computer Engineering, or a related technical field.
- Experience: A minimum of 8+ years of hands-on experience in building and supporting technology foundations of core infra services in platforms and devices “CPU, Memory, Unix, Linux, OS agents, Windows”
- Event Management & Incident Reduction: Proven track record in event management, alert tuning, and incident reduction within complex legacy environments pertaining to platforms “CPU, Memory, Unix, Linux, OS agents, Windows” with quantifiable results. Direct, hands-on experience with modern AIOps and event management platforms is required.
- Technical Expertise:
- Deep understanding the technology foundations of core infra services in platforms and devices such as “CPU, Memory, Unix, Linux, OS agents, Windows” that helps the Business & Product Application in Enterprise to function.
- Automation & Scripting: Hands-on experience developing robust automation solutions using relevant scripting languages and modern automation frameworks.
- Data Analysis: Proficiency in log analysis, pattern recognition, and using query languages for data analysis on log aggregation platforms.
- Problem-Solving & Analytical Skills: Excellent analytical abilities with a systematic approach to troubleshooting complex
- Communication & Leadership: Exceptional communication skills with the ability to bridge, Unix, Linux, Windows and technology teams, influence collaboration, and present technical concepts to diverse audiences.
Role Significance & Business Impact (Note)
This role is critical due to the scale and complexity of production data handled, requiring the Optimization Lead to examine and derive insights from approximately 4,000,000 (40 lakhs) operational data points annually across multiple technology sources.
At this scale:
• Manual or reactive analysis is not feasible
• Advanced data analytics and AI-driven frameworks become mandatory, not optional
• Accuracy in event correlation, prioritization, and response directly impacts system stability, customer experience, and operational cost
The Optimization Lead ensures that high-volume production data is transformed into actionable intelligence, enabling proactive issue identification, noise reduction, efficiency gains, and continuous improvement across production operations.
------------------------------------------------------
Job Family Group:
Technology------------------------------------------------------
Job Family:
Infrastructure------------------------------------------------------
Time Type:
Full time------------------------------------------------------
Most Relevant Skills
Please see the requirements listed above.------------------------------------------------------
Other Relevant Skills
For complementary skills, please see above and/or contact the recruiter.------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.
Global Benefits
Discover the top benefits offered to our global workforce, designed to support your well-being, growth and work-life balance. Explore a few of the highlights that make working with us rewarding.
Explore More Jobs
-
커머셜사업본부 - RA
- Seoul, Seoul
-
커머셜디지털기업금융센터 - Junior Banker
- Seoul, Seoul
-
Wholesale Credit Risk Intermediate Analyst – Officer
- Mumbai, Maharashtra
-
Wealth Statistical Regulatory Reporting Asia AVP
- Chennai, Tamil Nadu
-
Early Careers Talent Network
Sign up to receive personalized job matches based on your skills and interests. We'll help you discover opportunities that align with your goals.
-
Career Professionals Talent Network
Sign up to receive tailored job matches based on your skills and experience. Discover opportunities that align with your ambitions.