WHAT DOES AN INCIDENT MANAGER DO?

Published: Aug 27, 2025 - The Incident Manager owns and drives the Incident Management process by ensuring effective coordination, timely communication, and rapid service restoration while minimizing business impact. This role involves leading escalations, analyzing root causes, and facilitating post-incident reviews to implement corrective measures, reduce recurring issues, and enforce ITIL-aligned best practices. The Manager also collaborates with internal teams, external partners, and vendors to ensure adherence to SLAs, compliance requirements, and continuous service improvement.

A Review of Professional Skills and Functions for Incident Manager

1. Incident Manager Roles and Responsibilities

  • Assessment: Perform first-level assessment of technical and process deficiencies and follow up on investigations.
  • Leadership: Provide leadership to multiple support teams during major incidents to ensure immediate service restoration.
  • Process Management: Ensure proper lifecycle transition from Incident to Problem Management processes.
  • Issue Resolution: Resolve escalated matters and provide approvals.
  • Collaboration: Work in detail with technical teams.
  • Trend Analysis: Perform incident trend analysis and systemic problem identification.
  • Framework Support: Support and promote the ITSM framework among teams and the wider IT organization.
  • Operations Management: Participate in Operations Management activities and special projects or initiatives.
  • Change Oversight: Approve and oversee emergency corrective actions and changes.
  • Root Cause Analysis: Participate in root cause analysis meetings.
  • Process Improvement: Contribute to the continual improvement of the Incident Management process.

2. Incident Manager Duties and Roles

  • Stakeholder Communication: Communicate to stakeholders about the start, updates, and resolution of incidents.
  • Coordination: Coordinate between IT teams and the business to resolve customer impact.
  • Impact Analysis: Coordinate among IT teams to determine the specific impact on customers.
  • War-Room Management: Chair virtual “war-rooms” with all relevant parties, sharing information via collaboration tools and voice conferencing.
  • Escalation: Escalate to Global Major Incident Management.
  • Guidance: Guide teams through root cause analysis and the definition of improvement actions.
  • Management Reporting: Create management overviews on process adherence and maturity across domains.
  • Problem Escalation: Recognize problem areas such as high incident recurrence or low process maturity and escalate appropriately.
  • Knowledge Management: Collect and organize information about IT services into a central knowledge base.

3. Incident Manager Responsibilities and Key Tasks

  • Incident Coordination: Coordinate and manage Escalated Incident process activities.
  • Reporting Support: Support Escalated Incident Management reporting, including KPIs and customer SLAs.
  • Process Execution: Drive standard execution of the Escalated Incident Management process.
  • Stakeholder Communication: Communicate with internal and customer stakeholders regarding status, resolution, and success criteria signoff.
  • Incident Recording: Record and classify received incidents and take immediate action to restore failed services while targeting the most efficient time for restoration.
  • Timeline Documentation: Record all timelines, activities, resources, and assets involved in the incident.
  • Record Association: Associate incidents with related records such as other incidents, changes, problems, knowledge articles, and known errors.
  • Root Cause Ownership: Own the Root Cause Analysis process and delegate information requests to technical resources involved in service restoration.
  • Action Planning: Create, define, and manage Technical Action Plans for escalated incidents.
  • Priority Focus: Focus work primarily on Sev1 incidents, Sev2 incidents, Problem Management, Root Cause reviews, and internal escalations.
  • Collaboration: Collaborate effectively across technical and support teams.

4. Incident Manager Roles and Details

  • Process Oversight: Oversee the incident management process and team members involved in the process.
  • Incident Prioritization: Prioritize incidents according to their urgency and influence on the business.
  • Documentation: Produce documents and maintain the internal knowledge base.
  • Policy Analysis: Analyze current policies and develop new procedures.
  • Performance Analysis: Analyze team performance, provide feedback, and implement process improvements.
  • Tool Analysis: Analyze the use of the ITSM tool (Zendesk) and coordinate changes to drive higher efficiency.
  • Team Coordination: Work closely with the Scrum Master and Quality Assurance teams to coordinate activities.
  • Requirement Definition: Define requirements for improvements in the knowledge base.

5. Incident Manager Key Accountabilities

  • Incident Handling: Handle and coordinate incidents with engineers.
  • Event Coordination: Coordinate events together with the monitoring team.
  • War Room Management: Establish war rooms for high-priority tickets.
  • Knowledge Coordination: Secure knowledge base coordination.
  • Problem Management: Create and handle problem records.
  • Disaster Communication: Manage disaster communication and reporting.
  • Vendor Coordination: Coordinate with vendors on events, incidents, and problems.
  • Security Coordination: Coordinate security issues.
  • KPI Management: Manage statistics and KPIs.

6. Incident Manager General Responsibilities

  • Framework Management: Develop, maintain, and monitor conformance to the organization’s incident management framework, minimum standards, and associated tools.
  • Escalation Management: Act as the main point of contact for incident escalations and manage major incidents from inception through to resolution.
  • Stakeholder Engagement: Work with senior business stakeholders to ensure lessons are learned and improvements are implemented when issues occur.
  • Business Continuity: Support business continuity activities by ensuring plans are in place to recover critical services within recovery time objectives.
  • Post-Incident Analysis: Drive post-incident analysis to define root causes and ensure necessary corrective actions are implemented to avoid reoccurrences.
  • Committee Reporting: Support the creation of timely reporting for various committees, including operational and executive risk committees.
  • Service Protection: Play an integral role in ensuring critical services are protected and smoothly recovered when disruptions occur.
  • Project Collaboration: Collaborate with multiple project teams to understand potential impacts and develop robust incident response plans.
  • Management Reporting: Produce timely, informative management information suitable for presentation to the Executive Board.
  • Customer Remediation: Ensure the customer impact of incidents is fully understood and customer remediation is completed.

7. Incident Manager Role Purpose

  • Application Monitoring: Control and monitor the application landscape and systems in terms of end-to-end processing of business processes.
  • Procedure Development: Develop and implement procedures and tools for application monitoring.
  • Incident Management: Analyze, localize, and communicate incidents, coordinate solution finding and elimination, and initiate escalation mechanisms.
  • Process Optimization: Identify and eliminate weak points and optimize requirements in existing processes and systems.
  • Specialist Responsibility: Assume specialist responsibility and ensure know-how transfer for individual applications.
  • On-Call Service: Perform periodic on-call service.
  • Emergency Support: Be involved in the emergency organization and projects of the division.
  • Project Participation: Participate in projects and programs of the division.

8. Incident Manager Essential Functions

  • Incident Management: Provide incident management for an assigned set of customers and services.
  • Service Monitoring: Monitor platform, network, application, and other service events.
  • Incident Evaluation: Evaluate and delegate existing incidents.
  • Action Prioritization: Prioritize and manage required actions.
  • Team Cooperation: Cooperate with the problem and change management teams.
  • Workflow Management: Track, re-direct, and escalate incident workflows.
  • Critical Resolution: Cover critical incidents and provide technical resolutions.
  • Tool Usage: Use service management tools.

9. Global Incident Manager Additional Details

  • Incident Response: Coordinate activities to respond to major incidents and serve as an escalation point.
  • Resource Direction: Direct support resources to where they are most required.
  • Incident Escalation: Escalate incidents when the lack of progress is identified.
  • Policy Management: Establish policies, systems, and procedures, and monitor compliance with them.
  • Trend Analysis: Analyze and report trends in incidents and contribute to the service improvement roadmap.
  • Relationship Management: Establish and maintain positive business relationships with customers, suppliers, and other stakeholders.
  • Process Measurement: Measure the effectiveness of ITSM processes.
  • Best Practice Application: Apply relevant standards and best practices (e.g., ITIL/COBIT).

10. Incident Manager Roles

  • Issue Management: Manage critical client issues by working closely with Engineering, clients, and partners.
  • Escalation Process: Define and document a detailed process for a 24x7 escalation practice as part of a new role at Blue Prism.
  • CRM Maintenance: Ensure the CRM system is updated with the latest activities and action plans.
  • Stakeholder Communication: Involve all stakeholders, maintaining clear communication with Sales, Engineering, Support Management, and customers.
  • Issue Resolution: Actively lead, drive, and resolve escalated issues.
  • Issue Review: Proactively review open issues to identify patterns that may lead to escalation.
  • Root Cause Analysis: Collaborate with colleagues to identify root causes of issues and communicate them to clients and Blue Prism management.
  • Post-Incident Review: Conduct post-incident reviews and report lessons learned to improve processes and customer experience.
  • Resource Coordination: Orchestrate and coordinate resources from Professional Services or other teams when on-site support is required.

11. Incident Manager Tasks

  • Customer Experience: Implement solutions to improve the customer experience.
  • Incident Participation: Participate actively in major incidents throughout their lifecycle to meet expected SLA/SLOs up to resolution.
  • Incident Leadership: Provide leadership and technical guidance on major incident conference calls.
  • Proactive Approach: Adopt proactive approaches to eliminate problematic trends.
  • Post-Mortem Reporting: Produce and review post-mortem reports promptly.
  • Governance Facilitation: Facilitate governance meetings with various partners, including Help Desk, Network, and Field Services.
  • Team Coordination: Coordinate with internal teams, partners, and suppliers to establish communications and manage expectations.
  • Quality Control: Ensure quality control on problem and incident activities.
  • Account Management Support: Meet regularly with Customer Service Account Managers to present results and action plans related to problem management activities.
  • Business Communication: Provide communications to internal business groups and executives throughout the problem record lifecycle.
  • Service Improvement: Identify service improvement opportunities and analyze risk assessments.
  • Escalation Management: Act as the single point of contact for all customer escalations and service assurance-related issues.
  • On-Call Support: Be available for escalations outside of business hours on rotation.

12. Incident Manager Details and Accountabilities

  • Process Oversight: Oversee the incident management process and team members involved in resolving incidents.
  • Incident Response: Respond to reported service incidents, identify the cause, and initiate the incident management process.
  • Incident Prioritization: Prioritize incidents according to their urgency and influence on the business.
  • Documentation: Produce documentation around the incident management process, use of the service management tool, and other ad hoc documentation requests.
  • Engineer Collaboration: Collaborate with technical engineers to ensure that all protocols are diligently followed.
  • Incident Logging: Log all incidents and their resolutions in accordance with strict Service Level Agreements and ITIL principles.
  • Process Adjustment: Adjust the incident management process to ensure its effectiveness.
  • Management Communication: Communicate with upper management if major issues are found in the IT system.
  • Team Management: Manage incident team members by re-assigning workloads and re-scheduling non-urgent tasks.
  • Trend Analysis: Perform trend analysis of incident records to identify potential problems.
  • Knowledge Management: Create and update knowledge base articles.
  • Process Improvement: Identify and implement process improvements as directed by the management team.

13. Incident Manager Overview

  • Impact Assessment: Assess business impact and urgency, declare major incidents, or trigger business continuity procedures or disaster recovery invocation scripts.
  • Incident Resolution: Ensure that cross-domain incidents are resolved effectively, securing end-to-end Service Level Agreements and Service Management.
  • Escalation Point: Act as the escalation point where resolution ownership is disputed.
  • Process Guidance: Provide guidance and assistance to ensure a globally consistent approach to operational processes.
  • Communication Delivery: Deliver consistent communications within the scope of the process and services.
  • Reporting: Produce and deliver high-quality reports and communications.
  • Customer Advocacy: Ensure that the customer’s business interests are maintained above those of any specific domain or service owner.
  • Collaboration: Collaborate with various Service Delivery Owners, Incident Managers, and Regional/Global Change Managers.
  • Issue Resolution: Proactively identify operational issues and drive resolution by working directly with clients and service owners.
  • Subject Expertise: Act as the subject matter expert for Incident Management processes.
  • Process Support: Support the effective operation of the Major Incident Management process.

14. Incident Manager Job Description

  • Bridge Management: Manage major incident bridges where clients and third-party vendors are present.
  • Single Contact Point: Act as a single point of contact liaising between various technical teams and step in to manage the situation.
  • Status Communication: Communicate the status of issues and progress toward resolution periodically to both the client and internal management.
  • Escalation Management: Gauge situations and escalate on demand to engage additional resources or management, based on feedback during bridge calls, whether technical, communication, or ownership related.
  • Conflict Management: Tactfully manage conflicts and help teams stay focused on issue resolution.
  • Action Documentation: Record, document, and share actions taken by personnel during bridge calls.
  • Post-Incident Analysis: Participate in post-incident analysis by providing inputs on successes and areas for improvement.
  • Trend Analysis: Analyze incidents and request data to identify trends, follow up, reduce aging incidents, and identify process variances.
  • Team Coaching: Coach technical team members on defined processes.
  • Report Preparation: Prepare and present various reports internally and in client meetings.
  • Organizational Participation: Participate in organizational activities, including compliance drives, trainings, and audits.

15. Incident Manager Functions

  • Incident Facilitation: Drive incident management resolution by facilitating technical bridges throughout the event lifecycle.
  • Action Documentation: Document fix actions and provide detailed summaries to Client Delivery Representatives and CGI Executive Leadership.
  • Consistent Communication: Use internal tools and processes to communicate consistently with team members, technical support teams, client delivery, and CGI Executive Leadership.
  • Best Practice Management: Establish best practices and processes for managing major incidents in the areas of quality.
  • Operational Compliance: Manage operational compliance with terms and conditions through SLAs, KPIs, and other measures.
  • Problem Management: Identify problems, own results, and implement improvements.
  • Management Advising: Advise senior management during the incident lifecycle by performing notifications, ensuring process performance, identifying significant trends, and recommending actions.
  • Process Recommendations: Provide future recommendations for tool and process improvements.
  • Issue Anticipation: Anticipate potential issues, develop action plans to avoid them, and ensure proactive mitigation.
  • IT Value Delivery: Improve the value equation for IT capability delivery across the entire IT department.
  • Innovation: Contribute innovative ideas, propose improvements, and execute initiatives to drive the Infrastructure organization forward.

16. Incident Manager Accountabilities

  • Design Leadership: Lead design projects focused on Digital Experience Monitoring (DEM) to provide capabilities for monitoring and analyzing end-user experiences on mobile and web applications.
  • Agile Collaboration: Work closely with product and engineering teams to deliver a defined roadmap following agile principles.
  • User Research: Collaborate with the user research team to gather and synthesize qualitative and quantitative data from customers and partners.
  • Solution Delivery: Drive and iteratively deliver solutions that enable users to make quality decisions faster and with greater confidence.
  • Design Methods: Apply a variety of design methods to create concepts, storyboards, wireframes, mocks, and prototypes to explore and communicate design solutions.
  • Design Communication: Lead design reviews, communicate designs to leadership and engineering, and support roadmap and sprint planning.
  • UX Partnership: Partner with designers, researchers, product managers, and engineers to deliver compelling, people-centered UX solutions.
  • Design System Collaboration: Collaborate with the design technology team to influence and enhance the organization’s design system.

17. Incident Manager Job Summary

  • Problem Management: Coordinate and lead problem management activities and bridges, ensuring root causes and permanent corrective actions are identified for high-impact and high-visibility incidents.
  • Communication Facilitation: Facilitate communications across ITO service teams, escalating and communicating up the management chain.
  • Process Collaboration: Collaborate with subject matter experts to refine operating processes and procedures for more efficient service delivery and restoration.
  • Record Management: Ensure timely creation and updates of problem records with appropriate priority assigned.
  • Team Relationships: Maintain open communication and strong working relationships with all service teams participating in Problem Management, and help define ways to increase client satisfaction.
  • Ticket Review: Review problem ticket records for accuracy, generate reports, and facilitate Problem Reviews.
  • Trend Analysis: Develop trend analysis and prepare service improvement plans to address identified gaps.
  • Process Review: Meet with the Problem Process Owner to review, revise, and maintain Problem Management processes and procedures that reflect best practices.

18. Incident Manager Responsibilities

  • Escalation Management: Identify, contain, and resolve time-sensitive escalations.
  • Data Analysis: Perform analysis of data from a variety of sources to identify possible risk indicators.
  • Security Triage: Perform security incident triage, including determining scope, urgency, and potential impact, identifying the specific vulnerability, and making recommendations that enable expeditious remediation.
  • Trend Reporting: Perform escalation trend analysis and reporting.
  • Incident Handling: Perform real-time incident handling tasks to support worldwide operations.
  • Root Cause Analysis: Analyze escalations from various sources and determine possible root causes.
  • Incident Tracking: Track and document incidents from initial detection through final resolution.
  • Guidance Publication: Write and publish techniques, guidance, and reports on incident findings.
  • Threat Correlation: Coordinate with analysts to correlate threat assessment data.
  • Error Reporting: Write and publish Correction of Error Reports and after-action reviews.
  • Threat Monitoring: Monitor external data sources to maintain currency of threat conditions and determine which security issues may impact POE operations.
  • Executive Reporting: Present deep-dive documents and reports on policy and process gaps to senior executives.

19. Incident Manager Details

  • System Management: Manage the functionality and efficiency of a group of computers running on one or more operating systems.
  • System Security: Maintain the integrity and security of servers and systems.
  • Documentation: Maintain system documentation.
  • User Interaction: Interact with users and evaluate vendor products.
  • Installation and Recovery: Coordinate hardware and software installation and provide backup recovery.
  • Resource Monitoring: Monitor policies and standards for allocation related to the use of computing resources.
  • System Testing: Participate in systems testing and document results.
  • User Training: Provide advice and training to end-users.
  • Technology Knowledge: Maintain current knowledge of relevant technologies.
  • Project Participation: Participate in special projects.

20. Incident Manager Duties

  • IMS Monitoring: Monitor IMS and report incidents with directives.
  • Ticket Management: Report, monitor, and update outage tickets hourly and VIP tickets hourly.
  • COOP Monitoring: Report, monitor, and update COOP events and activities hourly.
  • Weather Monitoring: Report, monitor, and update weather events that impact services.
  • Impact Monitoring: Monitor AMs/DMs to determine possible impacts on the operations of services.
  • Ticket Cleanup: Monitor and clean up orphaned tickets.
  • Policy Compliance: Monitor and enforce compliance with the Incident Management Policy.
  • Policy Maintenance: Maintain the Incident Management Policy.
  • Situational Awareness: Work to improve situational awareness.
  • Aging Ticket Resolution: Monitor and facilitate the resolution of tickets older than 30 days.
  • Problem Management Support: Monitor and facilitate tasks for the Problem Management team.
  • Queue Control: Control, disseminate, and monitor ticket queues.
  • Service Communication: Draft service maintenance, outage, and restoral announcements.
  • After-Action Reporting: Facilitate the creation and submission of after-action reports for major incidents and outages.
  • Incident Coordination: Communicate and coordinate with Incident Managers and/or representatives from other contracts.

21. Incident Manager Roles and Responsibilities

  • Incident Analysis: Perform analysis of IT incidents and problems to proactively prevent recurrence of further incidents and issues.
  • Incident Reporting: Conduct thorough analysis and prepare Major Incident Reports (MIR) for every major incident after closure.
  • Procedure Updates: Update resolution procedures in the known issues section of the IT support runbook.
  • Problem Review: Conduct problem management review meetings with relevant members to identify triggers, causes, and preventative measures for major incidents.
  • Root Cause Analysis: Ensure causes of all major incidents are analyzed and root causes identified by coordinating with all parties in the problem management process.
  • Status Reporting: Provide periodic reports on the overall status of the Major Incident Management process.
  • Training Delivery: Conduct training and knowledge-sharing sessions for new and existing team members to minimize risk at all stages.
  • Documentation Development: Develop documentation for end-users, service desk support agents, and technical resources.
  • Team Coordination: Coordinate input from multiple cross-functional teams in real time and post-event.
  • Strategic Guidance: Provide strategic direction on problem management and incident management activities to drive enterprise-wide efficiencies.
  • Runbook Management: Drive runbook creation and improvements based on known issues and lessons learned from major incidents.
  • Crisis Response: Act quickly, pragmatically, and assertively under pressure to prioritize and resolve technical issues.
  • Decision Making: Initiate action and take responsibility for decisions to help IT achieve continuous availability.
  • Activity Planning: Plan, prioritize, and coordinate own and others’ activities effectively.
  • Process Documentation: Document processes and follow ITIL best practices.

22. Incident Manager Duties and Roles

  • Outage Restoration: Lead the restoration of all major network outages impacting business customers.
  • Incident Resolution: Own and manage the end-to-end resolution of all major and escalated priority incidents.
  • Customer Experience: Deliver exceptional customer experience through effective collaboration with internal and external departments, divisions, and customers.
  • Customer Satisfaction: Achieve required levels of customer satisfaction and advocacy.
  • Product Knowledge: Maintain a solid understanding of the products and services offered to the business customer base.
  • Root Cause Investigation: Assist with root cause investigation and drive ongoing improvements in incident management.
  • Escalation Management: Ensure timely resolution of escalated incidents for priority and sensitive business customers.
  • Incident Documentation: Provide post-incident documentation for high-priority incidents that meet defined criteria.
  • Outage Communication: Communicate potential major outages affecting business customers across the wider community.
  • Stakeholder Collaboration: Collaborate with all stakeholders to drive improvement, remedial, and preventative actions.
  • Incident Notifications: Produce and distribute Major Incident Notifications.
  • Conference Leadership: Organize and lead technical incident conference bridges.
  • Incident Triaging: Perform triaging activities to validate that major incident criteria have been met.
  • Delay Assessment: Assess the impact of resolver group acceptance delays and take appropriate action.
  • Technology Awareness: Stay updated on technology advancements and ITSM best practices.
  • Standards Compliance: Ensure adherence to industry standards and benchmarks.
  • Performance Reporting: Establish measures and reporting that reflect business, user, and IT support organization requirements.

23. Senior Incident Manager Responsibilities and Key Tasks

  • Process Ownership: Assume ownership of the Incident Management Process and ensure its adoption throughout the organization.
  • Process Development: Lead process development and refinement initiatives, socialize and train colleagues on the process, and fulfill internal audit obligations.
  • Data Custodianship: Act as the custodian of all incident-related data, ensuring accurate records to aid recovery efforts and support historical analysis/management information.
  • Playbook Management: Develop and maintain incident playbooks for common or anticipated scenarios.
  • Incident Response: Respond to reported incidents from multiple channels, including voice, email, and ITSM/alerting toolsets.
  • Impact Assessment: Ensure the impact of incidents is understood swiftly, with correct prioritization, categorization, and assignment to appropriate resolution groups.
  • Lifecycle Management: Manage all P1–P4 incidents throughout their lifecycle, ensuring progression and recovery in line with SLA/OLA commitments.
  • Major Incident Ownership: Take direct ownership of recovery efforts for Major Incidents (P1/P2) as Major Incident Manager (MIM) when on-call.
  • Incident Reporting: Create internal and external incident reports for all P1 Major Incidents.
  • Internal Communication: Issue internal communications related to incidents.
  • Crisis Escalation: Escalate incidents into the Crisis Management function/team in line with documented processes.
  • Monitoring Enhancement: Work with monitoring and operations teams to enhance monitoring capabilities and triage processes, aiming to eliminate or mitigate the impact of future incidents.
  • Performance Dashboards: Create and distribute Incident Performance Dashboards, including volume, cause, method of identification, recovery time, and impact.
  • Service Improvement: Contribute to the Continuous Service Improvement (CSI) framework to enhance service quality, client experience, and support service delivery.
  • Problem Collaboration: Collaborate closely with the Problem Management function to ensure problem resolution efforts are appropriately prioritized.
  • Policy Compliance: Perform all tasks and activities in accordance with organizational policies, procedures, and contractual commitments.

24. Incident Manager Roles and Details

  • Responsibility Delegation: Delegate management responsibilities to IM Supervisors, Senior and Lead Analysts while retaining ultimate responsibility.
  • Team Supervision: Supervise the team and coordinate resources effectively to ensure service levels are achieved.
  • Team Training: Train new team members and provide additional training and coaching.
  • Escalation Handling: Handle escalations from internal and external customers as well as team members, engaging the Director.
  • Training Materials: Create and maintain training materials and documentation for IM tools and processes.
  • Tool Administration: Act as an administrator of Service Desk, SharePoint, and other IM tools.
  • Meeting Management: Be in charge of conducting meetings.
  • Report Preparation: Prepare weekly and monthly reports and present them to management.
  • Efficiency Improvement: Identify opportunities for efficiency and development within the team and among individuals.
  • Staffing Management: Maintain proper staffing and shift coverage for IM based on business requirements.
  • Recruitment Support: Assist with recruiting, screening, and hiring new IM Analysts according to open positions.
  • Administrative Support: Perform administrative tasks to maintain smooth team operations and carry out other duties as required and assigned by the Director.

25. Incident Manager Key Accountabilities

  • Incident Response: Respond to internally identified and customer-reported system incidents, coordinating incident resolution rapidly and within defined service levels.
  • Incident Prioritization: Prioritize incidents according to their urgency and influence on the business.
  • Protocol Compliance: Collaborate with teams to ensure that all protocols are diligently followed.
  • Incident Documentation: Document the impact of incidents, root causes, and corrective actions.
  • Customer Communication: Communicate incident summary reports to customers and attend customer calls.
  • Team Collaboration: Work with internal and third-party teams to ensure actions are taken and completed to protect and improve services.
  • Management Reporting: Complete management reports related to incident trends, quality trends, and ticket trends.
  • Trend Analysis: Perform analysis and reporting of incident trend data to identify and eliminate root causes.
  • Service Improvement: Drive continual improvement of capabilities and service delivery across teams, ensuring cohesive and high-quality service in collaboration with other enterprise teams.
  • RCA Support: Establish criteria to capture incidents for trending and RCA, supporting proactive improvements to operational stability.
  • Governance Reviews: Establish and run periodic governance reviews (monthly, quarterly) to update stakeholders and secure their support on the progress of incident actions and stability improvements.

26. Incident Manager General Responsibilities

  • Process Accountability: Take overall accountability for the execution of the Incident process and procedures for the assigned line of business.
  • Incident Validation: Validate the classification of an incident against established Incident Criteria.
  • Notification Management: Manage and perform all internal notifications, executive alerts, and escalation activities through service recovery of an incident according to the defined Incident Notification Timeframe.
  • Team Assembly: Assemble an Incident Team consisting of technical support personnel, management, and key stakeholders to develop, execute, monitor, and track an integrated resolution plan through service recovery.
  • Recovery Decisions: Make service restoration and recovery decisions, engaging the Delivery Centre management team.
  • Record Documentation: Ensure that the progress of incident recovery and all relevant times are documented in the associated Incident Records.
  • Bridge Facilitation: Initiate and facilitate incident bridge meetings and communications.
  • Crisis Engagement: Engage the continuity provider in the event of a Data Centre crisis or exceptionally severe single-customer outages.
  • Resolution Planning: Consolidate an integrated resolution plan when multiple competencies or domains are engaged.
  • Status Updates: Obtain and provide status updates on incident recovery progress.
  • Service Confirmation: Confirm with the customer that the service has been restored to their satisfaction according to the Resolve and Close Incident policy and Incident Exit Criteria.

27. Incident Manager Role Purpose

  • Cybersecurity Response: Manage the response to cybersecurity incidents across the globe, taking responsibility for the timely mitigation of cyber threats and minimizing further risks to information assets and services.
  • Incident Coordination: Coordinate the actions of multiple business units during the response to cybersecurity incidents.
  • Stakeholder Communication: Provide timely and relevant updates to appropriate stakeholders and decision makers during incidents.
  • Post-Incident Review: Manage the completion of post-incident reviews, assessing the effectiveness of controls, detection, and response capabilities, and supporting required improvements with responsible owners.
  • Relationship Management: Cultivate close working relationships with regional Cybersecurity Leads, Business Information Risk Officers (BIROs), and Risk Managers to support and deliver remediation of security incidents.
  • Technology Awareness: Maintain strong awareness of technology trends and industry best practices to provide informed advice and guidance to business functions and IT teams.
  • Process Development: Support the development and maintenance of detailed processes and procedures for consistent management of cybersecurity incident responses.
  • Platform Enhancement: Support the continued technical enhancement of security platforms.
  • Capability Evolution: Evolve incident management and response capabilities and processes, including automation and orchestration.
  • Continuous Improvement: Promote a “self-critical” culture by effectively identifying weaknesses in people, processes, and technology, and ensuring they are addressed.
  • Knowledge Development: Foster a culture of individual self-improvement, expecting staff to maintain subject matter expertise within their focus area and broader cybersecurity.
  • Awareness Engagement: Support engagement of global business units and functions to uplift cybersecurity awareness and communicate organizational cybersecurity efforts.
  • Management Reporting: Produce management information related to the CSIRT mission, tailored to target audiences, supported by data and experienced analysis to enable informed decision-making.

28. Senior Incident Manager Essential Functions

  • Technical Leadership: Technically lead all aspects of critical incidents (S1–S3) by determining SMEs needed, identifying problems, and releasing or de-escalating after diagnosis while meeting SLAs.
  • Service Restoration: Focus on the fastest service restoration and recovery by managing bridges, communication channels, and sync points for sub-technical teams leading investigations, including third-party vendors and internal engineering teams.
  • Process Integrity: Ensure the quality and integrity of the Major Incident Management process by interfacing with Service Delivery Managers, Support teams, and Development and Engineering teams.
  • Troubleshooting Guidance: Provide recommendations on troubleshooting and technology improvements to quickly resolve incidents, ensuring infrastructure and application stability.
  • Cross-Team Partnership: Partner with Support, Development, and Engineering teams to resolve complex or unique system issues beyond the scope of team members.
  • Technical Support: Provide technical support to team members to facilitate the resolution or escalation of technical issues.
  • Failure Point Analysis: Identify failure points impacting availability and accelerate mean-time-to-repair by addressing architecture, design, process improvements, software disciplines, and testing.
  • Stakeholder Interaction: Interact frequently with stakeholders across the organization to prioritize backlog items related to availability.

29. Incident Manager Additional Details

  • Process Improvement: Identify opportunities and take ownership of automation and continuous improvement of the Incident Management process and best practices.
  • Tool Feedback: Provide feedback and drive improvements with current tools and processes by directing initiatives to the appropriate group for proactive design changes, implementation, or business risk assessment of incident causal factors.
  • Program Backlog: Build and manage the Incident Management program backlog with mechanisms to distribute and track work across team members and partner teams.
  • Review Leadership: Lead Incident Review forums to ensure ownership is closed loop and outcomes are delivered to prevent future incidents.
  • Best Practice Sharing: Develop mechanisms across Incident Management functions (Azure) to share and adopt best practices that improve execution during incidents.
  • KPI Improvement: Improve KPIs for measuring Incident Management phases by removing manual data entry points.
  • Executive Communication: Communicate proactively with Microsoft executive leadership, managers, engineering groups, and key stakeholders on active major incidents or crises.
  • Incident Triage: Perform incident triage when escalated, determining scope, urgency, and potential impact, identifying vulnerabilities, and making swift remediation recommendations.
  • Post-Incident Analysis: Drive deep-dive post-incident analysis of customer-impacting issues with Senior Incident Managers and respective teams, focusing on reducing the likelihood of recurrence.
  • Recovery Exercises: Participate in recovery implementation and testing exercises using scenario-based use cases to raise impact awareness and ensure remediation.
  • Timeline Documentation: Capture and record all incident timelines, data, and restoration efforts for handoff to Problem Management.
  • Cross-Team Resolution: Identify, explore, and drive cross-team efforts to proactively resolve issues that could impact customers.

30. Incident Manager Roles

  • Process Ownership: Own, execute, and drive the Incident Management processes by applying strong time management, organization, and planning skills to enable effective coordination and communication, facilitating restoration of agreed services to the business as quickly as possible while minimizing service impact.
  • Decision Leadership: Lead and manage the implementation of innovative and tactical decisions not covered by established procedures to ensure effective service restoration within the context of risk/benefit analysis.
  • Escalation Management: Own and execute escalations in line with Incident Management SLA thresholds with internal organizations, external partners, and suppliers.
  • Best Practice Adoption: Assist in driving Service Management best practices and ITIL process standardization.
  • Process Compliance: Ensure adherence to Incident, Problem Management, and Vendor Management processes to all applicable regulatory, security, and internal audit requirements.
  • Incident Review: Review all incidents across priorities to identify root causes, document accurate technical and business impact statements, define corrective measures, and ensure timely communications to key stakeholders internally and externally.
  • PIR Hosting: Host and participate in Post Incident Review (PIR) meetings with key participants and accountable parties to ensure organizational focus, identify root causes, and deliver eradication actions with the right ownership.
  • Problem Management: Create and progress new problem tickets for recurrent service issues as per the problem management process, ensuring timely closure.
  • Culture Building: Drive a culture focused on reducing repeat incidents and failed changes.

31. Incident Manager Tasks

  • Incident Command: Run and facilitate the incident response as the Incident Commander.
  • Incident Response: Respond to incidents as they occur within the assigned region.
  • Incident Documentation: Create and maintain incident documentation and data.
  • Logistics Coordination: Coordinate incident response logistics.
  • Communication Management: Ensure communications are detailed and timely for the correct audience.
  • Incident Reporting: Draft incident reports and assign them to appropriate parties for root cause analysis.
  • Retrospective Facilitation: Facilitate incident retrospective forums, including global audiences.
  • Remediation Management: Manage the remediation item process, including identification, ticket creation, tracking, follow-up, and reporting.
  • Prioritization Facilitation: Facilitate prioritization of incident remediation items to ensure identified open issues are resolved.
  • Project Support: Work consistently to keep related remediation projects moving forward.
  • Investigation Support: Support ongoing investigations and escalate high-risk discoveries.
  • RCA Management: Manage the root cause analysis process and provide details for reporting.

32. Incident Manager Details and Accountabilities

  • Incident Documentation: Ensure all incident details are accurate and fully documented across incident reports, tickets, and systems.
  • Threat Analysis: Apply functional knowledge of security monitoring, threat analysis, and trend analysis.
  • Threat Containment: Drive activities to contain immediate threats to Zendesk systems and data.
  • Process Management: Develop, manage, and maintain Zendesk’s security incident response processes and documentation.
  • Risk Remediation: Facilitate long-term risk remediation for Zendesk systems and data.
  • On-Call Support: Participate in an on-call rotation with other Incident Managers.
  • Performance Monitoring: Review operational metrics and drive team performance.
  • Reporting Contribution: Contribute to weekly reporting to ensure transparency across various Zendesk audiences.
  • Process Refinement: Refine and improve incident processes to drive the team toward SLO goals and ensure a consistent customer experience.
  • Risk Assessment: Assist in risk assessment activities.
  • Team Support: Support and back up other Incident Managers.
  • Continuous Improvement: Contribute to the continuous improvement project work.

33. Incident Manager Overview

  • Incident Facilitation: Act as a facilitator during major incidents, crises, and other broadly impacting events.
  • Stakeholder Management: Engage and manage workload and priorities of key stakeholders and participants in major incident activity to quickly assess business impact from service or application owners and identify mitigation plans.
  • Status Communication: Interact directly with key stakeholders to proactively communicate status on active major incidents or crises.
  • RCA Facilitation: Facilitate industry-standard Root Cause Analysis (RCA) exercises across all major incident and crisis stakeholders to initiate the problem management cycle.
  • Process Coordination: Coordinate and manage major incident management process activities.
  • Risk Escalation: Escalate risks and issues to the Major Incident Management Process Owner.
  • Reporting Support: Support major incident management reporting, including KPIs and customer SLAs.
  • Best Practice Support: Assist Major Incident Management Process Owners in driving service management best practices and ITIL process standardization.
  • Process Consistency: Ensure consistent end-to-end application of the major incident management process across the account.
  • Process Improvement: Support the identification and planning of major incident management process improvement projects.

34. Incident Manager Job Description

  • Process Implementation: Drive the implementation of the standard execution of the major incident management process.
  • Post-Mortem Reporting: Develop and deliver post-mortem reports for distribution to executive audiences.
  • Customer Engagement: Agree on issue definition, action plans, and success criteria with customers during emergencies.
  • Incident Recording: Record and classify received incidents, undertaking immediate efforts to restore failed data center services as quickly as possible.
  • Record Association: Associate incidents with other records such as incidents, changes, problems, knowledge articles, and known errors.
  • Resolution Verification: Verify resolution with users and resolve incidents in ticketing tools.
  • Lifecycle Ownership: Own all incidents and service requests throughout their lifecycle.
  • Troubleshooting Documentation: Document troubleshooting steps and service restoration details.
  • Knowledge Management: Create and submit knowledge articles.
  • Liaison Role: Act as liaison between the data center and requestors.

35. Incident Manager Functions

  • Shift Operations: Work across multiple shifts in a 24x7x365 operational team to drive the efficiency and effectiveness of the incident management process.
  • Management Reporting: Produce management information, including KPIs and reports.
  • Network Updates: Provide weekly Network Health updates.
  • MTTR Reporting: Deliver weekly MTTR reports for Tier 2 and 3.
  • Incident Summaries: Prepare weekly summaries of severity 1 incidents with causes and resolutions.
  • Property Summaries: Prepare weekly summaries of troublesome properties with high call/ticket volumes.
  • Tenant Summaries: Prepare weekly summaries of tenant-related dispatches with causes and resolutions.
  • Property Reviews: Conduct monthly reviews of properties with a greater than 25% tickets-to-rooms ratio.
  • Process Monitoring: Monitor the effectiveness of incident management and recommend improvements.
  • Process Management: Drive, develop, manage, audit, and maintain the major incident process, procedures, and technology.
  • Time Minimization: Minimize resolution time.

36. Incident Manager Accountabilities

  • Leadership Partnership: Partner with the leadership team to ensure all teams follow the incident management process for every event.
  • Decision Making: Make decisive, educated decisions to minimize incident resolution time.
  • Incident Prioritization: Prioritize and manage multiple incidents simultaneously.
  • Status Communication: Ensure timely and accurate communications are provided to upper management, advising on the current status and client impact.
  • Issue Resolution: Identify known issues and similar past incidents and collaborate with product support to resolve them.
  • Staff Training: Train and share knowledge in areas of expertise with junior staff members.
  • Incident Documentation: Ensure incidents are fully documented during and after the incident, including gathering and recording the full incident timeline.
  • Knowledge Management: Document symptoms and resolutions in the knowledge base.
  • Root Cause Reporting: Compile and distribute root cause analyses.
  • Process Compliance: Ensure adherence to all processes and procedures in compliance with policy.
  • Best Practice Support: Assist teams in maintaining industry best practices.
  • Service Improvement: Identify opportunities for service improvement and configuration standards.
  • MTTR Reduction: Minimize MTTR.

37. Incident Manager Job Summary

  • Single Point of Contact: Act as the Single Point of Contact for Incident Management.
  • Stakeholder Liaison: Liaise with business and IT stakeholders, both internal and external, during the Incident Management process.
  • Status Reporting: Participate in daily production support status calls and governance meetings, providing updates on demand.
  • Service Operations: Manage service operations in line with Incident Management best practices.
  • Bridge Management: Establish teleconference bridges for major incidents, involving relevant stakeholders, and chairing discussions to successful closure.
  • Notification Management: Initiate all incident-related notifications, such as ETR broadcasts, in a standard format.
  • Review Facilitation: Conduct Major Incident review meetings.
  • Escalation Validation: Periodically validate escalation mechanisms to ensure effectiveness.
  • Alert Configuration: Ensure notification methods are configured in the system to alert Level 2 teams in case of functional escalations.
  • Vendor Tracking: Track incidents escalated to third-party vendors through to closure.
  • Team Training: Ensure L1.5 application support teams and service desk personnel are trained on tickets that can be shifted left.

38. Incident Manager Responsibilities

  • RCA Coordination: Liaise with Problem Management teams to initiate RCA and implement permanent fixes.
  • Change Advisory: Actively participate in Change Advisory Board meetings, providing inputs for change impact analysis.
  • Emergency Fix Support: Work closely with Change and Release Management teams during emergency fixes, providing input for problem simulation.
  • Business Continuity: Participate and provide key inputs when invoking Business Continuity options during major outages.
  • Error Analysis: Analyze Known Error records with delivery teams to identify candidates for permanent fixes and tickets that can be resolved at Level 1 or through self-service.
  • SLA/OLA Input: Provide inputs to SLA/OLA baseline exercises specific to Incident Management.
  • Activity Coordination: Coordinate activities between multiple support groups to ensure adherence to SLAs and OLAs.
  • Process Adherence: Ensure consistent process adherence across all delivery teams.
  • SLA Monitoring: Monitor and ensure Incident and Availability SLAs are consistently met.
  • Metric Analysis: Analyze metrics data to identify service bottlenecks and provide inputs to the Service Improvement Plan.
  • Productivity Analysis: Perform periodic productivity analyses to drive operational efficiencies.

39. Incident Manager Details

  • Customer Commitments: Understand and meet customer agreements, including SLOs, SLAs, and other commitments.
  • Process Participation: Follow and actively participate in the improvement of established team processes.
  • Cost Reduction: Identify and adhere to cost reduction measures through continuous improvement and innovation.
  • ITIL Application: Apply knowledge of the ITIL framework, including Event, Incident, Change, and Problem Management.
  • Team Collaboration: Collaborate effectively with peers and multi-functional teams.
  • Knowledge Maintenance: Maintain the knowledge required to perform the role effectively.
  • Innovation Support: Actively share and develop innovation and automation to support continued improvement.
  • Incident Coverage: Provide management coverage and guidance on all P1, P2, and other high-visibility incidents.
  • Timely Notification: Notify the management team through e-page or appropriate mailer with incident numbers and bridge information within the agreed service level objectives.
  • Executive Updates: Provide internal and external executive-level updates to all stakeholders.
  • Troubleshooting Leadership: Ensure the incident team has an active voice and is driving troubleshooting efforts.
  • Task Management: Assign tasks and track follow-up actions to ensure accountability.
  • Resource Engagement: Engage additional resources to support incident resolution.
  • Unified Messaging: Collaborate with cross-functional teams, including Professional Services and Technical Support, to ensure unified messaging to customers.

40. Incident Manager Duties

  • RCA Development: Assist with the development and delivery of RCAs through collaboration with cross-functional teams.
  • Issue Escalation: Proactively identify and escalate issues to Problem Management.
  • Problem Solving: Work on complex problems requiring in-depth evaluation of multiple factors.
  • On-Call Support: Participate in an on-call rotation with other Incident Managers.
  • Operations Support: Assist Operations Managers with daily management tasks.
  • Performance Monitoring: Review operational metrics and drive team performance.
  • Process Improvement: Refine and improve incident processes to align the team with SLO goals and ensure a consistent customer experience.
  • Ticket Review: Participate in incident ticket reviews to provide ongoing feedback to the Incident Team.
  • Team Mentorship: Mentor and coach members of the Incident Team.
  • Team Support: Support and back up other Incident Managers.
  • Customer Relationships: Maintain productive and positive customer relationships.
  • Customer Onboarding: Assist with new customer onboarding to establish effective processes.
  • Service Reviews: Participate in the development and delivery of regular service reviews.
  • Escalation Management: Serve as an escalation point for both technical and political customer escalations.
  • Customer Engagement: Travel occasionally to attend customer meetings.

41. Incident Manager Roles and Responsibilities

  • Reactive Support: Deliver end-to-end reactive support to ensure positive outcomes, including healthy reactive case progression across all severities and effective management of Critical/Crisis Situations.
  • Escalation Handling: Align resources directly to customer accounts and Technical Account Managers (TAMs) to own escalation handling, reporting, trending, and customer communications, including non-account-aligned reactive support.
  • Customer Insights: Develop insights for TAMs to support customer health conversations and customize experiences for each customer.
  • Stakeholder Relationships: Build strong relationships with internal and customer stakeholders.
  • Trusted Advisor: Become a trusted advisor for aligned customers and TAMs.
  • Leadership Collaboration: Exhibit confident leadership by collaborating across multiple groups and organizations to achieve customer outcomes.
  • Effective Communication: Communicate timely and effectively to meet internal and external customer needs.
  • Value Articulation: Articulate the value of the Microsoft Services portfolio to key stakeholders.
  • Cross-Group Influence: Influence and lead actions effectively through cross-group collaboration.
  • Issue Escalation: Identify and escalate systematic issues and process breakdowns.
  • Process Improvement: Proactively identify and champion process and tools improvements, leading and participating in continuous improvement initiatives.
  • Case Reviews: Conduct regular case reviews of reactive cases owned by internal support organizations to assess health and status.

42. Incident Manager Duties and Roles

  • Case Support: Own reactive case support and partner with account TAMs on high-risk escalations to gather and analyze information to support customers.
  • Incident Management: Manage Critical Situation incidents by ensuring process adherence and proper escalation.
  • Crisis Management: Act as Crisis Manager in catastrophic outage situations by overcoming all challenges to support US Public Sector customers.
  • Risk Mitigation: Mitigate risks proactively.
  • Case Administration: Manage administrative casework effectively.
  • Problem Resolution: Drive cases to a healthy state with action-oriented problem resolution.
  • Internal Communication: Communicate case updates and action requests with internal Microsoft resources.
  • Customer Communication: Communicate directly with customers to promote case progress across multiple management levels, including executive leadership.
  • Expectation Management: Set proper expectations with customers for support.
  • Meeting Facilitation: Facilitate meetings between customers and internal support organizations to expedite action plans and ensure progress.
  • Reporting Delivery: Produce scheduled reports on case health metrics and status for delivery to customers.
  • Customer Meetings: Participate in and prepare content for scheduled customer meetings.
  • Decision Support: Collaborate on and prepare reports to support better decision-making during Critical Situation incidents.

43. Incident Manager Responsibilities and Key Tasks

  • Tactical Leadership: Report to the Duty Strategic Commander to provide 24/7 tactical-level incident management and leadership throughout the duty period to ensure a safe and efficient service.
  • Command Role: Operate as the Lead Tactical Commander with support from regional Tactical Commanders.
  • Resource Planning: Take a lead role in dynamic planning and scheduling of available resources in response to shortfalls to ensure optimum operational performance.
  • Performance Monitoring: Monitor live performance metrics to identify when intervention is required to maintain targets within agreed thresholds.
  • Tactical Decisions: Make tactical decisions to resolve operational issues adversely affecting performance.
  • Operational Command: Maximize operational capacity and provide effective, timely, and appropriate operational command responses for routine operations, declared Critical Incidents (CIs), Major Incidents (MIs), and Serious Incidents (SIs).
  • Demand Management: Manage demand and escalation effectively in response to peak activity and seasonal surges, implementing escalation arrangements in line with organizational plans.
  • Team Collaboration: Work with Duty Managers, frontline and hub-based Operations Officers, and Operational Commanders to ensure accountability for demand management, escalation actions, and command responsibilities.
  • Stakeholder Communication: Communicate effectively with key external stakeholders to assist local teams in resolving handover delays and other dynamic operational issues.
  • Initial Command: Adopt the initial role of Tactical Commander during declared CIs, MIs, and SIs until a formal command structure is established and a transfer of command can occur.
  • Duty Management: Act as the most senior manager on duty out of hours, making and being accountable for tactical-level decisions on behalf of the organization.
  • Staff Oversight: Operate as the senior accountable manager on duty out of hours, managing staff and operational issues at the appropriate level.
  • Performance Accountability: Be accountable for all aspects of performance and service delivery on a day-to-day basis.
  • Service Oversight: Oversee service delivery within hubs while on duty.

44. Incident Manager Roles and Details

  • Reactive Support: Drive positive outcomes end-to-end within reactive support, including healthy reactive support case progression and management across both standard and critical severity cases.
  • Escalation Management: Align resources directly to customer accounts to own all levels of escalation handling, reporting, problem management, and customer communications.
  • Insight Development: Partner closely with Customer Success Account Managers (CSAMs) to support the development of insights that inform solution and operational health program development.
  • Stakeholder Relationships: Build strong relationships with internal and customer-facing stakeholders as a key influencer and advocate for the customer.
  • Central Leadership: Partner with the Customer Success Account Manager (CSAM) to act as the trusted central commander within the Reactive Support Management space.
  • Confident Communication: Showcase confident leadership and communicate in a timely and professional manner while driving reactive support request health.
  • Customer Interaction: Interact directly with customers to gather business impact of support requests, provide status updates on case progress, and coordinate actions to improve health in at-risk or unhealthy support requests.
  • Process Improvement: Identify and champion process, tools, and service delivery program improvements to influence continuous improvement and a more connected customer experience.
  • Case Review: Review support requests using internal tools and analytical skills to identify cases requiring action.
  • Action Determination: Determine the best course of action to maintain healthy support request progression and resolution, considering customer-specific knowledge.
  • Risk Mitigation: Anticipate risks related to customer-specific workloads and solutions and take mitigation actions on demand.
  • Escalation Handling: Manage escalations by identifying cases requiring action and coordinating appropriate actions internally and externally to drive resolution.
  • Critical Support: Support the active management of select critical situation support requests during business hours.
  • Expectation Setting: Mitigate relationship risk through proactive expectation setting.
  • Trend Analysis: Analyze support request trends using internal tools, automation, and analytical skills to identify and confirm root causes, categorizing them following defined standards.
  • Operational Health: Leverage insights to help account teams maximize the value of customer Premier or Unified Support agreements through proactive services that drive Operational Health improvements.
  • Customer Recommendations: Identify trends and partner with Customer Success Account Managers (CSAMs) to build recommendations for customers.
  • Volume Analysis: Leverage purpose-built tooling and standard reporting to analyze support request volumes and trends.
  • Data-Driven Insights: Support customer conversations and various incident/problem management activities with data-driven insights.

45. Incident Manager Key Accountabilities

  • Cross-Functional Collaboration: Collaborate closely with Sourcing and Purchasing, Operations, Finance, and Customer Care teams to coordinate internal responses to supply incidents.
  • Issue Response: Provide prompt responses to urgent ingredient supply issues to prevent production impacts.
  • Ingredient Substitution: Identify substitute ingredients to mitigate supply shortages and draft relevant recipe updates.
  • Recipe Quality: Ensure impacted recipes remain delicious and satisfying for customers.
  • Menu Performance: Mitigate the impact of supply incidents on menu performance.
  • Cost Analysis: Analyze the cost implications of ingredient substitutions and identify mitigation strategies.
  • Process Management: Build and manage cross-functional processes for handling supply incidents.
  • Organizational Growth: Be part of an organization that is continually growing its technology and its people.

46. Incident Manager General Responsibilities

  • Troubleshooting Support: Play a hands-on role by taking incoming trouble calls and emails, troubleshooting problems, and providing resolutions.
  • Team Promotion: Promote the team internally to be recognized as the go-to for any incidents.
  • Issue Analysis: Analyze technical issues and collect all necessary technical information for triage.
  • Partner Collaboration: Collaborate with partners to fully understand problems and set expectations.
  • Resolution Partnership: Proactively work with Engineering, Platform, and external partners to determine steps toward resolution and prevention of future issues.
  • Problem Solving: Take responsibility for problem-solving and demonstrate curiosity to identify the root cause of any problems.
  • Process Monitoring: Be an integral part of monitoring, reporting, and communicating the global performance of owned processes.
  • Data Collection: Use a variety of data collection techniques and systems to gather technology operations information.
  • Process Training: Prepare and train for future business and IT procedures, as well as ITIL process deployments.

47. Incident Manager Role Purpose

  • Process Ownership: Act as the country process owner for incident management regarding systems, training for operational departments, and local processes.
  • Escalation Management: Trigger and ensure management and/or technical escalation, following up on escalations.
  • Impact Analysis: Coordinate and ensure the creation of technical and customer impact analyses for specific incidents.
  • Issue Escalation: Escalate blocking issues reported through the incident report.
  • Department Cooperation: Cooperate with internal departments to provide problem direction and support to operational teams.
  • Status Communication: Continuously inform of the status of problems through the Incident Notification procedure.
  • Process Management: Manage incident management processes for the account, including day-to-day incident management and driving SLAs.
  • Operational Alignment: Work closely with operations teams to ensure incident management processes meet their operational needs.
  • Continuous Improvement: Continuously seek improvement opportunities in areas of responsibility and related activities.

48. Incident Manager Essential Functions

  • Incident Resolution: Lead and coordinate resolution efforts during major incidents.
  • Process Improvement: Implement and improve incident management processes.
  • Post-Incident Analysis: Conduct post-incident analyses and ensure preventive measures are adopted.
  • KPI Tracking: Track and report KPIs relevant to SLA performance.
  • Team Education: Educate internal teams on incident processes and best practices.
  • On-Call Support: Provide on-call coverage as part of a duty rotation.
  • Incident Remediation: Investigate, analyze, and remediate financial planning incidents to mandated standards.
  • Pipeline Management: Log and manage the incident pipeline in ITSM systems.
  • Cross-Functional Participation: Participate in working groups and cross-functional forums.
  • External Partnerships: Maintain strong working relationships with insurers and other external partners.
  • Stakeholder Communication: Manage stakeholder communications with clarity and transparency.

49. Incident Manager Additional Details

  • Shift Management: Organize the team's working shift systems and roster the team for work and training.
  • Team Supervision: Direct and supervise the team to analyze and manage incidents on time, escalating to higher management on demand.
  • Status Monitoring: Ensure monitoring and recording of incident management status and progress.
  • Stakeholder Coordination: Ensure relevant internal and external stakeholders are informed and coordinated.
  • Reporting Accuracy: Ensure investigation reports and periodic updates with relevant stakeholders are accurate and timely.
  • Information Analysis: Understand the implications of new information for both current and future problem-solving and decision-making.
  • Decision Making: Consider the relative costs and benefits of potential actions to choose the most appropriate one.
  • Reasoning Skills: Use logic and reasoning to identify the strengths and weaknesses of alternative solutions, conclusions, or approaches to problems and recommend the best course of action.
  • Problem Dissection: Dissect complex problems to develop and evaluate options and implement solutions.
  • Gap Analysis: Analyze existing policy and process gaps and identify solutions to close them.
  • Post-Incident Contribution: Participate as a key contributor in follow-up activities after any major incident, including post-incident reviews and residual recovery activity.
  • Incident Communication: Prepare major incident communications for distribution across the organization and bespoke reports, briefings, and texts to senior leaders.
  • External Messaging: Liaise with communications teams and the press office to ensure they have the necessary information to manage external messages appropriately.
  • Team Contribution: Contribute to the objectives of the wider team.

50. Incident Manager Details and Accountabilities

  • Incident Management: Manage all severity 1 and 2 incidents from inception to resolution, ensuring quick and correct assessment of the issue, including identifying the impact to a specific Business Unit, Owners, and engaging the appropriate resolver group.
  • Progress Updates: Provide prompt periodic progress updates to the appropriate parties until detection of the root cause and issue closure.
  • Resolution Coordination: Coordinate resolution efforts across multiple applications and groups.
  • Post Event Review: Conduct Post Event Reviews with associated teams involved in each event to drive continuous improvement.
  • Incident Metrics: Provide metrics on reported incidents and work with the SRE Management team to identify trends or issues requiring inclusion in the Problem Management agenda.
  • Process Contribution: Contribute input into the Incident Management Process by further defining process flow, documenting expectations, and assisting in training and guiding impacted parties to ensure adherence.
  • On-Call Support: Provide after-hours on-call support through rotational weekend shifts and extended hours.
  • Team Communication: Facilitate resolution by ensuring effective communication across multiple teams.
  • Outage Assessment: Quickly assess outage severity in terms of business impact and technical complexity.
  • Escalation Management: Notify, escalate, and communicate outage existence and status to senior management.
  • Investigation Coordination: Coordinate investigations and drive incidents to resolution or remediation.
  • Professional Delivery: Deliver high-quality, professional major incident management as part of a 24x7, 365 Major Incident Management team.
  • Team Leadership: Lead, contact, and direct teams to effectively manage all incidents, ensuring maximum availability for all IT services for citizens and internal customers.
  • Customer Focus: Quickly understand customer issues from a business impact perspective, draw logical conclusions, make sensible suggestions aligned with strategic direction, and negotiate with suppliers to facilitate change.