Cutting Six Scrolls to One Glance, Where a Wrong Schedule Means Nobody Gets Paged

Area : B2B Saas , Incident Management, BLR IND

When : 2022 - 2025

Role : Product Designer

TL DR

The incident list was a doorway, not a workspace. I turned it into the place engineers could actually act

I redesigned the first touchpoint of incident response, the list view, so on-call engineers could acknowledge, reassign, and reprioritize in one click without opening a single incident. The goal was simple: shrink the time between seeing an incident and acting on it, because at scale those seconds are downtime you pay for.

Problem

It started with our CEO being OnCall He told me acting on incidents from the list was painful sometimes because of the constant back and fourth The list wasn’t a workspace. It was a doorway you had to walk through every single time.

Acknowledge

Reassign

Change priority

Snooze

Resolve

Acknowledge

Reassign

Change priority

Snooze

Resolve

How can we make incidents more visible to engineers during the most critical moments, so they can acknowledge and respond faster

This became the brief

20

Conducted User Interviews with 20 OnCall Engineers

140

Collected 140 Session recordings to map user journey

5

Created a group of 5 Internal testers from different OnCall teams

So I observed for a few days internally while the users were managing incidents to learn about their core workflow and found out

The list showed almost nothing, so engineers opened every incident just to triage it

Click are was the incident number making it super difficult to click on the incident

Every action forced a context switch. Open, act, back out, repeat.

The things people reached for most took the most clicks to reach.

Core incident actions (Acknowledge, Resolve, Snooze) made instantly accessible from the incident list

Inline tagging introduced for faster incident organization

Incident title visibility increased to support up to 120 characters without truncation

Filters and relationship indicators introduced for collated incidents

Confirmation workflows added for critical actions like Acknowledge, Resolve, and Snooze

Bulk Acknowledge and Resolve actions introduced for faster incident management

Impact After the redesign

120%

Higher Incident Visibility on Incident Title upto 140 chars

12 Clicks

Removed from common actions related to the Incident

8 Actions

Made available in the Incident List itself

12 Sec

Average Reduction in MTTA

In Numbers

A 40 to 50 second MTTA is generally considered good for highly automated SaaS teams, although elite SRE organizations often target under 30 seconds for critical production incidents.

Many organizations still operate in the 1 to 5 minute range depending on their maturity and on-call processes. Razorpay processes around $180B in annual payment volume and generated ₹3,783 Cr ($450M+) in revenue in FY25.

If a company that size sees ~50 production incidents a month, shaving 10 seconds off acknowledgement gives back over 8 minutes of engineering response time every month.

At industry estimates of around $2,000 per minute of downtime, that's roughly $16,000 (₹13-14 lakh) in operational impact. More importantly, those seconds can prevent customer disruption during critical payment flows.