a zoomed in view of the schedule ui showing the details of the schedule

Managing an on-call schedule can be a challenging task, especially for larger organisations with multiple departments and complex call rotations.

Intro
Detecting Issues
User Behaviour
Error Handling
Flow improvements
Sudden Changes
Changes & Results

How we built a space where every second should be precise and it matters

This is the story how how we came across building the next generation of On Call Schedule editor for Zenduty and how i failed a bet with Shyam and played cricket for 1 month
Managing an on-call schedule can be a challenging task, especially for larger organisations with multiple departments and complex call rotations.
UI of oncall schedule editor
Chapter one

What and Why of an OnCall manager ?

One of our clients, a major fintech company in India followed a shift of once 8 weeks, So one user will be oncall for a week and after that they come back after 7 weeks again, Also they used a buddy system where a partner is assigneed with the Oncall engineer for any help, Robin for the Batman
a zoomed in view of the schedule ui showing the details of the schedule
So team managers create oncall rotation for their users to come up on their shifts during a specific interval
An on-call schedule manager can streamline this process and help ensure that the right person is always available to respond to critical incidents.
Chapter two
How we decided to redesign our existing interface
Nobody can know our customers better other than customer success teams. Anjana was heading our Customer success team at that time and we started seeing patterns of confusion in terms of Usability, Navigation and Error handling in schedules
You can scroll to the and verify the changes you made looking into the schedule preview
Ohhh.. I didn't know there was a preview down there
Said by some people when they came to us to figure out why they were not getting alerts in the first place
note showing too much space was being not utilised in the UInote showing users have to scroll all the way down to understand the preview
Insights from the conversations to customer support team made us draw the first draft of ideations. It made us create benchmarks about things to improve.
Chapter Three

We needed some work in navigation

Schedules are broken down further into layers, they give more precise control over on-call schedules. Schedules can be set for each day, every week, or in a way that suits your team.
While we tracked the user behaviour through event and scroll tracking we found that on an average a user triggered 6 scroll events to see the changes they make in the layers and also some users were unaware of the preview as they made changes and even without seeing how it reflected in the timeline they proceeded to saving
image showing metrics of average events triggered by a user
This was creating two negative results
- More effort to visualise and understand changes
- Increased rate in error because of un accessible preview
metrics showing 11,000 users triggering scroll events and 6 event per user in a month
Chapter four

Error handling

Another category in term of support tickets were inaccurate schedule data. We used to see confusion in terms of things appearing on the preview because we failed to handle the errors and edge cases
comparsion image - before. showing the lack of validation in terms of error handling in the UIcomparsion image - after. showing the improvement in terms of error handling in the UI
For example in one layer there can be 2 or more restrictions in a single layer, If both restriction overlaps it creates incorrect data. An accessible date picker to create the restriction is extremely necessary and basic ! but the bigger picture happens when we let the user know that they made an error and its exactly here and its precisely this.
Chapter five

Coverage and Connections

Schedules only take place if one schedule is connected into an escalation policy, thats another part of the team and this escalation policy have to be attached to a service 🤯
image showing connections of the schedule with another EP
If the first part is broken then the entire incident management cycle is broken
Chapter six

In the last minute someone will suggest something that makes so much sense but horrible to hear !

In our initial flow we had the save button inside the layer edit panel, make changes on a layer, you save or discard and then move on, that sounded simple safe and logical.
image to show how the save button was before dheeraj suggested the change of keeping the button outside
image to show how the save button was after dheeraj suggested the change of keeping the button outside
Everybody seems to be okay with this in prototypes and multiple levels of testing. But in the end something lit up for Dheeraj and he questioned a use case of editing multiple layers back to back. And it made so much sense!
Its completely unnecessary to send so much save requests. So we quickly changed the logic and brought the logic to the top. You do all your changes and save it once or whenever you need and move on. You forget to save things ??
Image of a popup inside the screen of oncall schedule prompting the user to save changes
Image showing browser initiated pop up to prevent accidental closing of the page
We prompt you to save or discard before you make mistake
To make things a bit more easy we brought in keyboard accessibility of Cmd+S to save schedules on the go.
Chapter seven

Improving things and seeing changes

On the error handling side we saw one error being triggered more often than any other one and such moments were crucial to understand and improve things. We saw users triggering an event of not adding users while creating overrides. our first thought of solution was to add auto focus on adding users selection by default and see that created an impact on error rates by 40%
We saw more than 50% improvement in terms of time of creation of layers.
image showing improvement in time to create layer and override. It shows a 50% improvement after the updated schedule design
We didn't stop there, we gave an option to go back to the Legacy UI for the users who preferred it but made a step mandatory where they have to give us feedback why they go back so that we can understand and improve the product more.
image showing the option to switch to the legacy UI
Till now only 1.69% of our users tried going back to the old UI and 40% of them came back again in a few days reducing the bounce rate to under 0.59% percentage.
image of how many people downgraded yet
Up next