Schedule Editor

Intro
Detecting Issues
User Behaviour
Error Handling
Flow improvements
Sudden Changes
Changes & Results

How we built a space where every second should be precise and it matters

This is the story how how we came across building the next generation of On Call Schedule editor for Zenduty and how i failed a bet with Shyam and played cricket for 1 month

Managing an on-call schedule can be a challenging task, especially for larger organisations with multiple departments and complex call rotations.

Chapter one

What and Why of an OnCall manager ?

One of our clients, a major fintech company in India followed a shift of once 8 weeks, So one user will be oncall for a week and after that they come back after 7 weeks again, Also they used a buddy system where a partner is assigneed with the Oncall engineer for any help, Robin for the Batman

a zoomed in view of the schedule ui showing the details of the schedule

So team managers create oncall rotation for their users to come up on their shifts during a specific interval

An on-call schedule manager can streamline this process and help ensure that the right person is always available to respond to critical incidents.

Chapter two

How we decided to redesign our existing interface

Nobody can know our customers better other than customer success teams. Anjana was heading our Customer success team at that time and we started seeing patterns of confusion in terms of Usability, Navigation and Error handling in schedules

You can scroll to the and verify the changes you made looking into the schedule preview

Ohhh.. I didn't know there was a preview down there

Said by some people when they came to us to figure out why they were not getting alerts in the first place

note showing too much space was being not utilised in the UI

note showing users have to scroll all the way down to understand the preview

Insights from the conversations to customer support team made us draw the first draft of ideations. It made us create benchmarks about things to improve.

Chapter Three

We needed some work in navigation

Schedules are broken down further into layers, they give more precise control over on-call schedules. Schedules can be set for each day, every week, or in a way that suits your team.

While we tracked the user behaviour through event and scroll tracking we found that on an average a user triggered 6 scroll events to see the changes they make in the layers and also some users were unaware of the preview as they made changes and even without seeing how it reflected in the timeline they proceeded to saving

image showing metrics of average events triggered by a user

This was creating two negative results

- More effort to visualise and understand changes

- Increased rate in error because of un accessible preview

metrics showing 11,000 users triggering scroll events and 6 event per user in a month

Chapter four

Error handling

Another category in term of support tickets were inaccurate schedule data. We used to see confusion in terms of things appearing on the preview because we failed to handle the errors and edge cases

comparsion image - after. showing the improvement in terms of error handling in the UI

For example in one layer there can be 2 or more restrictions in a single layer, If both restriction overlaps it creates incorrect data. An accessible date picker to create the restriction is extremely necessary and basic ! but the bigger picture happens when we let the user know that they made an error and its exactly here and its precisely this.

Chapter five

Coverage and Connections

Schedules only take place if one schedule is connected into an escalation policy, thats another part of the team and this escalation policy have to be attached to a service 🤯

Schedule defines OnCall rotation
Attach this schedule to an EP
EP defines the hierarchy for escalations
Attach this EP to a service
Service gets impacted
A User or a User who is oncall from the schedule attached to the EP gets alerted
If they did not acknowledge it escalates to the next level

image showing connections of the schedule with another EP

If the first part is broken then the entire incident management cycle is broken

Chapter six

In the last minute someone will suggest something that makes so much sense but horrible to hear !

In our initial flow we had the save button inside the layer edit panel, make changes on a layer, you save or discard and then move on, that sounded simple safe and logical.

image to show how the save button was before dheeraj suggested the change of keeping the button outside

image to show how the save button was after dheeraj suggested the change of keeping the button outside

Everybody seems to be okay with this in prototypes and multiple levels of testing. But in the end something lit up for Dheeraj and he questioned a use case of editing multiple layers back to back. And it made so much sense!

Its completely unnecessary to send so much save requests. So we quickly changed the logic and brought the logic to the top. You do all your changes and save it once or whenever you need and move on. You forget to save things ??

Image showing browser initiated pop up to prevent accidental closing of the page

We prompt you to save or discard before you make mistake

To make things a bit more easy we brought in keyboard accessibility of Cmd+S to save schedules on the go.

Chapter seven

Improving things and seeing changes

On the error handling side we saw one error being triggered more often than any other one and such moments were crucial to understand and improve things. We saw users triggering an event of not adding users while creating overrides. our first thought of solution was to add auto focus on adding users selection by default and see that created an impact on error rates by 40%

We saw more than 50% improvement in terms of time of creation of layers.

image showing improvement in time to create layer and override. It shows a 50% improvement after the updated schedule design

We didn't stop there, we gave an option to go back to the Legacy UI for the users who preferred it but made a step mandatory where they have to give us feedback why they go back so that we can understand and improve the product more.

image showing the option to switch to the legacy UI

Till now only 1.69% of our users tried going back to the old UI and 40% of them came back again in a few days reducing the bounce rate to under 0.59% percentage.

Up next