MOVEIt Cloud - Cluster 1 North America Emergency Maintenance
Incident Report for Progress MOVEit Cloud
Postmortem

Overview 

This document provides a summary statement of the underlying cause of the file and folder access issues that some MOVEit Cloud customers experienced between July 12, 2021 and July 18, 2021. 

Why Did Customers Temporarily Lose Access to File Folders? 

 After detecting an issue with the file storage system on North American Cloud Cluster 1, we took immediate steps to mitigate the issue so that it could be resolved without any service interruption. 

Monitoring subsequently showed that the mitigation process was falling behind production file storage activities. To prevent a service outage, we made the decision to switch Cluster 1 over to a new file storage system. The process of switching over to the new file storage system left some customers temporarily without access to their historical files and folders while the synchronization process completed. 

What Was Done to Minimize the Impact to Customers’ Business? 

To minimize any adverse impact to customers, the Cloud Operations, Customer Support and Engineering teams provided advance notice (via https://status.moveitcloud.com/) of the pending emergency maintenance and the possibility that some customers would experience invalid folder/file errors during the sync process and made an all-hands effort to expedite the process and to support customers through the move to the new file storage system. 

What is Being Done to Prevent a Similar Interruption in the Future? 

After a comprehensive review of the situation, Progress has implemented changes to our infrastructure configuration and monitoring to protect against these problems repeating.

Posted Jul 26, 2021 - 15:59 CDT

Resolved
Dear MOVEit Cloud Customers,

The North America C1 MOVEit Cloud files sync completed successfully today at 3pm CST.

We apologize for the impact that this has had to our North America Cluster 1 customers over the past week and we sincerely appreciate the patience and understanding from everyone as we worked through this.

We will be having a postmortem this week internally and expect to have an Root Cause Analysis available that we can share with customers by the end of the week, which will be shared via the status page when it's available.

Thank you,
The Progress MOVEit Cloud Team
Posted Jul 18, 2021 - 15:52 CDT
Update
Dear MOVEit Cloud Customers,

The files sync is still successfully running in the background and moving files. We expect this process to complete by the weekend's end and we will send out an additional update Sunday evening, after Sunday's maintenance.

Thanks in advance for your understanding,

The Progress MOVEit Cloud Team
Posted Jul 16, 2021 - 15:56 CDT
Update
Dear MOVEit Cloud Customers,

We found that a few customers were experiencing issues with accessing folders, which has now been resolved for those customers.

If you are still experiencing an issue with access folders in your Org, please reach out to MOVEit Support for immediate assistance.

Please note that files are still syncing in the background for some customers, so download errors are expected when trying to download files that haven't been synced over yet. We will send out an update when the sync completes.

The Progress MOVEit Cloud Team
Posted Jul 13, 2021 - 11:04 CDT
Update
Dear MOVEit Cloud Customers,

We are investigating reports of 500 errors when attempting to upload files for some users on Cluster 1 North America.

We will send out an update when this has been resolved.

The Progress MOVEit Cloud Team
Posted Jul 13, 2021 - 08:02 CDT
Update
Dear MOVEit Cloud customers,

The folder id issues have been successfully resolved at this point and files will continue to sync over in the background over the next few days.

If you're still experiencing issues accessing a folder(s), please reach out to MOVEit Support for assistance with a specific folder.

Thanks in advance for your continued understanding,

The Progress MOVEit Cloud Team
Posted Jul 13, 2021 - 04:00 CDT
Update
Dear MOVEit Cloud Customers,

Hourly update:

We're still in the process of resolving all of the folder id errors and will continue to work on this until it's completely resolved.

We unfortunately don't have an eta at this time but we will post an update to the status page once this work has been completed.

Thanks in advance for your understanding,

The Progress MOVEit Cloud Team
Posted Jul 12, 2021 - 22:49 CDT
Update
Dear MOVEit Cloud customers,

I wanted to send out an update to let you know that we are still working behind the scenes to complete the folder restoration portion.

We will send out an update in the next hour on the status of the folders portion, until that is completed this evening.

Thanks in advance for your understanding,

The Progress MOVEit Cloud Team
Posted Jul 12, 2021 - 21:48 CDT
Update
Dear MOVEit Cloud customers,

I wanted to send out an update to let you know that we are still working behind the scenes to complete the folder restoration portion, which has taken longer than originally anticipated, and files are still successfully moving in the background.

We will send out an update hourly on the status of the folders portion, until that is completed this evening.

Thanks in advance for your understanding,

The Progress MOVEit Cloud Team
Posted Jul 12, 2021 - 20:41 CDT
Monitoring
Dear MOVEit Cloud customers,

The cut-over to the new share was successful and we are monitoring the data sync.

We will post another update this evening when all of the invalid folder errors have been resolved.

Expected impacts:

The initial cut-over only took a few minutes and the service will was restored but we do expect some customers to see invalid folder errors for 3-4 hours after the cut-over, and some file errors for the next few days as data is synced over in the background.

We understand the business impact that this can have and have taken every step to minimize downtime. If you have critical files that you are unable to download, we ask that you please reach back out to the sender and ask that they are re-uploaded to avoid an impact to your business over the next few days.

To remain updated on the latest information on this issue and our emergency maintenance next steps, please check our status page. As information is available, it will be immediately posted here. Please note that our MOVEit Support team will only have the same updates that are made available to the status page.

 If you are required to contact Support for special instructions we will indicate that on our status page.

Thanks in advance for your understanding,
The Progress MOVEit Cloud Team
Posted Jul 12, 2021 - 15:21 CDT
Update
Dear MOVEit Cloud Customers,

We will be starting our emergency maintenance at 2:30pm Central time for Cluster 1, and we will post an update when the cut-over portion is complete.

Expected impacts:

The initial cut-over will only take a few minutes and the service will be restored but we do expect some customers to see invalid folder errors for 3-4 hours after the cut-over, and some file errors for the next few days as data is synced over in the background.

We understand the business impact that this can have and have taken every step to minimize downtime. If you have critical files that you are unable to download, we ask that you please reach back out to the sender and ask that they are re-uploaded to avoid an impact to your business over the next few days.

How do I tell what cluster I'm on:

1) Click on the "Help" link
2) in the "About" Section it lists "Server Name"
NAC1* server names are Cluster 1 in the US
NAC2* is Cluster 2 in the US
NAC3* is Cluster 3 in the US
EUW* is Europe
UK1* is UK
the N# part of the server name tells you what node they are connected to in that cluster.
Additionally, it will show what IP address MOVEit Cloud shows them as connecting from.

Thanks in advance for your understanding,
The Progress MOVEit Cloud Team
Posted Jul 12, 2021 - 14:28 CDT
Update
MOVEIt Cloud Emergency Maintenance Update:

We will need to start our emergency maintenance in the next hour or two to transition to a new storage environment for Cluster 1.

Expected impacts:

The initial cut-over will only take a few minutes and the service will be restored but we do expect some customers to see invalid folder errors for 3-4 hours after the cut-over, and some file errors for the next few days as data is synced over in the background.

We understand the business impact that this can have and have taken every step to minimize downtime. If you have critical files that you are unable to download, we ask that you please reach back out to the sender and ask that they are re-uploaded to avoid an impact to your business over the next few days.

Thanks in advance for your understanding,
The Progress MOVEit Cloud Team
Posted Jul 12, 2021 - 08:54 CDT
Investigating
Dear MOVEit Cloud customers,



Progress has identified a storage related issue on North America Cluster 1. The MOVEit Cloud team will be performing emergency maintenance to address this issue today and we are targeting this for the end of the day today at 6pm CST but we may have to start this earlier than planned, so we wanted to give you as much notice as possible to make alternative arrangements for today.

During this emergency maintenance, only the MOVEit Cloud service for North America Cluster 1 customers will be impacted. The MOVEit Cloud service will be available for all other customers during this emergency maintenance period.

To remain updated on the latest information on this issue and our emergency maintenance next steps, please check our status page. As information is available, it will be immediately posted here. Please note that our MOVEit Support team will only have the same updates that are made available to the status page.

 If you are required to contact Support for special instructions we will indicate that on our status page.

Thanks in advance for your understanding,

The Progress MOVEit Cloud Team
Posted Jul 12, 2021 - 07:58 CDT
This incident affected: North America - Cluster 1.