Executing a Disaster Recovery Plan
A disaster recovery (DR) plan can be executed from either the standby or the primary Private Cloud Appliance. However, a failover plan is always executed from the standby system., because the primary rack is down in this scenario.
A switchover can be performed for the purpose of testing the disaster recovery setup, or when extensive maintenance is required on the primary system. To return both appliances to their normal working state after a failover, a postfailover plan is executed on each system when the primary is back online. The switchover plan has postfailover steps built in, so it does not require an additional run of the postfailover plan.
As a result of executing a DR plan, resources are moved between peered systems and the primary system changes. Those resources are not automatically moved back to their original host system. To move resources back to their original environment, you must perform another switchover for the relevant DR configuration(s).
Performing a Switchover
A switchover allows the administrator to move resources away from a system so it can be taken offline, for example in case of planned maintenance. A (second) switchover is also performed to move resources back to their original host system, after they were impacted by a failover or switchover.
- Using the Service CLI
-
-
Look up the ID of the switchover DR plan you want to execute. Use
drGetConfigs
to find the DR configuration, and display its associated DR plans usingdrListPlan
. -
From the primary or standby appliance, execute the switchover DR plan with the
drExecutePlan
command.Note
To run the command in check-only mode, add the parameter
checkOnly=True
. Only the DR plan steps enabled for check-only mode will be performed.PCA-ADMIN> drExecutePlan planId=6e797d8b-7245-4d49-8e68-bf67f2d53041::sw1 JobId: 92b4acc2-2dff-492c-9ba2-0a2ac058baa5 Data: DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::sw1. Successfully started job for DR Plan Execute for config_id 6e797d8b-7245-4d49-8e68-bf67f2d53041, plan_name sw1
-
Use the job ID to check the status of the operation you started.
PCA-ADMIN> show Job id=92b4acc2-2dff-492c-9ba2-0a2ac058baa5 Data: Id = 92b4acc2-2dff-492c-9ba2-0a2ac058baa5 Type = Job Associated Work Request Id = c6cca56c-a1cc-421c-9ded-acf0e7cd9da2 Done = false Name = OPERATION-EXECUTE_DR_PLAN Progress Message = DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::sw1. Successfully started job for DR Plan Execute for config_id 6e797d8b-7245-4d49-8e68-bf67f2d53041, plan_name sw1 Run State = Active Transcript = Created job OPERATION Username = admin WorkItemIds 1 = id:e06881fc-ea57-4835-bb86-e1244d3787c3 type:WorkItem name:
-
Ensure that the job completes successfully.
PCA-ADMIN> show Job id=92b4acc2-2dff-492c-9ba2-0a2ac058baa5 Data: Id = 92b4acc2-2dff-492c-9ba2-0a2ac058baa5 Type = Job Associated Work Request Id = c6cca56c-a1cc-421c-9ded-acf0e7cd9da2 Done = true Name = OPERATION-EXECUTE_DR_PLAN Progress Message = DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::sw1. DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::sw1. drexecuteplan succeeded for config [6e797d8b-7245-4d49-8e68-bf67f2d53041] Operation: [switchover] plan_name: [sw1]. Response: [Successfully completed checks for switchover for DR config_id 6e797d8b-7245-4d49-8e68-bf67f2d53041. Plan Execution Status: [precheck : pass , role_reversal_precheck : pass , stop_primary : norun , role_reversal : norun , start_standby : norun , cleanup_primary : norun , post_config : norun , ]] Run State = Succeeded Transcript = Created job OPERATION Username = admin WorkItemIds 1 = id:e06881fc-ea57-4835-bb86-e1244d3787c3 type:WorkItem name:
After successful completion, all instances included in the DR configuration have been recovered and are running on the standby appliance.
-
- Using the Service Web UI
-
-
Under Disaster Recovery Service, open the DR Configurations page. In the table, click the configuration for which you want to perform a switchover. The DR Configuration detail page appears.
-
In the Resources section, click Plans.
-
In the Actions column, open the quick menu (3 dots) for the switchover plan of your choice, and click Execute Plan.
Alternatively, click the DR plan name to display its detail page. In the top-right corner, click Execute Plan.
-
When prompted, choose whether to execute the full plan or a subset of the steps in check-only mode.
Click Confirm. A DR job is started. When it completes successfully, all steps in the switchover DR plan have been performed as expected.
To track progress, under Disaster Recovery Service, select Jobs. The Jobs table reports the status of each job. Click a record in the table to display the job details.
After successful completion, all instances included in the DR configuration have been recovered and are running on the standby appliance.
-
Performing a Failover
The native DR service does not provide automated failover. An administrator must confirm that the primary appliance is down, and execute the failover plan from the standby appliance. A failover is meant to allow continuation of service when the primary system experiences an outage.
When one appliance is down, the peer rack reports a fault with a name containing "peer connect" and the rack serial number. Use the Service CLI to check the fault list (list fault <parameters>
) and display the details of the peer connection problem. For example:
PCA-ADMIN> show fault id=57701191-5764-480b-826c-38c4b1970dde
Data:
Cause = 1742XC3024 : network is not in a CONNECTED state: CONNECTING
Action = Please contact customer support for solution
Health Exporter = peerconnect-checker
Diagnosing Source = peer connect health checker
Faulted Component Type = SOFTWARE
Description = 1749XC302P-- 1742XC3024 : network is not in a CONNECTED state: CONNECTING
Name = 1749XC302P--PCA-8000-UY--peerconnect
- Using the Service CLI
-
-
Look up the ID of the failover DR plan you need to execute. Use
drGetConfigs
to find the DR configuration, and display its associated DR plans usingdrListPlan
. -
From the standby appliance, execute the failover DR plan with the
drExecutePlan
command.Note
To run the command in check-only mode, add the parameter
checkOnly=True
. Only the DR plan steps enabled for check-only mode will be performed.PCA-ADMIN> drExecutePlan planId=6e797d8b-7245-4d49-8e68-bf67f2d53041::fo1 JobId: 49521287-c148-4791-9626-13190fce3d1d Data: DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::fo1. Successfully started job for DR Plan Execute for config_id 6e797d8b-7245-4d49-8e68-bf67f2d53041, plan_name fo1
-
Use the job ID to check the status of the operation you started.
PCA-ADMIN> show Job id=49521287-c148-4791-9626-13190fce3d1d Data: Id = 49521287-c148-4791-9626-13190fce3d1d Type = Job Associated Work Request Id = c8e3b554-a3ef-4e9b-a52c-c9a518f70974 Done = false Name = OPERATION-EXECUTE_DR_PLAN Progress Message = DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::fo1. Successfully started job for DR Plan Execute for config_id 6e797d8b-7245-4d49-8e68-bf67f2d53041, plan_name fo1 Run State = Active Transcript = Created job OPERATION Username = admin WorkItemIds 1 = id:d7a09483-ef2e-4e03-81bb-fed5ee661428 type:WorkItem name:
-
Ensure that the job completes successfully.
PCA-ADMIN> show Job id=49521287-c148-4791-9626-13190fce3d1d Data: Id = 49521287-c148-4791-9626-13190fce3d1d Type = Job Associated Work Request Id = c8e3b554-a3ef-4e9b-a52c-c9a518f70974 Done = true Name = OPERATION-EXECUTE_DR_PLAN Progress Message = DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::fo1. DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::fo1. drexecuteplan succeeded for config [6e797d8b-7245-4d49-8e68-bf67f2d53041] Operation: [failover] plan_name: [fo1]. Response: [Successfully completed checks for failover for DR config_id 6e797d8b-7245-4d49-8e68-bf67f2d53041. Plan Execution Status: [precheck : pass , role_reversal_precheck : pass , role_reversal : pass , start_standby : pass , ]] Run State = Succeeded Transcript = Created job OPERATION Username = admin WorkItemIds 1 = id:d7a09483-ef2e-4e03-81bb-fed5ee661428 type:WorkItem name:
After successful completion, all instances included in the DR configuration have been recovered and are running on the standby appliance.
-
- Using the Service Web UI
-
-
Under Disaster Recovery Service, open the DR Configurations page. In the table, click the configuration for which you want to perform a switchover. The DR Configuration detail page appears.
-
In the Resources section, click Plans.
-
In the Actions column, open the quick menu (3 dots) for the failover plan of your choice, and click Execute Plan.
Alternatively, click the DR plan name to display its detail page. In the top-right corner, click Execute Plan.
-
When prompted, choose whether to execute the full plan or a subset of the steps in check-only mode.
Click Confirm. A DR job is started. When it completes successfully, all steps in the switchover DR plan have been performed as expected.
To track progress, under Disaster Recovery Service, select Jobs. The Jobs table reports the status of each job. Click a record in the table to display the job details.
After successful completion, all instances included in the DR configuration have been recovered and are running on the standby appliance.
-
Performing Postfailover Operations
A postfailover is performed after a failover, when the system that experienced an outage comes back online. The plan can be executed from either of the peered systems. During postfailover, the DR configuration is cleaned up on the primary system that went down. The original standby system becomes the primary for the resources covered by the DR configuration, using the original primary as the new target for DR data replication.
- Using the Service CLI
-
-
After a failover, confirm that the primary appliance is back online and in healthy condition.
Ensure that the peering status is active and replication is enabled. Neither rack should report an active fault with a name containing "peer connect". (Check with Service CLI command
list fault
.) -
Look up the ID of the postfailover DR plan you want to execute. Use
drGetConfigs
to find the DR configuration, and display its associated DR plans usingdrListPlan
. -
From the primary or standby appliance, execute the postfailover DR plan with the
drExecutePlan
command.Note
For postfailover operations, the check-only mode does not apply.
PCA-ADMIN> drExecutePlan planId=6e797d8b-7245-4d49-8e68-bf67f2d53041::pfo1 JobId: 56d040ba-30a6-4bea-b924-78ebabed2626 Data: DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::pfo1. Successfully started job for DR Plan Execute for config_id 6e797d8b-7245-4d49-8e68-bf67f2d53041, plan_name pfo1
-
Use the job ID to check the status of the operation you started.
PCA-ADMIN> show Job id=56d040ba-30a6-4bea-b924-78ebabed2626 Data: Id = 56d040ba-30a6-4bea-b924-78ebabed2626 Type = Job Associated Work Request Id = b4ad564b-e385-4688-94ff-11bf5267d72e Done = false Name = OPERATION-EXECUTE_DR_PLAN Progress Message = DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::pfo1. Successfully started job for DR Plan Execute for config_id 6e797d8b-7245-4d49-8e68-bf67f2d53041, plan_name pfo1 Run State = Active Transcript = Created job OPERATION Username = admin WorkItemIds 1 = id:2e4db010-239e-41a1-aa0d-cb97167c64fc type:WorkItem name:
-
Ensure that the job completes successfully.
PCA-ADMIN> show Job id=56d040ba-30a6-4bea-b924-78ebabed2626 Data: Id = 56d040ba-30a6-4bea-b924-78ebabed2626 Type = Job Associated Work Request Id = b4ad564b-e385-4688-94ff-11bf5267d72e Done = true Name = OPERATION-EXECUTE_DR_PLAN Progress Message = DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::pfo1. DrPlan id: 6e797d8b-7245-4d49-8e68-bf67f2d53041::pfo1. drexecuteplan succeeded for config [6e797d8b-7245-4d49-8e68-bf67f2d53041] Operation: [postfailover] plan_name: [pfo1]. Response: [Successfully completed checks for postfailover for DR config_id 6e797d8b-7245-4d49-8e68-bf67f2d53041. Plan Execution Status: [stop_primary : pass , cleanup_primary : pass , post_config : pass , ]] Run State = Succeeded Transcript = Created job OPERATION Username = admin WorkItemIds 1 = id:2e4db010-239e-41a1-aa0d-cb97167c64fc type:WorkItem name:
After successful completion, all instances impacted by the switchover or failover have been restored and are running on the appliance where they were hosted before.
-
- Using the Service Web UI
-
-
After a failover, confirm that the primary appliance is back online and in healthy condition.
Ensure that the peering status is active and replication is enabled. Neither rack should report an active fault with a name containing "peer connect". (Display active faults in the Service CLI.)
-
Under Disaster Recovery Service, open the DR Configurations page. In the table, click the configuration for which you want to perform postfailover operations. The DR Configuration detail page appears.
-
In the Resources section, click Plans.
-
In the Actions column, open the quick menu (3 dots) for the postfailover plan of your choice, and click Execute Plan.
Alternatively, click the DR plan name to display its detail page. In the top-right corner, click Execute Plan.
-
When prompted, click Confirm.
Note
For postfailover operations, the check-only mode does not apply.
A DR job is started. When it completes successfully, all steps in the postfailover DR plan have been performed as expected.
To track progress, under Disaster Recovery Service, select Jobs. The Jobs table reports the status of each job. Click a record in the table to display the job details.
When the job has completed successfully, all instances impacted by the switchover or failover have been restored and are running on the appliance where they were hosted before.
-