Prechecks Performed by Full Stack Disaster Recovery

Full Stack Disaster Recovery performs prechecks for resources such as DR Protection Groups, DR Plans, and DR Plan Executions.

Prechecks for Compute Instance

Full Stack DR first performs the following storage prechecks for each VM present in the primary DRPG. Full Stack DR verifies that:
  • Volume group replication is configured or backup is configured with a backup policy and cross-region copy is enabled.
  • A volume group replica or at least one volume group backup exists in the standby region. Multiple backups can also exist as Full Stack DR uses the latest volume group backup.
  • All the boot and block volumes of the VMs of the members in a DRPG are added to the volume group.
  • Volume group contains only the boot and block volumes attached to the VM of the members in a DRPG.
  • Whether the user is trying to add moving compute instances to a standby DR Protection Group, which is not allowed.

Prechecks for File System:

Full Stack DR first performs the following prechecks for the file system :
  • Switchover/Failover/Start Drill:
    • Validates that the source file system is in Active state and should be exported file system
    • Validates that the source file system should not have custom encryption key
    • Validates that all the export present in source file system is mapped with destination mount target as part of file system member property and should be unique (avoid duplication)
    • Validates that there is at least one active Replication policy setup on source file system
    • Validates that the file system member property destination availability domain is in peer region
    • Validates that all the destination mount target configure for source file system exports are in peer region and are in the same availability domain as member property destination availability domain.
    • Validates that the target file system in Active state and should be unexported file system
    • Validates that at least one replication snapshot should be present in target file system
    • Validates that the destination mount targets are having correct TCP/UDP protocol enable. See Configuring VCN Security Rules for File Storage.
  • Stop Drill
    • Validates that the cloned exported file system is present for clean up
    • Validates that the exports are present in cloned exported file system for clean up

Prechecks for Mount/Unmount File System on Compute Instance:

Full Stack DR first performs the following prechecks for the mount/unmount file system on compute instance:
  • Validations for mount/unmount detail member property :
    • Mount details:
      • Validates that the mount target of mount details matches with standby region.
      • Validates that the combination of mount point and export is unique (avoid multiple mounting on same mount point).
    • Unmount details:
      • Validates that the mount target of unmount details matches with primary region.
      • Validates that the export path is present on the mount target of unmount details.
    • Validates that the movable/non-movable compute instance and mount target of mount details are having correct TCP/UDP protocol enable. See Configuring VCN Security Rules for File Storage.
    • Validates for movable/non-movable compute instance, either the primary DR Protection Group should have the mountable file system or destination mount target should have correct export path to mount.
    • Note

      For a failover, the following check is performed in the File System – Mount on Compute Instance step with newly launched compute instance due to unavailability of primary region

    • Validates that the compute instance have the Compute Instance Run Command plugin enabled.
    • Validates that the compute instance has a root-access:
    • Only for Linux operating system, validates that the compute instance has nfs-client installed. For information on how to install nfs-client on compute, see Mounting File Systems From UNIX-Style Instances.
    • Only for Linux operating system, validate that the /etc/fstab in linux operating system should have mount details present with correct mount target IP address/Fully Qualified Domain Name and mount point.
    • For both operating system, validate that the mount point is present on the compute instance

Prechecks for Volume Groups (Block Storage)

Full Stack DR first performs the following prechecks for all volume groups added to the primary DRPG. Full Stack DR verifies that:
  • The volume group is in an Available state.
  • The volume group has either replication or backups configured in the standby region. If both are configured, Full Stack DR uses replicas and ignore backups.
  • For intra-region DR that any destination (standby) region replicas are not in the same availability domain (AD).
  • The replica in the standby region is in an Available state, or if backups are used, that at least one backup exists and is Available.
  • The list of volumes in the source volume group match the list of volumes in the standby region replica or backup.

Prechecks for Block volume for Non-Movable compute instances

Full Stack DR first performs the following prechecks for the block volume for non-movable compute instances:

If the compute instance is added as member to a DRPG with the role Primary, then perform the following validations for each block volume ID provided in the new member property list:
  • The block volume ID should be a valid OCID of a block volume.
  • The block volume should not have duplicates in the member properties of the same compute instance.
  • Block volume should be already attached to the compute instance.
  • The block volume should be a part of some volume group member of the DRPG.
  • If a Volume attachment reference instance ID is provided in the attachment details, then that instance should be a member of the standby DR Protection Group and the block volume ID should be added in its member properties.
  • If the Volume attachment reference instance ID is not provided in the attachment details, then only one compute instance in the standby DRPG should have a member property defined with this block volume ID.
  • The mount points that are defined should be unique.
If the compute instance is added as member to a DRPG with the role Standby, then perform the following validations for each block volume ID provided in the new member property list:
  • The block volume ID should be a valid OCID of a block volume.
  • The block volume should not have duplicates in the member properties of the same compute instance.
  • The block volume should be from the region of the primary DRPG.
  • The block volume should be a part of some volume group member of the primary DRPG.
  • The volume group's destination/target AD (where the backup or replica will be activated) should match the AD of this standby compute instance.
  • If a Volume attachment reference instance ID is provided in the attachment details, then that peer instance should be a member of primary DRPG and the block volume should be attached to it.
  • If the Volume attachment reference instance ID is not provided in the attachment details, then only one compute instance in the primary DRPG should have the block volume attached to it.
  • The mount points that you define should be unique.
  • No two block volumes should be configured to attach using a same device path.
  • If the attachment uses device paths, then the device paths must not be in use.
  • If a block volume is configured to be attached to more than one compute instance, then the attachment must have a shareable access.

Prechecks for Object storage with Switchover and Failover

Full Stack DR performs the following prechecks for the object storage:

  • Switchover:
    • Object Storage Bucket - Delete Replication (Primary) Precheck

      • Validates that the source bucket is present.

      • Validates that the replication policy is present.

      • Validates that the replication policy should be in the peer region.

    • Object Storage Bucket - Setup Reverse Replication (Standby) Precheck

      • Validates that the target bucket is present

      • Validates that the source bucket is present

  • Failover:
    • Object Storage Bucket - Delete Replication (Primary) Precheck

      • Validates that the source bucket is in an Active state.

      • Validates that the replication policy is present.

      • Validates that the replication policy should be in the peer region.

      • Validates that the target bucket is in an Active state.

    • Object Storage Bucket - Setup Reverse Replication (Standby) Precheck (Continue on Error)

      • Validates that the target bucket is present.
      • Validates that the source bucket is present.

Prechecks for Database (Oracle Base Database Service, Oracle Exadata Database Service on Dedicated Infrastructure, Oracle Exadata Database Service on Exascale Infrastructure, Oracle Exadata Database Service on Cloud@Customer)

Full Stack DR performs the following prechecks if a database member (Oracle BaseDatabase Service, Oracle Exadata Database Service on Dedicated Infrastructure, Oracle Exadata Database Service on Exascale Infrastructure, Oracle Exadata Database Service on Cloud@Customer) is a part of the DRPG. Full Stack DR verifies that:
  • Database member properties are not empty or null and password secret vault location is a part of the database member properties.
  • You are able to access the secret vault in which the database password is stored database and peer database is in an Available state.
  • Database and peer Database have Data Guard enabled and they are Data Guard peers of each other.
  • Database and peer Database have the correct Data Guard roles.
  • Database and peer Database are a part of the two associated DR protection groups that are a part of the configuration. Primary database is a part of the primary DR protection group and standby database is a part of the standby DR protection group.

Prechecks for Oracle Autonomous Database Serverless

Full Stack DR performs the following prechecks if an Autonomous Database Serverless member is part of the DRPG. Full Stack DR verifies that:
  • Autonomous database member properties are not empty or null.
  • The primary Autonomous database does not have an empty standby database list.
  • The standby Autonomous database is not in the same region as the primary database region and is not a local peer.
  • The Autonomous database and the peer Autonomous database are a part of the two associated DR protection groups that are a part of the configuration.
  • Remote Data Guard is configured.
  • Remote peer database belongs to the remote DRPG.
  • The primary database lifecycle state is AVAILABLE.

    For switchover prechecks, Full Stack DR performs the following additional validations on the standby database: Verifies that remote peer standby is in the correct (STANDBY) state.

Prechecks for Oracle Autonomous Container Database

Full Stack DR performs the following prechecks if an Autonomous Container Database member is part of the DRPG. Full Stack DR verifies that:
  • Autonomous Container Database member properties are not empty or null.
  • The primary Autonomous Container Database does not have an empty standby database list.
  • The Autonomous database and the peer Autonomous database are a part of the two associated DR protection groups that are a part of the configuration.
  • Remote Data Guard is configured.
  • Remote peer database belongs to the remote DRPG.
  • The primary and the standby Autonomous Container database lifecycle states are AVAILABLE.

    For switchover prechecks, Full Stack DR performs the following additional validations on the standby database: Verifies that remote peer standby is in the correct (STANDBY) state.

Prechecks for Kubernetes Engine (OKE)

Full Stack DR performs the following prechecks for Kubernetes Engine (OKE).

Primary Cluster
  1. Checks the connection to primary cluster.
  2. Checks if the backup is older than one day.
  3. Downloads and prints backup log.
  4. Validates that the lifecycle state of the primary OKE cluster is Active.
  5. Validates that the namespace and the bucket in the primary member properties exist and are valid.
  6. Validates that the backup schedule in the primary member properties is in the RFC 5545 format.
    • Validates for the unsupported Rule part.
    • Validates the range for each Rule part.
  7. Validates the load balancer mapping in primary member properties for the following constraints.
    • SourceLoadBalancerId and DestinationLoadBalancerId have OCIDs of LOADBALANCER.
    • Load balancer mapping must be unique.
      • A-B, C-B -> Not allowed
      • A-B, A-D -> Not allowed
        Note

        For example, if you have a set of two load balancers in the primary region (such as load_balancer_A and load_balancer_C). You have another set of two load balancers in the standby region (such as load_balancer_B and load_balancer_D). So, while adding an OKE cluster as a member, you must add the load balancer mappings property. In this map, only the following mapping is allowed:
        load_balancer_A <--> load_balancer_B & load_balancer_C <--> load_balancer_D 
        or load_balancer_C <--> load_balancer_B & load_balancer_A <--> load_balancer_D
        . However, the following mappings are not allowed as load_balancer_B is set as the destination load balancer in both mappings:
        load_balancer_A <--> load_balancer_B & load_balancer_C <--> load_balancer_B
  8. Validates the network load balancer mapping in primary member properties for the following constraints.
    • SourceNetworkLoadBalancerId and DestinationNetworkLoadBalancerId have OCIDs of NETWORKLOADBALANCER.
    • Network load balancer mapping must be unique.
      • A-B, C-B -> Not allowed
      • A-B, A-D -> Not allowed
        Note

        For example, if you have a set of two network load balancers in the primary region (such as network_load_balancer_A and network_load_balancer_C). You have another set of two network load balancers in the standby region (such as network_load_balancer_B and network_load_balancer_D). So, while adding an OKE cluster as a member, you must add the network load balancer mappings property. In this map, only the following mapping is allowed:
        network_load_balancer_A <--> network_load_balancer_B & network_load_balancer_C <--> network_load_balancer_D 
        or network_load_balancer_C <--> network_load_balancer_B & network_load_balancer_A <--> network_load_balancer_D
  9. Validates the vault mapping in primary member properties for the following constraints.
    • SourceVaultId and DestinationVaultId have OCIDs of VAULT.
    • Vault mapping must be unique.
      • A-B, C-B -> Not allowed.
      • A-B, A-D -> Not allowed.
        Note

        For example, if you have a set of two vault in the primary region (such as vault_A and vault_C). You have another set of two vault in the standby region (such as vault_B and vault_D). So, while adding an OKE cluster as a member, you must add the vault mappings property. In this map, only the following mapping is allowed:
        vault_A  <--> vault_B & vault_C <--> vault_D 
        or vault_C <--> vault_B & vault_A <--> vault_D
  10. Validates that the peer cluster ID and the backup location in the primary member properties are not empty.
  11. Validates the Jump host in primary member properties for the following constraints.
    • DR Protection Group must contain compute instance with same OCID as jump host.
    • Jump host must be a non-movable compute instance.
    • Jump host must have Lifecycle state as RUNNING.
  12. Validates the peer cluster ID in primary member properties for the following constraints.
    • Validates that peer DR Protection Group has cluster with peer cluster ID.
    • Validates that member cluster itself is not added as peer cluster.
    • Validates that the peer cluster provided in member properties is not added as peer cluster for the other OKE cluster in the DR Protection Group.
  13. Validates the node pools in OKE cluster of primary region for the following constraints.
    • Validates that the node count in all the node pools is at least one.
    • Validates that there is at least one active node in all node pools.
    • Example: If there are two node pools, one node does not have any active node however, another has an active node, then this would result in an exception.
  14. Validates the source load balancer ID in primary member properties for the following constraints.
    • Lifecycle should be Active.
    • Load balancer can have only regional subnet.
  15. Validates the source network load balancer ID in primary member properties for the following constraints.
    • Lifecycle should be Active.
    • Network load balancer can have only regional subnet.
  16. Validates that the source vault ID in primary member properties is Active.
  17. Validates the managed node pool configuration in primary member properties for the following constraints.
    • Member type should be MANAGED.
    • Lifecycle should be ACTIVE.
    • Sum of node count ('maximum' in managed node configuration or the existing node count, whichever is greater) for all the node pools in cluster should not surpass the Limit.
  18. Validates the virtual node pool configuration in primary member properties for the following constraints.
    • Member type should be VIRTUAL.
    • Lifecycle should be ACTIVE.
    • Sum of node count ('maximum' in virtual node configuration or the existing node count, whichever is greater) for all the node pools in cluster should not surpass the Limit.

Standby Cluster

  1. Checks if the namespace(s) is/are a part of the backup.
  2. Checks if all the block volumes referenced in persistent volumes are a part of DR Protection Group.
  3. Checks if all the file systems/Mount targets referenced in persistent volumes are a part of DR Protection Group.
  4. Checks if all the Load balancers referenced in ingress class are a part of DR Protection Group.
  5. Checks if all the vaults referenced in secretproviderclasses are a part of DR Protection Group.
  6. Checks if the Custom resource definitions are compatible with the standby cluster version.
  7. Validates that the lifecycle state of the standby OKE cluster is Active.
  8. Validates that the peer cluster ID and backup location in standby member properties should not be empty.
  9. Validates the Jump host in standby member properties for the following constraints.
    • DR Protection Group must contain compute instance with same OCID as jump host.
    • Jump host must be a non-movable compute instance.
    • Jump host must have Lifecycle state as RUNNING.
  10. Validates the peer cluster ID in standby member properties for the following constraints.
    • Validates that the peer DR Protection Group has cluster with peer cluster ID.
    • Validates that the member cluster itself is not added as peer cluster.
    • Validates that the peer cluster provided in member properties is not added as peer cluster for other OKE cluster in the DR Protection Group.
  11. Validates the destination load balancer ID in primary member properties for the following constraints.
    • Lifecycle should be Active.
    • Load balancer can have only regional subnet.
  12. Validates the destination network load balancer ID in primary member properties for the following constraints.
    • Lifecycle should be Active.
    • Network load balancer can have only regional subnet.
    • Validates that destination vault ID in primary member properties is Active.
  13. Validates the managed node pool configuration in standby member properties for the following constraints.
    • Member type should be MANAGED.
    • Lifecycle should be Active.
    • Sum of node count ('maximum' in managed node configuration or the existing node count, whichever is greater) for all the node pools in cluster should not surpass the Limit.
  14. Validates the virtual node pool configuration in standby member properties for the following constraints.
    • Member type should be VIRTUAL.
    • Lifecycle should be Active.
    • Sum of node count ('maximum' in virtual node configuration or the existing node count, whichever is greater) for all the node pools in cluster should not surpass the Limit.
  15. Validates the node pools in OKE cluster of the standby region for the following constraints.
    • Validates that the node count in all the node pools is at least one.
    • Validates that there is at least one active node in all node pools.
  16. Validates that the destination cluster node pool must have at least one node in each AD where FSS/Block will be restored.
  17. Validates the securityList and NSG rules defined on the subnet/network security group of the node pools and virtual node pools provided in the cluster by ensuring that the following rules are set.
    • Stateful ingress Rules
      • TCP ports 111, 2048, 2049, and 2050, and
      • UDP ports 111 and 2048
    • Stateful egress Rules
      • TCP source ports 111, 2048, 2049, and 2050, and
      • UDP source port 111.
        Note

        Validation is applicable only if OKE Cluster has PV or PVC for file system. Full Stack DR verifies only the following scenarios:
        • Scenario A: Mount target and instance are in different subnets (recommended).
        • Scenario B: Mount target and instance are in the same subnet.
          Full Stack DR does not verify the following scenarios:
          • Scenario C: Mount target and instance use TLS in-transit encryption.
          • Scenario D: Mount target uses LDAP for authorization.