Learn to identify and resolve Oracle Data Guard issues.
When troubleshooting Oracle Data Guard, you must first determine whether the problem
occurs during the Data Guard setup and initialization or during Data Guard operation,
when lifecycle commands are entered. The steps to identify and resolve the issues are
different, depending on the scenario in which they are used.
There are three lifecycle operations: switchover, failover, and reinstate. The Data Guard
broker is used for all of these commands. The broker command line interface
(dgmgrl) is the main tool used to identify and troubleshoot the
issues. Although you can use logfiles to identify root causes, dgmgrl
is faster and easier to use to check and identify an issue.
Setting up and enabling Data Guard involves multiple steps. Log files are
created for each step. If any of the steps fail, review the relevant log file to
identify and fix the problem.
Validation of the primary cloud VM Cluster and database
Validation of the standby cloud VM Cluster
Recreating and copying files to the standby database (passwordfile and wallets)
Creating Data Guard through Network (RMAN Duplicate command)
Configuring Data Guard broker
Finalizing the setup
Troubleshooting Data Guard using logfiles The tools used to identify the issue and the locations of relevant logfiles are different, depending on the scenario in which they are used.
Troubleshooting the Data Guard Setup Process Review errors that can occur in the different steps of the Data Guard setup process. While some errors are displayed within the Console, most of the root causes can be found in the logfiles
The tools used to identify the issue and the locations of relevant logfiles
are different, depending on the scenario in which they are used.
Use the following procedures to collect relevant log files to investigate issues. If you
are unable to resolve the problem after investigating the log files, contact My Oracle
Support.
Note
When preparing collected files for
Oracle Support, bundle them into a compressed archive, such as a ZIP file.
On each compute node associated with the Data Guard configuration, gather log files
pertaining to the problem you experienced.
Enablement stage log files (such as those documenting the Create Standby
Database operation) and the logs for the corresponding primary or standby
system.
Enablement job ID logfiles. For example: 23.
Locations of enablement log files by enablement stage and Exadata system
(primary or standby).
Database name logfiles (db_name or
db_unique_name, depending on the file path).
Note
Check all nodes of the
corresponding primary and standby Exadata systems. Commands executed on a system may
have been run on any of its nodes.
Data Guard Deployer (DGdeployer) is the process that
performs the configuration. When configuring the primary database, it creates the
/var/opt/oracle/log/<dbname>/dgdeployer/dgdeployer.log
file.
This log should contain the root cause of a failure to configure the primary
database.
The primary log from the dbaasapi command-line
utility is:
/var/opt/oracle/log/dbaasapi/db/dg/<job_ID>.log.
Look for entries that contain dg_api.
One standby log from the dbaasapi command-line
utility is:
/var/opt/oracle/log/dbaasapi/db/dg/<job_ID>.log.
In this log, look for entries that contain dg_api.
The other standby log is:
/var/opt/oracle/log/<dbname>/dgcc/dgcc.log.
This log is the Data Guard configuration log.
The Oracle Cloud Deployment Engine (ODCE) creates the
/var/opt/oracle/log/<dbname>/ocde/ocde.log
file. This log should contain the cause of a failure to create the standby
database.
The dbaasapi command line utility creates the
var/opt/oracle/log/dbaasapi/db/dg/<job_ID>.log
file. Look for entries that contain dg_api.
The Data Guard configuration log file is
/var/opt/oracle/log/<dbname>/dgcc/dgcc.log.
DGdeployer is the process that performs the
configuration. It creates the following
/var/opt/oracle/log/<dbname>/dgdeployer/dgdeployer.log
file. This log should contain the root cause of a failure to configure the
standby database.
The dbaasapi command-line utility creates the
/var/opt/oracle/log/dbaasapi/db/dg/<job_ID>.log
file. Look for entries that contain dg_api.
The Data Guard configuration log is
/var/opt/oracle/log/<dbname>/dgcc/dgcc.log.
DGdeployer is the process that performs the
configuration. While configuring Data Guard, it creates the
/var/opt/oracle/log/<dbname>/dgdeployer/dgdeployer.log
file. This log should contain the root cause of a failure to configure the primary
database.
On each node of the primary and standby sites, gather log files for the
related database name (db_name).
Note
Check all nodes on both primary and standby Exadata systems. A lifecycle
management operation may impact both primary and standby systems.
Review errors that can occur in the different steps of the Data Guard setup
process. While some errors are displayed within the Console, most of the root causes can be
found in the logfiles
The password entered for enabling Data Guard didn't match the primary admin password
for the SYS user. This error occurs during the Validate Primary stage of
enablement.
The database may not be running. This error occurs during the Validate Primary stage
of enablement. Check with srvctl and sql on the
host to verify that the database is up and running on all nodes.
The primary database could not be configured. Invalid Data Guard commands or failed
listener reconfiguration can cause this error.
The TDE wallet could not be created. The Oracle Transparent Database Encryption (TDE)
keystore (wallet) files could not be prepared for transportation to the standby
site. This error occurs during the create TDE Wallet stage of enablement. Either of
the following items can cause failure at this stage:
The TDE wallet files could not be accessed
The enablement commands could not create an archive containing the wallet
files
Troubleshooting procedure:
Ensure that the cluster is accessible. To check the status of a cluster, run the
following command:
crsctl check cluster -all
If the cluster is down, run the following command to restart it:
crsctl start crs -wait
If this error occurs when the cluster is accessible, check the logs for create
TDE Wallet (enablement stage) to determine cause and resolution for the
error.
The archive containing the TDE wallet was likely not transmitted to the standby site.
Retrying usually solves the problem.
The primary and standby sites may not be able to communicate with each other to
configure the standby database. These errors occur during the configure standby
database stage of enablement. In this stage, configurations are performed on the
standby database, including the rman duplicate of the primary database. To
resolve this issue:
Verify the connectivity status for the primary and standby sites.
Ensure that the host can communicate from port 1521 to all ports. Check
the network setup, including Network Security Groups (NSGs), Network
Security Lists, and the remote VCN peering setup (if applicable). The
best way to test communication between the host and other nodes is to
access the databases using SQL*PLUS from the primary to standby and from
the standby to the primary.
The SCAN VIPs or listeners may not be running. Use the test above to help
identify the issue.
Possible causes:
SCAN VIPs or listeners may not be running. You can confirm this issue by using
the following commands on any cluster node.
[grid@exa1-****** ~]$ srvctl status
scan
[grid@exa1-****** ~]$ srvctl status
scan_listener
Databases may not be reachable. You can confirm this issue by attempting to
connect using an existing Oracle Net alias.
Troubleshooting procedure:
As the oracle OS user, check for the existence of an Oracle Net alias for the
container database (CDB). Look for an alias in
$ORACLE_HOME/network/admin/<dbname>/tnsnames.ora.
The following example
shows an entry for a container database named
db12c:
Verify that you can use the alias to connect to the database. For example, as
sysdba, enter the following command:
sqlplus sys@db12c
A possible cause for this error is that the Oracle Database sys or system user
passwords for the database and the TDE wallet may not be the same. To compare the
passwords:
Connect to the database as the sys user and check the TDE status in
V$ENCRYPTION_WALLET
.
Connect to the database as the system user and check the TDE status in
V$ENCRYPTION_WALLET
.
Update the applicable passwords to match. Log on to the system host as opc
and run the following commands:
When the switchover, failover, and reinstate commands are run, multiple error
messages may occur. Refer to the Oracle Database documentation for these error
messages.
Note
Oracle recommends using the Data Guard broker command line interface (dgmgrl) to
validate the configurations.
As the Oracle User, connect to the primary or standby database with
dgmgrl and verify the configuration and the
database:
If you were unable to resolve the problem using the information in this topic, follow the
procedures below to collect relevant database and diagnostic information. After you have
collected this information, contact Oracle Support.
Collecting Cloud Tooling Logs Use the relevant log files that could assist Oracle Support for further investigation and resolution of a given issue.
To collect the relevant Oracle diagnostic information and logs, run the dbaascli
diag collect command.
For more information about the usage of this utility, see DBAAS Tooling: Using dbaascli to Collect Cloud Tooling Logs and Perform a Cloud
Tooling Health Check.