Why my Data Guard Physical Standby always had Apply Lag greater than 30 minutes?
Hope you’re doing good.
I had a call with a client some days ago, they have a Data Guard Physical Standby running in protection mode Maximum Performance.
This protection mode provides the highest level of data protection that is possible without affecting the performance of a primary database. This is accomplished by allowing transactions to commit as soon as all redo data generated by those transactions has been written to the online log. Redo data is also written to one or more standby databases, but this is done asynchronously with respect to transaction commitment, so primary database performance is unaffected by the time required to transmit redo data and receive acknowledgment from a standby database.
This protection mode offers slightly less data protection than maximum availability mode and has minimal impact on primary database performance.
This is the default protection mode.
You can read more details about the Data Guard protection modes from the official documentation: here.
Well, first, let’s check how is the Apply Lag for this Standby DB:
DGMGRL> show database 'PE1STD'; Database - PE1STD Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: 0 seconds (computed 1 second ago) Apply Lag: 37 minutes 56 seconds (computed 0 seconds ago) Average Apply Rate: (unknown) Real Time Query: OFF Instance(s): PE1 Database Status: SUCCESS
Yes, we can confirm that Apply Lag is almost 38 minutes, EVEN that the Transport Lag is zero. We can have some quick conclusions on it:
- We can have sessions waiting on DB for any wait events;
- We can have slow disks (or experiencing slowness on disks) where the archivelogs (read operations) and datafiles (write operations) are stored;
- We can have some configuration leading to this behavior.
I did a check on the first two options: we didn’t have waiting sessions. Also, the disks are peforming well, no change on throughput.
Well, now we need to check if we have any configuration that can lead to this behavior. So, the question is: is there any configuration that can cause a delay on Apply?
Short answer: Yes, there is!
In the Data Guard Broker, there is a configuration called DelayMins, this property is also known as DELAY if you use Standby in manual configuration, without Broker, and you are configuring the values for LOG_ARCHIVE_DEST.
This property specifies the number of minutes log apply services will delay applying the archived redo log data on the standby database.
When the DelayMins property is set to the default value of 0 minutes, log apply services apply redo data as soon as possible.
Important to say that if you are going to perform a switchover to a standby, you need to reset the value for this property to 0, which means that ALL data must be applied on standby.
You can read about this property directly from official documentation, here.
Let’s check the value of this property:
DGMGRL> show database 'PE1STD' 'DelayMins'; DelayMins = '30'
Yes! Confirmed that DelayMins is set to 30, so, we’ll always have at least 30 mins of delay when applying the transactions in the standby database.
The following info was on alert.log:
PR00 (PID:27025): Media Recovery Delayed for 30 minute(s) T-1.S-67190
You can also check if BD has DELAY for apply querying the column DELAY_MINS on the view V$ARCHIVE_DEST.
Now, let’s change the value of DelayMins back to 0.
DGMGRL> edit database 'PE1STD' set property 'DelayMins' = '0'; Property "DelayMins" updated
In less than 3 mins the Apply Lag was zero again, and physical standby was ready for switchover, if needed.
DGMGRL> show database 'PE1STD'; Database - PE1STD Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: 0 seconds (computed 0 seconds ago) Apply Lag: 0 seconds (computed 0 seconds ago) Average Apply Rate: 58.88 MByte/s Real Time Query: OFF Instance(s): PE1 Database Status: SUCCESS
Hope this helps!
This site uses Akismet to reduce spam. Learn how your comment data is processed.