Wednesday, November 26, 2014

In ODA, archive not applying to DR


1) Alert log :

ORA-1153 signalled during: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION...
Wed Nov 26 09:47:38 2014
FAL[client]: Failed to request gap sequence
GAP - thread 2 sequence 2422-2422
DBID 1036100936 branch 851819650
FAL[client]: All defined FAL servers have been attempted.

------------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that's sufficiently large enough to maintain adequate log switch information to resolve
archivelog gaps.


solution :

The message about the CONTROLFILE_RECORD_KEEP_TIME parameter is often a red herring. The real culprit is:

GAP - thread 1 sequence 3126-3224

You are missing some log files. Find the log files from thread 1 for those sequences and manually transmit them to your standby database. Then register each one with the following:

ALTER DATABASE REGISTER LOGFILE '/dir/filename';

Once you have registered all logfiles with the standby, life should resume as normal.

HTH,
Brian



2) Error on Alert log :

Tue Nov 25 16:40:16 2014
WARN: ARC2: Terminating pid 5775 hung on an I/O operation
krsv_proc_kill: Killing 1 processes (Process by index)
ARC2: Error 16198 due to hung I/O operation to LOG_ARCHIVE_DEST_2
Tue Nov 25 16:50:12 2014

source : LGWR: Error 16198 due to hung I/O operation (Doc ID 1512424.1)

Cause :

This is an I/O problem. LGWR could not write to log_archive_dest_1 as the I/O operation is hung.

As can be seen from the alert.log, this is a hung I/O:

LGWR: Error 16198 due to hung I/O operation to LOG_ARCHIVE_DEST_1
LGWR: Detected ARCH process failure
From the RDA we could see that /rman3 is NFS mounted from the host gpttrtnlxmgmt
Disks Mounts:
gpttrtnlxmgmt:/rman3 4297753856    123008 4251116224   1% /rman3

From the operating system messages log we can see that there was an error connecting to this host:
RPC: error 5 connecting to server gpttrtnlxtest
nfs_statfs: statfs error = 5
...
RPC: error 512 connecting to server gpttrtnlxmgmt


Solution :
    
The NFS mount point is hung and cannot be used as a log archive destination.
As archivelogs are important to the database, the database will hang if there are any issues with the archivelog destination.

So ensure that the NFS mount point is working properly - refer the issue to your system administrator.
Otherwise use a different location for the archivelog destination where there are no known I/O issues.



--
2) Error on Alert log :

Tue Nov 25 21:20:27 2014
WARN: ARC2: Terminating pid 3096 hung on an I/O operation
krsv_proc_kill: Killing 1 processes (Process by index)
ARC2: Detected ARCH process failure
ARC2: STARTING ARCH PROCESSES

Solution :

krsv_proc_kill: Killing processes (Process by index) message in the alert log

Some times we'll get the following warning messages in the alert log of the database even continuously.

Alert log message:
==============
WARN: ARC2: Terminating ARCH (pid 10111) hung on a disk operation
Mon May 05 22:13:32 2014
krsv_proc_kill: Killing 266287972353 processes (Process by index)
Mon May 05 22:20:32 2014
ARC2: Detected ARCH process failure
ARC2: STARTING ARCH PROCESSES
Mon May 05 22:20:32 2014
ARC1 started with pid=21, OS id=12104
ARC1: Archival started
ARC2: STARTING ARCH PROCESSES COMPLETE
Mon May 05 22:20:41 2014
Deleted Oracle managed file /data/oracle/flash_recovery_area/ORCL/archivelog/2014_05_05/o1_mf_1_29497_9phnw343_.arc          

Do not panic when you see the above message. As long as there are no other side-effects this error can be ignored. It occurs when disk IO & CPU load are high in the server. Normally we are getting this warning messages when ever backup runs since disk io will be very high during the backup.

During high disk IO & cpu load, db is also trying to archiving the file on disk, But unable to succeed the process due to above said reasons. So it is deleting the incomplete archive log from the disk automatically. 

source : http://oradba11g.blogspot.com/2014/05/krsvprockill-killing-processes-process.html



Note : In My environment , I got the above errors in the alert log file during slow connectivity between DC and DR .

No comments:

Post a Comment