Many times
Oracle DBA's are faced with a situation where shutting down the instance with
shutdown immediate seems to take infinite time and gives a impression that the
session is hung. In most of the cases this is not a hang. Hang can be defined
as a scenario when few sessions/processes are waiting on some action and in
turn blocking other session/processes to complete their tasks. Now the original
sessions could be waiting because of resource issues or Oracle Bugs.
Shutdown
immediate can take long time to complete (appear to be hung) because of three
reasons:
1. Uncommitted
transactions are being rolled back.
2. SMON is
cleaning temp segments or performing delayed block cleanouts.
3. Processes
still continue to be connected to the database and do not terminate.
1. Uncommitted
transactions are being rolled back:
This is the
case when the message 'Waiting for smon to disable tx recovery' is posted in
the alert log after we issue shutdown immediate.
There are two
reasons for this:
- A large query was running at the time of shutdown immediate.
-A large transaction was running at the time of shutdown immediate.
- A large query was running at the time of shutdown immediate.
-A large transaction was running at the time of shutdown immediate.
For large
queries:
SQL > select
count(*) from v$session_longops where time_remaining>0;
If it returns a
value > 0 then we can do a shutdown abort and then startup restrict and then
again shutdown immediate.
For large
transactions:
SQL > select
sum(used_ublk) from v$transaction;
If it returns a
large value then we have to wait for a long time for shutdowm to get completed.
If the large transaction is aborted and then shutdown is issued then we have to query v$fast_start_transactions and v$fast_start_server, we will not see anything in v$transaction at this time.
If the large transaction is aborted and then shutdown is issued then we have to query v$fast_start_transactions and v$fast_start_server, we will not see anything in v$transaction at this time.
At this
particular moment transaction recovery is going on and the count(*) will keep
on decreasing:
SQL > select
count(*) from v$fast_start_transaction;
Decreasing
count will show that recovery is going on and when the recovery is completed
the database will be shutdown.
But it is not
desirable under some circumstances such as, when we have very short maintance
window and we need to perform a shutdown immediate to do some work, in those
cases we can use the following event and set in the init.ora file TEMPERORARLY
To disable transaction recovery:
event="10513
trace name context forever, level 2"
and bounce the
instance and issue shutdown immediate to get complete without transaction
recovery.SMON will not do a transaction recovery untill this event is set in
the init.ora file so it is necessary to remove this event whenever you get a
chance to shutdown the database again, this time shutdown immediate can even
take 3-5 hours(Just remove this event from pfile).
2. SMON is
cleaning temp segments or performing delayed block cleanouts:
During a
SHUTDOWN IMMEDIATE and SHUTDOWN NORMAL, SMON cleans up extents which are no
longer needed and marking them as freed. It means that count from uet$ will
decrease and count in fet$ will increase.
To verify that
the temporary segments are decreasing have an active session available in SQL
during the SHUTDOWN IMMEDIATE. Run query to ensure the database is not hanging,
but is actually perform extent cleanup:
SQL>
select count(block#) from fet$;
COUNT(BLOCK)
----------
115
SQL>
select count(block#) from uet$;
COUNT(BLOCK)
----------
713
After some
time, issue the query again and check the results:
SQL>
select count(block#) from fet$;
COUNT(BLOCK)
----------
210
SQL
> select count(block#) from uet$;
COUNT(BLOCK)
----------
512
If you do not
have sufficient time to wait for this cleanup then you can set the following
event and bounce the database and reissue shutdown immediate to skip this
cleanup:
event="10061
trace name context forever, level 10"
It allows you
to prevent SMON from cleaning up temporary segments. Again it is not
recommended to set this event event forever. Whenever you have large downtime
remove this event and allow SMON to do its work.
3. Processes
still continue to be connected to the database and do not terminate:
After issuing
shutdown immediate, If we see entries in alert log file as:
Tue
Jan 8 12:00:27 2008
Active
call for process 10071 user 'oracle' program 'oracle@server.domain.abc (J001)'
SHUTDOWN:
waiting for active calls to complete.
Tue
Jan 8 12:00:57 2008
SHUTDOWN:
Active sessions prevent database close operation
It shows that
there are some active calls at program 'oracle@server.domain.abc
(J001)' which pmon is not able to clear up.This message is due to the fact that
database is waiting for pmon to clean up processes, but pmon is unable to clean
them. The client connections to the server are causing the shutdown immediate
or normal to hang. Do the following in this case:
1. Before
shutdown immediate, shutdown the listener:
$ lsnrctl stop
2. Now check if
there are any connection present at the database as:
$ ps -eaf |
grep LOCAL
It will give
you the OSPIDs of the client connected to database.
3 Manually kill
them as:
# Kill -9
<OSPID>
4. Issue
shutdown immediate now.
Do not forget
to bring up the listener after startup
In addition to
this you can set 10046 event in the session used to shutdown the instance. This
will help to tell the event on which session is waiting
SQL>alter
session set events '10046 trace name context forever, level 12'
SQL>Shutdown
immediate;
Look for the trace
file in user_dump_dest location. Also look at the alert.log for any other
messages. They might be helpful in case the shutdown is experiencing hang
situation.Source:Internet
No comments:
Post a Comment