January 20, 2017

Solving a critical performance crisis for a national project caused by oracle connection slowness

An living national healthcare IT system, one core gateway system is very slow.

The gateway system is used to distribute patients' admission and treatment information. Every patient may generate dozens of system messages when they go to hospital, like when they report at the front desk, take X-rays, pay the bill, or when doctors write reports for their examinations etc.

The gateway system is designed to treat 100 messages per minute, but we find that it can only handles 40 messages per minute. As a result, tens of thousands of messages accumulate in the queue and sometimes patients have to wait hours before being attended to in hospitals in peak time as their information does not show up in the IT system.
Many teams and engineers are involved in this significant crisis.

After debugging the gateway, I believe the reason for the slowness is due to connection slowness from gateway system to oracle database.
The gateway system needs to establish a new connection to database to do some database updates for each message that it processes. So it needs to connect to database many times in a second.
However, there seems to be slowness when connecting to database.

Please note, for application running with long connections, even connecting time taking up to 30 seconds is fine, because it only needs to connect once for a long period. However, for this gateway system, it needs to connect to database so frequently, so the delay in connecting results in terrible performance.


A script is written to continuously connect from gateway system server to oracle database and below is the time spent for each connection:

time elapsed : 0,31
time elapsed : 0,31
time elapsed : 3,94
time elapsed : 0,33
time elapsed : 0,30
time elapsed : 0,33
time elapsed : 0,36
time elapsed : 0,30
time elapsed : 0,34
time elapsed : 0,25
time elapsed : 0,22
time elapsed : 0,34
time elapsed : 0,36
time elapsed : 0,30
time elapsed : 0,31
time elapsed : 0,28
time elapsed : 0,28
time elapsed : 0,27
time elapsed : 0,30
time elapsed : 3,81
time elapsed : 0,28
...........
...........

From above we can see it takes around 0.3 second to establish a connection, which is terrible; and worse, intermittently there are some connections taking 3+ seconds to establish, too bad.

However, after the connection is made, executing SQLs is very fast with no delay. It seems that the slowness only occurs when establishing connection.

This is where different teams have an argument.
DBA experts think it is a network issue.
Solaris administrators argues that oracle connection slowness should be either oracle database issue or network issue.
Network experts insist that there is nothing wrong with network.

So, which part of the system should be responsible for this? client server windows OS, database server Solaris OS, oracle database, or network layer?

Ok, let me start investigation. As a first step, I enable sqlnet trace for both oracle client on gateway server and for the database listener on database server.
The trace of sql client reveals the delay, below shows that the delay is pending at listener's response, so the slowness has nothing to do with client server:

[14-APR-2016 05:58:39:329] nsprecv: entry
[14-APR-2016 05:58:39:329] nsprecv: reading from transport...
[14-APR-2016 05:58:39:329] nttrd: entry
[14-APR-2016 05:58:41:201] nttrd: socket 516 had bytes read=8
[14-APR-2016 05:58:41:201] nttrd: exit

For the same connection, listener's trace reveals that it is waiting Solaris kernel to fork shadow process and return its PID:
[14-APR-2016 05:58:39:566] nsevwait: 4 registered connection(s)
[14-APR-2016 05:58:39:566] nsevwait: 0 pre-posted event(s)
[14-APR-2016 05:58:39:566] nsevwait: waiting for transport event (1 thru 5)...
[14-APR-2016 05:58:41:358] nsevwait: 1 newly-posted event(s)
[14-APR-2016 05:58:41:358] nsevfnt: cxd: 0x222b28 stage 0: NT events set:
        READ
[14-APR-2016 05:58:41:358] nsevfnt: cxd: 0x222b28 stage 0: NS events set:

With above information, we know the delay is with solaris kernel rather than oracle client/listener nor network layer.
I then re-do the debug while tracing the Solaris system, though dtrace and guds captures no contention/interruption/locking during the period, still with truss command we captured some useful information.
Below are the high level events we got from truss:

22626/1: 8.6926 fork1() = 3728 
3728/1: 8.6926 fork1() (returning as child ...) = 22626 
3728/1: 8.7229 fork1() = 3731 
3731/1: 8.7229 fork1() (returning as child ...) = 3728 
3731/1: 12.0249 write(17, " N T P 0 3 7 3 1\n", 10) = 10 
22626/1: 12.0252 read(16, " N T P 0 3 7 3 1\n", 64) = 10 

From above we can see truss shows process 22626 (oracle listener) forked process 3728, and then process 3728 forked process 3731(oracle process) , and then process 3731(oracle process) returns PID, and this steps take 3 seconds.

Meanwhile, oracle listener was simply looping and waiting this PID to be returned by Solaris kernel during the whole period.

So now let's look into the issue deeper to see what was 3731(oracle process) doing that took it up to 3 seconds to return PID to oracle listener:

3731/1: 9.3187 shmat(16777280, 0x380000000, IPC_ALLOC) = 0x380000000 
22626/1: pollsys(0x100333970, 4, 0x00000000, 0x00000000) (sleeping...) 
22626/2: 10.2679 nanosleep(0xFFFFFFFF785FBDF0, 0xFFFFFFFF785FBDE0) = 0 
22626/2: 10.2686 so_socket(PF_INET, SOCK_STREAM, IPPROTO_TCP, "", SOV_DEFAULT) = 14 
22626/2: 10.2689 fcntl(14, F_SETFD, 0x00000001) = 0 
22626/2: 10.2691 setsockopt(14, SOL_SOCKET, SO_LINGER, 0xFFFFFFFF785FBEDC, 8, SOV_DEFAULT) = 0 
22626/2: 10.2773 connect(14, 0x100341D88, 16, SOV_DEFAULT) Err#146 ECONNREFUSED 
22626/2: 10.2777 shutdown(14, SHUT_RDWR, SOV_DEFAULT) Err#134 ENOTCONN 
22626/2: 10.2780 close(14) = 0 
22626/2: 10.2783 so_socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP, "", SOV_DEFAULT) = 14 
22626/2: 10.2785 fcntl(14, F_SETFD, 0x00000001) = 0 
22626/2: 10.2787 setsockopt(14, SOL_SOCKET, SO_LINGER, 0xFFFFFFFF785FBEDC, 8, SOV_DEFAULT) = 0 
22626/2: 10.2790 connect(14, 0x100341ED8, 32, SOV_DEFAULT) Err#128 ENETUNREACH 
22626/2: 10.2791 shutdown(14, SHUT_RDWR, SOV_DEFAULT) Err#134 ENOTCONN 
22626/2: 10.2793 close(14) = 0 
22626/2: nanosleep(0xFFFFFFFF785FBDF0, 0xFFFFFFFF785FBDE0) (sleeping...) 
3731/1: 11.7921 shmat(16777281, 0x3C0000000, IPC_ALLOC) = 0x3C0000000 
3731/1: 11.7931 mprotect(0x380000000, 8192, PROT_READ) = 0 

From above we can see the major time was spent by process 3731 to attach to memory segment 0x380000000.

And memory segment 0x380000000 is part of the database:

2767: ora_smon_QPROD 
Address Kbytes RSS Anon Locked Pgsz Mode Mapped File 
0000000380000000 45056 45056 - - 8K rwxs- [ dism shmid=0x100001e ] 
0000000382C00000 32768 - - - - rwxs- [ dism shmid=0x100001e ] 
0000000384C00000 53248 53248 - - 8K rwxs- [ dism shmid=0x100001e ] 
00000003C0000000 9699328 - - - - rwxs- [ dism shmid=0x100001f ] 
0000000610000000 9043968 9043968 - - 8K rwxs- [ dism shmid=0x100001f ] 
0000000840000000 4096 4096 - - 8K rwxs- [ dism shmid=0x1000020 ] 

This is where I noticed that the database is using DISM, so very possible it is a bug related to solaris DISM functionality as there exists a serials of known defects along with DISM.

This 11gR2 oracle database is actually using parameters memory_target/memory_max_target. With this two parameters in play, solaris DISM memory management mechanism is enabled.
To avoid oracle from using solaris DISM, we simply stop using parameters memory_target/memory_max_target, and we use sga_target/sga_max_size parameters instead. By using sga_target/sga_max_size parameters, oracle will use solaris ISM instead DISM memory mechanism and we might be able to bypass some potential defects in DISM.

I then send out suggestion to test using sga_target/sga_max_size parameters instead of memory_target/memory_max_target parameters, as well as reducing oracle memory size defined in spfile from 40G to 18G, and that is when I got many doubts —— no one seems to believe that a slowness in gateway system can be solved by changing from memory_target/memory_max_target to sga_target/sga_max_size, and by reducing oracle memory size.
It sounds irrelevant and ridiculous.

Well, in fact, in my career I have seen plenty of more bizarre stuff than this issue, you can find many of them in my blog. This one for example:
http://dbakevin.blogspot.com.au/2012/05/application-crashed-after-increasing.html

Eventually they implemented the change and then we re-did the connection testing::

time elapsed : 0,13
time elapsed : 0,16
time elapsed : 0,17
time elapsed : 0,14
time elapsed : 0,13
time elapsed : 0,14
time elapsed : 0,12
time elapsed : 0,13
time elapsed : 0,13
time elapsed : 0,13
time elapsed : 0,13
time elapsed : 0,13
time elapsed : 0,11
time elapsed : 0,15
time elapsed : 0,17
time elapsed : 0,13
time elapsed : 0,12
time elapsed : 0,15
time elapsed : 0,15
time elapsed : 0,15

From above we can see now the connection time has dropped from 0.3 second to 0.15 averagely, and these is no more annoying intermittent 3-second-connections anymore. So in general, we have cut 60% connection time.
On gateway itself we can see sensible improvement of performance, from 40 messages per minute to 60+ messages per minute now.

This alleviated the performance issue and saved us from the crisis. Meanwhile, it does not satisfy me, as 0.15 second per connection is still long. And we know more and more regional hospitals will join this national project so the load by the end of the year will be at least 50% higher than at this moment. So I am asked to further fine tune the capacity of this gateway system as per their request.

I then come up with shared server connection method in my mind: It should help.

Why?

Now we know the bottleneck is when Solaris kernel forking child process.
With shared server connection method, oracle will always maintain a pool of existing oracle processes and assign one from this pool when there is an incoming call, and put it back into the pool after the connection ends.
So, theoretically, we can bypass oracle forking shadow process step and its associated slowness.


I then created a separated shared server connecting service on database server only for the gateway use and updated the new tnsname entry of the gateway, and now let's see the speed of connection:

time elapsed : 0,03
time elapsed : 0,05
time elapsed : 0,05
time elapsed : 0,03
time elapsed : 0,06
time elapsed : 0,04
time elapsed : 0,03
time elapsed : 0,03
time elapsed : 0,05
time elapsed : 0,05
time elapsed : 0,05
time elapsed : 0,05
time elapsed : 0,03
time elapsed : 0,06
time elapsed : 0,05
time elapsed : 0,05
time elapsed : 0,06
time elapsed : 0,05
time elapsed : 0,05
time elapsed : 0,05
time elapsed : 0,05
time elapsed : 0,03
time elapsed : 0,03
time elapsed : 0,03
time elapsed : 0,03
time elapsed : 0,06
time elapsed : 0,05

Now the gateway performance issue has been perfectly solved, the capacity of the gateway is almost 100 messages per minute, and this meets the demand.

More......

April 8, 2016

orphan entries in col$ (due to creating online index failure) make gather stats job fail with ORA-01847

The gather stats job always fail on one of our core table with below error:

EXEC dbms_stats.gather_table_stats(ownname=> 'SYSADM', 
tabname=>'gbprestatiegroep', method_opt=>'For all columns size 1',estimate_percent=> 100, 
cascade=>false, degree=>1, BLOCK_SAMPLE=>false, NO_INVALIDATE=>false ); 
Error report: 
ORA-01847: day of month must be between 1 and last day of month 
ORA-06512: at "SYS.DBMS_STATS", line 23829 
ORA-06512: at "SYS.DBMS_STATS", line 23880


With "cascade=>false" in above command, we define to only gather table data stats.
Then, we treid to gather index only stats for the indexes on this table and they all succeeded:

gather_index_stats(OWNNAME=> 'SYSADM', INDNAME=> 'IX_GBP_PATTRACKING',  GRANUL
gather_index_stats(OWNNAME=> 'SYSADM', INDNAME=> 'DPG17',  GRANULARITY=>'ALL',
gather_index_stats(OWNNAME=> 'SYSADM', INDNAME=> 'DPG2',  GRANULARITY=>'ALL',e
gather_index_stats(OWNNAME=> 'SYSADM', INDNAME=> 'DPG14',  GRANULARITY=>'ALL',
gather_index_stats(OWNNAME=> 'SYSADM', INDNAME=> 'CGBP01',  GRANULARITY=>'ALL'
gather_index_stats(OWNNAME=> 'SYSADM', INDNAME=> 'CIND_GBP_PK_S_KEY2',  GRANUL
............


Above result seems to indicate data (corruption/invalidation/broken) issue on the table while the data in index is fine.
However, below command can return result correctly without any issue:
select /*+full(table)*/ * from table;
select /*+full(table)*/ to_char(column1,'yyyy-mm-dd hh24:mi:ss'), to_char(column2,'yyyy-mm-dd hh24:mi:ss'), ..... from table;


To investigate, I enabled level 3 errorstack, and I captured below issue SQL:

select /*+  no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring no_substrb_pad  */
count(*),
count("SYS_NC00074$"),
count("SYS_NC00075$"),
count("SYS_NC00076$"),
count("SYS_NC00077$"),
count("SYS_NC00078$"),
count("SYS_NC00079$"),
sum(sys_op_opnsize("SLEUTEL")),
sum(sys_op_opnsize("CLIENTAPPLIC")),
sum(sys_op_opnsize("DEMONSTRATION_FU1")),
count("DEMONSTRATION_FU2"),
sum(sys_op_opnsize("DEMONSTRATION_FU2")),
count("CONTACTTEMPLATE"),
sum(sys_op_opnsize("CONTACTTEMPLATE")),
count("MEDICALVALIDATION"),
sum(sys_op_opnsize("AANVRAGER")),
...................
...................
from "SYSADM"."GBPRESTATIEGROEP" t ;


What the hell are these columns? They do not show up if we use "DESC GBPRESTATIEGROEP" command:
count("SYS_NC00074$"),
count("SYS_NC00075$"),
count("SYS_NC00076$"),
count("SYS_NC00077$"),
count("SYS_NC00078$"),
count("SYS_NC00079$"),


After verifying, we confirmed that it is below columns that lead to the issue, and we can reproduce by executing.

SQL> set lines 300
SQL> set pagesize 500
SQL> select
2  count("SYS_NC00074$"),
3  count("SYS_NC00075$"),
4  count("SYS_NC00076$"),
5  count("SYS_NC00077$"),
6  count("SYS_NC00078$"),
7  count("SYS_NC00079$")
8   from "SYSADM"."GBPRESTATIEGROEP" t ;
count("SYS_NC00074$"),
*
ERROR at line 2:
ORA-01847: day of month must be between 1 and last day of month

Above SYS_NC0007* columns does not exist in DBA_INDEXES/DBA_OBJECTS/DBA_COLUMNS.

However, in col$, I find below:

SQL> select OBJ#,COL#,SEGCOL#,SEGCOLLENGTH,NAME,TYPE#,NULL$,DEFLENGTH,DEFAULT$,INTCOL#,PROPERTY  
  2  from col$ where name in ('SYS_NC00074$','SYS_NC00075$','SYS_NC00076$','SYS_NC00077$','SYS_NC00078$','SYS_NC00079$') and obj#=15182;
      OBJ#       COL#    SEGCOL# SEGCOLLENGTH NAME                                TYPE#      NULL$  DEFLENGTH DEFAULT$                                                                            INTCOL#   PROPERTY
---------- ---------- ---------- ------------ ------------------------------ ---------- ---------- ---------- -------------------------------------------------------------------------------- ---------- ----------
     15182          0          0            7 SYS_NC00074$                          180          0         81 TO_TIMESTAMP(TO_CHAR("PLANUITVOERING",'DD-MON-RR'),'yyyy-mm-dd hh24:mi:sssss.ff'         74     327976
     15182          0          0            7 SYS_NC00075$                          180          0         75 TO_TIMESTAMP(TO_CHAR("PLANUITVOERING",'DD-MON-RR'),'yyyy-mm-dd hh24:mi:ss')              75     327976
     15182          0          0            7 SYS_NC00076$                          180          0         78 TO_TIMESTAMP(TO_CHAR("PLANUITVOERING",'DD-MON-RR'),'yyyy-mm-dd hh24:mi:sssss')           76     327976
     15182          0          0            7 SYS_NC00077$                          180          0         75 TO_TIMESTAMP(TO_CHAR("PLANUITVOERING",'DD-MON-RR'),'yyyy-mm-dd hh24:mi:ss')              77     327976
     15182          0          0            7 SYS_NC00078$                          180          0         75 TO_TIMESTAMP(TO_CHAR("PLANUITVOERING",'DD-MON-RR'),'yyyy-mm-dd hh24:mi:ss')              78     327976
     15182          0          0            7 SYS_NC00079$                          180          0         75 TO_TIMESTAMP(TO_CHAR("PLANUITVOERING",'DD-MON-RR'),'yyyy-MM-dd hh24:mi:ss')              79     327976

6 rows selected. 

For last a few years, constant performance turning has always being on-going for this site.
And during the process, many indexes were created based on poor SQL found or suggested by sql advisor.

It might be that when creating above function indexes, the creating function index command failed, and for some reason, it left behind orphan entries in col$.

After testing, I managed to reproduce the bug in 11.2.0.3.10, while it does not exist in 11.2.0.4 though.



SQL> create table SYSADM.TESTORPHAN as select created from DBA_OBJECTS where 1=2;
Table created.

SQL> insert into SYSADM.TESTORPHAN values (to_date('2000-07-27 14:47:30','YYYY-MM-DD HH24:MI:SS'));
1 row created.

SQL> commit;
Commit complete.

SQL> create index SYSADM.INT_ERROR on SYSADM.TESTORPHAN(TO_TIMESTAMP(TO_CHAR("CREATED",'DD-MON-RR'),'yyyy-MM-dd hh24:mi:ss')) online;
create index SYSADM.INT_ERROR on SYSADM.TESTORPHAN(TO_TIMESTAMP(TO_CHAR("CREATED",'DD-MON-RR'),'yyyy-MM-dd hh24:mi:ss')) online
ERROR at line 1:
ORA-01847: day of month must be between 1 and last day of month

---------here it will leaves orphan entries in col$ table-------------


This is an obvious defect in online index (re)build cleanup function.
I checked below tables and I cannot find any entries for above columns there:
IND$
ICOL$
CON$
SEG$
CCOL$
icol$
icoldep$


I believe we can fix the issue by executing below SQLs:

delete from col$ where obj#=15182 and name in ('SYS_NC00074$','SYS_NC00075$','SYS_NC00076$','SYS_NC00077$','SYS_NC00078$','SYS_NC00079$');
update tab$ set INTCOLS=73, AVGRLN=228 where obj#=15182;
update obj$ set SPARE2=1 where obj#=15182;
commit;
shutdown abort
startup


As this is very critical environment and a huge site, I found an outage time to implement the fix by first enabling flashback on. The fix worked as I expected, the gather table stats job then completed successfully for this table.
Cheers.

More......

June 10, 2014

long parsing time issue: bundles of SQLs cost 10+ seconds to do a hard parse

One database, after migrating to 11.2.0.3, customer started to complain the performance was much slower than before.
After checking, we saw for most SQLs, the execution speed was fast while the parsing time was too long.

I took one SQL as example, ran it multiple times to ensure there was no physical read:

no rows selected
Elapsed: 00:00:00.11
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2761 consistent gets
0 physical reads
0 redo size
16589 bytes sent via SQL*Net to client
513 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
0 rows processed

Then I added hint /*hard parse by me*/ to force a hard parse, still there was no physical read:
no rows selected
Elapsed: 00:00:03.73
Statistics
----------------------------------------------------------
22 recursive calls
0 db block gets
2767 consistent gets
0 physical reads
0 redo size
16589 bytes sent via SQL*Net to client
513 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
0 rows processed

Without parsing, the execution time was 0.11 second,but with parsing, the SQL cost 3.73 seconds.
There was no lock/mutex/latch/pin contention in database and server load was low.

Below query shew for many SQLs, a single hard parsing cost even more than 50 seconds:

SQL> SELECT SQL_ID,
  AVG_HARD_PARSE_TIME_MS
FROM
(SELECT
    ROUND(AVG_HARD_PARSE_TIME / 1000, 2)
      AVG_HARD_PARSE_TIME_MS,
    SQL_ID,
    EXECUTIONS
  FROM
    V$SQLSTATS 
  ORDER BY AVG_HARD_PARSE_TIME DESC )
WHERE
  ROWNUM <= 10; 
SQL_ID        AVG_HARD_PARSE_TIME_MS
------------- ----------------------
63n9pwutt8yzw              223279.79
8pduputnkhukx              174644.33
gg9m2b0hjky1y              161064.91
g0vn88sqftx3c               66367.09
bbvy8xjs7gjgj               64747.76
7bgzzkwsqr6hc               63303.48
bzhxncjcy1k96               61963.71
8y387dwxggn5t               57265.29
1f25j21vvfv10               54766.09
0br8s1y3ma0qr               53578.61

This was a very complicated hospital system. Though Bind Variables was in use, still large amount of hard parse were inevitable. 
 For instance, to querying a patient's report, doctors could use any combinations of below filter's : 
1. patient ID 
2. patient name 
3. patient walk in date 
4. surgery name 
5. surgery date 
6. doctor's name 
7. doctor's ID 
8. disease type 
9. severity of disease 
10. hospitalized or not 
11. Body Scan type:MRI/XRAY 
....... 
There are numerous combinations and each one could generate a new SQL therefore a hard parse. 

 Due to 11g new feature ACS, even for one same SQL, when the variables are different, it can generated multiple different child cursors(hard parse) as well. 
To minimize the count of hard parse, I set below parameters to disable ACS:
SQL> show parameter cursor
NAME                                   TYPE       VALUE
-------------------------------------- --------- ----------------------
_optimizer_adaptive_cursor_sharing     boolean   FALSE
_optimizer_extended_cursor_sharing     string    NONE
_optimizer_extended_cursor_sharing_rel string    NONE

After setting above parameters, the total parse count reduced 80% percent and the overall performance was much better. 
But still,the long consuming time for each single parsing was not reduced, customer may still encounter a slowness now and then if one SQL fell into a hard parse. 

Let's analyze the long parsing time issue. 
Here is an SQL trace:
call    count  cpu      elapsed    disk       query      current    rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse   1      0.01     0.00       0          0          0          0
Execute 1      3.30     3.30       0          6          0          0
Fetch   1      0.00     0.00       0          59         0          0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total   3      3.31     3.31       0          65         0          0

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 36 
Number of plan statistics captured: 1

And here is 10046 raw trace:
PARSING IN CURSOR #18446744071468120416 len=14352 dep=0 uid=36 oct=3 lid=36 tim=40253210782553 hv=3244419326 ad='7a995e8c8' sqlid='d228z5m0q3u7y'
PARSE #18446744071468120416:c=10000,e=9046,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=40253210782549
WAIT #18446744071468120416: nam='SQL*Net message to client' ela= 6 driver id=1413697536 #bytes=1 p3=0 obj#=49430 tim=40253213924594
WAIT #18446744071468120416: nam='SQL*Net more data to client' ela= 184 driver id=1413697536 #bytes=8145 p3=0 obj#=49430 tim=40253213934106
WAIT #18446744071468120416: nam='SQL*Net message from client' ela= 38091 driver id=1413697536 #bytes=1 p3=0 obj#=49430 tim=40253213972419
EXEC #18446744071468120416:c=3310000,e=3313604,p=0,cr=6,cu=0,mis=1,r=0,dep=0,og=1,plh=1738446058,tim=40253217287241
WAIT #18446744071468120416: nam='SQL*Net message to client' ela= 6 driver id=1413697536 #bytes=1 p3=0 obj#=49430 tim=40253217289638
WAIT #18446744071468120416: nam='SQL*Net more data to client' ela= 181 driver id=1413697536 #bytes=8145 p3=0 obj#=49430 tim=40253217299453
WAIT #18446744071468120416: nam='SQL*Net message from client' ela= 3096 driver id=1413697536 #bytes=1 p3=0 obj#=49430 tim=40253217302788
FETCH #18446744071468120416:c=0,e=3601,p=0,cr=59,cu=0,mis=0,r=0,dep=0,og=1,plh=1738446058,tim=40253217306558
WAIT #18446744071468120416: nam='SQL*Net message to client' ela= 5 driver id=1413697536 #bytes=1 p3=0 obj#=49430 tim=40253217311918
WAIT #18446744071468120416: nam='SQL*Net message from client' ela= 22854 driver id=1413697536 #bytes=1 p3=0 obj#=49430 tim=40253217334883


From above red part, the major running time was not consumed at fetch, but execute. To narrow out the influence of bind peeking, I used explain plan for to analyze SQL with only bind variables:
explain plan for select /*hard47*/ .......
DECODE(c.c1,:"SYS_B_14",:"SYS_B_15",:"SYS_B_16"),DECODE(c.c2,:"SYS_B_17",:"SYS_B_18",:"SYS_B_19"),
..........
FROM
TABLE1,TABLE2,VIEW1,VIEW2,VIEW3.....
WHERE NVL(QPQD_EX_KEY,:"SYS_B_60") > :"SYS_B_61"  AND   a.QPQD_AP_PATKEY = :1    
Elapsed: 00:00:03.66

Still the SQL cost 3 seconds to parse with out bind peeking.
To make our test more clear, I found a new SQL, and this new SQL cost 13 seconds to parse.


We know some CBO parameters can limit choices when calculating execution plan during a hard parse.
Here is the result for the 13 seconds parsing SQL after we set different values for parameter OPTIMIZER_FEATURES_ENABLE:

ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '9.2.0.8'; --- 0.82 seconds
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '10.1.0'; --- 1.03 seconds
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '10.2.0.1'; --- 4.03 seconds
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '11.1.0.7'; --- 12.20 seconds
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '11.2.0.3'; --- 12.91 seconds

When OPTIMIZER_FEATURES_ENABLE=11.2.0.3, the parse time was13 seconds, while 9.2.0.8, the parsing time was only 0.82 seconds. 
 Next I tested optimizer_mode, and as what I had thought, when OPTIMIZER_FEATURES_ENABLE was11.2.0.3, after set optimizer_mode to RULE, the parse time reduced to 0.5 second. 

 But, in a production environment, we can't simply downgrade OPTIMIZER_FEATURES_ENABLE to 9.2.0.8 or changing optimizer_mode to RULE, as these two parameters influenced too much.
I wanted to find a specific parameter that particular to address this long parse problem.

Then I made 10053 trace to analyze the process of SQL parse.
The 10053 trace file was 44MB which was normal, because if SQL parsing cost 13 seconds, and 10053 trace file was only a few hundreds KB, then there must be something wrong.


Let's look into the structure of the 10053 trace file:

pacsrd1212 $ cat  QPROD_ora_8456_10053.trc|grep -i "permutations tried"|grep -v "permutations tried: 1"
Number of join permutations tried: 2
Number of join permutations tried: 6
Number of join permutations tried: 52
Number of join permutations tried: 2
Number of join permutations tried: 3
Number of join permutations tried: 3
Number of join permutations tried: 3
Number of join permutations tried: 499
Number of join permutations tried: 84
Number of join permutations tried: 2
Number of join permutations tried: 3
Number of join permutations tried: 3
Number of join permutations tried: 3
Number of join permutations tried: 798
Number of join permutations tried: 84
Number of join permutations tried: 80
Number of join permutations tried: 2
Number of join permutations tried: 3
Number of join permutations tried: 3
Number of join permutations tried: 3
Number of join permutations tried: 499
Number of join permutations tried: 84
Number of join permutations tried: 451
Number of join permutations tried: 483
Number of join permutations tried: 84
Number of join permutations tried: 2
Number of join permutations tried: 6
Number of join permutations tried: 52
Number of join permutations tried: 435
Number of join permutations tried: 84

And below was another 10053 trace file while OPTIMIZER_FEATURES_ENABLE = '9.2.0.8', the size of which was only 2M while parsing time was within 1 second:
pacsrd1212 $ cat QPROD_ora_7830_10053_2.trc|grep -i "permutations tried"|grep -v "permutations tried: 1"
Number of join permutations tried: 2
Number of join permutations tried: 6
Number of join permutations tried: 20
Number of join permutations tried: 2
Number of join permutations tried: 3
Number of join permutations tried: 3
Number of join permutations tried: 3
Number of join permutations tried: 95

We can see, while OPTIMIZER_FEATURES_ENABLE is '11.2.0.3', CBO calculated so much join orders, no wonder it cost that long time.
We can set hidden parameter “_OPTIMIZER_MAX_PERMUTATIONS” to limit the max number of join orders CBO could pick.
After I tested to set _OPTIMIZER_MAX_PERMUTATIONS=50, the SQL parsing time reduced from 13 seconds to 6 seconds.
But this parameter was also not acceptable to me since it influenced too much.

Let us review the join orders in 10053 trace file, here is one sample:

PPOINTMENT_ITEMS[J]#6  QPQD_LINK[QPQD_LINK]#7  VW_NSO_3[VW_NSO_3]#11  QPQD_MACHINELINKS[K]#12  EXAMS[C]#10  EXAMINATIONS[E]#8  EXAMGROUPS[F]#9  VW_PRIORITY_SEQUENCE_PLACER[G]#13
Join order aborted: cost > best plan cost
Join order[449]:  QPQDVIEW541[A]#1  PATIENTS[B]#0  QPQD_LINK[Z]#2  APPOINTMENTS[D]#3  APPOINTMENT_ITEMS[H]#4  QPQD_ROOMLINKS[I]#5  APPOINTMENT_ITEMS[J]#6  QPQD_LINK[QPQD_LINK]#7  EXAMINATIONS[E]#8  EXAMGROUPS[F]#9  EXAMS[C]#10  VW_NSO_3[VW_NSO_3]#11  QPQD_MACHINELINKS[K]#12  VW_PRIORITY_SEQUENCE_PLACER[G]#13
Join order aborted: cost > best plan cost
Considering multiple instances based initial join order.
Best join order so far: 358
CBO found the best join order at the 358th,so if we limited _OPTIMIZER_MAX_PERMUTATIONS to a small value,we may miss this join order as our best plan.

 We can see there are 14 tables there, in theory, the maximum join orders could be 14*13*12*11*....*3*2*1=87178291200. 
Of course CBO won't be that stupid to test all the 87178291200 join orders. 
Based on its rules, CBO ignored most join orders which are destined to be low effective. 

And what we really need to do, is through turning some of CBO's specific parameters, we limit choices of CBO. 
The less of useless join orders that CBO tries, the less of parse time is needed. 
Next, I went to review the low parsing SQLs themselves. 
And I noticed that they have some common symbols, such as they all contains lots of tables and complex View, as well as nested child query within IN/EXIST. 

 So then I begain to test some of parameters related to unnest and merge:
alter session set query_rewrite_enabled = false; 
alter session set"_optimizer_table_expansion"=false;
alter session set "_and_pruning_enabled"=false;
alter session set "_subquery_pruning_enabled"=false;
alter session set "_optimizer_fast_access_pred_analysis" =false;
alter session set "_optimizer_sortmerge_join_enabled"   = false;
alter session set "_optimizer_sortmerge_join_inequality" = false;
............... 

None of above parameters were working. I reviewed of the parameters and noticed below test I did:
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '9.2.0.8'; --- 0.82 seconds
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '10.1.0'; --- 1.03 seconds
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '10.2.0.1'; --- 4.03 seconds
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '11.1.0.7'; --- 12.20 seconds
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '11.2.0.3'; --- 12.91 seconds

And this caught my attention::
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '10.1.0.3';     ---  1.40  seconds
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE = '10.1.0.4';     ---  3.59  seconds

OPTIMIZER_FEATURES_ENABLE controls a bunch of CBO related parameters, and changing OPTIMIZER_FEATURES_ENABLE equals changing bundles of those CBO parameters together. 

In my previous working experience, I knew there was a parameter introduced in 10.1.0.4.
And this parameter has caused a lot off issues since it was introduced:
_optimizer_cost_based_transformation.
I felt this parameter should be the RCA.

Let's test this parameter, from below I saw while the parameter changing, the behavior of the SQL costed 13 seconds to parse initially:

alter session set "_optimizer_cost_based_transformation"=exhaustive;    ---- Elapsed: 00:00:14.11
alter session set "_optimizer_cost_based_transformation"=iterative;     ---- Elapsed: 00:00:13.71
alter session set "_optimizer_cost_based_transformation"=linear;        ---- Elapsed: 00:00:13.97
alter session set "_optimizer_cost_based_transformation"=on;            ---- Elapsed: 00:00:13.97
alter session set "_optimizer_cost_based_transformation"=off;           ---- Elapsed: 00:00:03.05
After setting it to off, the parsing time reduced to 3 seconds.

And later, I did more tests and turned a few more parameters, including:
alter session set "_b_tree_bitmap_plans"=false; (cause I saw some useless bitmap plan judgement in 10053 trace)
alter session set "optimizer_index_cost_adj" = 50;
..................

And finally, the parsing time for the SQL reduced to 1 second from 13 seconds. And all the issue SQLs reduced 90%+ percent of parsing time.






Conclusion, let me make it a bit more clear:
A SQL's running has two major steps:
1. PARSE : this step oracle translate the SQL and find a best execution plan for the SQL.
2. Execution : this step, oracle execute with the execution plan generated at step 1 and return the data.

In very most cases, the SQL's time is mainly consumed at step 2 and for that we turn execution plans, we add indexes etc.

But in our case, due to the SQL is too complex, oracle cost too much time at step 1 while step 2 is quite fast.

For some issue SQLs, there are sometimes around 14 join tables. (A view can be made by multiple tables.)
So, now, oracle calculated a maxmum 14*13*12*....*3*2*1's possible join orders for the 14 tables.
Finally, after 60 seconds, oracle say:
ok, I have done my research and find a best plan, by using this specific plan, the SQL's execution can be done in 1 second.

I must say oracle is totally right.
But as you can see now, our problem is not:
" by using this specific plan, the SQL's execution can be done in 1 second."

Our problem is oracle spend 60 seconds to find out that best plan.


I changed those parameters:
alter session set "_optimizer_cost_based_transformation"=off;
alter session set "_b_tree_bitmap_plans"=false;
alter session set "optimizer_index_cost_adj" = 50;

It doesn't mean I was not statistic with previous plan oracle found and I want to change them.
For example,
Before I set "_b_tree_bitmap_plans"=false, oracle will consider these execution plans for a SQL:
1. table1 table2: no bitmap
2. table2 table1: no bitmap
3. table1 table2: bitmap
4. table2 table1: bitmap

and oracle finally picks:
1. table1 table2: no bitmap

Then I set "_b_tree_bitmap_plans"=false;
That will help oracle to know that it doesn't need to consider bitmap execution plans, so oracle will consider only below two plans:
1. table1 table2: no bitmap
2. table2 table1: no bitmap

And still at last oracle picks this plan:
1. table1 table2: no bitmap

We can see oracle picks up a same plan at last, but due to the fact oracle has less to consider, the time oracle costs at PRASE is reduced.


So, the purpose that I changed all the parameters was to limit oracle's choices.

As a result, still oracle may pick up a same execution plan, or a different one, but again, that is irrelevant as step EXECUTION was not the bottleneck in our case and not our concern.

More......

May 28, 2014

database memory exhausted caused by vmware ballooning(vmmemctl.sys)

One DB frequently crashed.
Here was the entry in alert.log:

Fri Oct 11 04:11:22 2013
Errors in file c:\oracle\product\10.2.0\admin\qprod\bdump\qprod_arc0_2380.trc:
ORA-00202: Message 202 not found; No message file for product=RDBMS, facility=ORA; arguments: [E:\ORACLE\ORADATA\QPROD\CONTROL02.CTL]
ORA-27091: Message 27091 not found; No message file for product=RDBMS, facility=ORA
ORA-27070: Message 27070 not found; No message file for product=RDBMS, facility=ORA
OSD-04006: ReadFile() failure, unable to read from file
O/S-Error: (OS 1453) Insufficient quota to complete the requested service.

Database failed to perform a IO on its controlfile, reason OS error 1453, hence database terminated.
Below is Microsoft explanation for OS error 1453:


WORKING_SET is a memory structure.
Above error indicates Oracle failed to locate memory from physical memory.


From trace file we could see there was not sufficient physical memory when the DB crashed.
Dump file c:\oracle\product\10.2.0\admin\qprod\bdump\qprod_arc0_2380.trc
Fri Oct 11 04:11:15 2013
ORACLE V10.2.0.4.0 - 64bit Production vsnsta=0
vsnsql=14 vsnxtr=3
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Windows NT Version V5.2 Service Pack 2
CPU                 : 4 - type 8664, 4 Physical Cores
Process Affinity    : 0x0000000000000000
Memory (Avail/Total): Ph:79M/8191M, Ph+PgF:11824M/20032M
Instance name: qprod

In Windows OS event manager, we could see errors that indicating it is a memory issue:


Based on above, the Error DB got was due to server memory exhausted during the issue period.
(Note that due to out of memory, different oracle processes might die with different ORA- errors and symbols in alert.log, but root is clear: out of memory.)

We configured tool to monitor the server's memory usage.
From the log we could see during the issue period, 2.8GB free memory was consumed in a few minutes:



We infer 2.8GB free memory were all consumed up in two minutes. Then database died. Later memory were released. (due to database was died there were more free memory later).

I checked the activity history inside the database including PGA and SGA history and I could see there was no activity and the memory usage of database was stable and not changed during the issue period.

So since then we were working on identifying which external process suddenly consumed all the memory during that period.

The most wired part was, from system overall logs, all available memory was consumed up in a few minutes, but when we looked into each single process' monitoring log, each process' memory usage is stable and not changed during the issue period.

It was a truth that all memory were consumed, but seems in the monitor log files, no process was admitted to be responsible for that.


Since the issue always happened at around 4:00 AM Local time, so then I logged into station at 4:00 AM and monitored the system closely for a few days.
And I can see above issue persists everyday 4:00 AM, some days database died due to the memory exhausted, someday database hang on until the memory released and survived.

For passed a few days, I tried different debug tools to walk into the kernel, until today, I noticed that during the issue period of time, one strange thread of SYSTEM process coming into active heavily (consumed Rank top CPU -- 13.48%), while it was not there during normal time:


We saw during the issue time, vmmemctl.sys of kernel process suddenly come into active heavily, and very soon Physical Usage raised to 100% from 60%.

Explanation for vmmemctl.sys:
Memory ballooning function is not related to shared/reserved setting in vmware properties.
Memory ballooning is handled through a driver (vmmemctl.sys) that is installed as part of the VMware Tools.
This driver is loaded in the guest OS to interact with the VMkernel and is leveraged to reclaim memory pages when ESX memory resources are in demand and available physical pages cannot meet. requirements.
When memory demands rise on the ESX host, the VMkernel will instruct the balloon driver to "inflate" and consume memory in the running guest OS,
forcing the guest operating system to leverage its own native memory management techniques to handle changing conditions.
Free pages are typically released first, but the guest OS may decide to page some memory out to its pagefile on the virtual disk.
The reclaimed memory is then used by ESX to satisfy memory demands of other running workloads, but will be relinquished back to the guest OS when memory demands decrease by "deflating" the balloon driver.
Balloon driver activity can be viewed either through VirtualCenter performance monitoring graphs or ESXTOP on the local host.



From the explanation, it is clear now that during issue period, memory exhausted on ESX global server, hence it grabs memory from guest OS namely our DB server.



Below are the steps to check the ballooning from ESX host:
a. Run esxtop.
b. Type m for memory
c. Type f for fields
d. Select the letter J for Memory Ballooning Statistics (MCTL)
e. Look at the MCTLSZ value.
MCTLSZ (MB) displays the amount of guest physical memory reclaimed by the balloon driver.

Below are steps to disable memory balloon:
Disabling ballooning via the vSphere Client
To set the maximum balloon size to zero:
1. Using the vSphere Client, connect to the vCenter Server or the ESXi/ESX host where the virtual machine resides.
2. Log into the ESXi/ESX host as a user with administrative rights.
3. Shut down the virtual machine.
4. Right-click the virtual machine listed on the Inventory panel and click Edit Settings.
5. Click the Options tab, then under Advanced, click General.
6. Click Configuration Parameters.
7. Click Add row and add the parameter sched.mem.maxmemctl in the text box.
8. Click on the row next to it and add 0 in the text box.
9. Click OK to save changes.
To re-enable the balloon driver in a virtual machine:
1. Using the vSphere Client, connect to the vCenter Server or the ESXi/ESX host where the virtual machine resides.
2. Shut down the virtual machine if it is powered on.
3. SSH to the ESXi/ESX host. For more information, see Connecting to an ESX host using an SSH client (1019852).
4. Change directory to the datastore where the virtual machine's configuration file resides.
5. Back up the virtual machine's configuration file.
6. Edit the virtual machine's configuration file (virtual_machine_name.vmx) and remove this entry:
sched.mem.maxmemctl = "0"
7. Save and close the file.
8. Power on the virtual machine.
Note: You cannot remove the entry via the Configuration Parameters UI once it has been added. You must edit the configuration file (.vmx) for the virtual machine to remove the entry.
-------------------------------------------------------
Disabling ballooning via the Windows registry
To disable ballooning on the virtual machine:
Note: This procedure modifies the Windows registry. Before making any registry modifications, ensure that you have a current and valid backup of the registry and the virtual machine. For more information on backing up and restoring the registry, see the Microsoft Knowledge Base article 136393.
1. Log into the guest OS.
2. Click Start > Run, type regedit, and press Enter. The Registry Editor window opens.
3. Navigate to:
\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\VMMEMCTL
4. Change the Start key from 2 to 4.
5. Save the setting and restart the guest OS.
-------------------------------------------------------
Disabling ballooning via VMware Tools uninstallation/reinstallation
1. Uninstall VMware Tools from the guest OS.
2. Reinstall VMware Tools using the Custom Settings option, and deselect the Memory Control Drivers.

More......

December 20, 2012

Performance Shooting: How much can SEQUENCE slow down a SQL

How much can SEQUENCE drag down speed of SQL?
10%? 20%? Below my working case show you SEQUENCE drag down 600% speed of one SQL.

It is during my recently shooting of a job's performance issue, i notice this SQL in the job block.
This is quite a simple SQL, two tables join, only thing secial is that it is using a sequence GGGC_SSBA.SEQ_MIM_CCCC_PPPP_SSS.NEXTVAL:

dwxxxx2 > SELECT /*+ USE_HASH(S1 TM)*/    GGGC_SSBA.SEQ_MIM_CCCC_PPPP_SSS.NEXTVAL        ,330404
  ,TO_DATE('2012-12-18 06:45:18', 'YYYY-MM-DD HH24:MI:SS')
  ,TO_DATE('2012-12-18 06:45:18', 'YYYY-MM-DD HH24:MI:SS')
  ,TO_DATE('2100-12-31', 'YYYY-MM-DD')    ,S1.MATNR AS MATNR      ,S1.VKORG AS
  VKORG   ,S1.VTWEG AS VTWEG      ,S1.KONDM AS KONDM      ,S1.DWERK AS DWERK
  ,S1.VMSTA AS VMSTA      ,S1.VMSTD AS VMSTD      ,S1.LVORM AS LVORM
  ,S1.DT_TM_XTR AS DT_TM_XTR      ,S1.CDHDR_UDATE AS CDHDR_UDATE  
  ,S1.PROVG AS PROVG FROM    GEPC_TEMP_TABLE.KKK_MIM_XXXX_RRRR_SSS TM
  GEPC_SOURCE.EEEIE_ZZZZZ_MMML_SLS S1 WHERE      TM.SOURCE_ROWID = S1.ROWID;

643418 rows selected.
Elapsed: 00:01:00.21

Execution Plan
----------------------------------------------------------
Plan hash value: 3276368232
--------------------------------------------------------------------------------------------
| Id  | Operation           | Name                  | Rows  | Bytes | Cost (%CPU)| Time    |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                       | 36837 |  2590K|   759   (3)| 00:00:14|
|   1 |  SEQUENCE           | SEQ_MIM_CCCC_PPPP_SSS |       |       |            |         |
|*  2 |   HASH JOIN         |                       | 36837 |  2590K|   759   (3)| 00:00:14|
|   3 |    TABLE ACCESS FULL| KKK_MIM_XXXX_RRRR_SSS | 36837 |   359K|    29   (0)| 00:00:01|
|   4 |    TABLE ACCESS FULL| EEEIE_ZZZZZ_MMML_SLS  |   643K|    38M|   724   (3)| 00:00:14|
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("TM"."SOURCE_ROWID"="S1".ROWID)

Statistics
----------------------------------------------------------
     128804  recursive calls
      96793  db block gets
      77309  consistent gets
        661  physical reads
   21141944  redo size
   25998756  bytes sent via SQL*Net to client
     472358  bytes received via SQL*Net from client
      42896  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     643418  rows processed

From above red part we see the SQL esclape time is one minute.
That is too long after considerring the query's base tables data volumn, it should not take that much time.

We also notice that big "recursive calls" and "db block gets" Statistics.
It is very clear that they are caused by the sequence in the SQL.

So let's remove the sequence from the query and check again:
dwxxxx2 > SELECT /*+ USE_HASH(S1 TM)*/    330404
  ,TO_DATE('2012-12-18 06:45:18', 'YYYY-MM-DD HH24:MI:SS')
  ,TO_DATE('2012-12-18 06:45:18', 'YYYY-MM-DD HH24:MI:SS')
  ,TO_DATE('2100-12-31', 'YYYY-MM-DD')    ,S1.MATNR AS MATNR      ,S1.VKORG AS
  VKORG   ,S1.VTWEG AS VTWEG      ,S1.KONDM AS KONDM      ,S1.DWERK AS DWERK
  ,S1.VMSTA AS VMSTA      ,S1.VMSTD AS VMSTD      ,S1.LVORM AS LVORM
  ,S1.DT_TM_XTR AS DT_TM_XTR      ,S1.CDHDR_UDATE AS CDHDR_UDATE  
  ,S1.PROVG AS PROVG FROM    GEPC_TEMP_TABLE.KKK_MIM_XXXX_RRRR_SSS TM
  GEPC_SOURCE.EEEIE_ZZZZZ_MMML_SLS S1 WHERE      TM.SOURCE_ROWID = S1.ROWID;

643418 rows selected.
Elapsed: 00:00:10.23

Execution Plan
----------------------------------------------------------
Plan hash value: 2074598172
-------------------------------------------------------------------------------------------
| Id  | Operation          | Name                  | Rows  | Bytes | Cost (%CPU)| Time    |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                       | 36837 |  2590K|   759   (3)| 00:00:14|
|*  1 |  HASH JOIN         |                       | 36837 |  2590K|   759   (3)| 00:00:14|
|   2 |   TABLE ACCESS FULL| KKK_MIM_XXXX_RRRR_SSS | 36837 |   359K|    29   (0)| 00:00:01|
|   3 |   TABLE ACCESS FULL| EEEIE_ZZZZZ_MMML_SLS  |   643K|    38M|   724   (3)| 00:00:14|
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("TM"."SOURCE_ROWID"="S1".ROWID)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      44998  consistent gets
          0  physical reads
          0  redo size
   21501260  bytes sent via SQL*Net to client
     472358  bytes received via SQL*Net from client
      42896  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     643418  rows processed

Without the sequence the job completes in 10 seconds.
In another word, the sequence drag down 600% speed of this SQL.

Let's check the sequence attribute:
dwxxxx2 >select * from dba_sequences where SEQUENCE_NAME='SEQ_MIM_CCCC_PPPP_SSS';
SEQUENCE_OWNER SEQUENCE_NAME         INCREMENT_BY C O CACHE_SIZE LAST_NUMBER
-------------- --------------------- ------------ - - ---------- -----------
GGGC_SSBA      SEQ_MIM_CCCC_PPPP_SSS            1 N N         20   109594869

The CACHE_SIZE is 20.
20 is a default value, it is not suitable in our case.
I increase the value to 500 and re-execute the SQL.

This time:
dwxxxx2 > SELECT /*+ USE_HASH(S1 TM)*/    GGGC_SSBA.SEQ_MIM_CCCC_PPPP_SSS.NEXTVAL        ,330404
  ,TO_DATE('2012-12-18 06:45:18', 'YYYY-MM-DD HH24:MI:SS')
  ,TO_DATE('2012-12-18 06:45:18', 'YYYY-MM-DD HH24:MI:SS')
  ,TO_DATE('2100-12-31', 'YYYY-MM-DD')    ,S1.MATNR AS MATNR      ,S1.VKORG AS
  VKORG   ,S1.VTWEG AS VTWEG      ,S1.KONDM AS KONDM      ,S1.DWERK AS DWERK
  ,S1.VMSTA AS VMSTA      ,S1.VMSTD AS VMSTD      ,S1.LVORM AS LVORM
  ,S1.DT_TM_XTR AS DT_TM_XTR      ,S1.CDHDR_UDATE AS CDHDR_UDATE  
  ,S1.PROVG AS PROVG FROM    GEPC_TEMP_TABLE.KKK_MIM_XXXX_RRRR_SSS TM
  GEPC_SOURCE.EEEIE_ZZZZZ_MMML_SLS S1 WHERE      TM.SOURCE_ROWID = S1.ROWID;

643418 rows selected.
Elapsed: 00:00:29.60

Execution Plan
----------------------------------------------------------
Plan hash value: 3276368232
--------------------------------------------------------------------------------------------
| Id  | Operation           | Name                  | Rows  | Bytes | Cost (%CPU)| Time    |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                       | 36837 |  2590K|   759   (3)| 00:00:14|
|   1 |  SEQUENCE           | SEQ_MIM_CCCC_PPPP_SSS |       |       |            |         |
|*  2 |   HASH JOIN         |                       | 36837 |  2590K|   759   (3)| 00:00:14|
|   3 |    TABLE ACCESS FULL| KKK_MIM_XXXX_RRRR_SSS | 36837 |   359K|    29   (0)| 00:00:01|
|   4 |    TABLE ACCESS FULL| EEEIE_ZZZZZ_MMML_SLS  |   643K|    38M|   724   (3)| 00:00:14|
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("TM"."SOURCE_ROWID"="S1".ROWID)

Statistics
----------------------------------------------------------
       5153  recursive calls
       3870  db block gets
      46287  consistent gets
          0  physical reads
     844460  redo size
   25998756  bytes sent via SQL*Net to client
     472358  bytes received via SQL*Net from client
      42896  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     643418  rows processed

SQL completes in 30 seconds. Also "db block gets" and "recursive calls" is much smaller then first serneriao--cache is 20.

30 seconds means still the sequence drag down 3 times of the sql's speed.
It is still not reasonable.
The DB have mutiple samiliar queries, I don't have enough free time to look deep into this. I may raise a ticket with oracle support to check this once i get free time.
More......

December 7, 2012

MMON Slave Process Reports ORA-7445 [kglic0] Error, Plus Database Hangs

A DB suddenly hang during a load. User come to me.
When i login the server, i can see the issue DB consuming 100% CPU on the server.

Tried "sqlplus / as sysdba" but also hang.
I gernerate a systemstate dump and notice that the PMON is blocked:

PROCESS 2: PMON
    O/S info: user: oracle, term: UNKNOWN, ospid: 1193 
    OSD pid info: Unix process pid: 1193, image: oracle@alpcisddb484.corporate.ge.com (PMON)
      waiting for 60108b48 Child shared pool level=7 child#=3 
        Location from where latch is held: kgh.h LINE:6387 ID:kghalo: 
        Context saved from call: 0
        state=busy [holder orapid=89] wlstate=free [value=0]
          waiters [orapid (seconds since: put on list, posted, alive check)]:
possible holder pid = 89 ospid=10478

Let's check which session is holding latch 60108b48, and what is the session waiting for:
PROCESS 89: 
    O/S info: user: oracle, term: UNKNOWN, ospid: 10478 
    OSD pid info: Unix process pid: 10478, image: oracle@alpcisddb484.corporate.ge.com
      holding    (efd=8) 60108b48 Child shared pool level=7 child#=3 
    Current Wait Stack:
     0: waiting for 'library cache: mutex X'
        idn=0xfc57b2b7, value=0xd100000000, where=0x4f
        wait_id=57 seq_num=60 snap_id=1
        wait times: snap=45 min 11 sec, exc=45 min 11 sec, total=45 min 11 sec
        wait times: max=infinite, heur=45 min 11 sec
        wait counts: calls=0 os=0
        in_wait=1 iflags=0x5a2

Let’s see which session is holding mutex 0xfc57b2b7 and what is the session waiting for:
PROCESS 42: M000
    SO: 0x19f40dda8, type: 4, owner: 0x19f2de3f8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=0x19f2de3f8, name=session, file=ksu.h LINE:12459, pg=0
    (session) sid: 209 ser: 521 trans: (nil), creator: 0x19f2de3f8
    Current Wait Stack:
      Not in wait; last wait ended 45 min 10 sec ago 

OK, till now we found the root blocking is M000.
It is holding mutex:
PROCESS 42: M000
    SO: 0x19f40dda8, type: 4, owner: 0x19f2de3f8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=0x19f2de3f8, name=session, file=ksu.h LINE:12459, pg=0
    (session) sid: 209 ser: 521 trans: (nil), creator: 0x19f2de3f8
    Current Wait Stack:
      Not in wait; last wait ended 45 min 10 sec ago 
      KGL-UOL SO Cache(total=240, free=0)
      KGX Atomic Operation Log 0x17d569fc0
       Mutex 0x19eb2b4c8(209, 0) idn 1fc57b2b7 oper EXCL
       Library Cache uid 209 efd 7 whr 49 slp 0
       oper=0 pt1=(nil) pt2=(nil) pt3=(nil)

But above also shows that M000 itself not in a wait since 45 mins back. What is M000 doing and why it doesn't release that mutex all along?
It is un-reasonable.

After review the process state of M000, i notice the process is already dead:
SO: 0x19f2de3f8, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3
   proc=0x19f2de3f8, name=process, file=ksu.h LINE:12451, pg=0
  (process) Oracle pid:42, ser:16, calls cur/top: 0x1908c1eb8/0x1906a28f8
            flags : (0x3) DEAD
            flags2: (0x8030),  flags3: (0x0) 
            intr error: 0, call error: 0, sess error: 0, txn error 0
            intr queue: empty
    ksudlp FALSE at location: 0
    Cleanup details:
      Found dead = 44 min 27 sec ago
      Total Cleanup attempts = 10, Cleanup time = 15 min 49 sec, Cleanup timer = 15 min 45 sec
      Last attempt (full) ended 54 sec ago, Length = 2 min 30 sec, Cleanup timer = 2 min 30 sec

OK. Now it is clear to us that M000's death leading to the DB hang.
But why M000 suddenly dead?
After checking alert.log i find blow:
Thu Dec 06 22:00:48 2012
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x1060] [PC:0x90ED147, kglic0()+1075] [flags: 0x0, count: 1]
Errors in file /x484/dump01/oracle/hrxt/diag/rdbms/hrxdt/hrxdt_m000_10618.trc  (incident=6337):
ORA-07445: exception encountered: core dump [kglic0()+1075] [SIGSEGV] [ADDR:0x1060] [PC:0x90ED147] [Address not mapped to object] []

Now all are clear.
After searching on metalink with above symbol, our scenario excatly matches this bug:
MMON Slave Process Reports ORA-7445 [kglic0] Error, Plus Database Hangs [ID 1487720.1]
More......

December 2, 2012

solve ORA-1658 VIA errorstack

Sometimes a issue itself is not complicated, but it is hard due to other reason, such as lack of neccesary information.

For example, one day user come to us and complain that their application job got below error:
F|Nov 29 20:47:34|UNXSERVER|E|27451|GXXS01|conv|ABORTED-SQLCODE -1658 AT 1223 ErrText = |
F|Nov 29 20:47:34|UNXSERVER|E|27451|GXXS01|conv|ENDING DUE TO SERIOUS SQL ERROR|
F|Nov 29 20:47:34|UNXSERVER|E|27462|GXXS02|conv|ABORTED-SQLCODE -1658 AT 1223 ErrText = |
F|Nov 29 20:47:34|UNXSERVER|E|27462|GXXS02|conv|ENDING DUE TO SERIOUS SQL ERROR|


All they can offer us is above error message and a DB name.
Then how should a DBA look into it?

Above error is not a standard DB error message which should begin with: "ORA-".
But from "SQLCODE -1658", we know it should be a DB error, it is not in a standard format due to their application catch the error message and throw it out with its own exception defination.

Let's see what is 1658:
server003[oracle]_db058> oerr ora 1658
01658, 00000, "unable to create INITIAL extent for segment in tablespace %s"// *Cause: Failed to find sufficient contiguous space to allocate INITIAL
// extent for segment being created.
// *Action: Use ALTER TABLESPACE ADD DATAFILE to add additional space to the
// tablespace or retry with a smaller value for INITIAL


OK, seems it is a simple tablespace free size issue.
But after checking in the DB, found there is not error message in alert.log.
The database have 30+ samiliar very tiny tablespaces.
So what tablespace caused the issue:

Tablespace                    Free MB   Total MB        PCT
------------------------- ----------- ---------- ----------
ICDM_20140801                       5         10         50
ICDM_20130901                       5         10         50
ICDM_20131201                       5         10         50
ICDM_20130501                       5         10         50
ICDM_20130701                       5         10         50
ICDM_20140101                       5         10         50
ICDM_20130801                       5         10         50
ICDM_20131001                       5         10         50
ICDM_20130301                       5         10         50
ICDM_20131101                       5         10         50
ICDM_20140201                       5         10         50
ICDM_20140501                       5         10         50
ICDM_20140601                       5         10         50
ICDM_20130401                       5         10         50
ICDM_20140301                       5         10         50
ICDM_20130601                       5         10         50
ICDM_20140401                       5         10         50
ICDM_20140701                       5         10         50
.............

Still user doesn't know anything, they got the error when running a job. But they don't know what is the current SQL and which table got the error on which tablespace.

Then i enabled the errorstack:
alter system set events '1658 trace name errorstack level 1';
Through above command i require to gernerate errorstack for error id ORA-1658 at DB level.

Then i ask user to re-run their job again.
Of course it fails again with error ORA-1658, but with the help of errorstack, this time we get the detail in alert.log:
Errors in file /server/backup/oracle/dump/getsstm2/diag/rdbms/db058/db058/trace/db058_ora_13006.trc:
ORA-01658: unable to create INITIAL extent for segment in tablespace FILE_20130101


OK, now we got the tablespace name: FILE_20130101.
After adding 4gb to FILE_20130101, the issue solved.

More......

November 27, 2012