Blog Archive

I have consolidated, and will do so in future, all my Experiences for Oracle Community. All these are my own opinions and does not necessarily reflect the views of Oracle. There are many databases with different permutation and combination which I have not seen. Hence, with these limitations, my views might be wrong, before implementing do test these. Please feel free to comment on my views, be it a Criticism, as this will only improve my writing abilities.

Saturday, 25 October 2008

Optimizer Behaviour in 10g R2 - Importance of Fresh Optimizer Statistics

On 19th October I wrote about a discussion with a customer on the importance on Statistics after Merging of Data. The scenario that I mentioned was that the customer planned to merge data from database C to H and to differentiate this data, some of the critical (queried) columns of C will be incremented by 900000000. Since, many of these columns are Unique / Primary keys, there ar possibilities that a column can have a value 1111 on both the databases. Therefore, on C Database, before the merger, this value will be changed to 900001111. The max value for this column in H database would be less than 200000000.


With this change, it was suggested to test the application post-merger and with new set of statistics. An example below demonstrates the change in optimizer behaviour since 10.1.0.3.


We all know that the predicted number of rows, also called as CARDINALITY, plays a crucial role in optimizer's calculation of a Join Order or an Access Path. Any error in the calculation of Cardinality can have an impact on the overall execution plan of a query.


The steps performed are : (to make the numbers and calculation small, I will use a table with less number of rows and with smaller values).


1. I first create a table that depicts H Database.
2. Execute a query. We will check for the Cardinality.
3. Then we merge the data into H from C, with incremented values.
4. Without gathering statistics, we shall execute the same query (as in step 2).
5. Without gathering statistics, we shall execute the same query but with a newer value.


After Step 4 & 5, we can conclude that statistics are very much important, especially, when the values in the frequently queried columns go out of range as it can have an impact in cardinality calculation.


.
.
Step 1 - Create Table and Gather Stats (H Database)
.
create table vivek_header as
with main as
(select * from all_objects),
main1 as
(select rownum ag_id, main.object_name, main.created, main.temporary
from main, main main1
where rownum<=1000000)
select * from main1
order by dbms_random.random;
.
Table created
.
.
Elapsed: 00:00:02.76
.
.
.
SQL> alter table vivek_header add constraint pk_ag_id primary key(ag_id);
.
Table altered.
.
Elapsed: 00:00:02.67
.
select count(*) from vivek_header;
.
COUNT(*)
----------
1000000
.
Elapsed: 00:00:00.09
.
.
select min(ag_id), max(ag_id) from vivek_header;

MIN(AG_ID) MAX(AG_ID)
---------- ----------
1 1000000
.
Elapsed: 00:00:00.12
.
exec dbms_stats.gather_table_stats(user,'VIVEK_HEADER',cascade=>true);
.
select column_name, vivek_number(low_value) "Low", vivek_number(high_value) "High"
from user_tab_columns
where table_name='VIVEK_HEADER'
and column_name='AG_ID';
.
.
COLUMN_NAME Low High
------------------------------ ---------- ----------
AG_ID 78 999594
.
Elapsed: 00:00:00.09
.
Step 2- Execute a Query that is in the Range. Cardinality calculation is more or less accurate.
.
select * from vivek_header
where ag_id>=900000;
.
100001 rows selected.
.
Elapsed: 00:00:00.79
.
Execution Plan
----------------------------------------------------------
Plan hash value: 1918031072
.
----------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
----------------------------------------------------------------------------------
0 SELECT STATEMENT 99240 2229K 895 (5) 00:00:11
* 1 TABLE ACCESS FULL VIVEK_HEADER 99240 2229K 895 (5) 00:00:11
----------------------------------------------------------------------------------
.
Step 3- Merging of C Data into H with New (Incremented) Values

insert into vivek_header
with main as
(select * from all_objects),
main1 as
(select 4000000+rownum ag_id, main.object_name, main.created, main.temporary
from main, main main1
where rownum<=50000)
select * from main1
order by dbms_random.random;
.
SQL> select min(ag_id), max(ag_id) from vivek_header;
.
MIN(AG_ID) MAX(AG_ID)
---------- ----------
1 4050000
.
Step 4 - Execute the query to get data for ag_id>=900000. The Cardinality Calculation is Wrong
.
select * from vivek_header
where ag_id>=900000;
.
150001 rows selected.
.
Elapsed: 00:00:01.07
.
Execution Plan
----------------------------------------------------------
Plan hash value: 1918031072

----------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
----------------------------------------------------------------------------------
0 SELECT STATEMENT 99240 2229K 895 (5) 00:00:11
* 1 TABLE ACCESS FULL VIVEK_HEADER 99240 2229K 895 (5) 00:00:11
----------------------------------------------------------------------------------
.
.
Step 5 - Execute the same Query for a new value ag_id. Cardinality Calculation is Surprisingly very low and this have an impact.
A point to note in the plan is that assuming a single row will be fetched, optimizer uses an Index Scan (PK)
.
.
select * from vivek_header
where ag_id>=4000000;
.
SQL> select * from vivek_header
2 where ag_id>=4000000;

50000 rows selected.
.
Elapsed: 00:00:00.50
.
Execution Plan
----------------------------------------------------------
Plan hash value: 3247409102
.
--------------------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
--------------------------------------------------------------------------------------------
0 SELECT STATEMENT 1 23 4 (0) 00:00:01
1 TABLE ACCESS BY INDEX ROWID VIVEK_HEADER 1 23 4 (0) 00:00:01
* 2 INDEX RANGE SCAN PK_AG_ID 1 3 (0) 00:00:01
--------------------------------------------------------------------------------------------
.
.


To sum up, optimizer statistics plays crucial role in cardinality calculation which is important to get thru a optimum execution plan, this includes, join order and proper choice of Indexes. Therefore, regular statistics are required on the objects with columns moving out of range, particularly, timebased, sequencebased columns. This change in 10g can surprise many dba's when they upgrade from 9i and 10g.

1 comments:

Sridhar Pandurangiah said...

Vivek, Nice to see my project on your blog. Haven't met you during my visits to H in Mumbai! Will link to this post from my blog.

Best regards

Sridhar

About Me

My Photo
Vivek Sharma
Mumbai, Maharashtra, India
I am an Oracle Professional, mainly into Application and Database Tuning. I have been associated with Oracle technologies for more than 8 years and, in these 8 years, have seen Oracle Database growing at a fast pace. This is one of the primary reason to be associated with this technology. I initiated my association with Oracle as a FORMS Developer and then transitioned to a DBA. Development experience helped me a lot in performing my DBA responsibilities, especially, when it comes to Database and Application tuning. I am a technical speaker with a passion to share and learn from Oracle Community across the world.
View my complete profile

Number of Hits on this Page (Counter Initiated since 4th Nov 2007)

Visit online slots or online blackjack at casino games site.

Visitor Locations - Across Globe

Some of my Favourite Books

  • Oracle 10g Real Application Clusters Handbook by K. Gopalakrishnan
  • Oracle Wait Interface by K. Gopalakrishnan
  • Effective Oracle by Design, Author : Thomas Kyte