Tuning SQL with dynamic sampling
Another example of what I think of as "the self-tuning database". Setting optimizer_dynamic_sampling=11 can fix many performance problems, without the DBA needing to use his brain at all.
Anyone who tunes SQL should know that developing a good plan is all about the estimated cardinalities. If you don't believe me, read this book Review - It's All About The Cardinalities Anyone who has spent time with me knows that I believe that many fixable bad plan problems can be fixed with extended statistics, I blogged about that here Extended stats
If you do not have the time to create extended stats, there is an alternative: optimizer dynamic sampling. I have long believed that all databases should have this set to 4. Level 4 means that the CBO will always sample if a query has a complex predicate: a predicate involving two or more columns of a table, or columns enclosed in functions. I have seen this deliver a spectacular performance boost, and have never seen it cause a problem. Hoever, level four is not good enough for all situations. Release 12.x introduces a new level, level 11, and this has been back-ported into release 11.2.0.4.
From the docs:
11 Use dynamic statistics automatically when the optimizer deems it necessary. The resulting statistics are persistent in the statistics repository, making them available to other queries.
That sounds pretty cool: just let Uncle Oracle get on with it. But does it work? Well, here is one situation (taken from a topic on OTN earlier today) that level 4 could not handle, but level 11 does. First, create a table and gather stats:
orclz> orclz> create table t1 as select * from all_objects; Table created. orclz> exec dbms_Stats.gather_table_stats(user,'t1',- > estimate_percent=>dbms_stats.auto_sample_size,method_opt=>'for all columns size skewonly') PL/SQL procedure successfully completed. orclz> select column_name,num_buckets,histogram from user_tab_col_statistics where table_name='T1'; COLUMN_NAME NUM_BUCKETS HISTOGRAM ------------------------------ ----------- --------------- ORACLE_MAINTAINED 2 FREQUENCY EDITIONABLE 2 FREQUENCY SHARING 3 FREQUENCY EDITION_NAME 0 NONE NAMESPACE 23 FREQUENCY SECONDARY 2 FREQUENCY GENERATED 2 FREQUENCY TEMPORARY 2 FREQUENCY STATUS 1 FREQUENCY TIMESTAMP 254 HYBRID LAST_DDL_TIME 254 HYBRID CREATED 254 HYBRID OBJECT_TYPE 45 FREQUENCY DATA_OBJECT_ID 254 HYBRID OBJECT_ID 1 NONE SUBOBJECT_NAME 254 HYBRID OBJECT_NAME 254 HYBRID OWNER 32 FREQUENCY 18 rows selected. orclz>
Note that I am using auto sample size, which is a requirement to get the new, and much more accurate, HYBRID histograms. They do not help, though, when I run a query that includes an analytical function (this is the example from the OTN topic). First with level 4 sampling, and the estimate is ludicrous:
orclz> orclz> set autot on exp orclz> alter session set optimizer_dynamic_sampling=4; Session altered. orclz> SELECT object_type 2 , object_name 3 , object_rank 4 FROM ( 5 SELECT object_type 6 , object_name 7 , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank 8 FROM t1 9 ) 10 WHERE object_name = 'DUAL'; OBJECT_TYPE OBJECT_NAME OBJECT_RANK --------------- ------------------------------ ----------- SYNONYM DUAL 1 TABLE DUAL 62 Execution Plan ---------------------------------------------------------- Plan hash value: 2273146475 ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 90329 | 8115K| | 1402 (1)| 00:00:01 | |* 1 | VIEW | | 90329 | 8115K| | 1402 (1)| 00:00:01 | | 2 | WINDOW SORT | | 90329 | 3704K| 4616K| 1402 (1)| 00:00:01 | | 3 | TABLE ACCESS FULL| T1 | 90329 | 3704K| | 424 (1)| 00:00:01 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("OBJECT_NAME"='DUAL') orclz>
And now try with automatic dynamic sampling:
orclz> orclz> alter session set optimizer_dynamic_sampling=11; Session altered. orclz> SELECT object_type 2 , object_name 3 , object_rank 4 FROM ( 5 SELECT object_type 6 , object_name 7 , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank 8 FROM t1 9 ) 10 WHERE object_name = 'DUAL'; OBJECT_TYPE OBJECT_NAME OBJECT_RANK --------------- ------------------------------ ----------- SYNONYM DUAL 1 TABLE DUAL 62 Execution Plan ---------------------------------------------------------- Plan hash value: 2273146475 ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 2 | 184 | | 1402 (1)| 00:00:01 | |* 1 | VIEW | | 90329 | 8115K| | 1402 (1)| 00:00:01 | | 2 | WINDOW SORT | | 90329 | 3704K| 4616K| 1402 (1)| 00:00:01 | | 3 | TABLE ACCESS FULL| T1 | 90329 | 3704K| | 424 (1)| 00:00:01 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("OBJECT_NAME"='DUAL') Note ----- - dynamic statistics used: dynamic sampling (level=AUTO) orclz>
Not bad. ie, perfect. Clearly, level 11 is doing a much better job than level 4 in this specially devised case, but you will find it works well everywhere. It must now be best practice to use level 11 in all databases.
--
John Watson
Oracle Cetified Master DBA
Skillbuilders Inc
ps - setting something to 11 is reminscent of that awful movie, This Is Spinal Tap. Why is it awful? Because it is so true. I won't tell you how I know this.
- John Watson's blog
- Log in to post comments
Comments
You stole my joke
As soon as I saw you suggest setting it to 11, I knew there needed to be a Spinal Tap reference. Sadly for me you got there first, but it is yet another demonstration of the truism that you're articles leave no important stone unturned.
Nice demo (again), although I needed to look back and forth between the plans to understand your point. I was expecting the plans (and therefore performance) to be different and initially thought you'd made a mistake when they came out the same. Your point - of course - is that the ESTIMATE is different. If this was embedded in a larger SQL with joins and/or sub-queries then it COULD actually result in a bad plan, which in turn could result in poorer performance.