Home » RDBMS Server » Performance Tuning » Help-Histogram
Help-Histogram [message #260405] Sun, 19 August 2007 10:31 Go to next message
orasaket
Messages: 70
Registered: November 2006
Member
Hi,

I am using Oracle 9i R2 on Linux

I have table column with following uneven data distribution

SELECT COUNT(1),OBJECT_ID FROM TB GROUP BY OBJECT_ID

COUNT(1)	OBJECT_ID
19000		30
31838		20
999		40
10000		10




While i am using following query
SELECT * FROM TB WHERE OBJECT_ID=40;

Index tbx is getting used correctly

And in following cases FTS is being performed as expected

SELECT * FROM TB WHERE OBJECT_ID=30;
SELECT * FROM TB WHERE OBJECT_ID=20;
SELECT * FROM TB WHERE OBJECT_ID=10;

I have read we compute histograms while data is skewed.

In above case also, data is skewed, however Index is being used correctly without Histogram.

My question is what would be the situation or data ditstribution for above table, in which we would compute histogram?

Also, why it is recommended not to compute hitogram in case we are using bind variables in the query?

Finally, how we estimate the bucket size for histograms? is it equal to number of distinct values in the table?

Thanks and Regards,
OraSaket

following is procedure with which i created this data

create table tb as select * from all_objects

UPDATE TB SET OBJECT_ID=20

UPDATE TB SET OBJECT_ID=10 WHERE ROWNUM<30000

UPDATE TB SET OBJECT_ID=30 WHERE OBJECT_ID=10  AND ROWNUM<20000

UPDATE TB SET OBJECT_ID=40 WHERE OBJECT_ID=30  AND ROWNUM<1000

COMMIT;

create index tbx on tb(object_id)

exec dbms_stats.gather_table_stats('HSASYS','TB',CASCADE=>TRUE);
Re: Help-Histogram [message #260439 is a reply to message #260405] Sun, 19 August 2007 22:08 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
I don't know why Oracle would get the plan right without histograms - I would have thought it would use the same plan for each value. If you were using 10g, I would say that it is probably using dynamic sampling.

If you are getting the right plan, why are you tuning?

Histograms are not useful in 9i when using bind variables because it cannot lookup a bind variable (at parse time) in the histogram - it needs a value. This is partially mitigated in 10g with bind-variable peeking at parse time, but only for the first parse. I read that the situation is improved in 11g, where a new plan can be determined based on the peeked values of bind variables.

Check the doco to see how to determine the optimal bucket size. I always used an advanced heuristic method (some might call it "trial and error") - use more buckets until it gets it right. 10g automatically does this for you.

Ross Leishman
Re: Help-Histogram [message #261258 is a reply to message #260405] Wed, 22 August 2007 04:13 Go to previous messageGo to next message
orasaket
Messages: 70
Registered: November 2006
Member
Hi Ross,

Thanks for your reply

I have retested it on 2 different environments (9i)

create table o as select * from all_objects;
insert into o select * from o;
update o set object_id=5;
update o set object_id=1 where rownum<1001;
update o set object_id=2 where rownum<10001 and object_id<>1;
update o set object_id=3 where rownum<30001 and object_id not in(1,2);
update o set object_id=4 where rownum<5001 and object_id not in(1,2,3);
update o set object_id=6 where rownum<51 and object_id not in(1,2,3,4);

select count(*),object_id from o group by object_id
COUNT(*)	OBJECT_ID
1000	       1
10000	       2
30000	       3
5000	       4
478862	       5
50	       6

create index ox on o(object_id)


After that i tested it on two different environments

One with
Optimizer_Index_Caching = 0
Optimizer_Index_Cost_Adj = 80

and another with
Optimizer_Index_Caching = 0
Optimizer_Index_Cost_Adj = 30

For the first environment i tested it twice
1) After gathering stats without histograms
2) gathering stats with histograms.

Following are the results

First Environment 
********************************************************************************************************************
Optimizer_Index_Caching = 0
Optimizer_Index_Cost_Adj = 80


SQL*Plus: Release 9.0.1.4.0 - Production on Tue Aug 21 19:13:44 2007

DB1>exec dbms_stats.gather_table_stats('HASYS','O',CASCADE=>TRUE);

PL/SQL procedure successfully completed.

DB1>set autotrace traceonly explain
DB1>select * from o where object_id=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1101 Card=87485 Byte
          s=7961135)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=1101 Card=87485
           Bytes=7961135)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=179 Card=8
          7485)




DB1>select * from o where object_id=2;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1101 Card=87485 Byte
          s=7961135)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=1101 Card=87485
           Bytes=7961135)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=179 Card=8
          7485)




DB1>select * from o where object_id=3;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1101 Card=87485 Byte
          s=7961135)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=1101 Card=87485
           Bytes=7961135)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=179 Card=8
          7485)




DB1>select * from o where object_id=4;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1101 Card=87485 Byte
          s=7961135)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=1101 Card=87485
           Bytes=7961135)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=179 Card=8
          7485)




DB1>select * from o where object_id=5;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1101 Card=87485 Byte
          s=7961135)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=1101 Card=87485
           Bytes=7961135)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=179 Card=8
          7485)




DB1>select * from o where object_id=6;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1101 Card=87485 Byte
          s=7961135)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=1101 Card=87485
           Bytes=7961135)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=179 Card=8
          7485)




After Computing Histogram, explicitly
********************************************************************************************************************
DB1>exec dbms_stats.gather_table_stats('HASYS','O',CASCADE=>TRUE, method_opt=>'for columns object_id size 7');

PL/SQL procedure successfully completed.

DB1>select * from o where object_id=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=283 Card=22280 Bytes
          =2027480)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=283 Card=22280
          Bytes=2027480)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=48 Card=22
          280)




DB1>select * from o where object_id=2;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=283 Card=22280 Bytes
          =2027480)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=283 Card=22280
          Bytes=2027480)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=48 Card=22
          280)




DB1>select * from o where object_id=3;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=283 Card=22280 Bytes
          =2027480)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=283 Card=22280
          Bytes=2027480)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=48 Card=22
          280)




DB1>select * from o where object_id=4;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=283 Card=22280 Bytes
          =2027480)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=283 Card=22280
          Bytes=2027480)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=48 Card=22
          280)




DB1>select * from o where object_id=5;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1296 Card=449925 Byt
          es=40943175)

   1    0   TABLE ACCESS (FULL) OF 'O' (Cost=1296 Card=449925 Bytes=40
          943175)




DB1>select * from o where object_id=6;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=283 Card=22280 Bytes
          =2027480)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=283 Card=22280
          Bytes=2027480)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=48 Card=22
          280)




DB1>

****Now another Environment 
Optimizer_Index_Caching = 0
Optimizer_Index_Cost_Adj = 30

SQL> exec dbms_stats.gather_table_stats('HASYS','O',CASCADE=>TRUE);

PL/SQL procedure successfully completed.

SQL> select * from o where object_id=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=402 Card=87483 Bytes
          =7960953)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=402 Card=87483
          Bytes=7960953)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=173 Card=8
          7483)




SQL> select * from o where object_id=2;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=402 Card=87483 Bytes
          =7960953)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=402 Card=87483
          Bytes=7960953)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=173 Card=8
          7483)




SQL> select * from o where object_id=3;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=402 Card=87483 Bytes
          =7960953)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=402 Card=87483
          Bytes=7960953)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=173 Card=8
          7483)




SQL> select * from o where object_id=4;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=402 Card=87483 Bytes
          =7960953)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=402 Card=87483
          Bytes=7960953)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=173 Card=8
          7483)


SQL> select * from o where object_id=5;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=402 Card=87483 Bytes
          =7960953)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=402 Card=87483
          Bytes=7960953)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=173 Card=8
          7483)


SQL> select * from o where object_id=6;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=402 Card=87483 Bytes
          =7960953)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O' (Cost=402 Card=87483
          Bytes=7960953)

   2    1     INDEX (RANGE SCAN) OF 'OX' (NON-UNIQUE) (Cost=173 Card=8
          7483)



Can we conclude the histogram requirement from the above?

Thanks and Regards,
OraSaket
Re: Help-Histogram [message #261634 is a reply to message #261258] Thu, 23 August 2007 03:14 Go to previous message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Correct me if I'm wrong, but in your test case:
- Optimizer_Index_Cost_Adj did not affect the result
- The only time Oracle altered the plan was when a histogram existed and a skewed value was used

Sounds to me like you proved conclusively that wet dogs smell and that night follows day.

Isn't this what we would expect to see? It doesn't shed any light on your earlier assertion that Oracle making correct index selection on skewed data WITHOUT a histogram (which I still cannot explain).

You have skewed data, you create a histogram, plans improve. Yes, I would say a histogram is indicated here.

Ross Leishman
Previous Topic: How to Ping (keep) table in db_keep_cache
Next Topic: Too much IO
Goto Forum:
  


Current Time: Sat Jun 29 08:17:31 CDT 2024