Basic index compression - identify candidates and estimate prefix length

August 30, 2017

Basic index compression - identify candidates and estimate prefix length

Goal:

Identify candidates for index compression quickly using statistics before running
ANALYZE with VALIDATE STRUCTURE

Loosely based on:

http://allthingsoracle.com/compression-in-oracle-part-4-basic-index-compression/

https://richardfoote.wordpress.com/category/index-compression/

This code contains heaps of assumptions and estimates, and has plenty of room for improvement.
Estimates are use: table, column and index statistics ( distinct_keys, leaf_blocks, num_rows, avg_col_len)

The last column gives an estimated savings in MB - a negative value means the index is expected to grow in size given the new prefix length.

Remember:

The data stored in the index is not actually compressed, duplicate values are removed.
A smaller index is good for memory as well as disk.
If there is empty space above pctfree, the index will shrink even further than estimated when it is rebuilt with compression. You will also get this from validate structure.
Read Richard Foote's blog for a more detailed explanation of how (and when) index compression works.

select a.owner, index_name, table_name, leaf_blocks, cols, 
num_cols index_cols, distinct_keys , idx_rows, 
prefix_length new_prefix_length, 
prefix_distinct_keys estd_prefix_distinct_keys ,
round(( 
  /* space freed per row */ 
  ( prefix_byte_length * idx_rows ) 
  /* cost of prefix entries in leaf blocks  */ 
  - ( (prefix_byte_length + 5) * prefix_records ) /* 5ish extra bytes for each prefix entry */ 
)/1024/1024 ,2) estd_savings_MB
from (
select a.*, round( idx_rows/prefix_distinct_keys ,2) prefix_rows_per_key ,  avg_leaf_blocks_per_key*prefix_distinct_keys  prefix_records 
from( 
select /*+ ordered */  a.owner, a.index_name, a.table_name , a.leaf_blocks,
a.distinct_keys , a.avg_leaf_blocks_per_key, a.num_rows idx_rows 
,pl prefix_length, count(*) num_cols 
,listagg( b.column_name,', ') within group ( order by b.column_position  ) cols 
,listagg(  num_distinct,', ') within group ( order by b.column_position  ) num_distinct 
,case 
    when pl = bb.num_rows then max(  a.distinct_keys )
    when pl = 1 then  max( num_distinct  )
    else least(  max( num_distinct  ),  round(exp(sum(ln( decode( num_distinct,0,1,num_distinct) ))),0)*0.5 )
 end  prefix_distinct_keys 
,sum( avg_col_len ) prefix_byte_length 
from dba_indexes a , dba_ind_columns b , dba_tab_cols c , 
( select level pl from dual connect by rownum <= 5) d ,
( select count(*) num_rows ,  bb.index_name,  bb.index_owner  from dba_ind_columns bb group by bb.index_name,  bb.index_owner   ) bb  
where a.owner = '&SCHEMA.'  -- schema to analyze 
and a.prefix_length is null  --- not already compressed 
and a.leaf_blocks > 2  -- big enough to benefit  
and index_type = 'NORMAL' 
and bb.index_name = a.index_name and a.owner = bb.index_owner  
and a.index_name = b.index_name and a.owner = b.index_owner 
and c.table_name = a.table_name  and a.table_owner = c.owner 
and b.column_name = c.column_name 
and pl >=  b.column_position 
and pl  <= bb.num_rows -- ( select count(*) from dba_ind_columns bb where bb.index_name = a.index_name and a.owner = bb.index_owner ) 
group by  a.owner, a.index_name, a.table_name , a.leaf_blocks,
a.distinct_keys , a.avg_leaf_blocks_per_key, a.num_rows  ,pl, bb.num_rows
) a ) a 
order by estd_savings_MB  desc ;

Search This Blog

Random oracle things

Basic index compression - identify candidates and estimate prefix length

Comments

Post a Comment

Popular Posts

ORA-00932: inconsistent datatypes: expected - got LONG

SET_AUDIT_TRAIL_LOCATION tablespace move & capacity calculation