Filter by date range based on PK id.

Understanding how data gets into a table means you have opportunities to optimize the way a query is executed beyond what is available to the optimizer.
In this scenario, I have an audit table where the primary key is a sequence, and represents the only index on a table with years of data.

The table also has a date column that represents when a row was inserted.
I receive a request for an extract of changes for a short timeframe ( say, last Tuesday)

If I know for sure that the date column and pk are both ordered the same (more or less), I can use the PK to filter by date allowing a small index range scan to bring back the relevant rows instead of a FTS.


Prerequisites:

  •       I know the sequence is inserted in time order (more or less).
  •       There aren’t many gaps in the table.
Filtering on un-indexed date column will result in a large FTS.

select * from owner.audittab
where aud_date between '22-AUG-2017 09:00:00' and  '22-AUG-2017 13:00:00';

----------------------------------------------------------------------         
| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)|         
----------------------------------------------------------------------         
|   0 | SELECT STATEMENT  |             |  3939 |   423K|   184K  (1)|          
|   1 |  TABLE ACCESS FULL| AUDITTAB    |  3939 |   423K|   184K  (1)|         
----------------------------------------------------------------------     

Get the high value for the PK:

select max(id)  maxd  from owner.audittab;

    MAX_ID
----------
47,634,622

Get a distributed list of mapped id to date values (5000) values.

select level l, round ((max_id/num_groups)*(level-1),0) id
from (select max(aud_id) max_id, 1000 num_groups from  owner.audittab) a
connect by level <= num_groups ;

        L         ID
---------- ----------
         1          1
         2      51402
         3     102803
         4     154204
         5     205606
         6     257007
       ...        ...
       ...        ...
       996   51144147
       997   51195548
       998   51246950
       999   51298351
      1000   51349752

Narrow down the last ID before your date range, and the first one after it.

select
max( case when auddate < '22-AUG-2017 09:00:00' then id end  ) start_id ,
min( case when auddate > '22-AUG-2017 13:00:00' then id end  ) end_id
from (
  select a.id,  aud_date auddate
  from(
    select level l , round ((max_id/num_groups)*(level-1),0)+1 id
    from ( select max(aud_id) max_id , 1000 num_groups from owner.audittab     ) a
    connect by level <= num_groups
  ) a ,  owner.audittab    b where a.id  = b.aud_id
);

----------------------------------------------------------------------------------------
| Id  | Operation                        | Name           | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                |     1 |    27 |     5   (0)|
|   1 |  SORT AGGREGATE                  |                |     1 |    27 |            |
|   2 |   NESTED LOOPS                   |                |     1 |    27 |     5   (0)|
|   3 |    NESTED LOOPS                  |                |     1 |    27 |     5   (0)|
|   4 |     VIEW                         |                |     1 |    13 |     3   (0)|
|   5 |      CONNECT BY WITHOUT FILTERING|                |       |       |            |
|   6 |       VIEW                       |                |     1 |    16 |     3   (0)|
|   7 |        SORT AGGREGATE            |                |     1 |     6 |            |
|   8 |         INDEX FULL SCAN (MIN/MAX)| PK_AUDITTAB    |     1 |     6 |     3   (0)|
|   9 |     INDEX UNIQUE SCAN            | PK_AUDITTAB    |     1 |       |     1   (0)|
|  10 |    TABLE ACCESS BY INDEX ROWID   | SECURITYLOG    |     1 |    14 |     2   (0)|
----------------------------------------------------------------------------------------

    START_ID       END_ID
------------ ------------
  51,092,941   51,195,744

If end_id comes back null ,  use the max_id  from above (47,634,622)
Use the selected ID range to filter before refining by the requested date range.

select * from  owner.audittab    
where aud_id between  51092941 and  51195744
and aud_date between '22-AUG-2017 09:00:00' and  '22-AUG-2017 13:00:00';

-----------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |     9 |   990 |  2346   (1)|
|   1 |  TABLE ACCESS BY INDEX ROWID| AUDITTAB       |     9 |   990 |  2346   (1)|
|   2 |   INDEX RANGE SCAN          | PK_AUDITTAB    |   102K|       |   397   (1)|
-----------------------------------------------------------------------------------

This method can be used in applications that take a date range as parameters – to allow searching without adding an extra index. 

Comments