Thursday, June 07, 2012

Refreshing multiple summary tables in one DML

In data warehouses summary (or aggregation) tables sometimes are used to speed up some expensive aggregations.

For example there are several summary tables that are computed from one fact table across different dimensions. After the fact table is added by a new portion of data all these summary tables have to be refreshed. Usually it is done one-by-one summary table, using multiple INSERT...SELECT statements. It is OK if the fact table is not big. If the table is huge, it can be too expensive to go through it again and again for each summary table.
In Oracle you can do it in one step!

There are:
- INSERT ALL statement that allows to insert into multiple tables. These tables can be different.
- Couple of multi-aggregation GROUP BY options such as ROLLUP,  GROUPING SET, CUBE to aggregate across multiple or all dimension combinations.

This can be used to refresh all summary tables at once.

Example:

1. Create "fact" table

create table dba_segm as
select owner, tablespace_name, segment_type, segment_name, bytes
from dba_segments;

2. Create empty summary tables for all dimension combinations (though you do not have to have all combinations, you can have just some of them, the method is still applicable. You can also use GROUP BY GROUPING SET in such cases):

create table dba_segm_sum_by_o as
select owner, bytes from dba_segm where 1=0;

create table dba_segm_sum_by_s as
select tablespace_name, bytes from dba_segm where 1=0;

create table dba_segm_sum_by_t as
select segment_type, bytes from dba_segm where 1=0;

create table dba_segm_sum_by_ot as
select owner,segment_type, bytes from dba_segm where 1=0;

create table dba_segm_sum_by_os as
select owner,tablespace_name, bytes from dba_segm where 1=0;

create table dba_segm_sum_by_st as
select tablespace_name,segment_type, bytes from dba_segm where 1=0;

create table dba_segm_sum_by_ost as
select owner,tablespace_name,segment_type, bytes from dba_segm where 1=0;




3. Finally populate all summary tables at once:

INSERT ALL
  when (owner is not null and tablespace_name is not null and segment_type is not null)
   then into dba_segm_sum_by_ost(owner,tablespace_name, segment_type, bytes)
  when (owner is not null and tablespace_name is not null and segment_type is null)
   then into dba_segm_sum_by_os(owner,tablespace_name,bytes) values (owner,tablespace_name,bytes)
  when (owner is not null and tablespace_name is null and segment_type is not null)
   then into dba_segm_sum_by_ot(owner,segment_type, bytes) values (owner,segment_type, bytes)
  when (owner is null and tablespace_name is not null and segment_type is not null)
   then into dba_segm_sum_by_st(tablespace_name, segment_type, bytes) values (tablespace_name, segment_type, bytes)
  when (owner is not null and tablespace_name is null and segment_type is null)
   then into dba_segm_sum_by_o(owner,bytes) values (owner,bytes)
  when (owner is null and tablespace_name is not null and segment_type is null)
   then into dba_segm_sum_by_s(tablespace_name, bytes) values (tablespace_name, bytes)
  when (owner is null and tablespace_name is null and segment_type is not null)
   then into dba_segm_sum_by_t(segment_type, bytes) values (segment_type, bytes)
select owner, tablespace_name, segment_type, sum(bytes) bytes
from dba_segm
GROUP BY CUBE (owner,tablespace_name, segment_type);

Done!

If you look into execution plan of this statement, you will find that table DBA_SEGM has been accessed once.


0 Comments:

Post a Comment

<< Home