DBA Tips Archive for Oracle

  


Sample Table Scans

by Jeff Hunter, Sr. Database Administrator

Introduced in Oracle8i, Sample table scans allow the user to perform random and limited scans of a table to determine sampled information, rather than performing a full table scan. This can be extremely useful with ad hoc SQL statements and Data warehousing/Data mining environments where full table scans can be time consuming, and an accurate sample data value can be obtained from a subset.

To implement this, simply put the SAMPLE function in the FROM clause.

For example:

  SELECT region AVG(amount)
  FROM sales SAMPLE(5)
  GROUP BY region;
This will perform a random sample of five percent of the rows in the SALES table and return the average of the AMOUNTs found.

A variation on this is to specify that rather than sampling rows, we randomly sample whole blocks at a time.

  SELECT region AVG(amount)
  FROM sales SAMPLE(5) BLOCKS
  GROUP BY region;
This will sample five percent of all (formatted) blocks in the table, and use all of the rows found in these blocks for the sample.

Care must be taken however on what information is inferred from the results of a sample scan since the value derived may not scale to the full table. Take for example the following sample table query:

  SELECT COUNT(DISTINCT( sal ) 
  FROM emp SAMPLE(1)
This will sample only one percent of the rows in the emp table to get the number of distinct salaries. If there were a very small number of different salaries in the emp table, a one percent sample might get them all the values, and we would then wrongly infer there were 100 times that number, where in fact we had retrieved all the values.

It should be noted that any query involving SAMPLE will ignore any indexes on the table.

When diagnosing sample table scans in TKPROF trace output, notice that it will show as:

  TABLE ACCESS (SAMPLE) OF 'BIGEMP' (Cost=425 Card=567) 
while in raw trace files, it will show up as:
  STAT #2 id=2 cnt=0 pid=1 pos=1 obj=2880 op='TABLE ACCESS SAMPLE BIGEMP ' 

The following restrictions apply to sample table scans:

  1. Sample table scans are restrictied to the Cost Based Optimizer only.
  2. Sample table scans can only be performed on local tables.
  3. This form of access is limited to single table queries - joins cannot use the SAMPLE syntax and will return a parse error.


Copyright (c) 1998-2017 Jeffrey M. Hunter. All rights reserved.

All articles, scripts and material located at the Internet address of http://www.idevelopment.info is the copyright of Jeffrey M. Hunter and is protected under copyright laws of the United States. This document may not be hosted on any other site without my express, prior, written permission. Application to host any of the material elsewhere can be made by contacting me at jhunter@idevelopment.info.

I have made every effort and taken great care in making sure that the material included on my web site is technically accurate, but I disclaim any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on it. I will in no case be liable for any monetary damages arising from such loss, damage or destruction.

Last modified on
Friday, 20-Jan-2006 16:50:27 EST
Page Count: 13147