Last Comments

snorkel (Data Warehouse / …): Oracle seems like a huge …
frak (Article Bot is fi…): Well thanks for being the…
random (Article Bot is fi…): Curious Statement from th…
frak (Article Bot Spyin…): For the answer see this u…
Bob (Pivot 1.30 Alpha.…): And this is what a commen…

Archives

01 Nov - 30 Nov 2007
01 Jun - 30 Jun 2007
01 Dec - 31 Dec 2006
01 Nov - 30 Nov 2006
01 Jul - 31 Jul 2006
01 Jun - 30 Jun 2006
01 Apr - 30 Apr 2006
01 Mar - 31 Mar 2006
01 Feb - 28 Feb 2006
01 Jan - 31 Jan 2006
01 Dec - 31 Dec 2005
01 Oct - 31 Oct 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Feb - 28 Feb 2005
01 Jan - 31 Jan 2005

Miscellany

Powered by Pivot - 1.40.5: 'Dreadwind' 
XML: RSS Feed 
XML: Atom Feed 

« Working with moving d… | Home | Working with moving d… »

Working with moving data sets(eg for moving averages) efficiently in PostgreSQL - Pt2

So how did I take a 40 second operation - that was still in the simple stage - and make it efficient?  Before I start, I should give you some info that you may have guessed from the last post.

Stocks table definition:

CREATE TABLE stocks(
  stock varchar(10) NOT NULL,
  day_id int4 NOT NULL,
  open numeric,
  high numeric,
  low numeric,
  closing numeric,
  volume int4,
  stock_index int4,
  CONSTRAINT pk PRIMARY KEY (stock, day_id)
) WITHOUT OIDS;

The stock_index, stock and day_id columns are indexed.  The stock_index column is what I used in my 40 second query I showed you last time.  (withouth this it was REALLY slow!  I had to do a seperate query using LIMIT to get the day_id values required to look up the required rows)

A picture of what I mean when I say "moving data set":

 |------------15 row set (row 15)-----------|
| |------------15 row set (row 16)-----------|
| | |------------15 row set (row 17)-----------|
| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|
|----10 row set (row 10)----| | | | | | | |
|----10 row set (row 11)----| | | | | | |
|----10 row set (row 12)----| | | | | |
|----10 row set (row 13)----| | | | |
|----10 row set (row 14)----| | | |
|----10 row set (row 15)----| | |
|----10 row set (row 16)----| |
|----10 row set (row 17)----|

As you can see more easily from the above diagram, the moving set of 10 is part of the moving set of 15.  (as indeed is any data set that involves the last 15 rows or less).  So how to "roll through" the rows to create a moving data set just the once - instead of linking in another setof rows every time a new period is needed for another moving average?

Instead of linking 15 rows to every row, then linking 10 rows to every row, how about a function that returns the last n (15 above) rows in a way that can be sliced up into smaller chunks (like the last 10 rows).  A function that simply adds the current row values to the beginning of the set and drops the last value off the end?  The latter would mean the data itself is only looked up (stocks table row-count) once and not -

stocks table row-count  *  set1 row-count  *  set2 row-count  * setn row-count...  ( =   a lot of work for the db! )
 - 15 times and then 10 times for every row of raw data in the stocks table.

This post is getting long again, so I'll tell you exactly how tomorrow - with code.

http://frakkle.com



No comments:


No trackbacks:

Trackback link:

Please enable javascript to generate a trackback url


  
Remember personal info?

/ Textile

this is to stop spam bots causing me pain.
 

  (Register your username / Log in)

Notify:
Hide email:

Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.