Working with moving data sets(eg for moving averages) efficiently in PostgreSQL - Pt2
So how did I take a 40 second operation - that was still in the simple stage - and make it efficient? Before I start, I should give you some info that you may have guessed from the last post.Stocks table definition:
CREATE TABLE stocks(The stock_index, stock and day_id columns are indexed. The stock_index column is what I used in my 40 second query I showed you last time. (withouth this it was REALLY slow! I had to do a seperate query using LIMIT to get the day_id values required to look up the required rows)
stock varchar(10) NOT NULL,
day_id int4 NOT NULL,
open numeric,
high numeric,
low numeric,
closing numeric,
volume int4,
stock_index int4,
CONSTRAINT pk PRIMARY KEY (stock, day_id)
) WITHOUT OIDS;
A picture of what I mean when I say "moving data set":
|------------15 row set (row 15)-----------|As you can see more easily from the above diagram, the moving set of 10 is part of the moving set of 15. (as indeed is any data set that involves the last 15 rows or less). So how to "roll through" the rows to create a moving data set just the once - instead of linking in another setof rows every time a new period is needed for another moving average?
| |------------15 row set (row 16)-----------|
| | |------------15 row set (row 17)-----------|
| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|
|----10 row set (row 10)----| | | | | | | |
|----10 row set (row 11)----| | | | | | |
|----10 row set (row 12)----| | | | | |
|----10 row set (row 13)----| | | | |
|----10 row set (row 14)----| | | |
|----10 row set (row 15)----| | |
|----10 row set (row 16)----| |
|----10 row set (row 17)----|
Instead of linking 15 rows to every row, then linking 10 rows to every row, how about a function that returns the last n (15 above) rows in a way that can be sliced up into smaller chunks (like the last 10 rows). A function that simply adds the current row values to the beginning of the set and drops the last value off the end? The latter would mean the data itself is only looked up (stocks table row-count) once and not -
stocks table row-count * set1 row-count * set2 row-count * setn row-count... ( = a lot of work for the db! )- 15 times and then 10 times for every row of raw data in the stocks table.
This post is getting long again, so I'll tell you exactly how tomorrow - with code.
No comments:
No trackbacks:
Trackback link: