Last Comments

snorkel (Data Warehouse / …): Oracle seems like a huge …
frak (Article Bot is fi…): Well thanks for being the…
random (Article Bot is fi…): Curious Statement from th…
frak (Article Bot Spyin…): For the answer see this u…
Bob (Pivot 1.30 Alpha.…): And this is what a commen…

Archives

01 Nov - 30 Nov 2007
01 Jun - 30 Jun 2007
01 Dec - 31 Dec 2006
01 Nov - 30 Nov 2006
01 Jul - 31 Jul 2006
01 Jun - 30 Jun 2006
01 Apr - 30 Apr 2006
01 Mar - 31 Mar 2006
01 Feb - 28 Feb 2006
01 Jan - 31 Jan 2006
01 Dec - 31 Dec 2005
01 Oct - 31 Oct 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Feb - 28 Feb 2005
01 Jan - 31 Jan 2005

Miscellany

Powered by Pivot - 1.40.5: 'Dreadwind' 
XML: RSS Feed 
XML: Atom Feed 

« Nerds FC | Home | Working with moving d… »

Working with moving data sets(eg for moving averages) efficiently in PostgreSQL - Pt1

I have been recently working on stock market technical analysis using PostgreSQL.   One of the hassles is that you want to create several moving sets of data - for moving averages, standard deviations (for creating bollinger bands) and the like.   It is common to have a short term moving average and a long term moving average to see where they cross.  Using pure SQL then you will need to link in an instance of the stock data to 10 rows of stock data from the same table, and then again for the 21 rows of data (hence creating a 10 day moving average with bollinger bands, and a 21 day moving average)

The following is an example with those two moving data sets - 21 and 10.   Even though indexes are being used, it takes 40 seconds to run!

SELECT s.stock, s.day_id, s.closing, avg(s10.closing), avg(s10.closing) + 2*stddev(s10.closing) AS bollinger_high,
    avg(s10.closing) - 2*stddev(s10.closing) AS bollinger_low,
    avg(s21.closing) AS ma_21
FROM stocks AS s
    INNER JOIN stocks s10
        ON s10.stock = s.stock
        AND s10.stock_index <= s.stock_index
        AND s10.stock_index > s.stock_index - 10
    INNER JOIN stocks s21
        ON s21.stock = s.stock
        AND s21.stock_index <= s.stock_index
        AND s21.stock_index > s.stock_index - 21
GROUP BY s.stock, s.day_id, s.closing
Remember that this is the beginning.  Adding an extra period would exponentially slow things down even further.  ( a single moving average/dataset was extremely quick.    This is the query plan:
GroupAggregate  (cost=76987947.62..80822192.41 rows=2316 width=44) (actual time=31699.194..38996.966 rows=2316 loops=1)
  ->  Sort  (cost=76987947.62..77371364.00 rows=153366549 width=44) (actual time=31672.742..32523.684 rows=484095 loops=1)
        Sort Key: s.stock, s.day_id, s.closing
        ->  Nested Loop  (cost=0.00..18700393.29 rows=153366549 width=44) (actual time=82.308..16989.105 rows=484095 loops=1)
              Join Filter: (("inner".stock)::text = ("outer".stock)::text)
              ->  Nested Loop  (cost=0.00..194391.82 rows=595984 width=44) (actual time=82.277..13726.831 rows=48426 loops=1)
                    Join Filter: ((("outer".stock)::text = ("inner".stock)::text) AND ("outer".stock_index > ("inner".stock_index - 21)))
                    ->  Index Scan using pk on stocks s21  (cost=0.00..439.98 rows=2316 width=22) (actual time=38.341..2949.632 rows=2316 loops=1)
                    ->  Index Scan using idx_stock_index on stocks s  (cost=0.00..68.30 rows=772 width=26) (actual time=0.034..2.765 rows=1159 loops=2316)
                          Index Cond: ("outer".stock_index <= s.stock_index)
              ->  Index Scan using idx_stock_index on stocks s10  (cost=0.00..25.91 rows=257 width=22) (actual time=0.007..0.033 rows=10 loops=48426)
                    Index Cond: ((s10.stock_index <= "outer".stock_index) AND (s10.stock_index > ("outer".stock_index - 10)))
Total runtime: 39006.705 ms


So then the answer came to me:  Create a PL function that will return a single moving set of data for each row in a way that will not blow the query out of the water in terms of required index scans, joins and sorts.   Why lookup the same set of data every time a new period is added?

Does not a 21 day moving average include the figures required to do a 10 day moving average?

For the details as to how, you will need to read the next installment. ;-)

http://frakkle.com



No comments:


No trackbacks:

Trackback link:

Please enable javascript to generate a trackback url


  
Remember personal info?

/ Textile

this is to stop spam bots causing me pain.
 

  (Register your username / Log in)

Notify:
Hide email:

Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.