Calender
Sun Mon Tue Wed Thu Fri Sat
 123456
78910111213
14151617181920
21222324252627
28293031   
<< July 2019 >>
広告
SEARCH

SELECTED ENTRIES
RECENT COMMENTS
RECENT TRACKBACK
CATEGORIES
ARCHIVES
LINKS
PROFILE
OTHERS
SKYPE
PC: skype.jojo.jp
chat
iPad: iphone.jojo.jp
chat call
THANKS



本日:
昨日:
多言語
広告
 ▼▲ 作業日報 ▼△
    What's under the hood?
<< 朝、五明朝あたりで事故 | main | 温度センサーと3軸加速度センサー >>
CE(Constraint Exclusion)利用時のJOINは注意
 PostgreSQLにてCEを有効にし、パーティショニイングしたテーブルにWHERE区で条件を与えると高速になりますが、アプリケーション上からJOINがあるSQLを利用していると逆に性能が劣化する場合がありました。

以下が検証例です:
 2つのパターンで検索をかけてみます。
 1.2つのテーブルt_data(元データ数件の日付とtextを持ったテーブル)とdays(2日分の日付だけを入れたテーブル)を用意しt_dataを結合からレコードを抽出する。
 2.t_dataというテーブルの日付でWHERE条件で'2008-12-1'と'2009-1-1'のレコードを持ってくる

 基本的には同一条件ですが、explainで確認すると、1の場合は分割テーブルは利用されません。

 ・14.2. プランナで使用される統計情報
・WHEREで抽出
"Result  (cost=0.00..73.50 rows=36 width=40) (actual time=0.077..0.116 rows=2 loops=1)"
"  ->  Append  (cost=0.00..73.50 rows=36 width=40) (actual time=0.073..0.108 rows=2 loops=1)"
"        ->  Seq Scan on t_data  (cost=0.00..24.50 rows=12 width=40) (actual time=0.044..0.044 rows=0 loops=1)"
"              Filter: (dt = ANY ('{"2009-04-01 00:00:00","2008-12-01 00:00:00"}'::timestamp without time zone[]))"
"        ->  Seq Scan on t_data_08 t_data  (cost=0.00..24.50 rows=12 width=40) (actual time=0.026..0.027 rows=1 loops=1)"
"              Filter: (dt = ANY ('{"2009-04-01 00:00:00","2008-12-01 00:00:00"}'::timestamp without time zone[]))"
"        ->  Seq Scan on t_data_0904 t_data  (cost=0.00..24.50 rows=12 width=40) (actual time=0.028..0.029 rows=1 loops=1)"
"              Filter: (dt = ANY ('{"2009-04-01 00:00:00","2008-12-01 00:00:00"}'::timestamp without time zone[]))"
"Total runtime: 0.181 ms"

・JOINで抽出
"Merge Join  (cost=723.60..1851.38 rows=74472 width=40) (actual time=0.102..0.119 rows=2 loops=1)"
"  Merge Cond: (days.dt = public.t_data.dt)"
"  ->  Sort  (cost=149.78..155.13 rows=2140 width=8) (actual time=0.020..0.022 rows=2 loops=1)"
"        Sort Key: days.dt"
"        Sort Method:  quicksort  Memory: 25kB"
"        ->  Seq Scan on days  (cost=0.00..31.40 rows=2140 width=8) (actual time=0.006..0.009 rows=2 loops=1)"
"  ->  Sort  (cost=573.82..591.22 rows=6960 width=40) (actual time=0.074..0.079 rows=6 loops=1)"
"        Sort Key: public.t_data.dt"
"        Sort Method:  quicksort  Memory: 25kB"
"        ->  Append  (cost=0.00..129.60 rows=6960 width=40) (actual time=0.006..0.040 rows=6 loops=1)"
"              ->  Seq Scan on t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.004..0.005 rows=1 loops=1)"
"              ->  Seq Scan on t_data_08 t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.002..0.003 rows=1 loops=1)"
"              ->  Seq Scan on t_data_0901 t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.003..0.004 rows=1 loops=1)"
"              ->  Seq Scan on t_data_0902 t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.002..0.003 rows=1 loops=1)"
"              ->  Seq Scan on t_data_0903 t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.003..0.004 rows=1 loops=1)"
"              ->  Seq Scan on t_data_0904 t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.002..0.003 rows=1 loops=1)"
"Total runtime: 0.177 ms"

・CEなしの場合(参考)
explain ANALYZE select * from t_data 
 where dt in ('2009-4-1','2008-12-1');
"Result  (cost=0.00..147.00 rows=72 width=40) (actual time=0.021..0.046 rows=2 loops=1)"
"  ->  Append  (cost=0.00..147.00 rows=72 width=40) (actual time=0.018..0.039 rows=2 loops=1)"
"        ->  Seq Scan on t_data  (cost=0.00..24.50 rows=12 width=40) (actual time=0.012..0.012 rows=0 loops=1)"
"              Filter: (dt = ANY ('{"2009-04-01 00:00:00","2008-12-01 00:00:00"}'::timestamp without time zone[]))"
"        ->  Seq Scan on t_data_08 t_data  (cost=0.00..24.50 rows=12 width=40) (actual time=0.003..0.004 rows=1 loops=1)"
"              Filter: (dt = ANY ('{"2009-04-01 00:00:00","2008-12-01 00:00:00"}'::timestamp without time zone[]))"
"        ->  Seq Scan on t_data_0901 t_data  (cost=0.00..24.50 rows=12 width=40) (actual time=0.003..0.003 rows=0 loops=1)"
"              Filter: (dt = ANY ('{"2009-04-01 00:00:00","2008-12-01 00:00:00"}'::timestamp without time zone[]))"
"        ->  Seq Scan on t_data_0902 t_data  (cost=0.00..24.50 rows=12 width=40) (actual time=0.003..0.003 rows=0 loops=1)"
"              Filter: (dt = ANY ('{"2009-04-01 00:00:00","2008-12-01 00:00:00"}'::timestamp without time zone[]))"
"        ->  Seq Scan on t_data_0903 t_data  (cost=0.00..24.50 rows=12 width=40) (actual time=0.003..0.003 rows=0 loops=1)"
"              Filter: (dt = ANY ('{"2009-04-01 00:00:00","2008-12-01 00:00:00"}'::timestamp without time zone[]))"
"        ->  Seq Scan on t_data_0904 t_data  (cost=0.00..24.50 rows=12 width=40) (actual time=0.002..0.003 rows=1 loops=1)"
"              Filter: (dt = ANY ('{"2009-04-01 00:00:00","2008-12-01 00:00:00"}'::timestamp without time zone[]))"
"Total runtime: 0.127 ms"

・CEなし(JOINにより結合)
 "Merge Join  (cost=723.60..1851.38 rows=74472 width=40) (actual time=0.099..0.116 rows=2 loops=1)"
"  Merge Cond: (days.dt = public.t_data.dt)"
"  ->  Sort  (cost=149.78..155.13 rows=2140 width=8) (actual time=0.025..0.027 rows=2 loops=1)"
"        Sort Key: days.dt"
"        Sort Method:  quicksort  Memory: 25kB"
"        ->  Seq Scan on days  (cost=0.00..31.40 rows=2140 width=8) (actual time=0.006..0.008 rows=2 loops=1)"
"  ->  Sort  (cost=573.82..591.22 rows=6960 width=40) (actual time=0.065..0.070 rows=6 loops=1)"
"        Sort Key: public.t_data.dt"
"        Sort Method:  quicksort  Memory: 25kB"
"        ->  Append  (cost=0.00..129.60 rows=6960 width=40) (actual time=0.006..0.040 rows=6 loops=1)"
"              ->  Seq Scan on t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.004..0.006 rows=1 loops=1)"
"              ->  Seq Scan on t_data_08 t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.002..0.003 rows=1 loops=1)"
"              ->  Seq Scan on t_data_0901 t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.002..0.003 rows=1 loops=1)"
"              ->  Seq Scan on t_data_0902 t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.002..0.003 rows=1 loops=1)"
"              ->  Seq Scan on t_data_0903 t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.002..0.003 rows=1 loops=1)"
"              ->  Seq Scan on t_data_0904 t_data  (cost=0.00..21.60 rows=1160 width=40) (actual time=0.002..0.003 rows=1 loops=1)"
"Total runtime: 0.175 ms"
 JOIN先のテーブルでデータがどのようにはいっているかは知るよしがない?ので当たり前といえば当たり前の動きかもしれませんが、肥大化したテーブルに対して下手に結合を利用してしまうと問題が起きそうです。一旦WHERE句でCE利用させるための一時VIEWなどを作成しそれをJOINする等、アプリケーション側の方で対応しなければならなそうです。。

| サーバー関連 | 21:05 | comments(0) | trackbacks(0) |









http://blog.jojo.jp/trackback/1248783