Closed
Description
I apologize in advance for this messy bug report.
Executing the following SQL query, an instance of a TPC-DS template, occasionally causes a segmentation fault in PG10.3 when pg_query_state
is called. Here is the query:
select *
from
(select count(*) h8_30_to_9
from store_sales, household_demographics , time_dim, store
where ss_sold_time_sk = time_dim.t_time_sk
and ss_hdemo_sk = household_demographics.hd_demo_sk
and ss_store_sk = s_store_sk
and time_dim.t_hour = 8
and time_dim.t_minute >= 30
and ((household_demographics.hd_dep_count = 0 and household_demographics.hd_vehicle_count<=0+2) or
(household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or
(household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2))
and store.s_store_name = 'ese') s1,
(select count(*) h9_to_9_30
from store_sales, household_demographics , time_dim, store
where ss_sold_time_sk = time_dim.t_time_sk
and ss_hdemo_sk = household_demographics.hd_demo_sk
and ss_store_sk = s_store_sk
and time_dim.t_hour = 9
and time_dim.t_minute < 30
and ((household_demographics.hd_dep_count = 0 and household_demographics.hd_vehicle_count<=0+2) or
(household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or
(household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2))
and store.s_store_name = 'ese') s2,
(select count(*) h9_30_to_10
from store_sales, household_demographics , time_dim, store
where ss_sold_time_sk = time_dim.t_time_sk
and ss_hdemo_sk = household_demographics.hd_demo_sk
and ss_store_sk = s_store_sk
and time_dim.t_hour = 9
and time_dim.t_minute >= 30
and ((household_demographics.hd_dep_count = 0 and household_demographics.hd_vehicle_count<=0+2) or
(household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or
(household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2))
and store.s_store_name = 'ese') s3,
(select count(*) h10_to_10_30
from store_sales, household_demographics , time_dim, store
where ss_sold_time_sk = time_dim.t_time_sk
and ss_hdemo_sk = household_demographics.hd_demo_sk
and ss_store_sk = s_store_sk
and time_dim.t_hour = 10
and time_dim.t_minute < 30
and ((household_demographics.hd_dep_count = 0 and household_demographics.hd_vehicle_count<=0+2) or
(household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or
(household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2))
and store.s_store_name = 'ese') s4,
(select count(*) h10_30_to_11
from store_sales, household_demographics , time_dim, store
where ss_sold_time_sk = time_dim.t_time_sk
and ss_hdemo_sk = household_demographics.hd_demo_sk
and ss_store_sk = s_store_sk
and time_dim.t_hour = 10
and time_dim.t_minute >= 30
and ((household_demographics.hd_dep_count = 0 and household_demographics.hd_vehicle_count<=0+2) or
(household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or
(household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2))
and store.s_store_name = 'ese') s5,
(select count(*) h11_to_11_30
from store_sales, household_demographics , time_dim, store
where ss_sold_time_sk = time_dim.t_time_sk
and ss_hdemo_sk = household_demographics.hd_demo_sk
and ss_store_sk = s_store_sk
and time_dim.t_hour = 11
and time_dim.t_minute < 30
and ((household_demographics.hd_dep_count = 0 and household_demographics.hd_vehicle_count<=0+2) or
(household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or
(household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2))
and store.s_store_name = 'ese') s6,
(select count(*) h11_30_to_12
from store_sales, household_demographics , time_dim, store
where ss_sold_time_sk = time_dim.t_time_sk
and ss_hdemo_sk = household_demographics.hd_demo_sk
and ss_store_sk = s_store_sk
and time_dim.t_hour = 11
and time_dim.t_minute >= 30
and ((household_demographics.hd_dep_count = 0 and household_demographics.hd_vehicle_count<=0+2) or
(household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or
(household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2))
and store.s_store_name = 'ese') s7,
(select count(*) h12_to_12_30
from store_sales, household_demographics , time_dim, store
where ss_sold_time_sk = time_dim.t_time_sk
and ss_hdemo_sk = household_demographics.hd_demo_sk
and ss_store_sk = s_store_sk
and time_dim.t_hour = 12
and time_dim.t_minute < 30
and ((household_demographics.hd_dep_count = 0 and household_demographics.hd_vehicle_count<=0+2) or
(household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or
(household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2))
and store.s_store_name = 'ese') s8
;
I apologize for the length... I am not sure what elements of the query causes the failure.
Here are the steps to reproduce:
- Download PostgresQL 10.3
- Download the PG10 branch
- Patch and build:
cd postgresql-10.3
patch -p1 < /home/postgres/pg_query_state/custom_signals.patch
patch -p1 < /home/postgres/pg_query_state/runtime_explain.patch
./configure --prefix=/home/postgres/local/
make -j 2
make install
export PATH=$PATH:/home/postgres/local/bin
cd /home/postgres/pg_query_state
make install USE_PGXS=1
- Enable the extension:
# wherever your postgresql.conf is...
echo "shared_preload_libraries = 'pg_query_state'">> /media/data/pg/postgresql.conf
- Start the DB and load TPC-DS data (scale factor of 1GB will work)
- Start a PSQL session, get the backend PID. Use this session to execute the above query over and over again
- In another PSQL section, run
select * from pg_query_state($PID);
while the query is running. After 1-5 tries, Postgres has a segfault.
Is there a way I can build PG to provide more useful info about what is going on? Are there log files I can provide?