Skip to content

Partitioning by expression

Dmitry Ivanov edited this page Jun 6, 2017 · 3 revisions

As of release 1.4, pg_pathman supports partitioning by expression (both RANGE and HASH partitioning schemes).

Prerequisites

/* we're going to store something like { "key": 100, "value": "abcdef...." } */
create table test(col jsonb not null);

/* create 10 partitions, each will contain 10.000 unique keys */
select create_range_partitions('test', '(col->>''key'')::int8', 1, 10000, 10);

/* insert some data */
insert into test
select format('{"key": %s, "date": "%s", "value": "%s"}',
              i, current_date, md5(i::text))::jsonb
from generate_series(1, 10000 * 10) as g(i);

Expression's internals

Now that we've created a table, let's look at the contents of pathman_config:

select partrel, expr, parttype, range_interval from pathman_config;
 partrel |              expr               | parttype | range_interval
---------+---------------------------------+----------+----------------
 test    | ((col ->> 'key'::text))::bigint |        2 | 10000
(1 row)

As you can see, the expr column contains a textual representation of our partitioning expression. The expression's AST is also stored in pathman_config, but it's not supposed to be human-readable:

select cooked_expr from pathman_config;
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
cooked_expr | {COERCEVIAIO :arg {OPEXPR :opno 3477 :opfuncid 3214 :opresulttype 25 :opretset false :opcollid 100 :inputcollid 100 :args ({VAR :varno 1 :varattno 1 :vartype 3802 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 9} {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull false :location 15 :constvalue 7 [ 28 0 0 0 107 101 121 ]}) :location 12} :resulttype 20 :resultcollid 0 :coerceformat 1 :location 21}

Partitions

Now let's take a glance at partitions:

table pathman_partition_list ;
 parent | partition | parttype |              expr               | range_min | range_max
--------+-----------+----------+---------------------------------+-----------+-----------
 test   | test_1    |        2 | ((col ->> 'key'::text))::bigint | 1         | 10001
 test   | test_2    |        2 | ((col ->> 'key'::text))::bigint | 10001     | 20001
 test   | test_3    |        2 | ((col ->> 'key'::text))::bigint | 20001     | 30001
 test   | test_4    |        2 | ((col ->> 'key'::text))::bigint | 30001     | 40001
 test   | test_5    |        2 | ((col ->> 'key'::text))::bigint | 40001     | 50001
 test   | test_6    |        2 | ((col ->> 'key'::text))::bigint | 50001     | 60001
 test   | test_7    |        2 | ((col ->> 'key'::text))::bigint | 60001     | 70001
 test   | test_8    |        2 | ((col ->> 'key'::text))::bigint | 70001     | 80001
 test   | test_9    |        2 | ((col ->> 'key'::text))::bigint | 80001     | 90001
 test   | test_10   |        2 | ((col ->> 'key'::text))::bigint | 90001     | 100001
(10 rows)

Each partition has a tricky check constraint that stores partition's configuration (bounds / hash + number):

select conname, pg_get_constraintdef(oid) condef from pg_constraint where conrelid = 'test_1'::regclass;
-[ RECORD 1 ]---------------------------------------------------------------------------------------------------------------
conname | pathman_test_1_check
condef  | CHECK (((((col ->> 'key'::text))::bigint >= '1'::bigint) AND (((col ->> 'key'::text))::bigint < '10001'::bigint)))

Sample queries

It's relatively easy to figure out whether pg_pathman is going to optimize a query... provided that expression consists of a single column:

create table abc(val int not null);
select create_hash_partitions('abc', 'val', 5); /* partition by 'val' */

explain select * from abc where val = 1;
                         QUERY PLAN
-------------------------------------------------------------
 Append  (cost=0.00..41.88 rows=13 width=4)
   ->  Seq Scan on abc_0  (cost=0.00..41.88 rows=13 width=4)
         Filter: (val = 1)
(3 rows)

Things get much more complicated once you use an intricate composite expression (like the one we've introduced in previous sections). pg_pathman won't be able to optimize a query unless we reference the exact same expression we used to create partitions:

explain select * from test where ((col ->> 'key'::text))::int = 4;
                           QUERY PLAN
-----------------------------------------------------------------
 Append  (cost=0.00..6620.00 rows=500 width=32)
   ->  Seq Scan on test_1  (cost=0.00..662.00 rows=50 width=91)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_2  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_3  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_4  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_5  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_6  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_7  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_8  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_9  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_10  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
(21 rows)

Something is wrong, isn't int? Looks like we used a wrong expression:

select expr from pathman_config where partrel = 'test'::regclass;
              expr
---------------------------------
 ((col ->> 'key'::text))::bigint   /* we used ((col ->> 'key'::text))::integer */
(1 row)

Everything works just as expected once we change the expression:

explain select * from test where ((col ->> 'key'::text))::bigint = 4;
                           QUERY PLAN
----------------------------------------------------------------
 Append  (cost=0.00..662.00 rows=50 width=32)
   ->  Seq Scan on test_1  (cost=0.00..662.00 rows=50 width=91)
         Filter: (((col ->> 'key'::text))::bigint = 4)
(3 rows)

Limitations

There are several limitations:

  • Expression must reference at least one column of partitioned table;
  • All referenced columns must be marked NOT NULL;
  • Expression may not reference system attributes (oid, xmin, xmax etc);
  • Expression may not include sub-queries;
  • All functions used by expression must be marked IMMUTABLE;
Clone this wiki locally
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy