Skip to content

Commit db1fdc9

Browse files
committed
Fix failure to detoast fields in composite elements of structured types.
If we have an array of records stored on disk, the individual record fields cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are only prepared to deal with TOAST pointers appearing in top-level fields of a stored row. The same applies for ranges over composite types, nested composites, etc. However, the existing code only took care of expanding sub-field TOAST pointers for the case of nested composites, not for other structured types containing composites. For example, given a command such as UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ... where x is a direct reference to a field of an on-disk tuple, if that field is long enough to be toasted out-of-line then the TOAST pointer would be inserted as-is into the array column. If the source record for x is later deleted, the array field value would become a dangling pointer, leading to errors along the line of "missing chunk number 0 for toast value ..." when the value is referenced. A reproducible test case for this was provided by Jan Pecek, but it seems likely that some of the "missing chunk number" reports we've heard in the past were caused by similar issues. Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to produce a self-contained Datum value if the Datum is of composite type. Seen in this light, the problem is not just confined to arrays and ranges, but could also affect some other places where detoasting is done in that way, for example form_index_tuple(). I tried teaching the array code to apply toast_flatten_tuple_attribute() along with PG_DETOAST_DATUM() when the array element type is composite, but this was messy and imposed extra cache lookup costs whether or not any TOAST pointers were present, indeed sometimes when the array element type isn't even composite (since sometimes it takes a typcache lookup to find that out). The idea of extending that approach to all the places that currently use PG_DETOAST_DATUM() wasn't attractive at all. This patch instead solves the problem by decreeing that composite Datum values must not contain any out-of-line TOAST pointers in the first place; that is, we expand out-of-line fields at the point of constructing a composite Datum, not at the point where we're about to insert it into a larger tuple. This rule is applied only to true composite Datums, not to tuples that are being passed around the system as tuples, so it's not as invasive as it might sound at first. With this approach, the amount of code that has to be touched for a full solution is greatly reduced, and added cache lookup costs are avoided except when there actually is a TOAST pointer that needs to be inlined. The main drawback of this approach is that we might sometimes dereference a TOAST pointer that will never actually be used by the query, imposing a rather large cost that wasn't there before. On the other side of the coin, if the field value is used multiple times then we'll come out ahead by avoiding repeat detoastings. Experimentation suggests that common SQL coding patterns are unaffected either way, though. Applications that are very negatively affected could be advised to modify their code to not fetch columns they won't be using. In future, we might consider reverting this solution in favor of detoasting only at the point where data is about to be stored to disk, using some method that can drill down into multiple levels of nested structured types. That will require defining new APIs for structured types, though, so it doesn't seem feasible as a back-patchable fix. Note that this patch changes HeapTupleGetDatum() from a macro to a function call; this means that any third-party code using that macro will not get protection against creating TOAST-pointer-containing Datums until it's recompiled. The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER(). It seems likely that this is not a big problem in practice: most of the tuple-returning functions in core and contrib produce outputs that could not possibly be toasted anyway, and the same probably holds for third-party extensions. This bug has existed since TOAST was invented, so back-patch to all supported branches.
1 parent 3897ee9 commit db1fdc9

File tree

15 files changed

+235
-150
lines changed

15 files changed

+235
-150
lines changed

src/backend/access/common/heaptuple.c

Lines changed: 43 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -618,6 +618,41 @@ heap_copytuple_with_tuple(HeapTuple src, HeapTuple dest)
618618
memcpy((char *) dest->t_data, (char *) src->t_data, src->t_len);
619619
}
620620

621+
/* ----------------
622+
* heap_copy_tuple_as_datum
623+
*
624+
* copy a tuple as a composite-type Datum
625+
* ----------------
626+
*/
627+
Datum
628+
heap_copy_tuple_as_datum(HeapTuple tuple, TupleDesc tupleDesc)
629+
{
630+
HeapTupleHeader td;
631+
632+
/*
633+
* If the tuple contains any external TOAST pointers, we have to inline
634+
* those fields to meet the conventions for composite-type Datums.
635+
*/
636+
if (HeapTupleHasExternal(tuple))
637+
return toast_flatten_tuple_to_datum(tuple->t_data,
638+
tuple->t_len,
639+
tupleDesc);
640+
641+
/*
642+
* Fast path for easy case: just make a palloc'd copy and insert the
643+
* correct composite-Datum header fields (since those may not be set if
644+
* the given tuple came from disk, rather than from heap_form_tuple).
645+
*/
646+
td = (HeapTupleHeader) palloc(tuple->t_len);
647+
memcpy((char *) td, (char *) tuple->t_data, tuple->t_len);
648+
649+
HeapTupleHeaderSetDatumLength(td, tuple->t_len);
650+
HeapTupleHeaderSetTypeId(td, tupleDesc->tdtypeid);
651+
HeapTupleHeaderSetTypMod(td, tupleDesc->tdtypmod);
652+
653+
return PointerGetDatum(td);
654+
}
655+
621656
/*
622657
* heap_form_tuple
623658
* construct a tuple from the given values[] and isnull[] arrays,
@@ -636,7 +671,6 @@ heap_form_tuple(TupleDesc tupleDescriptor,
636671
data_len;
637672
int hoff;
638673
bool hasnull = false;
639-
Form_pg_attribute *att = tupleDescriptor->attrs;
640674
int numberOfAttributes = tupleDescriptor->natts;
641675
int i;
642676

@@ -647,28 +681,14 @@ heap_form_tuple(TupleDesc tupleDescriptor,
647681
numberOfAttributes, MaxTupleAttributeNumber)));
648682

649683
/*
650-
* Check for nulls and embedded tuples; expand any toasted attributes in
651-
* embedded tuples. This preserves the invariant that toasting can only
652-
* go one level deep.
653-
*
654-
* We can skip calling toast_flatten_tuple_attribute() if the attribute
655-
* couldn't possibly be of composite type. All composite datums are
656-
* varlena and have alignment 'd'; furthermore they aren't arrays. Also,
657-
* if an attribute is already toasted, it must have been sent to disk
658-
* already and so cannot contain toasted attributes.
684+
* Check for nulls
659685
*/
660686
for (i = 0; i < numberOfAttributes; i++)
661687
{
662688
if (isnull[i])
663-
hasnull = true;
664-
else if (att[i]->attlen == -1 &&
665-
att[i]->attalign == 'd' &&
666-
att[i]->attndims == 0 &&
667-
!VARATT_IS_EXTENDED(DatumGetPointer(values[i])))
668689
{
669-
values[i] = toast_flatten_tuple_attribute(values[i],
670-
att[i]->atttypid,
671-
att[i]->atttypmod);
690+
hasnull = true;
691+
break;
672692
}
673693
}
674694

@@ -698,7 +718,8 @@ heap_form_tuple(TupleDesc tupleDescriptor,
698718

699719
/*
700720
* And fill in the information. Note we fill the Datum fields even though
701-
* this tuple may never become a Datum.
721+
* this tuple may never become a Datum. This lets HeapTupleHeaderGetDatum
722+
* identify the tuple type if needed.
702723
*/
703724
tuple->t_len = len;
704725
ItemPointerSetInvalid(&(tuple->t_self));
@@ -1388,7 +1409,6 @@ heap_form_minimal_tuple(TupleDesc tupleDescriptor,
13881409
data_len;
13891410
int hoff;
13901411
bool hasnull = false;
1391-
Form_pg_attribute *att = tupleDescriptor->attrs;
13921412
int numberOfAttributes = tupleDescriptor->natts;
13931413
int i;
13941414

@@ -1399,28 +1419,14 @@ heap_form_minimal_tuple(TupleDesc tupleDescriptor,
13991419
numberOfAttributes, MaxTupleAttributeNumber)));
14001420

14011421
/*
1402-
* Check for nulls and embedded tuples; expand any toasted attributes in
1403-
* embedded tuples. This preserves the invariant that toasting can only
1404-
* go one level deep.
1405-
*
1406-
* We can skip calling toast_flatten_tuple_attribute() if the attribute
1407-
* couldn't possibly be of composite type. All composite datums are
1408-
* varlena and have alignment 'd'; furthermore they aren't arrays. Also,
1409-
* if an attribute is already toasted, it must have been sent to disk
1410-
* already and so cannot contain toasted attributes.
1422+
* Check for nulls
14111423
*/
14121424
for (i = 0; i < numberOfAttributes; i++)
14131425
{
14141426
if (isnull[i])
1415-
hasnull = true;
1416-
else if (att[i]->attlen == -1 &&
1417-
att[i]->attalign == 'd' &&
1418-
att[i]->attndims == 0 &&
1419-
!VARATT_IS_EXTENDED(values[i]))
14201427
{
1421-
values[i] = toast_flatten_tuple_attribute(values[i],
1422-
att[i]->atttypid,
1423-
att[i]->atttypmod);
1428+
hasnull = true;
1429+
break;
14241430
}
14251431
}
14261432

src/backend/access/common/indextuple.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,11 @@ index_form_tuple(TupleDesc tupleDescriptor,
158158
if (tupmask & HEAP_HASVARWIDTH)
159159
infomask |= INDEX_VAR_MASK;
160160

161+
/* Also assert we got rid of external attributes */
162+
#ifdef TOAST_INDEX_HACK
163+
Assert((tupmask & HEAP_HASEXTERNAL) == 0);
164+
#endif
165+
161166
/*
162167
* Here we make sure that the size will fit in the field reserved for it
163168
* in t_info.

src/backend/access/heap/tuptoaster.c

Lines changed: 44 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -944,6 +944,9 @@ toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
944944
*
945945
* "Flatten" a tuple to contain no out-of-line toasted fields.
946946
* (This does not eliminate compressed or short-header datums.)
947+
*
948+
* Note: we expect the caller already checked HeapTupleHasExternal(tup),
949+
* so there is no need for a short-circuit path.
947950
* ----------
948951
*/
949952
HeapTuple
@@ -1021,59 +1024,61 @@ toast_flatten_tuple(HeapTuple tup, TupleDesc tupleDesc)
10211024

10221025

10231026
/* ----------
1024-
* toast_flatten_tuple_attribute -
1027+
* toast_flatten_tuple_to_datum -
1028+
*
1029+
* "Flatten" a tuple containing out-of-line toasted fields into a Datum.
1030+
* The result is always palloc'd in the current memory context.
1031+
*
1032+
* We have a general rule that Datums of container types (rows, arrays,
1033+
* ranges, etc) must not contain any external TOAST pointers. Without
1034+
* this rule, we'd have to look inside each Datum when preparing a tuple
1035+
* for storage, which would be expensive and would fail to extend cleanly
1036+
* to new sorts of container types.
1037+
*
1038+
* However, we don't want to say that tuples represented as HeapTuples
1039+
* can't contain toasted fields, so instead this routine should be called
1040+
* when such a HeapTuple is being converted into a Datum.
10251041
*
1026-
* If a Datum is of composite type, "flatten" it to contain no toasted fields.
1027-
* This must be invoked on any potentially-composite field that is to be
1028-
* inserted into a tuple. Doing this preserves the invariant that toasting
1029-
* goes only one level deep in a tuple.
1042+
* While we're at it, we decompress any compressed fields too. This is not
1043+
* necessary for correctness, but reflects an expectation that compression
1044+
* will be more effective if applied to the whole tuple not individual
1045+
* fields. We are not so concerned about that that we want to deconstruct
1046+
* and reconstruct tuples just to get rid of compressed fields, however.
1047+
* So callers typically won't call this unless they see that the tuple has
1048+
* at least one external field.
10301049
*
1031-
* Note that flattening does not mean expansion of short-header varlenas,
1032-
* so in one sense toasting is allowed within composite datums.
1050+
* On the other hand, in-line short-header varlena fields are left alone.
1051+
* If we "untoasted" them here, they'd just get changed back to short-header
1052+
* format anyway within heap_fill_tuple.
10331053
* ----------
10341054
*/
10351055
Datum
1036-
toast_flatten_tuple_attribute(Datum value,
1037-
Oid typeId, int32 typeMod)
1056+
toast_flatten_tuple_to_datum(HeapTupleHeader tup,
1057+
uint32 tup_len,
1058+
TupleDesc tupleDesc)
10381059
{
1039-
TupleDesc tupleDesc;
1040-
HeapTupleHeader olddata;
10411060
HeapTupleHeader new_data;
10421061
int32 new_header_len;
10431062
int32 new_data_len;
10441063
int32 new_tuple_len;
10451064
HeapTupleData tmptup;
1046-
Form_pg_attribute *att;
1047-
int numAttrs;
1065+
Form_pg_attribute *att = tupleDesc->attrs;
1066+
int numAttrs = tupleDesc->natts;
10481067
int i;
1049-
bool need_change = false;
10501068
bool has_nulls = false;
10511069
Datum toast_values[MaxTupleAttributeNumber];
10521070
bool toast_isnull[MaxTupleAttributeNumber];
10531071
bool toast_free[MaxTupleAttributeNumber];
10541072

1055-
/*
1056-
* See if it's a composite type, and get the tupdesc if so.
1057-
*/
1058-
tupleDesc = lookup_rowtype_tupdesc_noerror(typeId, typeMod, true);
1059-
if (tupleDesc == NULL)
1060-
return value; /* not a composite type */
1061-
1062-
att = tupleDesc->attrs;
1063-
numAttrs = tupleDesc->natts;
1064-
1065-
/*
1066-
* Break down the tuple into fields.
1067-
*/
1068-
olddata = DatumGetHeapTupleHeader(value);
1069-
Assert(typeId == HeapTupleHeaderGetTypeId(olddata));
1070-
Assert(typeMod == HeapTupleHeaderGetTypMod(olddata));
10711073
/* Build a temporary HeapTuple control structure */
1072-
tmptup.t_len = HeapTupleHeaderGetDatumLength(olddata);
1074+
tmptup.t_len = tup_len;
10731075
ItemPointerSetInvalid(&(tmptup.t_self));
10741076
tmptup.t_tableOid = InvalidOid;
1075-
tmptup.t_data = olddata;
1077+
tmptup.t_data = tup;
10761078

1079+
/*
1080+
* Break down the tuple into fields.
1081+
*/
10771082
Assert(numAttrs <= MaxTupleAttributeNumber);
10781083
heap_deform_tuple(&tmptup, tupleDesc, toast_values, toast_isnull);
10791084

@@ -1097,20 +1102,10 @@ toast_flatten_tuple_attribute(Datum value,
10971102
new_value = heap_tuple_untoast_attr(new_value);
10981103
toast_values[i] = PointerGetDatum(new_value);
10991104
toast_free[i] = true;
1100-
need_change = true;
11011105
}
11021106
}
11031107
}
11041108

1105-
/*
1106-
* If nothing to untoast, just return the original tuple.
1107-
*/
1108-
if (!need_change)
1109-
{
1110-
ReleaseTupleDesc(tupleDesc);
1111-
return value;
1112-
}
1113-
11141109
/*
11151110
* Calculate the new size of the tuple.
11161111
*
@@ -1119,7 +1114,7 @@ toast_flatten_tuple_attribute(Datum value,
11191114
new_header_len = offsetof(HeapTupleHeaderData, t_bits);
11201115
if (has_nulls)
11211116
new_header_len += BITMAPLEN(numAttrs);
1122-
if (olddata->t_infomask & HEAP_HASOID)
1117+
if (tup->t_infomask & HEAP_HASOID)
11231118
new_header_len += sizeof(Oid);
11241119
new_header_len = MAXALIGN(new_header_len);
11251120
new_data_len = heap_compute_data_size(tupleDesc,
@@ -1131,14 +1126,16 @@ toast_flatten_tuple_attribute(Datum value,
11311126
/*
11321127
* Copy the existing tuple header, but adjust natts and t_hoff.
11331128
*/
1134-
memcpy(new_data, olddata, offsetof(HeapTupleHeaderData, t_bits));
1129+
memcpy(new_data, tup, offsetof(HeapTupleHeaderData, t_bits));
11351130
HeapTupleHeaderSetNatts(new_data, numAttrs);
11361131
new_data->t_hoff = new_header_len;
1137-
if (olddata->t_infomask & HEAP_HASOID)
1138-
HeapTupleHeaderSetOid(new_data, HeapTupleHeaderGetOid(olddata));
1132+
if (tup->t_infomask & HEAP_HASOID)
1133+
HeapTupleHeaderSetOid(new_data, HeapTupleHeaderGetOid(tup));
11391134

1140-
/* Reset the datum length field, too */
1135+
/* Set the composite-Datum header fields correctly */
11411136
HeapTupleHeaderSetDatumLength(new_data, new_tuple_len);
1137+
HeapTupleHeaderSetTypeId(new_data, tupleDesc->tdtypeid);
1138+
HeapTupleHeaderSetTypMod(new_data, tupleDesc->tdtypmod);
11421139

11431140
/* Copy over the data, and fill the null bitmap if needed */
11441141
heap_fill_tuple(tupleDesc,
@@ -1155,7 +1152,6 @@ toast_flatten_tuple_attribute(Datum value,
11551152
for (i = 0; i < numAttrs; i++)
11561153
if (toast_free[i])
11571154
pfree(DatumGetPointer(toast_values[i]));
1158-
ReleaseTupleDesc(tupleDesc);
11591155

11601156
return PointerGetDatum(new_data);
11611157
}

src/backend/executor/execQual.c

Lines changed: 9 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -888,8 +888,6 @@ ExecEvalWholeRowFast(WholeRowVarExprState *wrvstate, ExprContext *econtext,
888888
{
889889
Var *variable = (Var *) wrvstate->xprstate.expr;
890890
TupleTableSlot *slot;
891-
HeapTuple tuple;
892-
TupleDesc tupleDesc;
893891
HeapTupleHeader dtuple;
894892

895893
if (isDone)
@@ -917,32 +915,20 @@ ExecEvalWholeRowFast(WholeRowVarExprState *wrvstate, ExprContext *econtext,
917915
if (wrvstate->wrv_junkFilter != NULL)
918916
slot = ExecFilterJunk(wrvstate->wrv_junkFilter, slot);
919917

920-
tuple = ExecFetchSlotTuple(slot);
921-
tupleDesc = slot->tts_tupleDescriptor;
922-
923918
/*
924-
* We have to make a copy of the tuple so we can safely insert the Datum
925-
* overhead fields, which are not set in on-disk tuples.
919+
* Copy the slot tuple and make sure any toasted fields get detoasted.
926920
*/
927-
dtuple = (HeapTupleHeader) palloc(tuple->t_len);
928-
memcpy((char *) dtuple, (char *) tuple->t_data, tuple->t_len);
929-
930-
HeapTupleHeaderSetDatumLength(dtuple, tuple->t_len);
921+
dtuple = DatumGetHeapTupleHeader(ExecFetchSlotTupleDatum(slot));
931922

932923
/*
933-
* If the Var identifies a named composite type, label the tuple with that
934-
* type; otherwise use what is in the tupleDesc.
924+
* If the Var identifies a named composite type, label the datum with that
925+
* type; otherwise we'll use the slot's info.
935926
*/
936927
if (variable->vartype != RECORDOID)
937928
{
938929
HeapTupleHeaderSetTypeId(dtuple, variable->vartype);
939930
HeapTupleHeaderSetTypMod(dtuple, variable->vartypmod);
940931
}
941-
else
942-
{
943-
HeapTupleHeaderSetTypeId(dtuple, tupleDesc->tdtypeid);
944-
HeapTupleHeaderSetTypMod(dtuple, tupleDesc->tdtypmod);
945-
}
946932

947933
return PointerGetDatum(dtuple);
948934
}
@@ -1017,13 +1003,13 @@ ExecEvalWholeRowSlow(WholeRowVarExprState *wrvstate, ExprContext *econtext,
10171003
}
10181004

10191005
/*
1020-
* We have to make a copy of the tuple so we can safely insert the Datum
1021-
* overhead fields, which are not set in on-disk tuples.
1006+
* Copy the slot tuple and make sure any toasted fields get detoasted.
10221007
*/
1023-
dtuple = (HeapTupleHeader) palloc(tuple->t_len);
1024-
memcpy((char *) dtuple, (char *) tuple->t_data, tuple->t_len);
1008+
dtuple = DatumGetHeapTupleHeader(ExecFetchSlotTupleDatum(slot));
10251009

1026-
HeapTupleHeaderSetDatumLength(dtuple, tuple->t_len);
1010+
/*
1011+
* Reset datum's type ID fields to match the Var.
1012+
*/
10271013
HeapTupleHeaderSetTypeId(dtuple, variable->vartype);
10281014
HeapTupleHeaderSetTypMod(dtuple, variable->vartypmod);
10291015

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy