Skip to content

Commit 1a8b9fb

Browse files
committed
Extend the unknowns-are-same-as-known-inputs type resolution heuristic.
For a very long time, one of the parser's heuristics for resolving ambiguous operator calls has been to assume that unknown-type literals are of the same type as the other input (if it's known). However, this was only used in the first step of quickly checking for an exact-types match, and thus did not help in resolving matches that require coercion, such as matches to polymorphic operators. As we add more polymorphic operators, this becomes more of a problem. This patch adds another use of the same heuristic as a last-ditch check before failing to resolve an ambiguous operator or function call. In particular this will let us define the range inclusion operator in a less limited way (to come in a follow-on patch).
1 parent bf4f96b commit 1a8b9fb

File tree

2 files changed

+147
-26
lines changed

2 files changed

+147
-26
lines changed

doc/src/sgml/typeconv.sgml

Lines changed: 48 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -304,13 +304,18 @@ without more clues. Now discard
304304
candidates that do not accept the selected type category. Furthermore,
305305
if any candidate accepts a preferred type in that category,
306306
discard candidates that accept non-preferred types for that argument.
307+
Keep all candidates if none survive these tests.
308+
If only one candidate remains, use it; else continue to the next step.
307309
</para>
308310
</step>
309311
<step performance="required">
310312
<para>
311-
If only one candidate remains, use it. If no candidate or more than one
312-
candidate remains,
313-
then fail.
313+
If there are both <type>unknown</type> and known-type arguments, and all
314+
the known-type arguments have the same type, assume that the
315+
<type>unknown</type> arguments are also of that type, and check which
316+
candidates can accept that type at the <type>unknown</type>-argument
317+
positions. If exactly one candidate passes this test, use it.
318+
Otherwise, fail.
314319
</para>
315320
</step>
316321
</substeps>
@@ -376,7 +381,7 @@ be interpreted as type <type>text</type>.
376381
</para>
377382

378383
<para>
379-
Here is a concatenation on unspecified types:
384+
Here is a concatenation of two values of unspecified types:
380385
<screen>
381386
SELECT 'abc' || 'def' AS "unspecified";
382387

@@ -394,7 +399,7 @@ and finds that there are candidates accepting both string-category and
394399
bit-string-category inputs. Since string category is preferred when available,
395400
that category is selected, and then the
396401
preferred type for strings, <type>text</type>, is used as the specific
397-
type to resolve the unknown literals as.
402+
type to resolve the unknown-type literals as.
398403
</para>
399404
</example>
400405

@@ -450,6 +455,36 @@ SELECT ~ CAST('20' AS int8) AS "negation";
450455
</para>
451456
</example>
452457

458+
<example>
459+
<title>Array Inclusion Operator Type Resolution</title>
460+
461+
<para>
462+
Here is another example of resolving an operator with one known and one
463+
unknown input:
464+
<screen>
465+
SELECT array[1,2] &lt;@ '{1,2,3}' as "is subset";
466+
467+
is subset
468+
-----------
469+
t
470+
(1 row)
471+
</screen>
472+
The <productname>PostgreSQL</productname> operator catalog has several
473+
entries for the infix operator <literal>&lt;@</>, but the only two that
474+
could possibly accept an integer array on the left-hand side are
475+
array inclusion (<type>anyarray</> <literal>&lt;@</> <type>anyarray</>)
476+
and range inclusion (<type>anyelement</> <literal>&lt;@</> <type>anyrange</>).
477+
Since none of these polymorphic pseudo-types (see <xref
478+
linkend="datatype-pseudo">) are considered preferred, the parser cannot
479+
resolve the ambiguity on that basis. However, the last resolution rule tells
480+
it to assume that the unknown-type literal is of the same type as the other
481+
input, that is, integer array. Now only one of the two operators can match,
482+
so array inclusion is selected. (Had range inclusion been selected, we would
483+
have gotten an error, because the string does not have the right format to be
484+
a range literal.)
485+
</para>
486+
</example>
487+
453488
</sect1>
454489

455490
<sect1 id="typeconv-func">
@@ -594,13 +629,18 @@ the correct choice cannot be deduced without more clues.
594629
Now discard candidates that do not accept the selected type category.
595630
Furthermore, if any candidate accepts a preferred type in that category,
596631
discard candidates that accept non-preferred types for that argument.
632+
Keep all candidates if none survive these tests.
633+
If only one candidate remains, use it; else continue to the next step.
597634
</para>
598635
</step>
599636
<step performance="required">
600637
<para>
601-
If only one candidate remains, use it. If no candidate or more than one
602-
candidate remains,
603-
then fail.
638+
If there are both <type>unknown</type> and known-type arguments, and all
639+
the known-type arguments have the same type, assume that the
640+
<type>unknown</type> arguments are also of that type, and check which
641+
candidates can accept that type at the <type>unknown</type>-argument
642+
positions. If exactly one candidate passes this test, use it.
643+
Otherwise, fail.
604644
</para>
605645
</step>
606646
</substeps>

src/backend/parser/parse_func.c

Lines changed: 99 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -618,14 +618,16 @@ func_select_candidate(int nargs,
618618
Oid *input_typeids,
619619
FuncCandidateList candidates)
620620
{
621-
FuncCandidateList current_candidate;
622-
FuncCandidateList last_candidate;
621+
FuncCandidateList current_candidate,
622+
first_candidate,
623+
last_candidate;
623624
Oid *current_typeids;
624625
Oid current_type;
625626
int i;
626627
int ncandidates;
627628
int nbestMatch,
628-
nmatch;
629+
nmatch,
630+
nunknowns;
629631
Oid input_base_typeids[FUNC_MAX_ARGS];
630632
TYPCATEGORY slot_category[FUNC_MAX_ARGS],
631633
current_category;
@@ -651,9 +653,22 @@ func_select_candidate(int nargs,
651653
* take a domain as an input datatype. Such a function will be selected
652654
* over the base-type function only if it is an exact match at all
653655
* argument positions, and so was already chosen by our caller.
656+
*
657+
* While we're at it, count the number of unknown-type arguments for use
658+
* later.
654659
*/
660+
nunknowns = 0;
655661
for (i = 0; i < nargs; i++)
656-
input_base_typeids[i] = getBaseType(input_typeids[i]);
662+
{
663+
if (input_typeids[i] != UNKNOWNOID)
664+
input_base_typeids[i] = getBaseType(input_typeids[i]);
665+
else
666+
{
667+
/* no need to call getBaseType on UNKNOWNOID */
668+
input_base_typeids[i] = UNKNOWNOID;
669+
nunknowns++;
670+
}
671+
}
657672

658673
/*
659674
* Run through all candidates and keep those with the most matches on
@@ -749,14 +764,16 @@ func_select_candidate(int nargs,
749764
return candidates;
750765

751766
/*
752-
* Still too many candidates? Try assigning types for the unknown columns.
753-
*
754-
* NOTE: for a binary operator with one unknown and one non-unknown input,
755-
* we already tried the heuristic of looking for a candidate with the
756-
* known input type on both sides (see binary_oper_exact()). That's
757-
* essentially a special case of the general algorithm we try next.
767+
* Still too many candidates? Try assigning types for the unknown inputs.
758768
*
759-
* We do this by examining each unknown argument position to see if we can
769+
* If there are no unknown inputs, we have no more heuristics that apply,
770+
* and must fail.
771+
*/
772+
if (nunknowns == 0)
773+
return NULL; /* failed to select a best candidate */
774+
775+
/*
776+
* The next step examines each unknown argument position to see if we can
760777
* determine a "type category" for it. If any candidate has an input
761778
* datatype of STRING category, use STRING category (this bias towards
762779
* STRING is appropriate since unknown-type literals look like strings).
@@ -770,9 +787,9 @@ func_select_candidate(int nargs,
770787
* Having completed this examination, remove candidates that accept the
771788
* wrong category at any unknown position. Also, if at least one
772789
* candidate accepted a preferred type at a position, remove candidates
773-
* that accept non-preferred types.
774-
*
775-
* If we are down to one candidate at the end, we win.
790+
* that accept non-preferred types. If just one candidate remains,
791+
* return that one. However, if this rule turns out to reject all
792+
* candidates, keep them all instead.
776793
*/
777794
resolved_unknowns = false;
778795
for (i = 0; i < nargs; i++)
@@ -835,6 +852,7 @@ func_select_candidate(int nargs,
835852
{
836853
/* Strip non-matching candidates */
837854
ncandidates = 0;
855+
first_candidate = candidates;
838856
last_candidate = NULL;
839857
for (current_candidate = candidates;
840858
current_candidate != NULL;
@@ -874,15 +892,78 @@ func_select_candidate(int nargs,
874892
if (last_candidate)
875893
last_candidate->next = current_candidate->next;
876894
else
877-
candidates = current_candidate->next;
895+
first_candidate = current_candidate->next;
878896
}
879897
}
880-
if (last_candidate) /* terminate rebuilt list */
898+
899+
/* if we found any matches, restrict our attention to those */
900+
if (last_candidate)
901+
{
902+
candidates = first_candidate;
903+
/* terminate rebuilt list */
881904
last_candidate->next = NULL;
905+
}
906+
907+
if (ncandidates == 1)
908+
return candidates;
882909
}
883910

884-
if (ncandidates == 1)
885-
return candidates;
911+
/*
912+
* Last gasp: if there are both known- and unknown-type inputs, and all
913+
* the known types are the same, assume the unknown inputs are also that
914+
* type, and see if that gives us a unique match. If so, use that match.
915+
*
916+
* NOTE: for a binary operator with one unknown and one non-unknown input,
917+
* we already tried this heuristic in binary_oper_exact(). However, that
918+
* code only finds exact matches, whereas here we will handle matches that
919+
* involve coercion, polymorphic type resolution, etc.
920+
*/
921+
if (nunknowns < nargs)
922+
{
923+
Oid known_type = UNKNOWNOID;
924+
925+
for (i = 0; i < nargs; i++)
926+
{
927+
if (input_base_typeids[i] == UNKNOWNOID)
928+
continue;
929+
if (known_type == UNKNOWNOID) /* first known arg? */
930+
known_type = input_base_typeids[i];
931+
else if (known_type != input_base_typeids[i])
932+
{
933+
/* oops, not all match */
934+
known_type = UNKNOWNOID;
935+
break;
936+
}
937+
}
938+
939+
if (known_type != UNKNOWNOID)
940+
{
941+
/* okay, just one known type, apply the heuristic */
942+
for (i = 0; i < nargs; i++)
943+
input_base_typeids[i] = known_type;
944+
ncandidates = 0;
945+
last_candidate = NULL;
946+
for (current_candidate = candidates;
947+
current_candidate != NULL;
948+
current_candidate = current_candidate->next)
949+
{
950+
current_typeids = current_candidate->args;
951+
if (can_coerce_type(nargs, input_base_typeids, current_typeids,
952+
COERCION_IMPLICIT))
953+
{
954+
if (++ncandidates > 1)
955+
break; /* not unique, give up */
956+
last_candidate = current_candidate;
957+
}
958+
}
959+
if (ncandidates == 1)
960+
{
961+
/* successfully identified a unique match */
962+
last_candidate->next = NULL;
963+
return last_candidate;
964+
}
965+
}
966+
}
886967

887968
return NULL; /* failed to select a best candidate */
888969
} /* func_select_candidate() */

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy