next up previous contents index
Next: 6. Analysis of Discriminants Up: II. Second Part: Semantic Previous: 4. Scopes and Visibility   Contents   Index

Subsections


5. Overload Resolution

Ada supports the overloading of subprograms and operators [AAR95, Section 8.3]. This means that an occurrence of an identifier or operator (a designator) may denote several entities that are simultaneously visible at that point. Overloaded subprograms must differ in at least one of the following respects: (1) whether the subprogram is a procedure or function, (2) if the subprogram is a function, its result type, (3) the number of parameters, (4) the type of each parameter. These properties are collectively called the subprogram's parameter-and-result-type profile. For example, the following operations can be visible at the same point:

\begin{figure}\begin{lstlisting}{}
function Op (x : Integer) return Integer; -- ...
...loat; -- (Op4)
procedure Op (x : Integer); -- (Op5)
\end{lstlisting}\end{figure}

Note that operations that only differ in the names of the formal parameters, but not their types, cannot be visible at the same point: either one hides the other through of scope and visibility rules, or else the declarations are illegal.

This notion is familiar from most other programming languages, where overloading is most commonly applied to operators. Overloading resolution is the process of selecting among the visible entities at a point (the candidate interpretations of the designator) the unique one that is compatible with the context in which it appears. In the simplest case, operators are resolved from the type of their operands: (A+B) denotes an integer addition or a floating-point addition, depending on the types of A and B. This simple rule applies to subprograms as well, where a candidate interpretation is retained if the types of the actuals in the call are compatible with the formals of the subprogram.

Ada, unlike many other programming languages, also uses the context of the call to resolve an overloaded name [AAR95, Section 8.6]. For example, a program may declare several functions whose signature differs only in their return type (cases (1) and (4) above). A consequence of the use of context is that type checking of expressions and the identification of operators cannot be performed in a single bottom-up traversal of the expression, as is the case for most other imperative languages. Instead, a two-pass algorithm is required (Algol68 had similar resolution rules, and also required a multi-pass resolution algorithm).

In the rest of this chapter we present the two-pass resolution algorithm implemented in GNAT. We first give a general description of the algorithm. Later we complete this description with additional details, and the data structures used to store the interpretations of an overloaded identifier.


5.1 Resolution Algorithm

The resolution algorithm has two main steps:

  1. In a first step, we attach to each identifier the set of its candidate interpretations, and proceed to select from this set those interpretations that are compatible with the types of their arguments. If such a set has a single element, the designator is unambiguous.

    This upward pass is performed over each complete context, defined as a language construct over which overload resolution must be performed without examining a larger portion of the program. Each of the following constructs is a complete context: a context item, a declarative_item or declaration, a statement, a pragma_argument_association, and the expression of a case_statement [AAR95, Section 8.6(4-9)]. Procedure calls and assignment statements are thus examples of complete contexts.

  2. In the second pass, type information is imposed on the complete context to select a single interpretation for each component of the construct. For example, the right-hand side of an assignment may be an overloaded function call, but if the type of the left-hand side is unique, it will identify uniquely the function being called. Once the function is known, the types of its formal parameters are used to resolve the possibly overloaded actuals in the call, and so on.

The problem of overload resolution is best explained by a simple example (a detailed description of this problem as well as a formal specification of the algorithm can be found in [vK87, Section 4.5]). Let us assume the following functions:

\begin{figure}\begin{lstlisting}{}
declare
function F (A, B : T1) return T is ....
... Var2, Res : T2;
begin
Res := F (Var1, Var2);
end;
\end{lstlisting}\end{figure}

The type of the expression F(Var1,Var2) cannot be deduced from the function call itself; the context in which it appears must be taken into account. In the first scan, the bottom-up analysis restricts at each node in the expression subtree the set of operators by discarding those operators whose formal-parameter types are inconsistent with the types of the actual-parameter expressions. In this example, the second and third version of F are valid candidate, so the processing attaches the set {F2, F3} to the occurrence of F in the call. In the second pass, the top-down analysis restricts the set by discarding any operator or function whose result type is inconsistent with the context. In our example, the context requires the result type T2. Taking this into account, the top-down scan reduces the set of applicable functions to the F from the third definition.

5.1.1 Additional Details for the Bottom-Up Pass

The first pass can also select candidate interpretations on the basis of named parameter associations. Consider the two declarations:

\begin{figure}\begin{lstlisting}{}
function F (A: Integer; B: Integer := 0) retu...
...F1)
function F (C: Integer) return Integer; -- (F2)
\end{lstlisting}\end{figure}

These two functions have different parameter profiles, but the call F(5) is ambiguous, regardless of the context: it can mean F(5,0) or F(5). However, it is possible to write F(C=$>$5) to resolve the ambiguity because the named notation indicates that only F2 is a candidate interpretation.

The details of overload resolution are complicated by construct-specific rules that apply to different complete contexts and also to constituents of a context, and finally by the existence of resolution rules that specify not a single type, but a class of types. For example:

  1. The bounds of an integer type declaration can be of any integer type.
  2. The condition in an if-statement can be of any boolean type.
  3. The operands of an equality operator must be non-limited.

Note that in the course of overload resolution, expressions themselves can be seen as overloaded, in the sense that they have more than one candidate type. For example, a selected component can be overloaded if its prefix is overloaded.

The bottom-up pass labels the subtree as follows:

In GNAT, all predefined operators are declared only once, in the internal representation of package Standard. If the operands of an operator are overloaded, different interpretations correspond to different types but the name of the operator itself is unique. The absence of explicit operators for each type complicates somewhat type-checking, but represents a substantial saving in space and performance. Operators have the following characteristics:

  1. The types of both operands must be the same.
  2. The context does not determine the expected type of the operands, given that it only specifies that the result must be of type Boolean.

Consider the expression f(x)=g(y), where both f and g are overloaded functions, and there are no user-defined equality operations, that is to say only the predefined equality is visible. Bottom-up analysis determines the possible interpretations of each call. We then look for types that appear in both sets of interpretations, that is to say we take the intersection of the candidate types of both operands. If this intersection contains more than one type, the construct is ambiguous regardless of context. If it contains a single type, the equality operation has a single interpretation with a boolean type.

In this section we have mentioned some typical resolution rules for expressions. Additional details can be found in the front-end packages Sem_Ch4 and Sem_Type.


5.1.2 Data Structures

The semantic analyzer stores the interpretations of an identifier in a global overloads table. An interpretation consists of a pair (identifier, type). Expressions can also be overloaded, but do not have a name, so the corresponding interpretations are pairs that just repeat the type. Standard iterator primitives: Get_First_Interp and Get_Next_Interp, are used systematically in the analysis and resolution of overloaded constructs. A common idiom in type processing consists in checking whether a given overloaded node can be used in a given context. The routine Has_Compatible_Type (T, N) iterates over the interpretations of N and yields True if one of those has type T. Details can be found in the front-end package Sem_Type.


5.1.3 Additional Details for the Top-Down Pass

The second pass of type resolution traverses the AST from root to leaves, and propagates the type information imposed by the context to each subcomponent of the context. For example, given the following:

\begin{figure}\begin{lstlisting}{}
declare
procedure P (X : Float; Y : Integer)...
...urn Float; -- F2
...
begin
P (F (5.0), 2.2);
end;
\end{lstlisting}\end{figure}

The bottom-up pass determines that the call to F has the set of possible interpretations: {F1,F2}. Similarly, the identifier P is either P1 or P2. The analysis of the call itself selects P2, which in turns determines that the type of the first actual must be Float. This type information is then applied to the call F(5.0) to select F2.

Each language construct has a corresponding subprogram in the front-end package Sem_Res, that applies the context information to the constituents of the construct. For example, consider an overloaded indexed expression:



F(G(x))(H(y))


...where F, G and H are overloaded function calls. Once the context type CT is known, we iterate over the interpretations of F and retain the one that returns an array type whose component type is CT. Once a unique interpretation of F is known, its formal parameter provides the unique type to resolve G(x), and the component type of its return type is used to resolve H(y).

The language displays a syntactic ambiguity which requires special processing. Consider the following declarations:

\begin{figure}\begin{lstlisting}{}
type Vector is array (Integer range <>) of I...
...er;
function F (X : Integer := 10) return Vector;
\end{lstlisting}\end{figure}

The expression F(5) has two possible interpretations: a function call with parameter 5 that returns a vector, i.e. F(X=>5), or the indexing of a parameterless function call that uses the default value, i.e. F(x=>10)(5). These two interpretations have different AST's, and the proper interpretation is obtained from context: the expected type of the call is either a Vector or an Integer. However, it is awkward to carry two different trees for this construct. Rather, the parser builds the first interpretation, and the resolution of function calls looks for this particular case (see details in Analyze_Call and Resolve_Call.

5.2 Summary

Resolution of overloaded subprograms and operators requires a two-pass algorithm. The first pass attaches to each identifier the set of its candidate interpretations with interpretations are compatible with the types of their arguments. If such a set has a single element, the designator is unambiguous. In the second pass, type information is imposed on the complete context to select a single interpretation for each component of the construct. This analysis is performed over each complete context, defined in Ada as a language construct over which overload resolution must be performed without examining a larger portion of the program.


next up previous contents index
Next: 6. Analysis of Discriminants Up: II. Second Part: Semantic Previous: 4. Scopes and Visibility   Contents   Index
(C) Javier Miranda and Edmond Schonberg, 2004