Why notation matters?
Communication between the members of the various working groups on aspects of the new algorithmic language was important because the ALGOL effort was an international effort consisting of different people and groups. A clear and unambiguous understanding of the language in all its facets by all members participating in the effort was necessary to be able to develop and discuss the ALGOL language. Using English or any other natural language to describe a programming language was, and is, insufficient. Natural languages are too ambiguous to define formally or even describe a programming language completely.
A description of a programming language consists of a description of two related aspects of the language: the syntax and the semantics. The syntax of a language is about how to form a string in the language. The semantics is about the meaning of syntactically correct strings in the language. The most obvious way to define formally the semantics is an implementation of the language on some machine. This implementation then fixes the meaning of the language.
Such an implementation is not a workable description of a language. A typical implementation is too large to be understood easily. Often it is written in some low level programming language or even assembly language. In addition, a compiler is written for a particular machine for which the details should be known to understand the details of the implementation.
Fortunately, describing the syntax formally appeared to be an easier task than describing the semantics of a language. During the development of ALGOL a formal notation was invented to describe the syntax of ALGOL. This notation, or metalanguage, was not complete or perfect but it fulfilled its task to prevent the occurrence of many ambiguities in the discussions about ALGOL. Although the new notation was primarily intended for describing the syntax, the semantics were not completely ignored. Actually, the description of the semantics was mixed in with the description of the syntax.
Usually a special notation to describe the syntax also influences the language concepts to be described. Some language concepts are more easily described in one notation than in an other one. A formal notation will result in more coherent and simple concepts: the notation forces the concepts to be described along the rules of the notation.
In this chapter, both the development of the notation used to describe the syntax and semantics of ALGOL and the development of the language concepts of ALGOL 60 are discussed. To that end, the notation used for describing FORTAN and IAL is explained first. After that the focus is on the first stage of the development of the notation: Backus’s notation. The third topic is the notation of the ALGOL 60 report. Finally, the development of some problematic programming language concepts in ALGOL, especially that of the procedure concept, are treated in more detail.
Notations used to describe early programming languages
The notations used to describe the early programming languages were, like the languages they described, primitive. During the ALGOL effort this notation was developing into an important aspect of the field of programming languages. Actually, one of the results of this effort was a way to define programming languages: do it as it was done in the ALGOL 60 report.
To describe this development of notation, the notation used before the ALGOL effort is explained and compared to the notation used to describe IAL. This comparison answers the question if this development of notation started during or before the ALGOL effort. In other words: Was the nature of the ALGOL effort responsible for this development of notation?
The notation of IAL and FORTRAN compared
In 1954, the first document describing FORTRAN was published: Preliminary Report—Specifications for the IBM Mathematical FORmula TRANslating System FORTRAN.1 Two years later, at 15 October 1956, a more finalised version of the language was published.2 The notation used in this later document was the same as in the preliminary report; the notation to describe a programming language had not changed fundamentally during these two years.
As was the case with FORTRAN, the first publication of IAL was also a preliminary report: Preliminary Report: International Algebraic Language.4 Actually, both preliminary reports do resemble each other in the sense that both documents were set up in a similar way. To compare the notations used in both reports, the descriptions for some language elements are given and compared.
Real numbers in IAL are described using a pattern reflecting the form of a real number. Compared to the description of real numbers in FORTRAN , where only natural language is used, the form of a number is more obvious.
A more complicated language construct was the expression. Again, in IAL patterns are used to clarify the form of an expression whereas in FORTRAN natural language is used. The difference between the two notations is much smaller than before: a pattern is used in both descriptions. In IAL’s description this pattern is more explicit, however.
The notations used for statements were also alike. For example, the
if statement is described in both preliminary reports with a pattern (compare figures). Although this version of FORTAN did not have that much declarations, the description of declarations was more or less similar with those in the IAL report. In a later version of FORTRAN more declarations were added to the language.
In short, the notation used to describe FORTRAN was similar to the notation used to describe IAL. In the description of FORTRAN natural language was used more often than in the description of IAL. The general form of a language element was described with a pattern in both descriptions. These patterns were the basis of the description of IAL. This was probably the influence of the European part of the designing committee. As we have seen before, the Europeans tended more to logic and mathematics than the Americans did.
The quality of IAL’s notation
The notations used to describe both FORTRAN and IAL were similar. This does not say anything about the quality of the notation, however. The question is: Was this notation sufficient to describe a programming language formally and completely? To answer this question, a closer look at IAL and its notation is taken.
Even for simple language elements, like algebraic expressions, IAL’s notation was not good enough. For example, an algebraic expression has simply one of the forms occurring on the right hand side of the “~” symbols. Although it is assumed that the operators in this definition do have the ‘conventional meaning,’10 this definition is ambiguous and it does not say anything about operator precedence, nor about associativity.
When defining an element with an unknown number of subelements or when there are constraints on the occurrence of some subelements this notation was problematic. Take, for example, the compound statement (above Figure). The pattern of this compound statement is a number of statements between the
end keywords. At first glance, this seems to define this compound statement completely.
Unfortunately it is not clear if it was possible to have a compound statement containing no statements at all, only one statement, or even two statements. After all, there are three “Σ” symbols in the pattern describing the compound statement. The use of “…” in itself may be clear and completely understandable for most of the readers of the report; those readers were almost all used to read mathematical texts in which these shortscripts occurred often. It does, however, not describe formally and fully what should occur on those “….”
This compound statement consists of a number of unconstrained elements only and the use of “…” would be acceptable if the only meaning was zero or more times the repeated element. Unacceptable for a formal notation are constraints on the elements in a pattern added outside the pattern using natural language.
As an example, part of the definition of the procedure declaration is given in the above Figure. Here the “…” are used not only to denote zero or more occurrences of one element denoted by a single symbol, it has three different meanings in the same pattern:
One or more occurrences of a procedure signature. The procedure signature itself consisted of more than one symbol and a part of unknown length, namely the parameter part of the procedure signature. Actually the part
Zero or more occurrences of a declaration about the input and output parameters defined in the procedure signatures.
Zero or more occurrences of two different elements: statements Σ and ∆ declarations . These different elements can be mixed in any order. For every procedure signature, however, there should be a label identical to the procedure identifier preceding minimally one statement. In addition, each procedure must have a return statement and all output parameters should have an assignment statement assigning the value to output the result.
This early notation to describe programming languages was not suitable to define a language like IAL fully and formally. To be able to do so was necessary because IAL was intended to be machine independent: IAL was to be implemented on various machines by different people. According to Backus (1959): ‘there must exist a precise description of those sequences of symbols which constitute legal IAL programs.’13 But, ‘heretofore there has existed no formal description of a machine-independent language.’14 For this reason, Backus started to work on such a formal description.
At the UNESCO International Conference on Information Processing, held at Paris from 15 till 20 June 1959, J.W. Backus presented The syntax and semantics of the proposed international algebraic language of the Zurich ACM-GAMM Conference15 about a formal description of the syntax of IAL. To be able to describe the syntax formally he invented a new metalanguage based on Emile Post’s production system.16 This notation became known as the Backus Normal Form and later as the Backus Naur Form,17 it is, however, best known by its abbreviation BNF.
Using this notation, the syntax of a language could be described by “production rules.” Each rule was of the shape
<metalinguistic variable> :≡ pattern. A pattern was built up from metalinguistic variables and symbols of the language. All possible patterns for a metalinguistic variable were connected with the
or symbol, denoting a choice between the different patterns for the metalinguistic variable.
A simple example of the application of Backus’s notation is the description of integers:18
<digit> :≡ 0 or 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9 <integer> :≡ <digit> or <integer><digit>
A digit is clearly a number, or better, a symbol representing a number, between
9 . An integer is now built up from these simple digits: it is either one simple digit, or it is a integer followed by a simple digit, like
Another simple and clear example is the description of arithmetic expressions:19
<factor> :≡ <number> or <function> or <variable> or <subscr var> or ( <ar exp> ) or <factor>↑<ar exp>↓ <term> :≡ <factor> or <term> × <factor> or <term> / <factor> <ar exp> :≡ <term> or + <term> or - <term> or <ar exp> + <term> or <ar exp> - term <Ar exp A> :≡ <ar exp> <relation> :≡ < or > or ≤ or ≥ or ≠ < rel exp> :≡ ( <ar exp> <relation> <ar exp A>)
Comparing this with the previous incarnation, it is immediately clear that the former is a less ambiguous description than the former description. Using his notation Backus was able to denote the operator precedence by splitting up the description of expressions into different parts: factors, terms and expressions.
Some other problems with the older notation were also solved by this new notation: the use of “…” to denote the occurrence of an element a (unknown) number of times was not needed any more. In Backus’s notation, it was possible to use recursion and, hence, to specify that an element could occur a number of times. The use of this recursivity is made clear by describing a parameter list:20
<param list> :≡ <param> or <param list>, <param>
A parameter list is either just one parameter, or it is a parameter list followed by a comma and one parameter.
Choice could be denoted by using the connective: write two patterns, one with the element of choice and one without it. Both recursion and choice were used to describe procedure statements:21
<oe> :≡ <left element> <out list> :≡ <oe> or <outlist>, <oe> <suc> :≡ <label> or <id>[<exp>] <succr list> :≡ <suc> or <succr list>, <suc> <A> :≡ =:(<out list>) or <blank> <B> :≡ :(<succr list>) or <blank> <proc stmt> :≡ <function> <A> <B> or <id> =:(<outlist>) <B> or <id>:(<succr list>) ppol> :≡ <blank> or <ppol> <oe> <pol> :≡ <ppol> or <pol>, or <pol>, <oe> <A'> :≡ =:(<pol>) <ppsl> :≡ <blank> or <ppsl><suc> <psl> :≡ <ppsl> or <psl>, or <psl>, <suc> <B'> :≡ :(<psl>) <F*> :≡ <function> or <pure funtion> or <id> A*> :≡ <A> or <A'> <B*> :≡ <B> or <B'> <pure function> :≡ <pure function> <A*> <B*> or <F*> <A'> <B'> or <F*> <A*> <B'> [a pure procedure may have any of the forms of a procedure statement but at least one position of one existing list must be empty; at least one input parameter position or one output position or one successor position].
Unfortunately, Backus was not able to write down a formal, clear and understandable description of the procedure statement. Even using his new notation, Backus had to write down some remarks about the statement using natural language to complete the formal description; the procedure statement was too difficult to describe formally. In addition, the description of the declaration of procedures was not included at all.
The new notation was an huge improvement over the one used earlier. Nonetheless, it was improved further by Peter Naur. He replaced by and by . With this, and with the use of complete words for metalinguistic variables instead of using abbreviations of the same words as did Backus, Naur improved the readability of the description.22 The most important contribution of Peter Naur to Backus’s notation, however, was the fact that he used it in the ALGOL 60 report.23 Only after the publication of that report the BNF became more widely known. Before the publication it appeared to Peter Naur that Backus’s description of IAL ‘was received with a silence that made it seem that precise syntax description was an idea whose time had not yet come.’24 Naur ‘thus proved the usefulness of the idea in a widely read paper and it was accepted.’25
Developing ALGOL 60
During the eighteen months between the meeting in Zürich and the next joint meeting on the international algebraic language, held in Paris, January 1960, the name of the language had changed from the ‘“unspeakable” and pompous acronym, IAL’26 to ALGOL. Furthermore, the language was discussed among interested people from America and various countries of Europe. In these months the ALGOL effort became a truly international effort, but it was still a separated effort. The main development took place at different meetings and in correspondence between members of the various subcommittees. The official communications channels of the development, however, were the Communications of the ACM in the USA and the ALGOL Bulletin in Europe.
Most of the American proposals were related to practical aspects of the language. They wanted to improve the language by extending it, by adding more types, and by adding input and output facilities. Another suggestion to improve the language was to tidy up the syntax a bit.27 This practical attitude to the ALGOL effort was a result of the state of programming in the USA: programming was becoming a professional field and the experience gained with existing programming languages provided a good feedback to the ALGOL effort.
The European proposals were often focused on the procedure concept and the scopes of variables.28 The Europeans aimed to improve the language fundamentally and the main target was the difficult procedure concept in IAL. Both in the preliminary report on IAL and in Backus’s description of IAL this procedure concept could not be described fully and formally in the notation used. Aside from this notational problem, other problems with IAL’s procedure concept were also noted and discussed.
The discussions on the procedure concept focused mainly on parameters. E.T. Irons and F.S. Acton (1959) sum up some problems with IAL’s parameters in A proposed interpretation in ALGOL.29 Parameters could occur in the procedure body at the left hand side of an assignment statement. When the procedure was called these parameters in the body were replaced with the argument supplied for that parameter. If an argument was not an assignable element (i.e. not a variable) it would result in undefined behaviour. Another problem mentioned was the use of one argument as both an input and an output parameter. Actually, one parameter could also be used as both an input and an output parameter.30
These problems with the procedure concept were resolved by various subcommittees at the final meeting in Paris. First, the distinction between input and output parameters was removed.31 This solved a number of problems but not all of them. Eventually, under great time pressure, the distinction between call-by-name (enabling the so-called Jensen’s device) and call-by-value was invented.32 Every occurrence of a call-by-name parameter in the body of a procedure was being substituted by the name of the argument supplied to the procedure for that parameter. If a parameter was called by value, however, then the value of the argument was assigned to all occurrences of the corresponding parameter in the procedure body. This call-by-name parameter concept was one of the most controversial features of ALGOL 60.
Another issue at the Paris meeting was recursion. Recursive procedures were new in 1960 and the usefulness of it was not widely recognised. The proposal to add the keyword to the language to denote a recursive procedure was rejected.33 When ALGOL 60 was published, however, it appeared that there was no restriction on the occurrence of calls to a procedure in its own procedure body. Without anyone knowing recursion was added to the language. According to Bauer (1978), this was the result of ‘the Amsterdam plot in introducing recursivity.’34
At the end of the meeting the procedure concept had become a clear concept. The formal description of the declaration of a procedure is given in the above Figure. As said earlier, Backus (1959) did not define the procedure declarations formally, but he did describe procedure statements. The description of procedure statements in the ALGOL 60 report consists of just six subparts. Backus, on the other hand, needed seventeen subparts for the same task. Using the BNF to define ALGOL 60 had a beneficial effect on the procedure concept and ALGOL 60 as a whole.
Besides the types Boolean, integer, real, and array of those types, the string type was added to ALGOL 60. It could only be used as an actual parameter in procedures, however. Proposals to add extra types like complex numbers and lists, were rejected. Furthermore, IAL’s assignment statement was extended into a multiple assignment statement: assignments like were now also allowed.37
The compound statement introduced in IAL was extended and it became a special case of the block concept. In a block, variables, functions, and even procedures can be declared local to the scope of the block. These declarations are only known in the block they are declared in and in all subblocks. They are not known, however, in the encapsulating block. A compound statement was now a block without any declarations. Blocks could be nested to any particular level.38
The case-like statement from IAL was removed and replaced by the
else clause in the
if statement and conditional arithmetical and Boolean expressions.40 In the above Figure the description of the arithmetic expression is given, including this conditional expression. An example of the use of a conditional expression is:
a := if (b < 23) then b + 23 else b - 21. To the variable the value
b + 23 is assigned if
b < 23, else the value
b - 21 is assigned.
In Figure below one of the example programs in the ALGOL 60 report is given exemplifying many of the programming language concepts discussed above.
The meeting in Paris was attended by Bauer, Naur, Rutishauser, Samelson, Vauquois, van Wijngaarden, and Woodger from Europe and by Backus, Green, Katz, McCarty, Perlis, and Wegstein from the USA. The seventh American member, Turanski, had died even before the meeting and was not present. According to Perlis (1978), ‘The meetings were exhausting, interminable, and exhilarating. (…) diligence persisted during the entire period, The chemistry of the 13 was excellent. (…) Progress was steady and the output, Algol60, was more racehorse than camel.’42
The difference between IAL and ALGOL 60 was huge. Instead of ‘just adding a few corrections to ALGOL 58, it was necessary to redesign the language from the bottom up.’43 In addition, the editor of the ALGOL 60 report, Peter Naur, used a changed version of Backus’s notation. The use of the BNF was beneficial for the clean structure of the report. Peter Naur became the editor because he had prepared a draft of the language for the Paris meeting. According to Bauer (1978), the participants of the conference were surprised by Naur’s work. ‘It therefor sounds poetic if he has written that his draft Report was ‘chosen’ as the basis of the discussion; the Committee was simply forced to do so’.44
The result of the meeting, the ALGOL 60 report, ‘was a fitting display for the language. Nicely organised, tantalisingly incomplete, slightly ambiguous, difficult to read, consistent in format, and brief, it was a perfect canvas for a language that possessed those same properties. Like the Bible it was meant, not merely to be read, but to be interpreted.’45 And although the ALGOL 60 report and the formal notation used in it were frightening many people at first, the ALGOL 60 report would become the standard for defining programming languages.
A formal notation to define a programming language is important to allow everyone to read and interpret the definition in the same way. During the development of ALGOL 60, the notation used to describe the ALGOL languages changed fundamentally.
The notation to describe early programming languages like FORTRAN and IAL was natural language combined with some patterns denoting the form of the various language elements. The disadvantage of this notation was that it resulted in ambiguous descriptions. Even for simple language elements, like numbers, expressions, and simple control structures, this was problematic. For definition of complex structures, like the procedure statement and declarations, the notation was totally insufficient. The description of IAL’s procedure concepts was long and incomplete because of the use of the “…” symbol.
To give a more formal and complete description of the syntax of IAL, Backus invented a new notation: the Backus Normal Form. Using this simple notation, complex structures in the language could be described formally. Unfortunately, the procedure concept of IAL was too complex to describe in this notation. So we can conclude that either the notation was insufficient or the procedure concept in itself was wrong.
During the period between the definition of IAL and ALGOL 60, many proposals to improve ALGOL were made. One of the main topics, especially in Europe, was the complex procedure concept. In the ALGOL 60 report the procedure concept was simplified: input and output parameters were removed, call-by-name and call-by-value parameters introduced. Another important aspect of the new language was the notion of a block with its own scope. This block was an extension of the compound statement from IAL. Recursion was a new feature added without anybody knowing it; the proposal to add recursion explicitly was rejected.
The ALGOL 60 report was edited by Peter Naur. He wrote the draft version and used a slightly modified version of Backus’ notation to describe the language. This draft was used as the basis for the ALGOL meeting. The final report would become the standard method of defining programming languages and Backus’ notation became the standard method to describe the syntax of programming languages.
In this short period, the ALGOL effort became a major contributor to the field of programming languages. Examples of contributions include the BNF, a method to define a programming language, the block structure, recursive procedures, call by name, call by value, the block, the scope, etc.
Programming Research Group IBM, Preliminary Report – Specifications for the IBM Mathematical FORmula TRANslating System FORTRAN (New York: IBM, 1954), http://community.computerhistory.org/scc/projects/FORTRAN/BackusEtAl-Preliminary Report-1954.pdf.↩︎
J. W. Backus et al., The FORTRAN Automatic Coding System for the IBM 704 EDPM: Programmer’s Reference Manual (Applied Science Division; Programming Research Department, International Business Machines Corporation, 1956), 1, http://community.computerhistory.org/scc/projects/FORTRAN/704_FortranProgRefMan_Oct56.pdf.↩︎
IBM, Preliminary Report FORTRAN, 3.↩︎
A. J. Perlis and K. Samelson, “Preliminary Report: International Algebraic Language,” Commun. ACM 1, no. 12 (1958): 8–22, doi:10.1145/377924.594925.↩︎
IBM, Preliminary Report FORTRAN, 6.↩︎
Perlis and Samelson, “Preliminary Report: IAL,” 14.↩︎
IBM, Preliminary Report FORTRAN, 14.↩︎
Perlis and Samelson, “Preliminary Report: IAL,” 12.↩︎
John W. Backus, “The Syntax and Semantics of the Proposed International Algebraic Language of the Zurich ACM-GAMM Conference.” in IFIP Congress, 1959, 129.↩︎
Backus, “The Syntax and Semantics of the Proposed International Algebraic Language of the Zurich ACM-GAMM Conference.”↩︎
John Backus, “Programming in America in the 1950s – Some Personal Impressions,” in A History of Computing in the Twentieth Century, ed. N. Metropolis, J. Howlett, and Gian-Carlo Rota (Academic Press, 1980), 132.↩︎
Donald E. Knuth, “Backus Normal Form Vs. Backus Naur Form,” Commun. ACM 7, no. 12 (1964): 736, doi:10.1145/355588.365140.↩︎
Backus, “The Syntax and Semantics of the Proposed International Algebraic Language of the Zurich ACM-GAMM Conference.” 129.↩︎
Backus, “Programming in America in the 1950s – Some Personal Impressions,” 133.↩︎
Knuth, “Backus Normal Form Vs. Backus Naur Form,” 736.↩︎
Backus, “Programming in America in the 1950s – Some Personal Impressions,” 133.↩︎
Alan J. Perlis, “The American Side of the Development of Algol,” in HOPL-1: The First ACM SIGPLAN Conference on History of Programming Languages (New York, NY, USA: ACM Press, 1978), 6, doi:10.1145/800025.808369.↩︎
Alan J. Perlis, “Transcripts of Presentations,” in HOPL-1: The First ACM SIGPLAN Conference on History of Programming Languages (New York, NY, USA: ACM Press, 1978), 144.↩︎
Perlis, “The American Side of the Development of Algol,” 9.↩︎
E. T. Irons and F. S. Acton, “A Proposed Interpretation in ALGOL,” Commun. ACM 2, no. 12 (1959): 14–15, doi:10.1145/368518.368546.↩︎
Peter Naur, “Transcripts of Presentations,” in HOPL-1: The First ACM SIGPLAN Conference on History of Programming Languages (New York, NY, USA: ACM Press, 1978), 147.↩︎
Perlis, “Transcripts of Presentations,” 159.↩︎
J. W. Backus et al., “Report on the Algorithmic Language ALGOL 60,” ed. Peter Naur, Numerische Mathematik 2, no. 1 (1960): 128–29.↩︎
Perlis, “The American Side of the Development of Algol,” 10.↩︎
Jean E. Sammet, Programming Languages: History and Fundamentals, Series in Automatic Computation (Englewood Cliffs, N. J.: Prentice-Hall, 1969), 193.↩︎
Backus et al., “Report on ALGOL 60,” 114.↩︎
Peter Naur, “The European Side of the Last Phase of the Development of ALGOL 60,” in HOPL-1: The First ACM SIGPLAN Conference on History of Programming Languages (New York, NY, USA: ACM Press, 1978), 23, doi:10.1145/800025.808370.↩︎
Backus et al., “Report on ALGOL 60,” 130–31.↩︎
Perlis, “The American Side of the Development of Algol,” 11–12.↩︎
Heinz Rutishauser, Description of ALGOL 60, ed. F. L. Bauer et al., vol. 1, Handbook for Automatic Computation (Berlin: Springer-Verlag, 1967), 7.↩︎
Naur, “European Side of Development of ALGOL 60,” 41.↩︎
Perlis, “The American Side of the Development of Algol,” 12.↩︎