"Bloody Stupid Decision": The History of C



Believe it or not, C was not born as a tattered paperback manual.



In one form or another, the C language has influenced the formation of virtually every programming language developed since the 1980s. Some languages ​​like C ++, C #, and Objective C were meant to be direct descendants of C, while others simply adopted and adapted its syntax. A programmer who has switched from Java, PHP, Ruby, Python, or Perl will have almost no difficulty in understanding simple C programs, and in this sense C can be thought of as the lingua franca of programmers.



However, C did not emerge out of nowhere as a single programming monolith. History C begins in England, with a colleague of Alan Turing and a checkers program.



God Save the King



Christopher Strachey has been called “the man of ideal programming,” according to his long dossier in the Annals of the History of Computing magazine . He earned this reputation at the University of Manchester Computing Center in 1951. Strachey ended up there thanks to his work on a computer at Ferranti Mark I University and his acquaintance with Alan Turing of King's College, Cambridge.



Strachey was born in 1916 to a well-connected British family - his uncle, Lytton Strachey, was one of the founders of the Bloomsbury Group, and his father, Oliver Strachey, was instrumental in the Allies' code-breaking operations in both world wars.



Strachey's emergence as a recognized expert in programming and computer science would have come as a surprise to his private school and Cambridge professors. Strachey always showed talent for the sciences, but rarely applied them.



If he had any hopes for a career in scientific research, he was dealt a severe blow by the unremarkable results of his final exams. So instead, during World War II, Strachey worked for a British electronics firm and later became a school teacher, eventually ending up at one of London's most prestigious private schools, Harrow.



In 1951, Strachey first got the chance to work with computers - he was introduced to Mike Wudger of the UK National Physics Laboratory. After spending one day of his Christmas weekend exploring the Pilot ACE Lab, in his spare time at Harrow he figured out how to teach a computer to play checkers. According to Martin Campbell-Kelly , later to become a colleague of Strachey, "anyone with more experience and less confidence would be happy with just a table of squares."



The first attempt did not bear fruit: the Pilot ACE computer simply didn’t have enough memory to play checkers, but it did show one aspect of Strachey’s interests that would prove critical to the development of the languages ​​that would eventually lead to C. At a time when computers were primarily prized for their ability to solve equations quickly, Strachey was more interested in their ability to perform logic problems (as he later admitted in 1952 at the Association for Computing Machinery meeting ).



The following spring, he learned that a Ferranti Mark I computer had been installed at the University of Manchester. Alan Turing was the assistant director of the computer lab at that university. Turing wrote a textbook for programmers, and Strachey knew it well enough from his Cambridge collaboration to ask for one.



In July 1951, Strachey happened to visit Manchester and personally discuss with Turing his drafts program. Impressed, Turing suggested that, as a first step, write a program that allows the Ferranti Mark I to simulate itself. The simulator will allow programmers to observe step by step how the computer will execute the program. Such a "tracing" program will identify the parts where the program creates bottlenecks or is ineffective. At that time, both memory and computer processor cycles were worth a fortune, so this was an important aspect of programming.



Strachey's tracing program contained more than a thousand commands - at that time it was the longest program written for Ferranti Mark I. After working all night, Strachey was able to run it, and upon completion, the program, according to Campbell-Kelly , played the hymn "God save King ”through the computer speaker.



This amateur accomplishment caught the attention of Lord Halsbury, Managing Director of the National Research and Development Corporation, who soon hired Strachey to manage government work to advance practical applications of the rapidly growing British universities of computer science.



It was in this position that he learned about a Cambridge project run by a trio of programmers named David.



David and Goliath Titan



The University of Cambridge Computing Center had a strong focus on serving scientists. The first computers in the Mathematics Laboratory, EDSAC and EDSAC 2, were made available to researchers throughout the university. They wrote programs that were punched on paper tapes and inserted into a machine.



In the computer center, such punched tapes were attached to the clothesline and were executed one after the other during working hours. This queue of waiting programs came to be known as the “job queue,” a term still used today to describe much more complex ways of organizing computational tasks.





At 6:55, you can see the EDSAC "job queue".



Just two years after the EDSAC 2 went into operation, the university realized that a much more powerful machine would soon be needed, and to solve this problem they would have to buy a commercial mainframe. The university considered options for the IBM 7090 and Ferranti Atlas, but could not afford either of these machines. In 1961, Ferranti department manager Peter Hall proposed that a stripped-down version of the Atlas computer be developed with the University of Cambridge. Cambridge will get a prototype called Titan, and Ferranti will be able to sell the new computer to customers for whom Atlas is too expensive.



To provide computing services to the rest of the university, this computer will need both an operating system and at least one high-level programming language.



Almost immediately, the idea of ​​extending the language developed for EDSAC 2 was abandoned. “In the early 1960s, it was generally thought that 'we are building a new computer, so we need a new programming language,'” recalled David Hartley in a 2017 podcast . Together with David Wheeler and David Barron Hartley participated in the early stages of the development of a new programming language for this computer.



“Creating a new operating system was inevitable,” Hartley said, but not a new programming language. “We thought we had an opportunity to experiment with a new language. In retrospect, it seems like it was a damn stupid decision. "



Maurice Wilkes, who led the Titan project, believed that there was no need for a new programming language. The main intended task of Titan was to provide computing services to the University of Cambridge, and for this it would be optimal to make and start the machine as soon as possible, equipping it with a language that is already familiar to users.



Before approving a proposal to develop a new language, Wilkes requested an analysis of existing programming languages. “We chose them very carefully,” recalls Hartley, “so that he decides that no language is appropriate.” Interestingly, the Fortran IV workgroup did not even reach out to Fortran users in Cambridge who could talk about additional features available in other Fortran flavors. Because of this, as Hartley recalls , the group was convinced that "we can design and develop something much better," but after a few years this venture ended in failure.



The trio had prepared an article by June 1962 arguing for the need for a new language, and, as Hartley says, "we got away with it."



This new programming language was called CPL (Cambridge Programming Language), and by 1963, work was already underway on it. The Cambridge programmers were joined by John Buxton and Eric Nixon of the University of London, and the CPL acronym now stands for Combined Programming Language. The project grew, and Wilkes decided to bring Christopher Strachey to lead it. Soon, according to Campbell-Kelly , people associated with the project began to decode the CPL as "Christopher's Programming Language."



The group of language researchers met in Cambridge or London, as well as at the University of London, but sometimes they met in the workshop of the Kensington townhouse where Strachey lived with his sister. The room at the back of the house was lined with Victorian chairs, pillows were on the floor, and the walls were adorned with portraits of members of the Bloomsbury Group by one of Strachey's relatives. It was there that Strachey "hosted a court party", sometimes wearing a dressing gown, and, as David Barron recalled a few years later, "we argued about how to change the world for the better, and in the evening we went home."



By that time, David Wheeler had moved on to other projects, leaving five of his colleagues: Hartley, Barron, Buxton, Nixon and Strachey.



Hartley loved working on the CPL: “It was actually pretty interesting work,” he recalls. The meetings were rather informal. "We discussed various topics very heatedly and over time even started throwing paper airplanes at each other."



The group started with the ALGOL 60 specifications, with the goal of writing an “ideal” language that would be practical for different users, but at the same time aesthetically pleasing and effective.



Difficulties with prioritization began almost immediately. As David Barron said of Strachey, "it was typical for him to defend his point of view on minor differences with the same force as on important points." One such minor controversy was Strachey's objection to the grammar of the IF ... THEN ... ELSE constructs. “I cannot allow my name to be associated with a recommendation to use illiterately incorrect English,” - this, according to Hartley's memoirs for the Annals of the History of Computing , was his point of view. Strachey preferred "OR", which contradicted the way "OR" was used in almost every other language in existence. However, his preference won out, and in the CPL reference guide OR is used where users would expect ELSE.





The CPL manual, which can of course be found online .



Valuable time has also been spent developing a way to avoid using an asterisk to denote a multiplication operation. In this case, aesthetic considerations led to complications that slowed down the implementation of a practical programming language as complex rules had to be developed to distinguish between "3a" meaning "3 * a" and "3a" as a variable name.



Meanwhile, Cambridge users were increasingly irritated by the lack of a practical programming language for the new Atlas computers. The language specifications were mostly finished, but the compiler did not exist yet. The working group made the CPL so complex that early attempts at writing a compiler in machine code were incredibly ineffective.



Starting with promotion



In 1965, Strachey left for the Massachusetts Institute of Technology (MIT) for several months, and upon his return became director of the Oxford Programming Research Group. In the meantime, Martin Richards joined the CPL project in Cambridge. He set about developing a stripped-down version of the CPL that users could work with. This language BCPL ("B" for "Basic", "simple") needed an efficient compiler.



While at MIT, Strachey helped Martin Richards get a two-year sabbatical at the institute, and in 1966 Richards took BCPL with him to Massachusetts where he was able to work on a compiler.



BCPL is a "bootstrap" language because its compiler is capable of self-compilation. Basically, a small part of the BCPL compiler is written in assembly or machine code, and the rest of the compiler is written in a corresponding subset of BCPL. The part of the compiler written in BCPL is passed to the assembler part, and the resulting compiler program can be used to compile any program written in BCPL.



Compilers with bootstrapping greatly simplify the process of porting a language from one computer or operating system to another. In order for the compiler to run on another computer, it is enough to rewrite one relatively small part of the compiler, written in code specific to a particular computer.



While Richards worked at MIT on the BCPL compiler, the institute began participating in the Multics project with Bell Labs and GE. To support the project, a network was created connecting MIT and Bell Labs.



While the network connection between the two organizations was nominally meant to make Multics easier to work with, Ken Thompson told us that it was socially acceptable to "roam" the MIT mainframes looking for other projects, and that is how he found the BCPL code and documentation. He downloaded them to the Bell Labs mainframe and started working on them. “I was looking for a language to do the things I wanted,” Thompson recalls. "Programming in this era of proprietary operating systems on huge machines was very difficult."



Around the time of Thompson's experiments with BCPL, Bell Labs withdrew from the Multics consortium, and Thompson's computer science department temporarily ran out of computers.



When she finally got the computer, it was a used PDP-7, not particularly powerful even by the standards of that era. Nevertheless, Thompson was able to create and run the first version of Unix on this machine.



The PDP-7 had 8192 "words" in memory (a word in this case was 18 bits - the industry had not yet standardized an 8-bit "byte"). Unix took up the first 4,000 words, leaving 4,000 to run programs.



Thompson squeezed his copy of BCPL (which itself was a stripped-down CPL) to fit into the 4,000 words of the PDP-7's free memory. In the process of compression, he borrowed parts of the language he encountered as a student at the University of California at Berkeley. This SMALGOL language was a subset of ALGOL 60 designed to run on less powerful computers.



The language Thompson ended up using on the PDP-7 was, by his own description, “BCPL semantics with much of the SMALGOL syntax,” meaning it looked like SMALGOL and worked like BCPL. Since this new language consisted only of those aspects of BCPL that Thompson found most useful and fit within the PDP-7's limitations, he decided to shorten the name of BCPL to just "B".



Thompson had previously written Unix for the PDP-7 in assembler, but even in 1969 it was not an ideal way of building an operating system. On the PDP-7, it worked because the computer was pretty basic and Thompson wrote the operating system mostly for fun.



Any operating system that needed to be maintained and updated by multiple programmers, or run on more sophisticated hardware at the time, needed to be written in a high-level programming language.



So when Bell Labs acquired the PDP-11 for the department in 1971, Thompson decided it was time to rewrite Unix in a high-level programming language. He talked about it to Brian Kernighan on stage at VCF 2019 .



At the same time, Dennis Ritchie borrowed B and adapted it to run on more powerful computers. One of the first aspects he added to B again was the ability to "type" variables. Since the PDP-7's memory consisted of 18-bit words, B could be simplified by treating each variable as either one memory word or a sequence of words referenced by its location in the system memory. There were no fixed or floating point decimal numbers, integers or strings in the language. As Thompson said, "It was all just words."



This approach was effective on a simple machine with a very small memory footprint and a small user base, but on more complex systems with complex programs and many users it could lead to specific ways of determining whether a variable is a string or a number, as well as inefficient memory use. ...



According to the second edition of the History of Programming Languages ​​by Thomas Bergin and Richard Gibson, Ritchie named this modified language NB ("New B"). It was installed on the mainframe of the Murray Hill Computing Center, which was made available to users through Bell Labs.



Naturally, when Thompson decided to rewrite Unix in Java, he started with NB. The first three attempts ended in failure, and "because of my selfishness, I blamed the language for it," Thompson recalled with a grin on VCF.



After each failure, Ritchie added features to NB that he and Ken needed. He once added structures - variables that store many individual values ​​in a coherent or "structured" way, after which Thompson was able to write Unix in this new language. Ritchie and Thompson considered the addition of structures that were not in B, SMALGOL, BCPL and CPL significant enough changes to rename the language, and B became C.



Life under the C sign





Christopher Strachey in 1955 with Ferranti Mark 1 (also known as Manchester Electronic Computer).



C escaped from Bell Labs in much the same way as Unix did. It helped make the PDP-11 quickly one of the most successful mini-computers on the market, but what made it the most attractive was the Unix price tag. Any data center with a PDP-11 could install Unix on it at the cost of media and mail, and



C came with Unix. C's presence on so many campuses was, according to Thompson, the main reason for his success. In a letter, he wrote that "they graduated from a lot of guys who knew C".



A further boost to C's success was the publication in 1978 of The C Programming LanguageDennis Ritchie and Brian Kernighan. The small book of 228 pages became, even by the standards of that time, remarkably simple and accessible work.



A wide knowledge of C syntax influenced the development of many languages ​​that followed, little or no resemblance to C. Scripting languages ​​like PHP and JavaScript contain bits of programming notation that Thompson originally designed to fit B into the small memory footprint of the PDP-7. For example, these are the "++" increment and "-" decrement operators. When you only have 4,000 words to work with, reducing "x = x + 1" to "x ++" saves a fair amount of space.



Thompson never thought C would be so widespread. “There were many completely different, very interesting and useful languages ​​around,” said Thompson. "My aesthetic sense told me that one language cannot cover all the needs in the universe."



Like Unix, C was a success born of failure. In both cases, programmers took the best parts of projects that were doomed because too much was asked of them. Multics, which gave birth to Unix, was used at its peak in only about 80 computers, and the CPL, which eventually led to the creation of C, was never completed, and Cambridge researchers abandoned it in 1967.



When Christopher Strachey started the programming research group at Oxford, he said, "dividing work into practical and theoretical is artificial and harmful."

Although the CPL was supposed to combine practical and theoretical sides, it turned out to be too theoretical in implementation. The CPL working group did not attempt to program in the CPL.



However, even if Strachey could not achieve a synthesis between theory and practice in the CPL, his attitude was definitely correct. "C was written to write Unix," Thompson recalls. And "Unix was written so that we could all write programs."






Advertising



Inexpensive servers for any task - that's about our epic servers . Create your own plan in a couple of clicks, maximum configuration - 128 CPU cores, 512 GB RAM, 4000 GB NVMe.






All Articles