Time of Troubles. The history of processors with Intel NetBurst architecture. Part 1



On November 20, 2000, an event took place, which was eagerly awaited by many: Intel officially presented the new Pentium processors - Pentium 4 on the Willamette core. The first mention of this toponym (according to tradition, Intel gives "geographic" code names to its products) happened already in 1996, some specifics appeared two years later - in the fall of 1998, when the final announcement was supposed to take place according to the initial plans.



The processor seemed to be a further development of the P6 microarchitecture, even the internal designation of this version of the microarchitecture - P68, spoke of this, and was supposed to become a transitional link to Merced processors based on a new set of EPIC instructions (explicitly parallel instruction computing - "computation with explicit parallelism of machine instructions" ). It just so happened that these plans were not destined to come true. Processors of this generation became, of course, commercially successful, but at the same time, the attitude towards them is extremely controversial, and disputes among researchers of the history of technology and enthusiasts of retro-computing do not subside to this day.



This article continues a series of articles about the history of processors and platforms for them, we have already studied the period from the appearance of the Pentium to the latest versions of the Pentium III. And if the previous period can be considered the "Golden Age", now the "Time of Troubles" is coming. Competition is escalating, and Intel is making the wrong bet. Time machine, start, attention ... START!



Lumpy first pancake?



Shortly before the announcement of the Pentium 4, Intel lost the “race for gigahertz” to its main competitor AMD, being 2 days late (on paper). At the same time, Intel did not manage to step over a gigahertz - Pentium III 1133 MHz processors (Coppermine, do not confuse with the subsequent Tualatin) were withdrawn from the market due to unstable operation, the 1100 MHz model (with a 100 MHz bus) still went into series, but it was produced in minimal quantities, and even a 25% lower bus bandwidth in some tasks even led to a lag from the 1000 MHz version with a 133 MHz bus. Athlon reached 1200 MHz by the end of October.



Intel, in turn, demonstrated prototypes back in February, running at a mind-boggling 1.5 GHz! The expectations were extremely high! And as far as the frequency is concerned, they completely came true - the first thing to do was the models for 1400 and 1500 MHz. But, as usual, there were many nuances. First, the new systems were extremely expensive, which is generally not surprising. The problem is that in addition to the cost of the processor - $ 644 and $ 819, respectively, two other key components were worth a lot.



A motherboard based on the uncontested at that moment Intel 850 chipset with support for dual-channel RDRAM memory, due to the complexity of the layout, required an expensive eight-layer design, and the memory itself still cost many times more than the usual SDRAM. As a result, the boxed versions came with two 64 MB modules in the kit.



Secondly, from the very beginning, the demonstrated level of performance was not as impressive as expected from it. The new microarchitecture received a pipeline that was twice as long - 20 stages, but the branch prediction unit was still underdeveloped, and often made mistakes, which led to downtime for the execution units. On the other hand, ALU - arithmetic logic units, worked at twice the core frequency, a new set of SSE2 instructions appeared.



Interestingly, the cache of the first level was implemented, instead of the part of it responsible for caching instructions, a cache of already decoded micro-operations appeared. Moreover, the new Pentium 4 no longer supported multiprocessor operation, unlike its predecessors. Thus, the new processors performed very well in multimedia tasks, especially those related to data encoding, and relatively modestly in most others.



Rambus Strikes Back



On the technical side, the strong point of the new platform was the chipset. Intel 850 "Tehama", which in its first iteration worked in tandem with the south bridge (a hub, according to Intel's official terminology) ICH2, familiar from Intel 815. It did not receive any fundamentally new functionality, except for support for the Pentium 4, but everything related to this, essentially key, function was done as it should!



Source



First, the new processors instead of the usual GTL / AGTL bus used the new QPB (Quad Pumped Bus) with a frequency of "only" 100 MHz, but with an effective data transfer rate corresponding to AGTL at 400 MHz. Later, up to the latest Core 2, Intel used the effective bus frequency in the designation of processors.



The bus was the bottleneck of the Pentium III, data was exchanged with memory through it, and it was because of this that there was not much point in using faster memory in desktop systems based on it - only active I / O could fully load two and even more four-channel memory controllers in Intel 840, Intel Profusion and ServerSet III. Therefore, even in workstations, the simpler Intel 820 and VIA 694D chipsets were often used.



That is why the Rambus memory for the Pentium III was almost useless. In the case of the Pentium 4, RDRAM has a second chance. The Intel 850 chipset received support for dual-channel (like in the i840) memory with an effective frequency of 800 MHz and up to 2 GB. Its bandwidth was 3200 MB / s, exactly matching the capabilities of the bus.



The combination was perfect but expensive. As mentioned above, due to memory prices, Intel had to add two minimum size bars to the boxed Pentium 4 package. Quite quickly, by mid-2001, a more affordable solution arrived - Intel 845 "Brookdale", which received support instead of RDRAM. ..PC133 SDRAM. Much of the early Pentium 4's bad reputation goes to him.



Source



The chipset itself was not so bad - it was reliable and stable, but the discrepancy between the memory bandwidth and the processor's requirements ruined the performance in the bud - the drop averaged about 20 percent. The insufficient memory speed was aggravated by the small size of the Willamette cache - the same 256 KB as the old Coppermine.



A compromise could be DDR memory, in the DDR266 variant already used in VIA chipsets for the AMD K7 platform. Its bandwidth would be one third less than the system bus, but it is better than three times. But Intel at that time was bound by an agreement with Rambus, according to which it pledged not to release chipsets with other types of memory, with the exception of PC133 SDRAM and slower ones.



But VIA got ahead of the game - the P4X266 chipset came out almost simultaneously with the Intel 845. There is only one problem - VIA did not receive a license from Intel to release chipsets with QPB support. And she did not receive it, not because she was not ready to pay, but because Intel refused to provide it. A serious scandal broke out, which greatly slowed down the distribution of motherboards based on an alternative chipset. As a result, of course, VIA received a license, but only after the release of DDR chipsets from Intel itself.





Digital Vintage SERVERGHOST Catalina P7/SE Intel D850GB «Garibaldi». Socket 423 — , — RIMM , . . Socket 603, . , . Pentium 4 2000 . :



  • Intel Pentium 4 1400 MHz (Socket 423)
  • Intel D850GB «Garibaldi»
  • 1 — PC800 RIMM ECC
  • Nvidia GeForce 2 GTS 32
  • 40 IDE
  • 50x CD-ROM
  • InWin S500


, Windows Millennium Edition (-, !).





In 2001, to test the 130nm process technology on a less complex product, Intel released updated Pentium IIIs based on the Tualatin core. Mobile and server versions get 512 KB of cache, desktop versions are cut in half - the full version turns out to be too powerful. Celeron (nicknamed "Tualeron" by enthusiasts) becomes a hit - with 256 KB cache and 100 MHz bus, it practically does not lag behind the desktop Pentium version and overclocks perfectly, while the low nominal bus frequency does not require any special efforts on the part of the motherboard for overclocking. When you turn on the Celeron 1200 on a 133 MHz bus, the processor frequency reaches 1600 MHz, and the performance level matches, and sometimes even exceeds, Pentium 4 1.8 GHz (except for tasks that benefit from the NetBurst features, of course). The Pentium 4's reputation is tarnished.



Let's make a reservation right away, as a rule, the comparison involved Pentium 4 in conjunction with PC133 memory, and this, of course, significantly influenced the results. But even with the Intel 850, the Willamette would not have won back completely - the Celeron would have matched a slightly lower-frequency processor, and would not have lost dry. So the fact remains that the cheap "last generation Celeron" has been compared to the newest top.



Despite the hype, this story did not greatly affect the sales of the Pentium 4, but it sealed the fate of the Tualatin. Despite the large frequency potential on the new technical process, the development of Tualatin stopped at around 1400 MHz, although during overclocking most of the early stepping processors reached 1600-1700 MHz, and for some later ones they were overclocked to 2000 MHz. However, certain conclusions were made and his name was Pentium M.



Socket change



If this time the processor and the chipset finally appeared at the same time, then Intel could not refuse the second "favorite" trick. As with the first Pentium, the original Socket 423 was planned for a very short life. Moreover, this became known already at the time of the processor's release. In the original version, only Pentium 4 processors on the Willamette core were produced.



The final of the socket was the 2 GHz model announced on August 27, 2001, simultaneously presented in both versions. This time, Intel got ahead of AMD, which was soon forced to introduce performance ratings to label its Athlon XP. The heat dissipation of the top Pentium 4 model reached 100 W.



The dimensions of Socket 478 exactly correspond to the BGA chip located on the panel of the "old" Socket 423 - Intel simply removed the "extra" link, the intermediate board to which the chip was soldered. The new connector was also called uPGA due to the reduced pitch and diameter of the legs, which became more fragile.



The new socket also brought a greater variety of processors. First of all, the previous models, starting from 1.5 GHz, were released for it, and already in January 2002, new Pentium 4s on the Northwood core appeared - released according to the 130 nm process technology and having a doubled cache size - 512 KB. And in May 2002 Willamette "returned" - in the form of a pair of Celeron models with 1.7 and 1.8 GHz frequencies with 128 KB cache. The return was short-lived - in September Celeron moved to the Northwood core, however, it did not receive a cache increase.



Celeron began to be systematically cut relative to the main line, which is most noticeable these days - in fact, Celeron can hardly be called a full-fledged processor, as it was in the first years of its existence. Although already in the days of Socket 478 they began to tease him with a "socket plug" - the lag even behind the equal-frequency Pentium 4 was too noticeable even with the naked eye, especially on inexpensive chipsets with slow memory.



Block diagram of i845 series chipsets with DDR support - Source The



new socket was installed on essentially the same motherboards. Only by the end of 2001 the Intel 845D with DDR266 support appeared and the SiS 645 announced back in August with support for progressive DDR333 became available. The usual 845 went to aggravate the Celeron's lag - the combination of 128KB of cache and PC133 memory often turned from a computer to a torture machine. Especially thanks to the growing popularity of Windows XP, which is much more demanding on resources than Windows 98. It started to work well starting with 256 MB of RAM, but cheap computers often installed only 128 MB.



Rapid development



With the advent of Northwood, the frequency race did not stop, but on the contrary received a new breath. AMD processors, however, provided similar performance at a noticeably lower frequency, but Intel declared the importance of the frequency. In May, models with a bus frequency of 533 MHz appeared. In general, 2002 was marked by the achievement of the 3 GHz frequency - it was taken by the 3.06 GHz model released on November 14 with a bus frequency of 533 MHz.



It became the first desktop processor with support for Hyper-Threading technology, a proprietary implementation of SMT (Simultaneous Multi Threading) - one processor was seen by the system as two logical ones and when performing two tasks using different execution units of the processor, performance increased significantly. Real-world gains range from 5 to 10 percent, but the increase in complexity and cost of the processor was negligible.



At the same time, new chipsets began to appear. The top-end was Intel 850E with support for 533 MHz bus and received a new ICH4 south bridge with support for six USB 2.0 ports. This was Intel's latest solution for RDRAM. Rambus managed to reduce the cost of its memory and lost the market. The 845D replaced the 845E with 533 MHz bus support, and soon the 845PE, which received the ability to work with DDR333 memory. There are also integrated solutions based on it - 845GE, 845GV (without support for AGP video cards) and 845GL (additionally limited by a 400 MHz bus).



VIA, which finally received a license, released a whole family of chipsets for processors with a 533 MHz bus - P4X266A, P4X333 and even P4X400 - featuring support for new memory standards - DDR333 and DDR400. Their integrated versions were also released - P4M266, P4M333 and P4M400. Unfortunately, due to release delays, VIA missed out on a significant portion of the market and was unable to regain its former popularity.



But VIA chipsets for Athlon have been the most popular and one of the best for a long time. It was on the VIA P4X266A chipset that the last known Baby AT motherboard, the Commate P4XB model, was released. In terms of component layout and size, it is very similar to a mATX board, but, as expected, the Baby AT does not have a panel of I / O ports, most of which are made in the form of miscarriages.



SiS chipsets turned out to be successful - due to their low cost, decent reliability and adequate performance, manufacturers of budget motherboards and ready-made computers fell in love with them. And if the discrete SiS 645 and 648 were just popular, then their integrated sibling SiS 650 became just a hit. It produced not only desktop computers, but also a significant part of laptops - thanks to its low heat generation and advanced energy-saving technologies, it was suitable for use in mobile computers.





Digital Vintage . 845 — SERVERGHOST Rotoscope P7 Intel D845GEBV2 «Brownsville 2» ( 845GE). Pentium 4 2.8 c 533 ( Northwood), 2 80 IDE. Radeon 9200.





The arrival of the Pentium 4 in notebooks was significantly delayed, until 2002 the Tualatium ruled the ball with 512 KB of cache - Pentium III-m. Willamette was too hot for mobile use, and only with the release of Northwood that changed. The thermal package of the mobile version managed to fit within 35 W, much more than the Pentium III, but half that of the desktop versions. And all the same, notebooks based on Pentium 4-m, such a name was given to the mobile version, were distinguished by their heaviness and short battery life.



Processors started from 1.4 GHz, but these versions are extremely rare, they became widespread starting from 1.6 GHz (March 2002) and up to 2.4 GHz (January 2003). The maximum frequency is 2.6 GHz, the processor was released in April 2003, after the announcement of the Pentium M. The mobile Pentium 4 was at the top for only one year, in the spring of 2003 it was replaced by a seriously modified Pentium III - this was the beginning of the end of both the NetBurst microarchitecture and the frequency race ...



For Pentium 4-m Intel, a single chipset was released - a discrete Intel 845MP with support for 400 MHz bus and DDR266 memory. Part of the market was taken by the integrated ATi Radeon IGP 330M chipset, created in conjunction with ALi. It made it possible to significantly reduce the cost of the ready-made solution and at the same time reduce power consumption, while providing adequate performance of the integrated video core and the system as a whole.



However, even after the Pentium M was released, mobile Pentium 4s continued their development, they were intended for large multimedia notebooks focused on working with video content - where NetBurst performs best. In fact, these were adapted desktop models; they were distinguished from the main line primarily by their adherence to the 533 MHz bus even when desktop processors switched to a faster version, as well as support for Intel Enhanced SpeedStep (EIST) energy-saving technology. But even with it, the thermal package reached 88 W in the older models!



Common desktop processors were also widely used in laptops, and even such giants as Toshiba and IBM did not disdain such. At that time, notebooks of the "desktop replacement" class were very popular - with a powerful processor and a large screen, with a powerful video card. They often had very weak batteries or even did without them (the so-called "desknotes"). Often such machines were based on SiS chipsets - 645 and 648, less often Intel 845MP.



Cheaper desktops were based mainly on two chipsets - Intel 852GME (a simplified version of the 855GM / GME, which in turn is a highly energy-efficient version of the 845GE) and the SiS 650, which is much loved by budget laptop manufacturers. Sometimes there were also chipsets from ATI.



At that time, locally assembled laptops were popular in the Russian market (in fact, OEM machines from Chinese manufacturers - Clevo, Mitac and others). Most laptops were based on SiS chipsets. Moreover, if in cheap laptops SiS 650 was quite understandable, in the middle segment the SiS 648 also did not look completely alien, then a huge 17 "laptop for $ 3400 (the same money cost IBM ThinkPad T40p!) With more than 3 GHz processor, powerful a video card, but with the same SiS inside, in a plastic case and with a terrible keyboard - it just generated rejection in the minds of the smallest understanding user.



The Digital Vintage collection began in 2008 as an exclusively laptop, so there are a lot of interesting laptops of the corresponding period in it. As an example, here are some of them:





IBM ThinkPad A31p — Intel Pentium 4-m 1700 ATI Mobility FireGL 7800. 15- IPS 16001200, , , , . — ThinkPad .





IBM ThinkPad T30 — 14». Pentium 4-m 1900 . ThinkPad c , TrackPoint. — 35 , .





IBM ThinkPad R40e — 14». Mobile Celeron, Pentium 4-m 2200 . — ATI Radeon IGP 330M ALi.





RoverBook Explorer E570 WH — . — Pentium 4 2.8 , SiS 650, ATI Mobility Radeon 9000. , , — , — . — .





Let's get back to the serious technique and forget the Roverbooks as a bad dream. In the world of large computing, reliability and performance are valued - they use other solutions and from there this time came a small "revolution". But about it - a little later, but for now let's get back to Rambus.



The first Xeons (now just Xeons, without Pentium) based on Netburst were released in May 2001, their name is Foster. In fact, these were the same Willamette with 256 KB cache and frequencies from 1.4 to 1.7 GHz (later a 2 GHz model was added), but with support for dual-processor configurations and made in the Socket 603 design.



In February 2002, they were replaced by processors based on the Prestonia core - an analogue of Northwood. In addition to the doubled cache, these processors received support for Hyper-Threading technology, which will appear in desktop processors only by the end of the year. The first models worked at a frequency from 1.8 to 2.2 GHz (400 MHz bus), later the frequencies reached 3.0 (400 MHz bus) and 3.06 GHz (533 MHz bus), and Xeon LV processors with reduced power consumption were released - with a frequency of 1.6 to 2.4 GHz. Processors with 533 MHz bus received a "new" Socket 604, it was possible to install "old" processors in it, but not vice versa.



But these processors were replacing only Pentium III and Pentium III Xeon with 256 KB of cache (for dual-processor solutions), but not full-fledged Cascades with 2 MB of cache (the last of which entered the market already in 2001). Only in March 2002 their successors appeared, these were the Xeon MP (Foster MP) processors supporting up to 4 processors in one system and having 512 or 1024 KB L3 cache located on the chip. Since Intel processors have an inclusive caching architecture (each level caches the previous one), the effective cache size is not the sum of the caches, but the size of the largest of them. The Foster MP processors also had Hyper-Threading support. Only three models were released - 1.4, 1.5 and 1.6 GHz.



At the end of 2002, Gallatin processors replaced the Foster MP. They also used three levels of caching, with cache sizes ranging from 1MB to 4MB. Frequencies - from 1.5 to 3.2 GHz. Most of these processors were sold as Xeon MPs (they used a 400 MHz bus), but there were also models with a 533 MHz bus for dual-processor systems (Xeon DP).



Foster and early Prestonia worked in motherboards based on the Intel 860 "Colusa" chipset, which is essentially an analogue of the desktop 850, but with support for dual-processor systems and the ability to install additional MRH-R chips, doubling the number of memory banks on each channel - thus the chipset supports up to 8 slots and up to 4 GB of RAM. The possibility of installing a P64H bridge is supported, which adds the ability to work with the PCI64 bus or two additional PCI32 buses. The chipset works only with a 400 MHz bus, and uses the ICH3 south bridge, which differs from the ICH2 by supporting up to 6 USB ports 1.1. A version for 533 MHz bus was not presented.



But with DDR chipsets for workstations and servers, Intel clearly tried to compensate for the latency that happened in the desktop market! The variety is simply amazing:



  • E7500 «Plumas» (2-4 , 400 , DDR200)
  • E7501 «Plumas» (2-4 , 533 , DDR266)
  • E7505 «Placer» (2 , 533 , DDR266, AGP)
  • E7205 «Granite Bay» (1 , 533 , DDR266, AGP 8x)


Intel E7505 Chipset Block Diagram - Source



Please note that all chipsets use a dual-channel memory controller operating in synchronous mode with the processor bus. As a result, latency is minimal, and memory bandwidth is ideally matched to the needs of the processor bus. The E7500 / 01/05 also supports the 64-bit version of the PCI bus through an optional bridge, which technically can be screwed to the E7205 and even to desktop chipsets, as time will tell. And let this be a spoiler for the second part - Granite Bay, replacing the Intel 850E, will give rise to new chipsets for the Pentium 4, after the release of which there will be no trace of the unfortunate reputation of the early processors.



Block diagram of the Grand Champion HE chipset using the example of the HP ProLiant ML530 server - Source



Intel also competed with the chipsets of ServerWorks, the current series was called the Grand Champion. The chipset was especially popular in the version for four-processor systems, although there were versions for simpler one and two-processor machines. Moreover, the basis was essentially the same set of microcircuits, supplementing or simplifying which it was possible to obtain a system of the required level. Again, even Intel itself has produced boards and server platforms based on these chipsets. Unfortunately, the Grand Champion became the last ServerWorks chipset, soon it was bought by the giant Broadcom and for some unknown reason left the market.



There were also proprietary solutions, for example XA-32 and EXA from IBM, but they were not used outside the servers of this manufacturer. These are solutions at a higher level than Intel and ServerWorks offerings - up to 8 processors in the standard configuration, and up to 16 using NUMA. The chipset also provides L4 cache.



In the era of Netburst, technologies for ensuring fault tolerance and availability (RAS - Reliability, availability and serviceability) are actively developing - RAID arrays with hot-swap disks from the attributes of high-end servers are ubiquitous, hot-swapping technologies appear, and sometimes adding memory (Chipkill), not to mention already replacing expansion cards. At the same time, the decline of RISC reaches its apogee - old architectures leave one after another, new ones hang between life and death. Only IBM Power and the still imperceptible, but already ubiquitous ARM feel good.



In the Digital Vintage collection, this period is represented by two interesting self-assembly systems:





SERVERGHOST Constellation X7/TE — Xeon 2.0 (Prestonia). Tyan Thunder i860 EATX. , 8 RIMM, 4 . , MRH-R. Ultra160 SCSI. — 2 ( 4 ). — 36 , 10000 rpm SCSI. — Matrox Millennium G450 Dual Head.









SERVERGHOST Spectre X7/TE — 1U Gigabyte GS-SR125E. Xeon 3.0 (Prestonia) 6 . 36 SCSI RAID. — Intel E7501 P64H 64- PCI-.



AMD



NetBurst history is a history of rivalry with AMD. Companies were heading toe to toe in the race for the first gigahertz, but fate was on AMD's side. Intel could not allow this the second time, and the second gigahertz was taken by it. AMD could no longer keep up with the third, but this does not mean that it abandoned the struggle. Performance rating is a measure that at first caused laughter.



One of the most popular Athlon XP - the 2500+ model on the Barton core actually worked at 1833 MHz. But the jokes ended when it became clear that this processor is on par with the Pentium 4 2400-2600 MHz. The latest model - Athlon XP 3200+ lagged behind the rating by a whole gigahertz, but did not lag behind the declared competitor!



But competing against equals does not mean winning. Although AMD held up to 30% of the PC processor market at the time, a much more serious response was needed. And in other segments AMD did not look convincing either - its processors were rarely used in notebooks, and the server Athlon MP had extremely limited popularity, despite its advantages.



The answer was given in April 2003 and sounded loudly. K8 is a 64-bit processor with an integrated dual-channel memory controller, with a frequency of up to 2.4 GHz, supporting operation in eight-processor systems - the server version, which received the marketing name Opteron, was the first to come out. A little later, in the fall, the desktop K8 - Athlon 64 was released. Even at frequencies below 2 GHz, they bypassed the 3 GHz Pentium 4 with a margin ...



To be continued ...



Intel, knowing about the upcoming announcement, also prepared itself, shortly before the release of Opteron released updated Pentium 4 with 800 MHz bus frequency and announced further updates. The next two years brought many revolutionary changes, many of which we still use today.



Stay tuned - in the second part you will find the continuation of the story:



  • From servers to the table
  • Slow down to speed up
  • Let's cut off the processor legs
  • A new tire for centuries
  • Napoleonic plans
  • Two in one!
  • Changing course





All Articles