Personal computers have become much more reliable over the last 10 years or so, mostly due to the introduction of advanced operating systems with memory protection and hardware abstraction. The hardware itself has gotten better too; uncorrectable random errors are rare in PCs and extraordinarily rare in server-class systems.
These and other improvements have largely eliminated machine crashes. Blue-screen errors on Windows and kernel panics in Linux and Mac OS X still occur, but much more rarely.
Error-reporting services have become common, helping software developers figure out what went wrong. Most large developers now issue regular patches to fix newly discovered bugs, making systems more reliable between major releases.
All this progress is wonderful, of course, but our PCs still aren't reliable in the way that other consumer products are reliable. Machine crashes are still possible, and any bug can bring down an individual application.
Automobiles, for example, can fail in many ways, but they are still much more reliable than PCs. The risks associated with vehicle failures have been greatly reduced by decades of design refinements. Would you feel safe if PC technology controlled the steering and brakes in your car? Conversely, wouldn't you be more confident in your PC if you knew it was as reliable as your vehicle?
Can you rely on your system to display this 370-megapixel image?
(Credit: European Southern Observatory (ESO))PCs are also fragile in response to change. I know I'm always a little nervous the first time I install a new device driver or run a new application. Even without software changes, opening an unusually large image can induce some trepidation. Consider this 370-megapixel image of the Lagoon Nebula available from the European Southern Observatory Web site; how confident are you that all of your image-viewing programs would survive the attempt to open it?
And worst of all, PCs are fragile in response to attack. The kinds of problems that are sometimes created accidentally by software bugs are relatively easy to create on purpose.
Minimizing the frequency and consequences of these problems would require tremendous effort from everyone in the industry. Almost every bit of PC hardware and software would have to change. One part of the solution is an extension of the same techniques that make today's PCs more reliable than older models: more hardware-based isolation of one function from another.
The minimal isolation of today's systems is very convenient for software developers, making it easier to write code and achieve high levels of performance. More isolation means more complexity and more overhead, but it improves reliability.
Developers are taking the first steps in this direction already, for example, with the process isolation features of the Microsoft Internet Explorer 8 and Google Chrome browsers. But there's much more that can be done.
Another way to improve reliability is to verify that data and addresses are consistent in range and format with the original intent of the software developer before they are used by the program. Making these checks in software can help; the incidence of failures related to accidental and deliberate buffer-overflow conditions has been dramatically reduced in this way. There's plenty of room for new hardware to help in this process too.
There's also work to be done in making it easier to recover from failures, since true hardware failures are inevitable. This is another area where some high-end systems are way ahead of the PC. Fault-tolerant machine architectures have been around for a long time in the aerospace industry, for example.
Historically, fault tolerance has never been practical on the PC because PCs always had only one of each critical subsystem: one processor, one bank of memory, one display channel. Today, PC processors and graphics chips have multiple cores and multiple memory interfaces, creating the potential for redundant operation where it's most needed.
Recoverability also implies backups--not just of the contents of disk drives, but even of the live data in memory through checkpointing. And disk backups can be improved too, by making the backup process an integral part of all disk I/O. Modern file systems use journaling to increase reliability; this technique can be extended to allow recovering from errors long after they occur.
There will be a heavy price to be paid in complexity and performance for all of these techniques, but the currency for this payment is transistors, and Moore's Law gives us more of those in every new process generation. We need to consider how we want to allocate these transistors. Over time, I believe reliability should account for an increasing portion of them.
After 19 months of consulting--in Silicon Valley, we prefer that term to "unemployment"--I've accepted a job.
Once I start, I'll have to stop blogging. But while I'm still independent, I'd like to wrap up here by offering a short series of articles addressing several key topics in the area of personal computing.
Today, the topic is energy efficiency.
Energy efficiency has become a major selling point of today's personal computers, especially laptops, because power consumption determines battery life.
Unfortunately, laptops are being optimized for energy efficiency in a way that isn't fully consistent with the needs of laptop users.
Advances in process technology and CPU design have greatly improved the power efficiency of modern microprocessors when they're running. This improvement is most visible at the highest performance levels.
Over the last few years, dual-core laptop processors have gone from maximum speeds of roughly 2.4GHz to 3.0GHz without consuming any more power. The newest quad-core chips provide much more aggregate performance in a similar power envelope.
This improvement in operating efficiency is great for gaming, mobile video editing, and a few other applications. But it's not very meaningful for most consumers.
What the rest of us need is non-operating efficiency, the ability of the laptop to consume very little power when it isn't doing much because that's what our laptops are usually doing.
We need laptops that can do nothing--more efficiently.
I've been looking at the newest crop of ultra low-power laptops. Based on published benchmark data, they consume an average of 8W to 10W of power when doing essentially nothing (what we call "idle power"). Even the best of them consumes about 6W of power at all times, getting 10 hours of battery life from a 60WH battery. Maybe 2W of that is spent keeping the display on. The other 4W to 8W is just wasted by the CPU and other motherboard circuitry.
When your laptop isn't doing much--for example, when you're typing in your word processor--it's using only slightly more CPU performance than your cell phone is when you're texting. Your cell phone consumes very little power to do this meager amount of work, usually no more than 0.25W or so for the CPU and its support chips. The corresponding elements of your laptop, however, may consume 50 times as much power under similar conditions.
Some of this difference is inevitable; your laptop has wider data buses, more and faster RAM, and so on. Nevertheless, your laptop motherboard could be designed to idle along on 1W or so.
That would give you a total system-level power consumption of around 3W--half the power of today's most energy-efficient laptops and about one-quarter the power of an average machine. Because there's a relationship between peak CPU speed and idle power, today's fastest laptops consume 20W or more at idle. With more energy-aware designs, these systems could see even greater proportional reductions.
In other words, adopting more aggressive methods for reducing idle power could easily double battery life across the board, and some systems would see much bigger improvements.
This is not merely a quantitative improvement. Consider what happens when your laptop can comfortably operate for 20 hours with the display on, or 60 hours with the display off.
For one thing, it never has to go to sleep. Your cell phone never really goes to sleep, and that's a great part of its value. Your laptop can have this same cell phone operating model.
Closing the lid should turn off the display, but the machine should keep running. It can stay connected to the Internet over Wi-Fi or 3G, periodically download your new e-mail messages, watch that eBay auction, and do whatever else you need it to do...all the time. Just plug it in to recharge while you're asleep. (If the laptop is in your briefcase, it'll have to slow down a lot to keep from consuming too much power, but that's easily managed.)
When you're ready to start using the machine actively again, it shouldn't take any longer to turn the display on again than it does to physically open the lid. Think "always on," not "instant on."
All of this is possible with today's technology, but nobody's doing it. I think one of the reasons we don't see this usage model is that laptop buyers don't know to ask for it. Incremental improvements produce adequate sales figures with each new laptop generation, and everyone figures that's good enough.
But mark my words: the first full-function laptop that works like a cell phone--always running, always connected, always ready--is going to hit the market like a sledgehammer. Everything else is going to seem obsolete overnight.
While we're all familiar with the steady increase in the number of cores in mainstream PC and server processors, the corresponding progress in the embedded-processor market has been anything but steady.
With mainstream PC microprocessors standardizing on four-core designs such as Intel's Core i7 and leading-edge server chips ranging from 8 to 16 cores, single-core chips are no longer competitive. For embedded systems, however, one core may still be the right answer; if more are needed, the choices range up into the hundreds.
The Tilera Tile-Gx100 combines 100 independent 64-bit integer processor cores and cryptographic accelerators with memory, network, and PCI Express interfaces.
(Credit: Tilera Corporation)The latest announcement in the many-core embedded processor market is Tilera's Tile-Gx family, which combines 16 to 100 64-bit integer processor cores with cryptographic accelerators and off-chip interfaces for memory, networking, and PCI Express. I met with Tilera before last week's announcement to discuss the technical and business issues related to the Tile-Gx.
The technical details
San Jose, Calif.-based Tilera is eager to set itself apart from the many other chip companies competing in its target markets. Unlike most embedded processors with high core counts, for example, Tilera's design allows its cores to operate truly independently, even to the extent of running different operating systems if needed. More commonly, groups of tiles will be combined to run a single task that is part of a larger workload. In this way, one chip can operate like a cluster of multiprocessor systems.
Between this distinction and the fact that cores in the Tile-Gx family are a full 64 bits wide, Tilera claims the Tile-Gx100 is the "world's first 100-core processor." I think that's just a little too broad a claim, personally, since companies such as Clearspeed and Xelerated have previously made similar claims for their chips. Even more significantly, the Tile-Gx100 doesn't exist yet. It won't be a real product until early 2011, according to Tilera's current schedule.
Tile-Gx processors aren't something most CNET readers will ever knowingly use, though these chips will likely, eventually, help carry traffic over the public Internet and through larger corporate networks. But they do provide an excellent example of the issues facing PC processor vendors as core counts continue to grow.
Consider the Tile-Gx100 block diagram shown above. It's easy to imagine that this chip can get a lot of work done. Every core can run up to three instructions per cycle at up to 1.5GHz. It has dedicated hardware accelerators for cryptography and network packet processing. The network interfaces can implement up to eight 10Gb Ethernet ports. The chip also has four DDR3 memory interfaces; to reduce DRAM accesses, every core has 320KB of local cache memory. (The total amount of cache memory in the Tile-Gx100, about 32MB, matches that of IBM's Power7 processor, which has only eight cores.)
The need for balance
It's not so easy to keep all these resources busy, however. The more complicated a chip gets, generally speaking, the more difficult it becomes to make full use of its resources. This is what we often call the balance between hardware and software.
Tilera will offer four products in the Tile-Gx family with 16, 36, 64, and 100 cores and corresponding differences in memory and networking support. This range of products helps meet the needs of different applications, but each product still needs a particular balance of application requirements for maximum efficiency.
So here lies Tilera's great challenge--finding software applications that need a large amount of CPU performance and that also:
1. Are highly parallel, so they can keep many cores busy.
2. Don't need much (if any) floating-point math, since the Tile-Gx doesn't do that.
3. Can benefit from cryptographic acceleration.
4. Consume large amounts of network bandwidth.
Tilera wants customers to think of its chips as "general-purpose" processors, but as this list shows, they're better for some purposes than for others. As PC processors reach higher core counts and integrate more functionality, they too will become more sensitive to application requirements. Integration eventually reaches a point where additional complexity adds no practical value. And the closer PC processor vendors approach that limit, the more difficult it will become to sell their latest, greatest, most complicated chips.
Network processing is the most natural fit for Tilera's capabilities, particularly high-level services like virus scanning as I discussed in September (see "Insatiable demand for mobile data challenges industry"). Internet service providers rarely provide such services for PC users, since PCs can do their own scanning--but mobile phones and other Internet appliances often can't, so these services are seeing increasing demand.
The networking market, unfortunately, is not large enough to support a company like Tilera. Although there is a lot of networking equipment sold each year, each company in the business has its own ideas about how this processing should be done. A single chip design could never capture the majority of this potential demand.
Further, the larger equipment vendors often have policies in place against relying too heavily on individual suppliers, especially small start-ups. They will commonly design different products using different chip-level technology so that the failure of a single supplier--or the purchase of a supplier by a competing equipment vendor--will have only a limited effect on their bottom line.
New business opportunities
Tilera is working to develop new markets for its current TilePro and future Tile-Gx parts. The most significant of these new markets is cloud computing, which may favor solutions like Tilera's that offer higher performance per watt.
That's the metric Tilera touts most heavily for the Tile-Gx, promising 10 times the performance per watt of Intel's Westmere-EP, a six-core 32nm processor that Intel will begin shipping in 2010, which is aimed at high-efficiency servers. (Incidentally, I commend Tilera for making this comparison; Westmere-EP is exactly what they'll be competing against. Too often, chip companies will try to make themselves look better by comparing next year's products with last year's competition.)
Although 10x is a critical multiplier in this business (see my post "The factor factor"), such an advantage doesn't necessarily guarantee success. Tilera has done everything it can to minimize the difficulties associated with software development by adopting industry-standard development tools such as GCC and Eclipse, but its Tile chips still can't run Windows and it just can't match the developer relationships that companies like Advanced Micro Devices and Intel have established.
Fortunately, Tilera is small and relatively efficient for a chip company. Last month, Tilera announced that Quanta Computer invested $10 million in the company based on Quanta's interest in cloud computing. Tilera said it has enough funding to reach cash-flow breakeven in 2011, assuming the Tile-Gx reaches market and achieves the kind of success Tilera predicts.
I remain skeptical, but hopeful. I think there's no question that in the long run, there will be plenty of demand for complex, many-core processors like Tilera's. But will Tilera still be around by that time? And in the long run, once this demand develops, larger companies such as Intel will have their own offerings.
Can Tilera carve out a market niche that it can defend against such strong competition? I just don't know, but I'm always glad to see people trying new ideas.
It's been years since the concept of a digital convergence was seriously debated. Today, it's rare to see a single-function electronic device.
Digital still cameras can record video, and camcorders can take still photos. Even cheap cell phones include cameras. There are Web browsers in cell phones, cameras, televisions, and digital picture frames. In fact, it seems like it's only a matter of time before everything with a battery or power cord will be connected to the Internet.
So it's a little startling to see a new gizmo that does nothing but display text, especially when that text comes from a preprogrammed memory card...and it's extraordinary when the text came from the Internet in the first place.
Openmoko's WikiReader is a standalone Wikipedia browser with a touch screen and the complete text of Wikipedia on a memory card.
(Credit: Peter N. Glaskowsky)I was initially incredulous when I heard about WikiReader, a $99 device from Openmoko designed solely for the purpose of reading Wikipedia articles. How useful could such a thing really be, I wondered.
The device, which was released about two weeks ago, displays the text only. The user interface couldn't be much simpler. Pushing the power button boots the device in less than two seconds. There's a search button for looking up individual articles, a history button for recalling previously viewed articles, and a button to open a random article from the collection. A parental-control feature allows blocking mature content (imperfectly, as I quickly learned).
And that's about it. It doesn't display images, references, discussion pages, or links to outside Web sites. (The latter point is reasonable enough because the device can't access the Internet anyway.) In fact, all 3 million Wikipedia articles viewable on WikiReader ship on a memory card in the device.
The content on the card is just a snapshot of the active Wikipedia database, complete with whatever errors or vandalism may have been present at the moment each article was copied. But overall, it's still an impressive amount of useful information. (Openmoko will offer quarterly updates that can be downloaded for free, or delivered on new memory cards twice per year for an annual cost of $29.)
Not long ago, distributing Wikipedia this way would have been impractical. Even today, an 8GB Micro SD card is a sub-$15 item in wholesale channels, which is a big chunk of the $99 retail price. Saving money here, however, would have compromised the usefulness of the device. (On the unit I tested, 4.18GB out of 7.4GB was actually used; perhaps some foreign-language versions of Wikipedia could fit on smaller, cheaper cards.)
The other elements of WikiReader show similar trade-offs. In an e-mail exchange, Openmoko President Sean Moss-Pultz told me that the Wikireader design began with the chips commonly used for electronic dictionaries--for example, Epson's S1C33E07 microcontroller. But whereas such devices usually have small screens and physical keyboards, allowing them to hit very low price points (e.g., this $21 device from Royal), Openmoko chose to go with a larger screen that displays about 13 lines of proportionally spaced text--roughly 40 characters per line, 80 words at a time.
Further, WikiReader has a capacitive touch screen, enabling the use of a virtual on-screen keyboard rather than a separate physical keyboard. The touchscreen--equipped with a tempered glass face that resists scratches better than plastic--also handles touch-drag scrolling and selecting links to other Wikipedia pages. As a result, WikiReader is smaller than most electronic dictionaries, but has a larger screen and is easier to use. (Click for more details on the WikiReader hardware platform.)
WikiReader is also more expensive than most electronic dictionaries, but again, the higher price was essential if WikiReader was to accomplish its mission. That mission is simple to express: make Wikipedia accessible to anyone, anywhere, any time. At $99, this device may not be affordable by everyone in the world. On the other hand, it's a lot more affordable than even the least expensive laptops, including the original "$100 laptop" from the One Laptop Per Child Foundation, which is still priced at $199 two years after it first went on sale.
Although the comparison is hardly fair, it's still relevant since the number of parents and schools in the world that can afford a $99 WikiReader is a lot larger than the number that can afford a laptop plus the necessary supporting infrastructure such as an Internet connection and power source. (By comparison, Openmoko says that two AAA alkaline batteries--cheap and widely available--will run the WikiReader for up to a year, and that's the only recurring cost to keep the unit operating.)
I expect the cost of manufacturing WikiReader will come down slowly over time, and the product itself may become more valuable as third-party developers begin to work with the WikiReader's open-source software. Openmoko began as an open-source cell phone project, and while WikiReader has nothing in common with that earlier work, the company still has some visibility in the open-source developer community.
WikiReader isn't quite easy enough for a cat to use.
(Credit: Peter N. Glaskowsky)The WikiReader software load is very simple. There's no OS, not even Linux; just one application to run the Wikipedia browser, for example. All of the software, along with the compressed Wikipedia database, is provided on the Micro SD card. There are some diagnostic programs, and there's even a simple four-function calculator "Easter egg" that comes up in response to a History-Power button combination.
The lack of a full OS is a matter of necessity, but this is the kind of necessity from which virtue is created. The near-instant boot time and ultra-low power consumption couldn't be matched with any flavor of Linux. Software development isn't as easy as it would be for a Linux PC application, but then, the device is simple, so it wouldn't be too difficult to develop new functionality for the WikiReader hardware. I'd like to see the usual combination of dictionary, thesaurus, and language translation found in many other devices, along with a more-advanced calculator.
In the meantime, WikiReader does the one thing it was meant to do, and I think that's good enough.
(My thanks to Pat Meier-Johnson of Pat Meier Associates for bringing WikiReader to my attention. Also, thanks to Openmoko for providing a review unit and answering my questions.)
I'm very impressed by the Nook, Barnes & Noble's new e-book reader. It's clear B&N has studied Sony's Reader and Amazon's Kindle very carefully.
The Nook has almost all of the major features of both product lines, plus a few more, with few competitive disadvantages. B&N has also followed Amazon's lead on support services. The Nook has a very good online e-book store as well as applications to support e-book reading on Macs, Windows machines, and smartphones.
(Credit:
Barnes & Noble)
The Nook doesn't ship until the end of November, but here's what I found most significant from the announcement and the pages at nook.com:
Industrial design
I think the Nook is attractive and well-designed. It looks better than the Kindle 2, but not as good as Sony's Reader Touch Edition, which offers a larger screen in a smaller form factor. Also, Sony's forthcoming Reader Daily Edition is only slightly larger than the Nook, but offers a much larger screen.
Secondary color display
This feature surprised me. It seems expensive and insufficiently functional for what must be a significant added cost. The low resolution of this display (480 x 144, according to a CNET blog post) means it won't be useful for much beyond the basic user-interface features B&N has already described: book covers, menus, and a keyboard for note-taking. (Although I should note for the record that while B&N says "Its full-color touchscreen encourages you to bookmark, add notes, and highlight passages," I haven't found a photo on the company Web site depicting the virtual keyboard shown in some of the pre-release images. Perhaps that's one of the features still under development.)
By comparison, the secondary color screen built into the Alex e-book reader from Spring Design, shown in another recent CNET story, is large enough to be useful. Unfortunately, it's also large enough to be very much in the way, leading to an awkward device. Spring Design and B&N need to make up their minds-- are they making e-book readers or something else?
... Read moreI've been thinking about buying a new gizmo, and it turns out I'm not the only one in the family having these thoughts.
My sister sent me an e-mail over the weekend:
I need a 3G card for my laptop and I'm going to get it from Verizon. What should I ask for? I just don't want them to try to sell me more or less than I need.
Coincidentally, I've been looking into the latest options for mobile broadband access for a couple of months now, ever since the two-year contract ran out on the Option GT Max 3.6 Express I bought in 2007.
Here's an expanded version of my reply e-mail:
There are four basic kinds of 3G wireless modems: USB dongles, PC Card and ExpressCard devices, portable 3G/Wi-Fi access points, and cell phones with wireless "tethering."
USB modems are the most popular type and usually the least expensive. They plug in like a thumb drive, and they're easy to deal with. But I don't like them because they can stick out pretty far, which makes them awkward and a bit fragile. The larger ones don't work at all with USB jacks that are too close to other ports. Also, the cheapest ones can have relatively poor reception.
If your laptop has a plug-in card slot, it's either for PC Cards or the more recent ExpressCard type. Your user manual will tell you. Verizon offers one of each. They don't stick out so far, which makes them a little more rugged while in use, though you should still remove them before putting away the laptop. I find them more convenient than the USB type.
The Novatel MiFi 2372 connects up to five Wi-Fi devices to 3G mobile broadband networks.
(Credit: Novatel Wireless)A portable access point is worth considering if you have more than one gizmo to connect to the Internet while you're traveling. For most North American users there's only one such device available, the Novatel MiFi.
Sprint and Verizon offer the MiFi 2200, which provides typical download speeds from 400Kbps to 1.4Mbps (Verizon's estimate; actual speeds vary widely).
Novatel also makes the MiFi 2372, which works on AT&T, T-Mobile, and pretty much any international phone network. This is the one I want, but as far as I can tell AT&T and T-Mobile don't offer discounted pricing on this gizmo yet. If purchased directly from a mail-order supplier, it's very expensive--well over $300.
Whichever version you get, the MiFi is a standalone gadget a little smaller than an iPhone. It has its own battery and recharges with a small wall adapter or by connecting it to your laptop (which makes it work like a USB wireless modem). It connects to the cellular data network and creates its own little Wi-Fi hot spot that can be used by up to five systems at once--like your laptop and an iPod Touch.
I don't have one of these myself, but friends do, and it looks like the most convenient way to get online while traveling.
As an aside, I should mention that one of the earliest mobile broadband/Wi-Fi gizmos was developed by a friend of mine, Tor Amundson. He called it the Stompbox, and wrote about it for Make magazine. More information is available on one of his sites, Stompboxnetworks.com.
Earlier this year, Tor told me about an interesting alternative to the MiFi. Cradlepoint makes gizmos that are functionally equivalent to the MiFi, except they work with a user-provided USB or ExpressCard modem. While this approach is noteworthy, I think the MiFi is generally a better solution for most users.
The last option is to get a 3G-compatible cell phone that supports "tethering"--that is, using the cellphone itself as a modem. This can work pretty well, though I had a lot of trouble tethering the Cingular 8525 phone I had before I got the Option card.
The major downside of tethering is that you may not be able to talk on the phone while using the Internet. Apparently AT&T and T-Mobile 3G phones are more likely to support simultaneous operation than those on Verizon or Sprint. I regard this limitation as unacceptable, though you might feel differently. The upsides are that tethering can be somewhat cheaper than getting a separate 3G modem because there's only one contract, and there's nothing else to carry around.
(The iPhone still doesn't allow tethering.)
The most important thing to keep in mind, no matter how you get online, is that mobile Internet usage is quite strictly limited by all carriers. Verizon's $40/month service provides only 250 MB/month of data transfer, and that can run out very quickly. Even the $60 service's 5GB limit can be exceeded in mere days if you spend too much time on YouTube or some other video streaming service.
If you go over your plan limit, per-megabyte charges are really painful. According to Verizon, the 5GB overage rate is 5 cents/MB and the 250MB overage rate is 10 cents/MB. In other words, a single HD video on YouTube could easily cost you a few dollars to watch once you're over the limit.
For comparison purposes, AT&T's overage fees are $10/100MB for its $40/month plan and 49 cents/MB for the $60/month plan. The latter rate is the cell phone equivalent of the death penalty, since hardly anyone is going to go only a few megabytes over the 5GB allotment. A careless user could easily incur hundreds of dollars in overage fees in a single month.
So whatever you buy, be careful how you use it. And if you share your connection (using a MiFi, or via Internet Connection Sharing in Windows), make sure your friends stay away from Hulu.
Another thing to consider is whether you need international access. If you intend to travel a lot, you can get a wireless modem that will work in most foreign countries. Be sure to ask about the countries that matter to you; Japan and South Korea, in particular, have very specific requirements. What Verizon calls "Global Ready" modems are somewhat more expensive to buy, but again, be warned: international roaming can be *very* expensive. (In the U.S., the charges are the same as for any other 3G modem.)
In my opinion, the best way to get Internet access while traveling internationally is to find cheap or free Wi-Fi hot spots and skip the mobile broadband. This approach is less convenient, but there's no risk of coming home to a very expensive bill from your cell phone company.
In part 1 and part 2 of this series, I claimed that there is apparently a secret rule in the microprocessor industry that determines the success--or failure--of new chip designs.
The failures included RISC processors, media processors, and intelligent RAM chips, which all sank in spite of clearly demonstrable advantages over alternative solutions. The great success is the programmable graphics processing unit (GPU), which has succeeded in spite of the sometimes wrenching shifts in programming methods and PC system architecture that have been required to support it.
So what's the secret? Simply this: a factor-of-two advantage, even if it's an inherent, persistent advantage, isn't enough to unseat an incumbent solution in the face of even the mildest competitive disadvantage. Without a factor of 10--a full order of magnitude--a new product won't even get a foot in the door.
That's why I call this rule the "factor factor." It isn't enough to be a few times faster than the existing alternatives. Given the performance consequences of Moore's Law, it's easier for your potential customers to wait a few years rather than spend a few years adapting to your "issues." You need be much faster than the products you're trying to replace. The target factor is 10--no less.
Sometimes, even a tenfold advantage isn't enough. One order of magnitude is enough to overcome one disadvantage, such as a change of programming methods. Add another simultaneous disadvantage, however, like the serious constraint in local memory capacity imposed by the IRAM concept, and the new technology may need a factor of 100 in performance to win a place in the market.
Overall, a new product must deliver net benefits amounting to as much as a full order of magnitude in cost, performance, or productivity to compensate for each significant disadvantage. That's just what it takes to motivate customers to deal with the problems rather than waiting for Moore's Law to speed up the solutions that are already familiar to them.
The introduction of the AMD64 instruction set by Advanced Micro Devices (also known as EM64T or "Intel 64" on Intel processors, or generically as x86-64) represents the ultimate success case for the factor factor.
AMD's Athlon 64 debuted the AMD64 instruction-set architecture.
(Credit: Advanced Micro Devices)This isn't immediately clear, I suppose. Adopting the AMD64 standard required a lot of work by operating system vendors and software developers, and the performance benefit was relatively mild in most cases. But still, AMD64 was an immediate success because the performance benefit in certain applications--those that simply wouldn't fit into a 32-bit address space--was practically infinite.
Although the factor factor seems obvious--or at least it should--it's still at the heart of many failed products and hundreds of millions of dollars of wasted investments every year.
In Silicon Valley, like other chip-design centers around the world, projects rarely fail because of poor execution. In most projects, the engineers are good at their jobs, the managers are good at coordinating their work, and the investment is sufficient to get the work done.
Most projects fail at the conceptual level, before the detail design work even begins. The factor factor is only one of many reasons for these failures, of course, but it's the one that disturbs me the most because it's the easiest to anticipate.
This rule doesn't apply to all products. When a new chip for an existing market is architecturally compatible with previous products, a factor-of-two performance improvement is plenty. Even smaller benefits can justify the costs of developing a new product if there are few, if any, disadvantages associated with it.
Multicore CPUs are one of these products, at least for now. Process technology makes it pretty easy to double core counts. Dual-core CPUs were almost a drop-in replacement for single-core chips and caused no serious problems. Quad-core chips were the same thing again. Eight-core CPUs may be a lesson in diminishing returns, but I'm sure they'll be commercially successful.
Beyond that, we'll have to see how it goes. The critical advantage of the CPU over the GPU is high performance on inherently serial processing tasks (what we sometimes call "single-threaded applications"). On a typical PC, there's rarely more than a few of these tasks running at any given moment. It's always useful to have a few extra cores available for parallel tasks, but at some point (I'm thinking somewhere around the 16-core level), PC buyers are likely to stop paying extra for more extra cores.
Even mighty Intel could find itself on the wrong side of the factor factor. Given that quad-core chips became a mainstream product just this year, we can expect to see 16-core processors for ordinary desktop PCs in 2013 and laptops in 2015 or so. By that time, the GPU could be the incumbent solution for high-performance parallel processing, and multicore CPUs could be the technology looking for compelling performance advantages.
So...now you know the supposed secret. When you hear about a radical new microprocessor architecture, you can do what I do: imagine the numeral "1" followed by a "0" for each drawback you see in the proposal. Compare that figure with the claimed benefits and you'll know which way to bet.
By the way, kudos to CNET users divisionbyzero and TrinityTrident, who proved my point that this rule isn't really a secret by explaining it on their comments to the previous posts in this three-part series.
Now if someone could only explain why so many companies don't seem to know this rule!
In the first part of this series, I claimed that a great secret in the microprocessor industry largely determines whether new products succeed or fail.
I noted that this secret shouldn't be a secret at all because many people (including myself) have talked about it over the years, but clearly a lot of people are in the dark because they continually disregard it and develop products that are doomed.
I gave several examples of products that failed because their creators didn't know the great secret. Those products included RISC processors, media processors, and intelligent RAM chips, in which processor cores were integrated with memory to eliminate one of the great bottlenecks in computer performance.
During my eight years at Microprocessor Report, I covered the markets for media processors, 3D-graphics chips, network processors, and what I coined extreme processors--chips with large numbers of simple cores running in parallel. Many of these chips were cheaper, easier to design, and twice as fast as competing products--and still failed.
However, some did succeed. The critical factor that made the difference in most of these cases is the essence of the so-called secret.
One of those successes is the graphics processing unit, or GPU.
I was reminded again of the secret at Nvidia's recent GPU Technology Conference, where many of the talks dealt with GPU computing.
(Disclosure: I recently wrote a technical white paper for Nvidia.)
Although the GPU field dates back only five or six years, GPUs have already earned a place alongside CPUs. Each is clearly superior for certain kinds of applications.
This is true in spite of the fact that GPUs aren't nearly as easy to program as CPUs. Like other forms of parallel programming, GPU programming requires new hardware (the GPU itself), significant new extensions for programming languages, and a different mindset for programmers--one that simply wasn't part of standard computer-science curriculum for most of the last 50 years.
... Read more
Listen carefully. I am about to reveal one of the great apparent secrets of the microprocessor industry. This secret largely determines whether new products succeed or fail.
I don't know why it seems to be a secret. It's simple enough. I figured it out early, in my first job in the industry, and I've seen it demonstrated over and over since then. I'm hardly the only one who knows this secret; I've seen dozens of talks that allude to it, and a few that mentioned it specifically. I've talked about it myself in articles I wrote for Microprocessor Report and other publications.
Unfortunately, I've also seen hundreds of products brought to market in apparent ignorance of this simple rule, and they've all failed, wasting the billions of dollars invested in their development. Assuming the developers weren't throwing away their money on purpose, I conclude they must not have known the one basic fact that doomed their projects, which means it must be a secret.
The secret is...... Read more
Nvidia and Advanced Micro Devices' ATI division are taking different approaches to graphics processing in the next generations of their products. Both strategies have strengths and weaknesses, and I think it's too soon to pick the eventual winner in this long-running fight.
Before I get into my analysis, I should say that Nvidia paid me to write a white paper on the implications of its new GPU architecture (code-named Fermi) for high-performance computing applications. The white paper was released as part of the Fermi launch event at Nvidia's GPU Technology Conference last week.
Nvidia also paid for white papers from two other well-known microprocessor analysts, Nathan Brookwood of Insight64 and my friend and former colleague Tom Halfhill of Microprocessor Report. UC Berkeley professor David Patterson wrote a fourth white paper, and Nvidia wrote one of its own. All of these works take a different approach to the subject; all are worth reading if you need to understand what Fermi is all about.
In short, I think the Fermi architecture has been more thoroughly white-papered than any graphics chip design in history. All five of these documents are available on the Fermi home page on Nvidia's Web site, and just in case that page is moved or changed, you're welcome to take advantage of my own mirror of my white paper.
I've spent much of the last several days reading these documents plus David Kanter's excellent article on Fermi over on his Real World Technologies site. David managed to get some details on Fermi that Nvidia didn't give to the rest of us.
I've also had time to go through the coverage of ATI's recent launch of the RV870, which is what Nvidia's Fermi-based chips will be competing against. The first of Nvidia's chips bears the internal code name of GF100, and it's huge. Here's a life-size photo:
... Read more