Last month Nvidia disclosed that due to a manufacturing flaw, some of their laptop computer graphics processors and chipsets are overheating and failing. This is a brief summary of the story for those that missed it.
All of the flawed processors and chipsets are not failing but the frequency of failure is unclear. Nvidia put it this way:
"Certain notebook configurations with GPUs and MCPs manufactured with a certain die/packaging material set are failing in the field at higher than normal rates. To date, abnormal failure rates with systems other than certain notebook systems have not been seen."
The day after the announcement, Humphrey Cheung at tgdaily noted that "significant quantities" of Nvidia chips are overheating and failing.
Two ways that failures manifest themselves are not being able to start the computer and, of course, a blank screen. Dell said that failure symptoms include multiple images, random characters on the screen lines on the screen. HP lists not detecting wireless networks as a sign of failure along with the wireless adapter not appearing in the Windows Device Manager. They also note that if the "battery charge indicator light does not turn on when the battery is installed and the AC adapter is connected" it may be due to this Nvidia problem.
The problem has existed for a while. CNET blogger Brooke Crothers says the wrote about this problem back in April of 2007. Last month, Mr. Demerjian offered a fascinating explanation of what's going on in his article Nvidia plays the meltdown blame game. In it he says "...this problem hasn't cropped up in desktop parts yet, but it most assuredly will.". At The INQUIRER Charlie Demerjian
Today, the Wall Street Journal had a story about dissatisfaction with the way Nvidia has dealt with this issue, Chip Problems Haunt Nvidia, PC Makers. The article notes that "Nvidia hasn't recalled the affected chips or identified which models have problems." Nvidia's failure to publicly identify the problematic hardware, strikes me as inexcusable. According to The INQUIRER, All Nvidia G84 and G86s are bad.
The only laptop vendors to step up to the plate so far have been Dell and HP.
Owners of 24 HP laptop computer models need to be concerned. See HP Pavilion dv2000/dv6000/dv9000 and Compaq Presario v3000/v6000 Series Notebook PCs - HP Limited Warranty Service Enhancement and HP Limited Warranty Service Enhancement. I can't tell which of these two items is the most recent since HP doesn't date stamp them.
Owners of 15 Dell laptop computers are affected, including models in the Inspiron, Latitude, Precision, Vostro, and XPS lines. Dell owners should read NVIDIA GPU Update: Dell to Offer Limited Warranty Enhancement to All Affected Customers Worldwide.
The solutions offered by both HP and Dell boil down to running the fan all the time to prevent the Nvidia hardware from getting too hot.
Both companies offer a BIOS update. HP seems to have an updated BIOS for all affected machines, Dell has one for 10 of their 15 affected models.
HP describes the BIOS update thusly:
"HP has identified a hardware issue with certain HP Pavilion dv2000/dv6000/dv9000 and Compaq Presario V3000/V6000 series notebook PCs, and has also released a new BIOS for these notebook PCs... The new BIOS release for your notebook PC is preventative in nature to reduce the likelihood of future system issues. The BIOS updates the fan control algorithm of the system, and turns the fan on at low volume while your notebook PC is operational."
A very different perspective on the BIOS update is offered by Charlie Demerjian in The INQUIRER:
"If you look at the HP page, the prophylactic fix they offer is to more or less run the fan all the time. Once again, for the non-engineers out there, fan running eats a lot of power, so this destroys the battery life of notebooks. Basically, people bought a machine with a battery life of X, and now it is Y to prevent meltdown from a bum part. It doesn't fix anything, it just makes the failures take longer, hopefully past the warranty period, at a huge battery life cost. Fire up your class actions people, you got shafted."
Both Dell and HP have extended the warranty on affected machines by one year.
If you own a laptop computer with Nvidia chips and you haven't registered it with the hardware vendor, I suggest doing so. This way they can contact you if need be, and it can only help grease the wheels should you need warranty repair.
Some motherboards have thermometers for measuring and reporting the temperature. Try to contact the hardware vendor to see if they offer software that you can use to watch the internal temperature. I use the free HD Tune to watch the temperature in hard disks but the hard disk might be nowhere near the Nvidia chips. The System Information for Windows program can also display some temperatures. Still, the best monitoring is probably with software from the motherboard or computer manufacturer, if they offer it.
Be aware of where the vents are and make sure they aren't blocked. Also, check for dust on the fan and remove any that's there. Go to the Power options in the Control Panel and make sure that all the available power management facilities are being used. They include powering down the hard disk after a period of inactivity as well as CPU power management. The Thinkpad T42 that I'm writing this on also offers PCI Bus power management.
And, of course, the most important advice of all, backup your important files to some place outside your computer. Locally resident backups on an external hard disk or a USB flash drive are a great starting point.
Update August 20, 2008: A reader with a ThinkPad T61 laptop computer wrote to tell me that the fan runs all the time. I haven't seen anything about Lenovo in terms of this Nvidia problem but the computer in question has an NVIDIA Quadro NVS 140M.
Update September 10, 2008: A lawsuit broke out. See.