PDA

View Full Version : GeForce 9800 GX2 - Only one GPU reported under linux


Craig Harrison
October 24th 11, 11:38 PM
Hi folks,

My Dual GPU 9800 GX2 has developed a problem! Under Linux only 1 GPUis
ever reported, this did not used to be the case, and has only come
apparent since installing Kubuntu 11.10.

However I have tested by re-installing Mint 11 (Where this used to
correctly report both GPU's and SLi configuration, however this too only
reports a single GPU now.

Is this a hardware issue, or has the latest Linux GeFOrce driver got an
issue? below is the result of LSPCI and nvidia-smi
Regards

Craig.

[email protected]:~$ lspci
00:00.0 RAM memory: nVidia Corporation MCP78S [GeForce 8200] Memory
Controller (rev a2)
00:01.0 ISA bridge: nVidia Corporation MCP78S [GeForce 8200] LPC Bridge
(rev a2)
00:01.1 SMBus: nVidia Corporation MCP78S [GeForce 8200] SMBus (rev a1)
00:01.2 RAM memory: nVidia Corporation MCP78S [GeForce 8200] Memory
Controller (rev a1)
00:01.3 Co-processor: nVidia Corporation MCP78S [GeForce 8200]
Co-Processor (rev a2)
00:01.4 RAM memory: nVidia Corporation MCP78S [GeForce 8200] Memory
Controller (rev a1)
00:02.0 USB Controller: nVidia Corporation MCP78S [GeForce 8200] OHCI
USB 1.1 Controller (rev a1)
00:02.1 USB Controller: nVidia Corporation MCP78S [GeForce 8200] EHCI
USB 2.0 Controller (rev a1)
00:04.0 USB Controller: nVidia Corporation MCP78S [GeForce 8200] OHCI
USB 1.1 Controller (rev a1)
00:04.1 USB Controller: nVidia Corporation MCP78S [GeForce 8200] EHCI
USB 2.0 Controller (rev a1)
00:06.0 IDE interface: nVidia Corporation MCP78S [GeForce 8200] IDE (rev a1)
00:07.0 Audio device: nVidia Corporation MCP72XE/MCP72P/MCP78U/MCP78S
High Definition Audio (rev a1)
00:08.0 PCI bridge: nVidia Corporation MCP78S [GeForce 8200] PCI Bridge
(rev a1)
00:09.0 IDE interface: nVidia Corporation MCP78S [GeForce 8200] SATA
Controller (non-AHCI mode) (rev a2)
00:0a.0 Ethernet controller: nVidia Corporation MCP77 Ethernet (rev a2)
00:10.0 PCI bridge: nVidia Corporation MCP78S [GeForce 8200] PCI Express
Bridge (rev a1)
00:12.0 PCI bridge: nVidia Corporation MCP78S [GeForce 8200] PCI Express
Bridge (rev a1)
00:13.0 PCI bridge: nVidia Corporation MCP78S [GeForce 8200] PCI Bridge
(rev a1)
00:14.0 PCI bridge: nVidia Corporation MCP78S [GeForce 8200] PCI Bridge
(rev a1)
00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor
HyperTransport Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor
Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor
DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor
Miscellaneous Control
00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor
Link Control
01:09.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07)
01:09.1 Input device controller: Creative Labs SB Live! Game Port (rev 07)
01:0a.0 FireWire (IEEE 1394): Agere Systems FW322/323 (rev 70)
02:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro
Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a2)
03:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro
Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a2)
03:02.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro
Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a2)
05:00.0 VGA compatible controller: nVidia Corporation G92 [GeForce 9800
GX2] (rev a2)
[email protected]:~$


[email protected]:~$ nvidia-smi
Gpus found in probe:
Found Gpuid 0x5000
Attaching all probed Gpus...OK
Getting unit information...OK
Getting all static information..
[email protected]:~$

Paul
October 25th 11, 02:21 AM
Craig Harrison wrote:
> Hi folks,
>
> My Dual GPU 9800 GX2 has developed a problem! Under Linux only 1 GPUis
> ever reported, this did not used to be the case, and has only come
> apparent since installing Kubuntu 11.10.
>
> However I have tested by re-installing Mint 11 (Where this used to
> correctly report both GPU's and SLi configuration, however this too only
> reports a single GPU now.
>
> Is this a hardware issue, or has the latest Linux GeFOrce driver got an
> issue? below is the result of LSPCI and nvidia-smi
> Regards
>
> Craig.
>
> [email protected]:~$ lspci

> 02:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4...
> 03:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4...
> 03:02.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4...
> 05:00.0 VGA compatible controller: nVidia Corporation G92 [GeForce 9800 GX2] (rev a2)

<resend - problem with E-S server... >

The person here, has both a 9800GX2 and a 9400GT card. The first line is the
9400GT, and the rest is the 9800GX2.

http://www.nvnews.net/vbulletin/showthread.php?t=154384

01:00.0 VGA compatible controller: nVidia Corporation G96 [GeForce 9400 GT] (rev a1)
07:00.0 PCI bridge: nVidia Corporation PCI express bridge for Quadro Plex S4...
08:00.0 PCI bridge: nVidia Corporation PCI express bridge for Quadro Plex S4...
08:02.0 PCI bridge: nVidia Corporation PCI express bridge for Quadro Plex S4...
09:00.0 3D controller: nVidia Corporation G92 [GeForce 9800 GX2] (rev a2)
0a:00.0 VGA compatible controller: nVidia Corporation G92 [GeForce 9800 GX2] (rev a2)

In my non-expert opinion, yes, one of your GPUs is missing. It looks like the
PCI switch chip is still there, and one GPU.

Have you ever taken your 9800GX2 apart ? Maybe it's a bad connection between
the two PCBs.

The other thing that is puzzling, is the identifier for the switch chip.

*******

If I look at this one... And selectively snip bits of it.

http://mushkingames.com/phpbb2/viewtopic.php?f=3&t=13090&start=15

[ AFAIK B=Bus D=Device F=Function ]

B01 D00 F00: nVIDIA nForce 200 (BR04) PCI Express 2.0 Switch
B02 D00 F00: nVIDIA nForce 200 (BR04) PCI Express 2.0 Switch
B02 D02 F00: nVIDIA nForce 200 (BR04) PCI Express 2.0 Switch
B03 D00 F00: nVIDIA GeForce 9800 GX2 Video Adapter
B04 D00 F00: nVIDIA GeForce 9800 GX2 Video Adapter

The BR04 may be

Vendor ID 0x10DE
Model ID 0x05BE

when I compare it to the list here (NF200 entries).

http://pciids.sourceforge.net/pci.ids

So I'm guessing right now, that it isn't a problem with a mis-identified
NF200. More likely, it's something with GPU or wiring between GPU and
the switch chip.

The entries should start with simple bus probes to config space.
I don't know if something like a missing VESA rom info, would do that,
or whether what you're seeing, is a failure to get an ACK of any
sort, back from one of the GPUs. It could be as simple as snapping one of the
surface mount coupling caps off the capacitively coupled bus lanes. If you
pick the right lane for that, you can prevent detection. While PCI Express
buses can dynamically resize (thus avoiding some lane failures), I don't
think they can work around any arbitrary lane failing. Some lanes are
more important than others (like say, lane zero).

I suppose it could also be something like a power converter failure
next to the GPU, denying power to the core of the GPU, and preventing
it from answering probes.

Just a non-expert guess,

Paul

Rene
October 26th 11, 02:08 PM
Hi Paul,

What an extensive answer, I notice that you often invest much time in
helping people in this ng, that is great.

I have a question and also something to say. Were you the guy I had a
discussion with, I think by now several years ago, about the cooling of
an AGP->PCI-E bridge chip? If yes, I'ld like to say that you were right,
even more when it came to the things that had nothing to do with that
subject but that were about me ;-). (If you'ld think "maybe I was but I
have a hard time recalling it" then don't worry, then it will not have
been you, I am quite sure the person that I have in mind, and when
reading your style of writing you could very well have be the one, would
remember immediately).

Apologies and Thanks!

If you weren't that guy, the "Thanks" part remains for your efforts to
help people with video card troubles.

Yours sincerely,
Rene

Paul
October 26th 11, 02:52 PM
Rene wrote:
> Hi Paul,
>
> What an extensive answer, I notice that you often invest much time in
> helping people in this ng, that is great.
>
> I have a question and also something to say. Were you the guy I had a
> discussion with, I think by now several years ago, about the cooling of
> an AGP->PCI-E bridge chip? If yes, I'ld like to say that you were right,
> even more when it came to the things that had nothing to do with that
> subject but that were about me ;-). (If you'ld think "maybe I was but I
> have a hard time recalling it" then don't worry, then it will not have
> been you, I am quite sure the person that I have in mind, and when
> reading your style of writing you could very well have be the one, would
> remember immediately).
>
> Apologies and Thanks!
>
> If you weren't that guy, the "Thanks" part remains for your efforts to
> help people with video card troubles.
>
> Yours sincerely,
> Rene

I'm more like a USENET Cheerleader than anything else :-)

Paul

Rene
October 26th 11, 03:50 PM
On 10/26/2011 03:52 PM, Paul wrote:
> Rene wrote:
>> Hi Paul,
>>
>> What an extensive answer, I notice that you often invest much time in
>> helping people in this ng, that is great.
>>
>> I have a question and also something to say. Were you the guy I had a
>> discussion with, I think by now several years ago, about the cooling
>> of an AGP->PCI-E bridge chip? If yes, I'ld like to say that you were
>> right, even more when it came to the things that had nothing to do
>> with that subject but that were about me ;-). (If you'ld think "maybe
>> I was but I have a hard time recalling it" then don't worry, then it
>> will not have been you, I am quite sure the person that I have in
>> mind, and when reading your style of writing you could very well have
>> be the one, would remember immediately).
>>
>> Apologies and Thanks!
>>
>> If you weren't that guy, the "Thanks" part remains for your efforts to
>> help people with video card troubles.
>>
>> Yours sincerely,
>> Rene
>
> I'm more like a USENET Cheerleader than anything else :-)
>
> Paul


Cheerleaders can bring light into the most dense darkness. And wasn't it
Paul himself who wrote "all things that are in the dark, will be
brought into the light"? ;-)

Thanks again! I hope you are doing fine and wish the best of luck, but I
am sure you already will meet a lot of luck in your daily life.
Nevertheless, I am also sure that my wishing it will make it even more
prominent.

Sincerely,
Rene

P.S. Sorry to other subscribers for my OT talking and to OP for
hijacking the thread, I'll stop now.