biodoc
Diamond Member
- Dec 29, 2005
- 6,326
- 2,241
- 136
Link to Kernel Dump:
I'm not sure if it's a driver issue or a hardware issue. Hopefully @StefanR5R can help diagnose.
NVRM: GPU at 0000:02:00.0 has fallen off the bus.
Then issues with the other GPU.
Then:
NVRM: A GPU crash dump has been created. If possible, please run
NVRM: nvidia-bug-report.sh as root to collect this data before
NVRM: the NVIDIA kernel module is unloaded
Code:
[ 932.512994] NVRM: Xid (PCI:0000:02:00): 79, GPU has fallen off the bus.
[ 932.512995] NVRM: GPU at 0000:02:00.0 has fallen off the bus.
[ 932.512996] NVRM: GPU is on Board .
[ 932.512998] pcieport 0000:00:03.0: device [8086:2f08] error status/mask=00004020/00000000
[ 932.513001] pcieport 0000:00:03.0: [ 5] Surprise Down Error
[ 932.513003] pcieport 0000:00:03.0: [14] Completion Timeout (First)
[ 932.513005] pcieport 0000:00:03.0: broadcast error_detected message
[ 932.513007] pcieport 0000:00:03.0: AER: Device recovery failed
[ 932.513008] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: id=0018
[ 932.513012] pcieport 0000:00:03.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0018(Requester ID)
[ 932.513015] pcieport 0000:00:03.0: device [8086:2f08] error status/mask=00004020/00000000
[ 932.513017] pcieport 0000:00:03.0: [ 5] Surprise Down Error
[ 932.513019] pcieport 0000:00:03.0: [14] Completion Timeout (First)
[ 932.513023] pcieport 0000:00:03.0: broadcast error_detected message
[ 932.513024] pcieport 0000:00:03.0: AER: Device recovery failed
[ 932.513025] pcieport 0000:00:03.0: AER: Multiple Uncorrected (Fatal) error received: id=0018
[ 932.513030] pcieport 0000:00:03.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0018(Requester ID)
[ 932.513032] pcieport 0000:00:03.0: device [8086:2f08] error status/mask=00004020/00000000
[ 932.513034] pcieport 0000:00:03.0: [ 5] Surprise Down Error
[ 932.513036] pcieport 0000:00:03.0: [14] Completion Timeout (First)
[ 932.513038] pcieport 0000:00:03.0: broadcast error_detected message
[ 932.513040] nvidia 0000:02:00.0: device has no AER-aware driver
[ 932.513041] snd_hda_intel 0000:02:00.1: device has no AER-aware driver
[ 932.513206] NVRM: A GPU crash dump has been created. If possible, please run
NVRM: nvidia-bug-report.sh as root to collect this data before
NVRM: the NVIDIA kernel module is unloaded.