ATI GPU errors on LINUX

Rudy Toody

Diamond Member
Sep 30, 2006
4,267
421
126
I get errors on 3 projects. Any ideas?

PrimeGrid: pps_sr2sieve_51209917
Code:
<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
process exited with code 106 (0x6a, -150)
</message>
<stderr_txt>
Unrecognized XML in parse_init_data_file: userid
Skipping: 31263
Skipping: /userid
Unrecognized XML in parse_init_data_file: teamid
Skipping: 132
Skipping: /teamid
Unrecognized XML in parse_init_data_file: hostid
Skipping: 286540
Skipping: /hostid
Unrecognized XML in parse_init_data_file: result_name
Skipping: pps_sr2sieve_51209917_1
Skipping: /result_name
Unrecognized XML in parse_init_data_file: starting_elapsed_time
Skipping: 0.000000
Skipping: /starting_elapsed_time
Unrecognized XML in parse_init_data_file: using_sandbox
Skipp<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
process exited with code 106 (0x6a, -150)
</message>
<stderr_txt>
Unrecognized XML in parse_init_data_file: userid
Skipping: 31263
Skipping: /userid
Unrecognized XML in parse_init_data_file: teamid
Skipping: 132
Skipping: /teamid
Unrecognized XML in parse_init_data_file: hostid
Skipping: 286540ing: 0
Skipping: /using_sandbox
Unrecognized XML in parse_init_data_file: gpu_type
Skipping: ATI
Skipping: /gpu_type
Unrecognized XML in parse_init_data_file: gpu_device_num
Skipping: 0
Skipping: /gpu_device_num
Unrecognized XML in parse_init_data_file: gpu_opencl_dev_index
Skipping: 0
Skipping: /gpu_opencl_dev_index
Unrecognized XML in parse_init_data_file: ncpus
Skipping: 0.067801
Skipping: /ncpus
Sieve started: 192027270000000000 <= p < 192027279000000000
Thread 0 starting
No protocol specified
Error: Creating Context. (clCreateContextFromType): Device not found.
called boinc_finish

</stderr_txt>
]]>

MilkyWay@Home: ps_separation_09_2s_sample_2_1341007502_45354731
Code:
Stderr output

<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
<search_application> milkyway_separation 1.02 Linux x86_64 double OpenCL </search_application>
Unrecognized XML in project preferences: max_gfx_cpu_pct
Skipping: 10
Skipping: /max_gfx_cpu_pct
Unrecognized XML in project preferences: nbody_graphics_poll_period
Skipping: 30
Skipping: /nbody_graphics_poll_period
Unrecognized XML in project preferences: nbody_graphics_float_speed
Skipping: 5
Skipping: /nbody_graphics_float_speed
Unrecognized XML in project preferences: nbody_graphics_textured_point_size
Skipping: 250
Skipping: /nbody_graphics_textured_point_size
Unrecognized XML in project preferences: nbody_graphics_point_point_size
Skipping: 40
Skipping: /nbody_graphics_point_point_size
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' 
Error reading astronomy parameters from file 'astronomy_parameters.txt'
  Trying old parameters file
Using SSE3 path
No protocol specified
Found 1 platform
Platform 0 information:
  Name:       AMD Accelerated Parallel Processing
  Version:    OpenCL 1.2 AMD-APP (923.1)
  Vendor:     Advanced Micro Devices, Inc.
  Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
  Profile:    FULL_PROFILE
Using device 0 on platform 0
Failed to find number of devices (-1): CL_DEVICE_NOT_FOUND
Failed to get information about device
Error getting device and context (1): MW_CL_ERROR
Failed to calculate likelihood
<background_integral> nan </background_integral>
<stream_integral>  nan  nan </stream_integral>
<background_likelihood> nan </background_likelihood>
<stream_only_likelihood>  nan  nan </stream_only_likelihood>
<search_likelihood> nan </search_likelihood>
21:10:39 (4136): called boinc_finish

</stderr_txt>
]]>
POEM@Home: poempp_gpucrystal_1347996218_1665838544_0
Code:
Stderr output

<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified
No protocol specified

</stderr_txt>
]]>
 

biodoc

Diamond Member
Dec 29, 2005
6,350
2,243
136
Are you using a cc_config.xml file?

Boinc version 7.0.27 should be ok.

Did the driver installation file come from AMD or from a repository?

Latest driver: http://support.amd.com/us/gpudownload/linux/Pages/radeon_linux.aspx

Release notes (instructions); http://www2.ati.com/relnotes/Catalyst_11.10_Linux_Installer.pdf

I'm just shooting in the dark here, obviously, but since I saw an odd reference to ncpu, I figured it's worth taking a look at your cc_config file. Since your failing on all three projects it seems to me it could be a problem with the driver source or installation.

In the release notes, there is a list of libraries required for installation.


Sorry I can't be of more help.
 

Rudy Toody

Diamond Member
Sep 30, 2006
4,267
421
126
I'm thinking it's a missing link to the libraries.

The first clue is the No Protocol message.
The second is from Mathematica, where one command fails for Linux. It has to do with the path to the library. I have tried to finesse it by setting the environment variable to the proper value, but it gets replaced with an invalid path. I may have to resort to a debugger to find out where that happens.
Code:
Environment["ATISTREAMSDKROOT"]

$Failed
The driver and the APP (formerly known as stream) both came from AMD.

Code:
<cc_config>
<options>

<report_results_immediately>1</report_results_immediately>
<use_all_gpus>1</use_all_gpus>

</options>
</cc_config>
 
Last edited:

biodoc

Diamond Member
Dec 29, 2005
6,350
2,243
136
Another option would be to create a file in /etc/ld.so.conf.d called boinc_gpu or something like that. Put the path(s) to the libraries in that file and then run sudo ldconfig

good luck RT.
 

Rudy Toody

Diamond Member
Sep 30, 2006
4,267
421
126
Ken_g6-- I got your example from PrimeGrid to run the first time.

However, I was in root when I did it. When I tried as fredk, it ran but did not display anything.

So, your observation that it was a permissions issue is correct. My cpu didn't show up on any boinc projects until I set boinc user="root".

The apps from PrimeGrid have errors caused by bad parsing of the .xml.

What else can I change for Boinc-Client to get the right permissions. Should I add boinc to the root group?

The other thing I can try, is to re-install Catalyst---making sure I am logged in as root.

root@blueheron2:/home/fredk/Downloads# ./ppsieve-cl-boinc-x86_64-linux -p42070e9 -P42070010e6 -k 1201 -K 9999 -N 2000000 -c 60
ppsieve version cl-0.2.3e (testing) Compiled Feb 26 2011 with GCC 4.3.3
nstart=76, nstep=32
ppsieve initialized: 1201 <= k <= 9999, 76 <= n < 2000000 CL setup complete. cthread_count = 18432
42070000070587 | 9475*2^197534+1 42070000198537 | 3373*2^1046686+1
42070003101727 | 4207*2^1054290+1 42070003511309 | 6057*2^1043547+1
42070006307657 | 1513*2^1771812+1 42070006388603 | 2059*2^1816098+1
42070007177519 | 5437*2^1121592+1 42070007396759 | 7339*2^1803518+1
42070008823897 | 4639*2^952018+1 42070008858187 | 2893*2^317690+1
Found 10 factors
root@blueheron2:/home/fredk/Downloads#

I tried the Mathematica test that was erroring out by starting Mathematica as root and it didn't generate any errors. The fractal didn't display, though. So, next is the Catalyst re-install.
 
Last edited:

biodoc

Diamond Member
Dec 29, 2005
6,350
2,243
136
I've never run boinc as a service so I can't help you out with permissions, etc. All I do is download the shell script into my home directory for the most recent linux-64 version here: http://boinc.berkeley.edu/download_all.php

I run the shell script and this creates a BOINC directory in my home directory. I cd into that directory and type ./run_manager.

The advantage of the above is that all boinc related files are in one directory. The disadvantage is you have to start boinc manually.
 
Last edited:

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,836
4,815
75
I've only ever installed BOINC from the Linux repositories - at least in the past five years or so. When I want to upgrade I simply overwrite the boinc executables. The advantage of this is that boinc gets its own user (boinc) who runs stuff in /var/lib/boinc-client. I've never had GPU problems doing things that way - but I've never used an AMD card.
 

Rudy Toody

Diamond Member
Sep 30, 2006
4,267
421
126
I've only ever installed BOINC from the Linux repositories - at least in the past five years or so. When I want to upgrade I simply overwrite the boinc executables. The advantage of this is that boinc gets its own user (boinc) who runs stuff in /var/lib/boinc-client. I've never had GPU problems doing things that way - but I've never used an AMD card.

I am using the Debian testing version (named "Wheezy") which has BOINC 7.0.27 that behaves as you describe above. However, the ATI gpu is not recognized by the BOINC projects until we change the BOINC user from "boinc" to "root."

POEM, MilkyWay, and PrimeGrid will now send work units. However, each project has errors because some XML is not parsed properly. This causes device and platform info to be dropped.

I was able to run the PrimeGrid test sample (stand-alone) as root, but not as a non-root user. I can do the same with the Mathematica examples.

I just re-installed Wheezy to start fresh. I'm going to experiment with using a common group for boinc, catalyst, and AMDAPP (formerly strreamSDK). Before I do that, I'm going to ask a few questions on StackExchange.
 

biodoc

Diamond Member
Dec 29, 2005
6,350
2,243
136
It seems like it's a permissions issue but you can test to see if any libraries are missing by running ldd /var/lib/boinc/project/projectname/binary name for each of the GPU project apps.
 

Rudy Toody

Diamond Member
Sep 30, 2006
4,267
421
126
It seems like it's a permissions issue but you can test to see if any libraries are missing by running ldd /var/lib/boinc/project/projectname/binary name for each of the GPU project apps.

It's running as a stand-alone in a root terminal. It doesn't work in a regular terminal.

I have accumulated over 50K on MilkyWay in the first 6 hours!

POEM reserved one core and Proth Prime Sieve slowed everything down too much. However, they all work under root.
 
Last edited:

biodoc

Diamond Member
Dec 29, 2005
6,350
2,243
136
It's running as a stand-alone in a root terminal. It doesn't work in a regular terminal.

I have accumulated over 50K on MilkyWay in the first 6 hours!

POEM reserved one core and Proth Prime Sieve slowed everything down too much. However, they all work under root.

That's great RT! :cool::thumbsup:
 

zzuupp

Lifer
Jul 6, 2008
14,866
2,319
126
It's running as a stand-alone in a root terminal. It doesn't work in a regular terminal.

I have accumulated over 50K on MilkyWay in the first 6 hours!

POEM reserved one core and Proth Prime Sieve slowed everything down too much. However, they all work under root.

nice! at that rate I'm toast in 600 hours