I have been trying to run qunex_container (via singularity) on ubuntu 20.04 with cuda 10.2. All the HCP preprocessing steps were fine. However, when I ran bedpostx_gpu, the script failed.
I am suspecting it is calling the bedpostx_gpu version that was compiled using cuda 9.1. The reason being if I directly called bedpostx_gpu (via /usr/local/fsl/pkgs/fsl-fdt-cuda-10.2-2202.5-h2bc3f7f_1/src/fsl-fdt-cuda-10.2/CUDA/bedpostx_gpu) that was compiled using cuda 10.2 in terminal, bedpostx_gpu ran fine.
So my question is, is there anyway to change which specific version of bedpostx_gpu that qunex calls?
Thanks so much for the swift response as always! Unfortunately it didn’t work after I installed cuda 10.1 and tried to run bedpostx_gpu with the following commands:
--- Full QuNex call for command: dwi_bedpostx_gpu
. /opt/qunex/bash/qx_utilities/dwi_bedpostx_gpu.sh --sessionsfolder='/home/ehui/qunex/cimt/sessions' --session='V3' --fibers='' --weight='' --burnin='' --jumps='' --sample='' --model='' --rician='' --gradnonlin='' --overwrite='yes' --species=''
--------------------------------------------------------------
--------------------------------------------------------------
Running dwi_bedpostx_gpu locally on ehuicompute
Command log: /home/ehui/qunex/cimt/processing/logs/runlogs/Log-dwi_bedpostx_gpu_2022-12-22_17.16.42.044673.log
Command output: /home/ehui/qunex/cimt/processing/logs/comlogs/tmp_dwi_bedpostx_gpu_V3_2022-12-22_17.16.42.044673.log
--------------------------------------------------------------
------------------------- Start of work --------------------------------
Note: The fibers parameter is not set, using default [3]
Note: The weight parameter is not set, using default [1]
Note: The burnin parameter is not set, using default [1000]
Note: The jumps parameter is not set, using default [1250]
Note: The sample parameter is not set, using default [25]
Note: The model parameter is not set, using default [2]
Note: The rician parameter is not set, using default [yes]
--> Executing qunex.sh dwi_bedpostx_gpu:
Study folder: /home/ehui/qunex/cimt
Sessions Folder: /home/ehui/qunex/cimt/sessions
Session: V3
Number of fibers: 3
ARD weights: 1
Burnin period: 1000
Number of jumps: 1250
Sample every: 25
Model type: 2
Rician flag: yes
Overwrite prior run: yes
--> Removing existing bedpostx run for V3...
--> Checking if bedpostx was completed on V3...
--> Prior bedpostx run not found or incomplete for V3. Setting up new run...
--> Generating log folder
--> Not using gradient nonlinearities flag -g
--> Running FSL command:
/opt/qunex/bash/qx_utilities/diffusion_tractography_dense/fsl_gpu/bedpostx_gpu /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion/. /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/. -n 3 -w 1 -b 1000 -j 1250 -s 25 -model 2 --rician
---------------------------------------------
------------ BedpostX GPU Version -----------
---------------------------------------------
subjectdir is /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion
bedpostxdir is /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX
-- Making bedpostx directory structure
-- Copying files to bedpostx directory
-- Pre-processing stage
-- Queuing parallel processing stage
----- Bedpostx Monitor -----
...................Allocated GPU 0...................
Log directory is: /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/diff_parts/data_part_0000
Number of Voxels to compute in this part: 381214
Number of Directions: 199
Rician noise model requested. Non-linear parameter initialization will be performed, overriding other initialization options!
SubPart 1 of 30: processing 12800 voxels
terminate called after throwing an instance of 'thrust::system::system_error'
what(): parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
/home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/commands.txt: line 1: 64723 Aborted (core dumped) /opt/qunex/qx_library/etc/fsl_gpu_binaries/bedpostx_gpu_cuda_10.1/bedpostx_gpu/bin/xfibres_gpu --data=/home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/data_0 --mask=/home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/nodif_brain_mask -b /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/bvals -r /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/bvecs --forcedir --logdir=/home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/diff_parts/data_part_0000 --nf=3 --fudge=1 --bi=1000 --nj=1250 --se=25 --model=2 --cnonlinear --rician /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion 0 1 381214
-- Queuing post processing stage
-- Merging parts
Log directory is: /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/diff_parts
/home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/nodif_brain_mask
-- Removing intermediate files
-- Creating identity xfm
-- Finished bedpostx_gpu
--> Checking outputs...
--> 9 merged samples for V3 found.
--> bedpostx outputs missing or incomplete for V3
----------------------------------------------------
--> bedpostx run not found or incomplete for V3. Something went wrong.
Check output: /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX
ERROR: bedpostx run did not complete successfully
All parts processed
===> ERROR during dwi_bedpostx_gpu. Check final QuNex error log output:
/home/ehui/qunex/cimt/processing/logs/comlogs/error_dwi_bedpostx_gpu_V3_2022-12-22_17.16.42.044673.log
QuNex FAILED!