[RESOLVED] Running qunex on ubuntu 20.04 LTS with CUDA 10.2

Dear Jure and QuNex experts:

I have been trying to run qunex_container (via singularity) on ubuntu 20.04 with cuda 10.2. All the HCP preprocessing steps were fine. However, when I ran bedpostx_gpu, the script failed.

I am suspecting it is calling the bedpostx_gpu version that was compiled using cuda 9.1. The reason being if I directly called bedpostx_gpu (via /usr/local/fsl/pkgs/fsl-fdt-cuda-10.2-2202.5-h2bc3f7f_1/src/fsl-fdt-cuda-10.2/CUDA/bedpostx_gpu) that was compiled using cuda 10.2 in terminal, bedpostx_gpu ran fine.

So my question is, is there anyway to change which specific version of bedpostx_gpu that qunex calls?

Many thanks as always!


Hi Ed,

another of our users experienced something similar. Before I dig into it, can you check if any of the solutions provided in [RESOLVED] CUDA error with dwi_bedpost_gpu - #5 by snason-tomaszewski help your case.



Hi Jure:

Thanks so much for the swift response as always! Unfortunately it didn’t work after I installed cuda 10.1 and tried to run bedpostx_gpu with the following commands:

qunex_container dwi_bedpostx_gpu \
  --sessionsfolder="${WORK_DIR}/${STUDY_NAME}/sessions" \
  --sessions="${SESSIONS}" \
  --overwrite="yes" \
  --container="${QUNEX_CONTAINER}" \
  --bash_pre="module load CUDA/10.1" \
  --bash_post="export DEFAULT_CUDA_VERSION=10.1" \
  --bind="/usr/local/cuda-10.1/:/usr/local/cuda/" \

Below is the error I got:

--- Full QuNex call for command: dwi_bedpostx_gpu

. /opt/qunex/bash/qx_utilities/dwi_bedpostx_gpu.sh     --sessionsfolder='/home/ehui/qunex/cimt/sessions'     --session='V3'     --fibers=''     --weight=''     --burnin=''     --jumps=''     --sample=''     --model=''     --rician=''     --gradnonlin=''     --overwrite='yes'     --species=''



   Running dwi_bedpostx_gpu locally on ehuicompute
   Command log:     /home/ehui/qunex/cimt/processing/logs/runlogs/Log-dwi_bedpostx_gpu_2022-12-22_17.16.42.044673.log
   Command output: /home/ehui/qunex/cimt/processing/logs/comlogs/tmp_dwi_bedpostx_gpu_V3_2022-12-22_17.16.42.044673.log


 ------------------------- Start of work --------------------------------
 Note: The fibers parameter is not set, using default [3]
 Note: The weight parameter is not set, using default [1]
 Note: The burnin parameter is not set, using default [1000]
 Note: The jumps parameter is not set, using default [1250]
 Note: The sample parameter is not set, using default [25]
 Note: The model parameter is not set, using default [2]
 Note: The rician parameter is not set, using default [yes]

 --> Executing qunex.sh dwi_bedpostx_gpu:
     Study folder: /home/ehui/qunex/cimt
     Sessions Folder: /home/ehui/qunex/cimt/sessions
     Session: V3
     Number of fibers: 3
     ARD weights: 1
     Burnin period: 1000
     Number of jumps: 1250
     Sample every: 25
     Model type: 2
     Rician flag: yes
     Overwrite prior run: yes

 --> Removing existing bedpostx run for V3...

 --> Checking if bedpostx was completed on V3...

 --> Prior bedpostx run not found or incomplete for V3. Setting up new run...

 --> Generating log folder

 --> Not using gradient nonlinearities flag -g

 --> Running FSL command:
    /opt/qunex/bash/qx_utilities/diffusion_tractography_dense/fsl_gpu/bedpostx_gpu /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion/. /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/. -n 3 -w 1 -b 1000 -j 1250 -s 25 -model 2 --rician
------------ BedpostX GPU Version -----------

subjectdir is /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion

bedpostxdir is /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX

-- Making bedpostx directory structure

-- Copying files to bedpostx directory

-- Pre-processing stage

-- Queuing parallel processing stage

----- Bedpostx Monitor -----

...................Allocated GPU 0...................
Log directory is: /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/diff_parts/data_part_0000
Number of Voxels to compute in this part: 381214
Number of Directions: 199
Rician noise model requested. Non-linear parameter initialization will be performed, overriding other initialization options!

SubPart 1 of 30: processing 12800 voxels
terminate called after throwing an instance of 'thrust::system::system_error'
  what():  parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
/home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/commands.txt: line 1: 64723 Aborted                 (core dumped) /opt/qunex/qx_library/etc/fsl_gpu_binaries/bedpostx_gpu_cuda_10.1/bedpostx_gpu/bin/xfibres_gpu --data=/home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/data_0 --mask=/home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/nodif_brain_mask -b /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/bvals -r /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/bvecs --forcedir --logdir=/home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/diff_parts/data_part_0000 --nf=3 --fudge=1 --bi=1000 --nj=1250 --se=25 --model=2 --cnonlinear --rician /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion 0 1 381214

-- Queuing post processing stage

-- Merging parts

Log directory is: /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX/diff_parts

-- Removing intermediate files

-- Creating identity xfm

-- Finished bedpostx_gpu

 --> Checking outputs...

 --> 9 merged samples for V3 found.

 --> bedpostx outputs missing or incomplete for V3


 --> bedpostx run not found or incomplete for V3. Something went wrong.
     Check output: /home/ehui/qunex/cimt/sessions/V3/hcp/V3/T1w/Diffusion.bedpostX

 ERROR: bedpostx run did not complete successfully

All parts processed

 ===> ERROR during dwi_bedpostx_gpu. Check final QuNex error log output:



Please kindly advise.

Thanks so much!

Hi Jure:

I just realized there is actually a version for cuda 11 and it worked!

Thanks so much;)

Great. Glad it worked!