Description:
Dear QuNex team,
When running dwi_legacy_gpu command on a GPU enabled AWS instance, eddy_cuda9.1 is erroring out at function EDDY::EddyCudaHelperFunctions::InitGpu(bool). Let me know on how to further debug this issue. Thanks in advance.
Call:
QuNex command used to start the docker container in interactive mode:
docker run --runtime=nvidia --gpus all -v "/test-data":"/data/" -v "/output":"/data/output" -it "gitlab.qunex.yale.edu:5002/qunex/qunexcontainer:0.96.2"
dwi_legacy_gpu command and params are as follows, dwi has no field-map data.
qunex dwi_legacy_gpu
–sessionsfolder=‘/data/output/sessions’
–sessions=‘10171’
–diffdatasuffix=‘DWI_dir64_PA’
–usefieldmap=‘no’
–pedir=2
–echospacing=‘0.69’
–unwarpdir=‘-y’
–scanner=‘siemens’
–overwrite=‘yes’
–nv
Logs:
nvidia-smi output from host machine
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.142.00 Driver Version: 450.142.00 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 |
| N/A 31C P8 30W / 149W | 0MiB / 11441MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Qunex Logs:
→ unsetting the following environment variables: PATH MATLABPATH PYTHONPATH QUNEXVer TOOLS QUNEXREPO QUNEXPATH QUNEXLIBRARY QUNEXLIBRARYETC TemplateFolder FSL_FIXDIR FREESURFERDIR FREESURFER_HOME FREESURFER_SCHEDULER FreeSurferSchedulerDIR WORKBENCHDIR DCMNIIDIR DICMNIIDIR MATLABDIR MATLABBINDIR OCTAVEDIR OCTAVEPKGDIR OCTAVEBINDIR RDIR HCPWBDIR AFNIDIR ANTSDIR PYLIBDIR FSLDIR FSLGPUDIR PALMDIR QUNEXMCOMMAND HCPPIPEDIR CARET7DIR GRADUNWARPDIR HCPPIPEDIR_Templates HCPPIPEDIR_Bin HCPPIPEDIR_Config HCPPIPEDIR_PreFS HCPPIPEDIR_FS HCPPIPEDIR_PostFS HCPPIPEDIR_fMRISurf HCPPIPEDIR_fMRIVol HCPPIPEDIR_tfMRI HCPPIPEDIR_dMRI HCPPIPEDIR_dMRITract HCPPIPEDIR_Global HCPPIPEDIR_tfMRIAnalysis HCPCIFTIRWDIR MSMBin HCPPIPEDIR_dMRITractFull HCPPIPEDIR_dMRILegacy AutoPtxFolder FSL_GPU_SCRIPTS FSLGPUBinary EDDYCUDADIR USEOCTAVE QUNEXENV CONDADIR MSMBINDIR MSMCONFIGDIR R_LIBS FSL_FIX_CIFTIRW FSFAST_HOME SUBJECTS_DIR MINC_BIN_DIR MNI_DIR MINC_LIB_DIR MNI_DATAPATH FSF_OUTPUT_FORMAT
Generated by QuNex
Version: 0.96.2
User: root
System: d677e2409d35
OS: RedHat Linux #1 SMP Sun Nov 27 06:09:45 UTC 2022
██████\ ║ ██\ ██\
██ __██\ ║ ███\ ██ |
██ / ██ |██\ ██\ ║ ████\ ██ | ██████\ ██\ ██\
██ | ██ |██ | ██ | ║ ██ ██\██ |██ __██\\██\ ██ |
██ | ██ |██ | ██ | ║ ██ \████ |████████ |\████ /
██ ██\██ |██ | ██ | ║ ██ |\███ |██ ____|██ ██\
\██████ / \██████ | ║ ██ | \██ |\███████\██ /\██\
\___███\ \______/ ║ \__| \__| \_______\__/ \__|
\___| ║
DEVELOPED & MAINTAINED BY:
Anticevic Lab, Yale University
Mind & Brain Lab, University of Ljubljana
Murray Lab, Yale University
COPYRIGHT & LICENSE NOTICE:
Use of this software is subject to the terms and conditions defined in
‘LICENSES’ which is a part of the QuNex Suite source code package:
—> Setting up Octave
(/opt/env/qunex) [QuNex qunex]$
(/opt/env/qunex) [QuNex qunex]$ qunex dwi_legacy_gpu \
--sessionsfolder='/data//sessions' \ --sessions='10171' \ --diffdatasuffix='DWI_dir64_PA' \ --usefieldmap='no' \ --pedir=2 \ --echospacing='0.69' \ --unwarpdir='-y' \ --scanner='siemens' \ --overwrite='yes' \ --nv
… Running QuNex v0.96.2 …
NOTE: Processing without FieldMap (TE option not needed)
Running dwi_legacy_gpu with the following parameters:
Study Folder: /data/
Sessions Folder: /data//sessions
Sessions: 10171
Study Log Folder:
Using FieldMap: no
Echo Spacing: 0.69
Phase Encoding Direction: 2
TE value for Fieldmap:
EPI Unwarp Direction: -y
Diffusion Data Suffix Name: DWI_dir64_PA
Overwrite prior run: yes
WARNING: QuNex study folder specification .qunexstudy in /data/ not found.
Check that /data/ is a valid QuNex folder.
Consider re-generating QuNex hierarchy…
— Full QuNex call for command: dwi_legacy_gpu
/opt/qunex/bash/qx_utilities/dwi_legacy_gpu.sh --sessionsfolder=/data//sessions --session=10171 --usefieldmap=no --pedir=2 --echospacing=0.69 --te= --unwarpdir=-y --diffdatasuffix=DWI_dir64_PA --overwrite=yes
Running dwi_legacy_gpu locally on d677e2409d35
Command log: /data//processing/logs/runlogs/Log-dwi_legacy_gpu_2023-01-24_07.22.50.362874.log
Command output: /data//processing/logs/comlogs/tmp_dwi_legacy_gpu_10171_2023-01-24_07.22.50.362874.log
– dwi_legacy_gpu.sh: Specified Command-Line Options - Start –
Sessionsfolder: /data//sessions
Session: 10171
Using fieldmap: no
Diffusion data sufix: DWI_dir64_PA
Overwrite: yes
– dwi_legacy_gpu.sh: Specified Command-Line Options - End –
------------------------- Start of work --------------------------------
— Establishing paths for all input and output folders:
T1w folder: /data/10171/hcp/10171/T1w
Diffusion folder: /data/10171/hcp/10171/Diffusion
T1w diffusion folder: /data/10171/hcp/10171/T1w/Diffusion
— Deleting prior runs for 10171_DWI_dir64_PA …
— Copying unprocesed data into the Diffusion folder
Copying /data/10171/hcp/10171/unprocessed/Diffusion/10171_DWI_dir64_PA.bval
Copying /data/10171/hcp/10171/unprocessed/Diffusion/10171_DWI_dir64_PA.bvec
Copying /data/10171/hcp/10171/unprocessed/Diffusion/10171_DWI_dir64_PA.nii.gz
— Setting up acquisition parameters:
Check acquisition parameter files:
acqparams.txt
index.txt
— Omitting FieldMap step…
Getting the first volume of each DWI image…
Run BET on the B0 EPI image to create masks…
IN=/data/10171/hcp/10171/Diffusion/rawdata/10171_DWI_dir64_PA_nodif
OUT=/data/10171/hcp/10171/Diffusion/rawdata/10171_DWI_dir64_PA_nodif_brain
bet2opts= -m -f 0.35 -v
verbose=1
debug=0
variation=0
min 0 thresh2 0 thresh 74.9385 thresh98 749.385 max 4095
c-of-g 101.297 92.7163 46.3714 mm
radius 73.38 mm
median within-brain intensity 300
self-intersection total 326.99 (threshold=4000.0)
— Checking if PreFreeSurfer was completed to obtain inputs for epi_reg…
PreFreeSurfer data found:
/data/10171/hcp/10171/T1w/T1w_acpc_dc_restore_brain.nii.gz
FAST already completed.
Setting inputs for epi_reg:
→ T1w Data: /data/10171/hcp/10171/T1w/T1w_acpc_dc_restore
→ T1w BET+FAST Data: /data/10171/hcp/10171/T1w/T1w_acpc_dc_restore_brain
→ WM Segment FAST Data: /data/10171/hcp/10171/T1w/T1w_acpc_dc_restore_brain_pve_2
→ T1w Brain Mask Data: /data/10171/hcp/10171/T1w/T1w_acpc_brain_mask
— Running eddy_cuda…
Using the following eddy_cuda binary: /opt/fsl/fsl/bin/eddy_cuda9.1
Running command:
/opt/fsl/fsl/bin/eddy_cuda9.1 --imain=/data/10171/hcp/10171/Diffusion/10171_DWI_dir64_PA --mask=/data/10171/hcp/10171/Diffusion/rawdata/10171_DWI_dir64_PA_nodif_brain_mask --acqp=/data/10171/hcp/10171/Diffusion/acqparams/10171_DWI_dir64_PA/acqparams.txt --index=/data/10171/hcp/10171/Diffusion/acqparams/10171_DWI_dir64_PA/index.txt --bvecs=/data/10171/hcp/10171/Diffusion/10171_DWI_dir64_PA.bvec --bvals=/data/10171/hcp/10171/Diffusion/10171_DWI_dir64_PA.bval --fwhm=10,0,0,0,0 --ff=10 --nvoxhp=2000 --flm=quadratic --out=/data/10171/hcp/10171/Diffusion/eddy/10171_DWI_dir64_PA_eddy_corrected --data_is_shelled --repol -v
Reading images
Performing volume-to-volume registration
Running Register
EDDY::: cuda/EddyCudaHelperFunctions.cu::: static void EDDY::EddyCudaHelperFunctions::InitGpu(bool): Exception thrown
EDDY::: cuda/EddyGpuUtils.cu::: static std::shared_ptrEDDY::DWIPredictionMaker EDDY::EddyGpuUtils::LoadPredictionMaker(const EDDY::EddyCommandLineOptions&, EDDY::ScanType, const EDDY::ECScanManager&, unsigned int, float, NEWIMAGE::volume&, bool): Exception thrown
EDDY::: eddy.cpp::: EDDY::ReplacementManager EDDY::Register(const EDDY::EddyCommandLineOptions&, EDDY::ScanType, unsigned int, const std::vector<float, std::allocator >&, EDDY::SecondLevelECModel, bool, EDDY::ECScanManager&, EDDY::ReplacementManager, NEWMAT::Matrix&, NEWMAT::Matrix&): Exception thrown**
EDDY::: Eddy failed with message EDDY::: eddy.cpp::: EDDY::ReplacementManager EDDY::DoVolumeToVolumeRegistration(const EDDY::EddyCommandLineOptions&, EDDY::ECScanManager&): Exception thrown*