[RESOLVED] Hcp_diffusion error with qunex 0.97.2

Description:

Hi, I updated the qunex container to 0.7.2 but I’m still seeing errors when running the hcp_diffusion command, including this error mentioned in the HCP-Users list: https://groups.google.com/a/humanconnectome.org/g/hcp-users/c/V5tgm9AT-nw

I also noticed that the cuda version I included in the command was not used for some reason. On a side note, the container status still reports version 0.97.1 for some reason.

Please advise. Thank you.

Estephan

Call:

msi_resources_time=12:00:00; msi_resources_nodes=1; msi_resources_ntaskspernode=24; msi_resources_mem=64000; msi_queue=a100-8; msi_resources_gpu=gpu:a100:1; msi_resources_jobname=HCPDiff; \
study_sharedfolder=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02; \
qunex_container hcp_diffusion \
--batchfile=${study_sharedfolder}/processing/batch_K99Aim2.txt --sessionsfolder=${study_sharedfolder}/sessions --parsessions=1 --parelements=24 --overwrite=yes \
--hcp_dwi_cudaversion=9.1 \
--nv \
--bash="module load cuda/9.1" \
--scheduler=SLURM,time=${msi_resources_time},nodes=${msi_resources_nodes},cpus-per-task=${msi_resources_ntaskspernode},mem=${msi_resources_mem},partition=${msi_queue},gres=${msi_resources_gpu},jobname=${msi_resources_jobname} \
--bind=${study_sharedfolder}:${study_sharedfolder} --container=${HOME}/qunex/qunex_suite-0.97.2.sif

Logs:

cn2107:~ moana004$ qunex_container --container=qunex/qunex_suite-0.97.2.sif --env_status
cn2107:~ moana004$ INFO:    Environment variable SINGULARITY_DOCKER_USERNAME is set, but APPTAINER_DOCKER_USERNAME is preferred
--> unsetting the following environment variables: PATH MATLABPATH PYTHONPATH QUNEXVer TOOLS QUNEXREPO QUNEXPATH QUNEXEXTENSIONS QUNEXLIBRARY QUNEXLIBRARYETC TemplateFolder FSL_FIXDIR FREESURFERDIR FREESURFER_HOME FREESURFER_SCHEDULER FreeSurferSchedulerDIR WORKBENCHDIR DCMNIIDIR DICMNIIDIR MATLABDIR MATLABBINDIR OCTAVEDIR OCTAVEPKGDIR OCTAVEBINDIR RDIR HCPWBDIR AFNIDIR PYLIBDIR FSLDIR FSLGPUDIR PALMDIR QUNEXMCOMMAND HCPPIPEDIR CARET7DIR GRADUNWARPDIR HCPPIPEDIR_Templates HCPPIPEDIR_Bin HCPPIPEDIR_Config HCPPIPEDIR_PreFS HCPPIPEDIR_FS HCPPIPEDIR_PostFS HCPPIPEDIR_fMRISurf HCPPIPEDIR_fMRIVol HCPPIPEDIR_tfMRI HCPPIPEDIR_dMRI HCPPIPEDIR_dMRITract HCPPIPEDIR_Global HCPPIPEDIR_tfMRIAnalysis HCPCIFTIRWDIR MSMBin HCPPIPEDIR_dMRITractFull HCPPIPEDIR_dMRILegacy AutoPtxFolder FSL_GPU_SCRIPTS FSLGPUBinary EDDYCUDADIR USEOCTAVE QUNEXENV CONDADIR MSMBINDIR MSMCONFIGDIR R_LIBS FSL_FIX_CIFTIRW FSFAST_HOME SUBJECTS_DIR MINC_BIN_DIR MNI_DIR MINC_LIB_DIR MNI_DATAPATH FSF_OUTPUT_FORMAT
 
Generated by QuNex 
------------------------------------------------------------------------ 
Version: 0.97.1 
User: moana004 
System: cn2107 
OS: RedHat Linux #1 SMP Wed Jan 25 16:41:43 UTC 2023 
------------------------------------------------------------------------ 
 
        \u2588\u2588\u2588\u2588\u2588\u2588\                  \u2551      \u2588\u2588\   \u2588\u2588\                        
       \u2588\u2588  __\u2588\u2588\                 \u2551      \u2588\u2588\u2588\  \u2588\u2588 |                       
       \u2588\u2588 /  \u2588\u2588 |\u2588\u2588\   \u2588\u2588\       \u2551      \u2588\u2588\u2588\u2588\ \u2588\u2588 | \u2588\u2588\u2588\u2588\u2588\u2588\ \u2588\u2588\   \u2588\u2588\     
       \u2588\u2588 |  \u2588\u2588 |\u2588\u2588 |  \u2588\u2588 |      \u2551      \u2588\u2588 \u2588\u2588\\u2588\u2588 |\u2588\u2588  __\u2588\u2588\\\u2588\u2588\ \u2588\u2588  | 
       \u2588\u2588 |  \u2588\u2588 |\u2588\u2588 |  \u2588\u2588 |      \u2551      \u2588\u2588 \\u2588\u2588\u2588\u2588 |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 |\\u2588\u2588\u2588\u2588  /     
       \u2588\u2588 \u2588\u2588\\u2588\u2588 |\u2588\u2588 |  \u2588\u2588 |      \u2551      \u2588\u2588 |\\u2588\u2588\u2588 |\u2588\u2588   ____|\u2588\u2588  \u2588\u2588\      
       \\u2588\u2588\u2588\u2588\u2588\u2588 / \\u2588\u2588\u2588\u2588\u2588\u2588  |      \u2551      \u2588\u2588 | \\u2588\u2588 |\\u2588\u2588\u2588\u2588\u2588\u2588\u2588\\u2588\u2588  /\\u2588\u2588\     
        \___\u2588\u2588\u2588\  \______/       \u2551      \__|  \__| \_______\__/  \__|    
            \___|                \u2551                                       
 
 
                       DEVELOPED & MAINTAINED BY: 
 
                    Anticevic Lab, Yale University 
               Mind & Brain Lab, University of Ljubljana 
                     Murray Lab, Yale University 
 
                      COPYRIGHT & LICENSE NOTICE: 
 
Use of this software is subject to the terms and conditions defined in 
'LICENSES' which is a part of the QuNex Suite source code package: 
https://gitlab.qunex.yale.edu/qunex/qunex/-/tree/master/LICENSES 
 
 ---> Setting up Octave  


-------------------------------------------------------------- 
 QuNex Environment Status Report 
-------------------------------------------------------------- 



   OS Version 
---------------------------------------------- 

               NAME="CentOS Linux"
               VERSION="7 (Core)"
               ID="centos"
               ID_LIKE="rhel fedora"
               VERSION_ID="7"
               PRETTY_NAME="CentOS Linux 7 (Core)"
               ANSI_COLOR="0;31"
               CPE_NAME="cpe:/o:centos:centos:7"
               HOME_URL="https://www.centos.org/"
               BUG_REPORT_URL="https://bugs.centos.org/"
               
               CENTOS_MANTISBT_PROJECT="CentOS-7"
               CENTOS_MANTISBT_PROJECT_VERSION="7"
               REDHAT_SUPPORT_PRODUCT="centos"
               REDHAT_SUPPORT_PRODUCT_VERSION="7"

   QuNex General Environment Variables 
---------------------------------------------- 

                 QuNexVer : 0.97.1
                    TOOLS : /opt
                QUNEXREPO : qunex
                QUNEXPATH : /opt/qunex
                 QUNEXENV : /opt/env/qunex
           TemplateFolder : /opt/qunex/qx_library/data/
            QUNEXMCOMMAND : octave -q --no-init-file --eval

   Core Dependencies Environment Variables 
---------------------------------------------- 

                 CONDADIR : /opt/miniconda
                   FSLDIR : /opt/fsl/fsl
               FSLCONFDIR : /panfs/roc/msisoft/fsl/6.0.4/config
                FSLGPUDIR : /opt/fsl/fsl/bin
          FSL_GPU_SCRIPTS : /opt/qunex/bash/qx_utilities/diffusion_tractography_dense/fsl_gpu
             FSLGPUBinary : /opt/qunex/qx_library/etc/fsl_gpu_binaries
               FSL_FIXDIR : /opt/fsl/fix
          FREESURFER_HOME : /opt/freesurfer/freesurfer-6.0
     FREESURFER_SCHEDULER : /opt/freesurfer/FreeSurferScheduler
             WORKBENCHDIR : /opt/workbench/workbench/bin_rh_linux64
                CARET7DIR : /opt/workbench/workbench/bin_rh_linux64
                  AFNIDIR : /opt/AFNI/AFNI
                  ANTSDIR : /opt/ANTs/ANTs/bin
                DCMNIIDIR : /opt/dcm2niix/dcm2niix
               DICMNIIDIR : /opt/dicm2nii/dicm2nii
                OCTAVEDIR : /opt/octave/octave
             OCTAVEPKGDIR : /opt/octave/octavepkg
             OCTAVEBINDIR : /opt/octave/octave/bin
                     RDIR : /opt/R/R
                  PALMDIR : /opt/palm/palm-o

   HCP Pipelines 
---------------------------------------------- 

               HCPPIPEDIR : /opt/HCP/HCPpipelines
            GRADUNWARPDIR : /opt/gradunwarp/gradunwarp
     HCPPIPEDIR_Templates : /opt/HCP/HCPpipelines/global/templates
           HCPPIPEDIR_Bin : /opt/HCP/HCPpipelines/global/binaries
        HCPPIPEDIR_Config : /opt/HCP/HCPpipelines/global/config
         HCPPIPEDIR_PreFS : /opt/HCP/HCPpipelines/PreFreeSurfer/scripts
            HCPPIPEDIR_FS : /opt/HCP/HCPpipelines/FreeSurfer/scripts
        HCPPIPEDIR_PostFS : /opt/HCP/HCPpipelines/PostFreeSurfer/scripts
      HCPPIPEDIR_fMRISurf : /opt/HCP/HCPpipelines/fMRISurface/scripts
       HCPPIPEDIR_fMRIVol : /opt/HCP/HCPpipelines/fMRIVolume/scripts
         HCPPIPEDIR_tfMRI : /opt/HCP/HCPpipelines/tfMRI/scripts
          HCPPIPEDIR_dMRI : /opt/HCP/HCPpipelines/DiffusionPreprocessing/scripts
     HCPPIPEDIR_dMRITract : /opt/qunex/bash/qx_utilities/diffusion_tractography/scripts
        HCPPIPEDIR_Global : /opt/HCP/HCPpipelines/global/scripts
 HCPPIPEDIR_tfMRIAnalysis : /opt/HCP/HCPpipelines/TaskfMRIAnalysis/scripts
                MSMBINDIR : /opt/MSM_HOCR_v3
 HCPPIPEDIR_dMRITractFull : /opt/qunex/bash/qx_utilities/diffusion_tractography_dense
    HCPPIPEDIR_dMRILegacy : /opt/qunex/bash/qx_utilities
            AutoPtxFolder : /opt/qunex/bash/qx_utilities/diffusion_tractography_dense/autoptx_hcp_extended
              EDDYCUDADIR : /opt/fsl/fsl/bin/eddy_cuda10.1
                   ASLDIR : /opt/HCP/HCPpipelines/hcp-asl


   Binary / Executable Locations and Versions 
---------------------------------------------- 

    HCPpipelines TAG : Post-v4.7.0-053d1fc
 HCPpipelines commit : 053d1fc335db8233c5a9c3c6f945ab13de7cbf6f

         FSL Binary  : /opt/fsl/fsl/share/fsl/bin/fsl
         FSL Version : 6.0.6.2

  FreeSurfer Binary  : /opt/freesurfer/freesurfer-6.0/bin/freesurfer
  FreeSurfer Version :   freesurfer-Linux-centos6_x86_64-stable-pub-v6.0.0-2beb96c

        AFNI Binary  : /opt/AFNI/AFNI/afni
        AFNI Version : Precompiled binary linux_centos_7_64: Jan  6 2023 (Version AFNI_23.0.00 'Commodus')

        ANTs Binary  : /opt/ANTs/ANTs/bin/antsJointFusion
        ANTs Version : ANTs Version: 2.3.5.dev1-g6f137

    dcm2niix Binary  : /opt/dcm2niix/dcm2niix/dcm2niix
    dcm2niix Version : Chris Rorden's dcm2niiX version v1.0.20220720  (JP2:OpenJPEG) GCC4.8.5 x86-64 (64-bit Linux)

      Octave Binary  : /opt/octave/octave/bin/octave
      Octave Version : 4.4.1

           R Binary  : /usr/local/bin/R
           R Version : R version 3.6.1 (2019-07-05) -- "Action of the Toes"

 R required packages : ggplot2
           R Package : ggplot2  \u20183.4.0\u2019

      python binary  : /opt/env/qunex/bin/python
     python Version : Python 3.7.15

        PALM Binary  : /opt/palm/palm-o/palm.m
        PALM Version : Jun/2021 (github)

  wb_command Binary  : /opt/workbench/workbench/bin_rh_linux64/wb_command
  wb_command Version : Version: 1.5.0

  Full Environment Paths 
---------------------------------------------- 

  PATH : /opt/env/qunex/bin:/opt/miniconda/bin:/opt/fsl/fsl-6.0.6.2/condabin:/usr/local/cuda/bin:/opt/HCP/HCPpipelines/global/matlab:/opt/fsl/fix:/opt/qunex/qx_library/etc/fsl_gpu_binaries:/opt/qunex/bash/qx_utilities/diffusion_tractography_dense/fsl_gpu:/opt/qunex/bash/qx_utilities/diffusion_tractography_dense/autoptx_hcp_extended:/opt/qunex/bash/qx_utilities:/opt/qunex/bash/qx_utilities/diffusion_tractography_dense:/opt/qunex/bash/qx_utilities/diffusion_tractography/scripts:/opt/HCP/HCPpipelines/TaskfMRIAnalysis/scripts:/opt/HCP/HCPpipelines/global/scripts:/opt/HCP/HCPpipelines/DiffusionPreprocessing/scripts:/opt/HCP/HCPpipelines/tfMRI/scripts:/opt/HCP/HCPpipelines/fMRIVolume/scripts:/opt/HCP/HCPpipelines/fMRISurface/scripts:/opt/HCP/HCPpipelines/PostFreeSurfer/scripts:/opt/HCP/HCPpipelines/FreeSurfer/scripts:/opt/HCP/HCPpipelines/PreFreeSurfer/scripts:/opt/HCP/HCPpipelines/global/config:/opt/HCP/HCPpipelines/global/binaries:/opt/HCP/HCPpipelines/global/templates:/opt/gradunwarp/gradunwarp/gradunwarp/core:/opt/workbench/workbench/bin_rh_linux64:/opt/HCP/HCPpipelines:/opt/HCP/HCPpipelines/MSMConfig:/opt/qunex/python/qx_utilities/templates:/opt/qunex/qx_library/data:/opt/qunex/qx_library/data/atlases/hcp:/opt/R/R:/opt/matlab/bin:/opt/octave/octave/bin:/opt/dcm2niix/dcm2niix:/opt/dcm2niix/dcm2niix/build/bin:/opt/ANTs/ANTs/bin:/opt/AFNI/AFNI:/opt/palm/palm-o:/opt/workbench/workbench/bin_rh_linux64:/opt/freesurfer/FreeSurferScheduler:/opt/fsl/fsl/share/fsl/bin:/opt/fsl/fsl/bin:/opt/freesurfer/freesurfer-6.0/bin:/opt/freesurfer/freesurfer-6.0/fsfast/bin:/opt/freesurfer/freesurfer-6.0/tktools:/opt/fsl/fsl/bin:/opt/fsl/fsl/share/fsl/bin:/panfs/roc/msisoft/freesurfer/6.0.0/mni/bin:/opt/freesurfer/freesurfer-6.0:/opt/fsl/fsl/bin:/opt/qunex/lib:/opt/qunex/bin:/opt/bin:/opt/lib/bin:/opt/lib/lib:/opt/olib:/opt/qunex/python/qx_utilities:/opt:/usr/local/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/bin

  MATLABPATH : /opt/qunex/matlab/qx_mice:/opt/qunex/matlab/qx_utilities/general:/opt/qunex/matlab/qx_mri/stats:/opt/qunex/matlab/qx_mri/img:/opt/qunex/matlab/qx_mri/general:/opt/qunex/matlab/qx_mri/fc:/opt/HCP/HCPpipelines/global/matlab:/opt/HCP/HCPpipelines/global/matlab/cifti-matlab:/opt/HCP/HCPpipelines/global/matlab/icaDim:/opt/HCP/HCPpipelines/global/matlab/nets_spectra:/opt/HCP/HCPpipelines/global/matlab:/opt/fsl/fix:/opt/qunex/qx_library/data/:/opt/qunex/qx_library/data/atlases/hcp:/opt/dicm2nii/dicm2nii:/opt/palm/palm-o:/opt/workbench/workbench/bin_rh_linux64:/opt/fsl/fsl:/opt/fsl/fsl/bin:


=================== QuNex environment set successfully! ==================== 


[qunex_container_2023-03-22_19.10.24.671713.txt|attachment](upload://yZpQZQR2RaAG4WpKoceDrSBAgny.txt) (9.6 KB)


** For an example of how to report an issue, please refer to this post.

Hi Estephan,

Default CUDA for FSL 6.0.6+ is CUDA 10.2, while CUDA 9.1 was deprecated. I am finishing a patch that fixes the issues cause by this as well as some ICAFix issues that the latest container has. Expect a 0.97.3 release tomorrow or on Monday.

For the time being, maybe best to use 0.96.2a, that one is much more tested and vetted. You can also try running diffusion on 0.97.2 with:

--nv \
--bash="module load cuda/10.2" \

hcp_dwi_cudaversion can be omitted as 10.2 is now the default. Also you need to check on your system if module load cuda/10.2 is the correct module load call. If you do not have CUDA 10.2 you can also load a newer version and it will work, e.g. CUDA 11. The next bigger QuNex update will add support for CUDA 10.2+ and no-GPU processing support to all Diffusion commands, not just hcp_diffusion.

Jure

Thanks for the quick response Jure. Good to know about ICAFix as well, as I just ran it using 0.97.2 - I guess I’ll re-run it with the newer version once it is out. I just checked and my local computer cluster does not have cuda 10.2 but it does have 11.2. Is it possible to use the cuda version included in the container itself, or do I have to load a module through the “–bash=” parameter? If yes, how to do it? Thank you.

agc01:~ moana004$ module avail cuda
----------------------- /panfs/roc/soft/modulefiles.hpc ------------------------
cuda-sdk/6.5  cuda-sdk/9.1            cuda/7.0  cuda/10.0(default)  
cuda-sdk/7.0  cuda-sdk/10.0(default)  cuda/7.5  cuda/10.1           
cuda-sdk/7.5  cuda-sdk/10.1           cuda/8.0  cuda/11.2           
cuda-sdk/8.0  cuda-sdk/11.2           cuda/9.0  
cuda-sdk/9.0  cuda/6.5                cuda/9.1  

Based on your module printout you should specify:

...
--nv \
--bash="module load cuda/11.2" \
...

hcp_dwi_cudaversion can be omitted as the default value will automatically pick the correct version of CUDA. FSL is now designed to support CUDA 10.2+ out of the box. With Singularity containers the CUDA installed in the container cannot be used, that one can be used by Docker containers only.

Jure

There was another bug in MSMAll/DrDriftAndResample that we are resolving, the 0.97.3 container will be most likely released tomorrow.

Jure

I tried to run hcp_diffusion using qunex 0.97.3 but still found errors. Please see the error log attached. Please advise.

Command:

msi_resources_time=12:00:00; msi_resources_nodes=1; msi_resources_ntaskspernode=24; msi_resources_mem=64000; msi_queue=a100-4; msi_resources_gpu=gpu:a100:1; msi_resources_jobname=HCPDiff; \
study_sharedfolder=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02; \
qunex_container hcp_diffusion \
--batchfile=${study_sharedfolder}/processing/batch_K99Aim2.txt --sessionsfolder=${study_sharedfolder}/sessions --parsessions=1 --parelements=24 --overwrite=yes \
--nv \
--bash_pre="module load cuda/11.2" \
--scheduler=SLURM,time=${msi_resources_time},nodes=${msi_resources_nodes},cpus-per-task=${msi_resources_ntaskspernode},mem=${msi_resources_mem},partition=${msi_queue},gres=${msi_resources_gpu},jobname=${msi_resources_jobname} \
--bind=${study_sharedfolder}:${study_sharedfolder} --container=${HOME}/qunex/qunex_suite-0.97.3.sif

I’m having a hard time in attaching the error log file. I pasted below the sections that seem to trigger the initial errors.

# Generated by QuNex 0.97.3 on 2023-03-30_16.40.41.603061
#
------------------------------------------------------------
Running external command via QuNex:

/opt/HCP/HCPpipelines/DiffusionPreprocessing/DiffPreprocPipeline.sh                 --path="/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp"                 --subject="10001"                 --PEdir=2                 --posData="/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir98_PA.nii.gz@/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir99_PA.nii.gz"                 --negData="/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir98_AP.nii.gz@/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir99_AP.nii.gz"                 --echospacing="0.000689998"                 --gdcoeffs="/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/info/3T_mri_scanner_info/coeff_Prisma3T_20160203.grad"                 --dof="6"                 --b0maxbval="50"                 --combine-data-flag="1"                 --printcom=""                --topup-config-file=/opt/HCP/HCPpipelines/global/config/b02b0.cnf                --cuda-version=10.2

Test file: 
/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/T1w/Diffusion/data.nii.gz
------------------------------------------------------------

========================================
  DIRECTORY: /opt/HCP/HCPpipelines
    PRODUCT: HCP Pipeline Scripts
    VERSION: Post-v4.7.0-79f0293
     COMMIT: 79f0293bb195ad846ea7974b6e6414aef5e48fc3
   MODIFIED: no
========================================
Thu Mar 30 16:40:48 CDT 2023:DiffPreprocPipeline.sh: HCPPIPEDIR: /opt/HCP/HCPpipelines
Thu Mar 30 16:40:48 CDT 2023:DiffPreprocPipeline.sh: FSLDIR: /opt/fsl/fsl
-- DiffPreprocPipeline.sh: Specified Command-Line Parameters - Start --
   StudyFolder: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp
   Subject: 10001
   PEdir: 2
   PosInputImages: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir98_PA.nii.gz@/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir99_PA.nii.gz
   NegInputImages: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir98_AP.nii.gz@/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir99_AP.nii.gz
   echospacing: 0.000689998
   GdCoeffs: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/info/3T_mri_scanner_info/coeff_Prisma3T_20160203.grad
   DWIName: Diffusion
   DegreesOfFreedom: 6
   b0maxbval: 50
   runcmd: 
   CombineDataFlag: 1
   TopupConfig: /opt/HCP/HCPpipelines/global/config/b02b0.cnf
   SelectBestB0: false
   EnsureEvenSlices: false
   extra_eddy_args: 
   no_gpu: false
   cuda-version: 10.2
-- DiffPreprocPipeline.sh: Specified Command-Line Parameters - End --
Thu Mar 30 16:40:48 CDT 2023:DiffPreprocPipeline.sh: Invoking Pre-Eddy Steps
Thu Mar 30 16:40:48 CDT 2023:DiffPreprocPipeline.sh: pre_eddy_cmd: /opt/HCP/HCPpipelines/DiffusionPreprocessing/DiffPreprocPipeline_PreEddy.sh  --path=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp  --subject=10001  --dwiname=Diffusion  --PEdir=2  --posData=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir98_PA.nii.gz@/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir99_PA.nii.gz  --negData=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir98_AP.nii.gz@/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir99_AP.nii.gz  --echospacing=0.000689998  --b0maxbval=50  --topup-config-file=/opt/HCP/HCPpipelines/global/config/b02b0.cnf  --printcom= 
========================================
  DIRECTORY: /opt/HCP/HCPpipelines
    PRODUCT: HCP Pipeline Scripts
    VERSION: Post-v4.7.0-79f0293
     COMMIT: 79f0293bb195ad846ea7974b6e6414aef5e48fc3
   MODIFIED: no
========================================
Thu Mar 30 16:40:49 CDT 2023:DiffPreprocPipeline_PreEddy.sh: HCPPIPEDIR: /opt/HCP/HCPpipelines
Thu Mar 30 16:40:49 CDT 2023:DiffPreprocPipeline_PreEddy.sh: FSLDIR: /opt/fsl/fsl
Thu Mar 30 16:40:49 CDT 2023:DiffPreprocPipeline_PreEddy.sh: HCPPIPEDIR_Config: /opt/HCP/HCPpipelines/global/config
-- DiffPreprocPipeline_PreEddy.sh: Specified Command-Line Parameters - Start --
   StudyFolder: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp
   Subject: 10001
   PEdir: 2
   PosInputImages: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir98_PA.nii.gz@/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir99_PA.nii.gz
   NegInputImages: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir98_AP.nii.gz@/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir99_AP.nii.gz
   echospacing: 0.000689998
   DWIName: Diffusion
   b0maxbval: 50
   runcmd: 
   SelectBestB0: false
-- DiffPreprocPipeline_PreEddy.sh: Specified Command-Line Parameters - End --
Thu Mar 30 16:40:49 CDT 2023:DiffPreprocPipeline_PreEddy.sh: outdir: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion
Thu Mar 30 16:40:49 CDT 2023:DiffPreprocPipeline_PreEddy.sh: basePos: Pos
Thu Mar 30 16:40:49 CDT 2023:DiffPreprocPipeline_PreEddy.sh: baseNeg: Neg
Thu Mar 30 16:40:49 CDT 2023:DiffPreprocPipeline_PreEddy.sh: Copying positive raw data to working directory
Thu Mar 30 16:40:49 CDT 2023:DiffPreprocPipeline_PreEddy.sh: PosInputImages: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir98_PA.nii.gz /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir99_PA.nii.gz
Thu Mar 30 16:40:57 CDT 2023:DiffPreprocPipeline_PreEddy.sh: Copying negative raw data to working directory
Thu Mar 30 16:40:57 CDT 2023:DiffPreprocPipeline_PreEddy.sh: NegInputImages: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir98_AP.nii.gz /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/unprocessed/Diffusion/10001_DWI_dir99_AP.nii.gz
Thu Mar 30 16:41:02 CDT 2023:DiffPreprocPipeline_PreEddy.sh: Total readout time is .000095 secs
Image Exception : #63 :: No image files match: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/topup/Pos_b0
No image files match: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/topup/Pos_b0
/opt/HCP/HCPpipelines/DiffusionPreprocessing/DiffPreprocPipeline_PreEddy.sh: line 166: % 2: syntax error: operand expected (error token is "% 2")
/opt/HCP/HCPpipelines/DiffusionPreprocessing/DiffPreprocPipeline_PreEddy.sh: line 480: [: -eq: unary operator expected
Thu Mar 30 16:41:02 CDT 2023:DiffPreprocPipeline_PreEddy.sh: Create two files for each phase encoding direction
Thu Mar 30 16:41:02 CDT 2023:DiffPreprocPipeline_PreEddy.sh: Running Intensity Normalisation

 START: basic_preproc_norm_intensity.sh
basic_preproc_norm_intensity.sh: Input Parameter: workingdir: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion
basic_preproc_norm_intensity.sh: Input Parameter: b0maxbval: 50
basic_preproc_norm_intensity.sh: Rescaling series to ensure consistency across baseline intensities
basic_preproc_norm_intensity.sh: Processing /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/rawdata/Pos_1
basic_preproc_norm_intensity.sh: About to fslmaths /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/rawdata/Pos_1.nii.gz -Xmean -Ymean -Zmean /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/rawdata/Pos_1_mean
basic_preproc_norm_intensity.sh: Getting Posbvals from /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/rawdata/Pos_1.bval
basic_preproc_norm_intensity.sh: Posbvals: 5 5 1490 2995 1495 3005 1500 2990 1495 3005 1490 3000 1490 2985 1495 2985 1500 5 3005 1490 2995 1500 3000 1495 3005 1490 2985 1505 2990 1500 3000 1490 3005 5 1505 3000 1495 2980 1500 2995 1505 3000 1500 2985 1495 2995 1495 3000 1485 5 3005 1500 2990 1505 2995 1495 2985 1490 2995 1500 3000 1500 2985 1495 2985 5 1505 3005 1495 3000 1505 3010 1495 3005 1490 2995 1490 3000 1505 3000 1495 5 2985 1500 2985 1490 2995 1495 2995 1490 2980 1500 3000 1490 3005 1495 2990 1500 3005
basic_preproc_norm_intensity.sh: Posbvals i: 5
basic_preproc_norm_intensity.sh: cnt: 0000
basic_preproc_norm_intensity.sh: About to fslroi /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/rawdata/Pos_1_mean /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/rawdata/Pos_1_b0_0000 0 1
basic_preproc_norm_intensity.sh: Posbvals i: 5
basic_preproc_norm_intensity.sh: cnt: 0001
basic_preproc_norm_intensity.sh: About to fslroi /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/rawdata/Pos_1_mean /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/rawdata/Pos_1_b0_0001 1 1
Thu Mar 30 17:33:07 CDT 2023:DiffPreprocPipeline_PreEddy.sh: Completed!
Thu Mar 30 17:33:07 CDT 2023:DiffPreprocPipeline.sh: Invoking Eddy Step
Thu Mar 30 17:33:07 CDT 2023:DiffPreprocPipeline.sh: eddy_cmd: /opt/HCP/HCPpipelines/DiffusionPreprocessing/DiffPreprocPipeline_Eddy.sh  --path=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp  --subject=10001  --dwiname=Diffusion  --printcom=  --cuda-version=10.2
========================================
  DIRECTORY: /opt/HCP/HCPpipelines
    PRODUCT: HCP Pipeline Scripts
    VERSION: Post-v4.7.0-79f0293
     COMMIT: 79f0293bb195ad846ea7974b6e6414aef5e48fc3
   MODIFIED: no
========================================
Thu Mar 30 17:33:08 CDT 2023:DiffPreprocPipeline_Eddy.sh: HCPPIPEDIR: /opt/HCP/HCPpipelines
Thu Mar 30 17:33:08 CDT 2023:DiffPreprocPipeline_Eddy.sh: FSLDIR: /opt/fsl/fsl
-- DiffPreprocPipeline_Eddy.sh: Specified Command-Line Parameters - Start --
   StudyFolder: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp
   Subject: 10001
   DWIName: Diffusion
   DetailedOutlierStats: False
   ReplaceOutliers: False
   runcmd: 
   nvoxhp: 
   sep_offs_move: False
   rms: False
   ff_val: 
   dont_peas: 
   fwhm_value: 0
   resamp_value: 
   ol_nstd_value: 
   extra_eddy_args: 
   no_gpu: False
   cuda-version: 10.2
-- DiffPreprocPipeline_Eddy.sh: Specified Command-Line Parameters - End --
Thu Mar 30 17:33:08 CDT 2023:DiffPreprocPipeline_Eddy.sh: Running Eddy
Thu Mar 30 17:33:08 CDT 2023:DiffPreprocPipeline_Eddy.sh: About to issue the following command to invoke the run_eddy.sh script
Thu Mar 30 17:33:08 CDT 2023:DiffPreprocPipeline_Eddy.sh:  /opt/HCP/HCPpipelines/DiffusionPreprocessing/scripts/run_eddy.sh                -g  --cuda-version=10.2 -w /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy  --fwhm=0 
========================================
  DIRECTORY: /opt/HCP/HCPpipelines
    PRODUCT: HCP Pipeline Scripts
    VERSION: Post-v4.7.0-79f0293
     COMMIT: 79f0293bb195ad846ea7974b6e6414aef5e48fc3
   MODIFIED: no
========================================
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: HCPPIPEDIR: /opt/HCP/HCPpipelines
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: FSLDIR: /opt/fsl/fsl
-- run_eddy.sh: Specified Command-Line Options - Start --
   workingdir: /home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy
   useGpuVersion: True
   produceDetailedOutlierStats: False
   replaceOutliers: False
   nvoxhp: 
   sep_offs_move: False
   rms: False
   ff_val: 
   dont_peas: 
   fwhm_value: 0
   resamp_value: 
   ol_nstd_val: 
   extra_eddy_args: 
   g_cuda_version: 10.2
-- run_eddy.sh: Specified Command-Line Options - End --
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: INFO: Determined that the FSL version in use is 6.0.6.2
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: User requested GPU-enabled version of eddy
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: GPU-enabled version of eddy found: /opt/fsl/fsl/bin/eddy_cuda10.2
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: eddy executable command to use: /opt/fsl/fsl/bin/eddy_cuda10.2
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: outlier statistics option: 
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: replace outliers option: 
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: nvoxhp option: 
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: sep_offs_move option: 
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: rms option: 
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: ff option: 
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: ol_nstd_option: 
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: About to issue the following eddy command: 
Thu Mar 30 17:33:09 CDT 2023:run_eddy.sh: /opt/fsl/fsl/bin/eddy_cuda10.2       --cnr_maps --imain=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/Pos_Neg --mask=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/nodif_brain_mask --index=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/index.txt --acqp=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/acqparams.txt --bvecs=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/Pos_Neg.bvecs --bvals=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/Pos_Neg.bvals --fwhm=0 --topup=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/topup/topup_Pos_Neg_b0 --out=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/eddy_unwarped_images 


EDDY::: Eddy failed with message žç®D
Thu Mar 30 17:33:26 CDT 2023:run_eddy.sh: Completed with return value: 1
Thu Mar 30 17:33:26 CDT 2023:DiffPreprocPipeline_Eddy.sh: Completed!
Thu Mar 30 17:33:26 CDT 2023:DiffPreprocPipeline.sh: Invoking Post-Eddy Steps

Hi Estephan,

The first exception/error happens in the HCP code block that is there for assurance of even slices. Often, you can just ignore this as it is just a safety mechanism. I am in contact with HCP developers to properly resolve this.

The second one is more problematic: EDDY::: Eddy failed with message žç®D. I think this happens when eddy is unable to access your GPU. Is the exact command working with container 0.96.2a and CUDA 9.1?

Jure

Hi Jure, I just tried what you suggested. It also raised errors, please see the error log. Any thoughts?

Command:

msi_resources_time=12:00:00; msi_resources_nodes=1; msi_resources_ntaskspernode=24; msi_resources_mem=64000; msi_queue=v100; msi_resources_gpu=gpu:v100:1; msi_resources_jobname=HCPDiff; \
study_sharedfolder=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02; \
qunex_container hcp_diffusion \
--batchfile=${study_sharedfolder}/processing/batch_K99Aim2.txt --sessionsfolder=${study_sharedfolder}/sessions --parsessions=1 --parelements=24 --overwrite=yes \
--nv \
--bash_pre="module load cuda/9.1" \
--scheduler=SLURM,time=${msi_resources_time},nodes=${msi_resources_nodes},cpus-per-task=${msi_resources_ntaskspernode},mem=${msi_resources_mem},partition=${msi_queue},gres=${msi_resources_gpu},jobname=${msi_resources_jobname} \
--bind=${study_sharedfolder}:${study_sharedfolder} --container=${HOME}/qunex/qunex_suite-0.96.2a.sif

tmp_hcp_diffusion_10021_2023-04-03_10.45.01.918828.log (79.0 KB)

It seems like some kind of an issue with eddy and CUDA. Unfortunately, I am unable to reproduce this behavior, I ran 2 different container version (0.96.2a and 0.97.3) on 2 different systems and diffusion completed successfully. It seems like the software inside the container has issues accessing CUDA drivers. Was there a version where you were able to run hcp_diffusion on the system you are using?

Jure

I also managed to run hcp_diffusion on a brand new system that runs CUDA 12. Give this a go:

qunex_container hcp_diffusion \
  --batchfile="${study_sharedfolder}/processing/batch_K99Aim2.txt" \
  --sessionsfolder="${study_sharedfolder}/sessions" \
  --overwrite=yes \
  --nv \
  --bash_pre="module load cuda/11.2" \
  --scheduler="SLURM,time=${msi_resources_time},nodes=${msi_resources_nodes},cpus-per-task=${msi_resources_ntaskspernode},mem=${msi_resources_mem},partition=${msi_queue},gres=${msi_resources_gpu},jobname=${msi_resources_jobname}" \
  --bind="${study_sharedfolder}:${study_sharedfolder},/usr/local/cuda:/usr/local/cuda" \
  --container="${HOME}/qunex/qunex_suite-0.97.3.sif"

If this does not work, we can then enter the GPU node and the container interactively and inspect what is going on. Also maybe check with your IT guys if module load cuda/11.2 is the correct command.

Jure

Thanks for the tips Jure. I tried what you suggested, and also contacted our local computing cluster helpdesk who provided the correct “bind” path for where CUDA files are located. I still saw the same error. I posted the last command I tried and one of the error logs below. I think your suggestion of entering the container to try to understand what’s driving the error would probably be the best next step.

Command:

msi_resources_time=12:00:00; msi_resources_nodes=1; msi_resources_ntaskspernode=24; msi_resources_mem=64000; msi_queue=a100-4; msi_resources_gpu=gpu:a100:1; msi_resources_jobname=HCPDiff; \
study_sharedfolder=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02; \
qunex_container hcp_diffusion \
--batchfile=${study_sharedfolder}/processing/batch_K99Aim2.txt --sessionsfolder=${study_sharedfolder}/sessions --parsessions=1 --parelements=24 --overwrite=yes \
--nv \
--bash_pre="module load cuda/11.2" \
--scheduler=SLURM,time=${msi_resources_time},nodes=${msi_resources_nodes},cpus-per-task=${msi_resources_ntaskspernode},mem=${msi_resources_mem},partition=${msi_queue},gres=${msi_resources_gpu},jobname=${msi_resources_jobname} \
--bind=${study_sharedfolder}:${study_sharedfolder},/panfs/roc/msisoft/cuda/11.2:/usr/local/cuda --container=${HOME}/qunex/qunex_suite-0.97.3.sif

error_hcp_diffusion_10001_2023-04-07_09.05.02.192307.log (87.6 KB)

First, I would recommend re-running the command from your latest call with version 0.97.3 and CUDA 11.2.

If that does not work, I am attaching the full procedure of working interactively. This procedure tries to run FSL’s eddy_cuda10.2 without QuNex - directly in the container.

The procedure is like this:

# 1. Start an interactive session on the GPU node (it might take some time to enter this)
srun --nodes=1 --cpus-per-task=8 --partition=a100-4 --time=12:00:00 --gres=gpu:a100:1 --mem=64G --pty bash -i

# ----- You are now on the compute node of your system -----

# 2. Load the CUDA module
module load cuda/11.2

# 3. Prepare variables
study_sharedfolder=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02

# 3. Enter the singularity container
singularity shell -B ${study_sharedfolder}:${study_sharedfolder},/panfs/roc/msisoft/cuda/11.2:/usr/local/cuda --nv ${HOME}/qunex/qunex_suite-0.97.3.sif

# ----- You are now in the QuNex container -----

# 4. Source QuNex environment
source /opt/qunex/env/qunex_environment.sh

# 5. Execute FSL's eddy cuda
/opt/fsl/fsl/bin/eddy_cuda10.2 \
  --cnr_maps \
  --imain=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg \
  --mask=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/nodif_brain_mask \
  --index=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/index.txt \
  --acqp=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/acqparams.txt \
  --bvecs=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg.bvecs \
  --bvals=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg.bvals \
  --fwhm=0 \
  --topup=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/topup/topup_Pos_Neg_b0 \
  --out=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/eddy_unwarped_images

In one of our latest studies we were having similar issues as you because of DWI data. We found out that adding --data_is_shelled to eddy_cuda10.2 call fixed this. We are working with HCP to add this funcitonality to hcp_diffusion. Maybe you are having similar issues. So if the above does not work you can try calling the final (eddy_cuda10.2) command with --data_is_shelled parameter.

Also, hcp_diffusion supports processing without a GPU, you can run it like normal commands, without all of the CUDA stuff and add hcp_dwi_nogpu flag to it. It will take much longer and is something we usually do not want to do at scale, but for testing it is perfectly fine.

Cheers, Jure

Thank you for the detailed instructions Jure. The obscure eddy error keeps coming up in every attempt I did running things inside the qunex container: 1. Running eddy_cuda10.2 as you suggested; 2. Adding --data_is_shelled; 3. Running eddy non-gpu version with and without --data_is_shelled. The error comes up very quickly. I wonder if I should download the container again, could it be a reason for this? See the terminal output below of my attempts. Thank you.

Estephan

cn2107:~ moana004$ module load singularity/current python3
cn2107:~ moana004$ module load cuda/11.2
cn2107:~ moana004$ study_sharedfolder=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02
cn2107:~ moana004$ singularity shell -B ${study_sharedfolder}:${study_sharedfolder},/panfs/roc/msisoft/cuda/11.2:/usr/local/cuda --nv ${HOME}/qunex/qunex_suite-0.97.3.sif
Apptainer> source /opt/qunex/env/qunex_environment.sh
--> unsetting the following environment variables: PATH MATLABPATH PYTHONPATH QUNEXVer TOOLS QUNEXREPO QUNEXPATH QUNEXEXTENSIONS QUNEXLIBRARY QUNEXLIBRARYETC TemplateFolder FSL_FIXDIR FREESURFERDIR FREESURFER_HOME FREESURFER_SCHEDULER FreeSurferSchedulerDIR WORKBENCHDIR DCMNIIDIR DICMNIIDIR MATLABDIR MATLABBINDIR OCTAVEDIR OCTAVEPKGDIR OCTAVEBINDIR RDIR HCPWBDIR AFNIDIR PYLIBDIR FSLDIR FSLGPUDIR PALMDIR QUNEXMCOMMAND HCPPIPEDIR CARET7DIR GRADUNWARPDIR HCPPIPEDIR_Templates HCPPIPEDIR_Bin HCPPIPEDIR_Config HCPPIPEDIR_PreFS HCPPIPEDIR_FS HCPPIPEDIR_PostFS HCPPIPEDIR_fMRISurf HCPPIPEDIR_fMRIVol HCPPIPEDIR_tfMRI HCPPIPEDIR_dMRI HCPPIPEDIR_dMRITract HCPPIPEDIR_Global HCPPIPEDIR_tfMRIAnalysis HCPCIFTIRWDIR MSMBin HCPPIPEDIR_dMRITractFull HCPPIPEDIR_dMRILegacy AutoPtxFolder FSL_GPU_SCRIPTS FSLGPUBinary EDDYCUDADIR USEOCTAVE QUNEXENV CONDADIR MSMBINDIR MSMCONFIGDIR R_LIBS FSL_FIX_CIFTIRW FSFAST_HOME SUBJECTS_DIR MINC_BIN_DIR MNI_DIR MINC_LIB_DIR MNI_DATAPATH FSF_OUTPUT_FORMAT
 
Generated by QuNex 
------------------------------------------------------------------------ 
Version: 0.97.3 
User: moana004 
System: cn2107 
OS: RedHat Linux #1 SMP Tue Mar 7 15:41:52 UTC 2023 
------------------------------------------------------------------------ 
 
        \u2588\u2588\u2588\u2588\u2588\u2588\                  \u2551      \u2588\u2588\   \u2588\u2588\                        
       \u2588\u2588  __\u2588\u2588\                 \u2551      \u2588\u2588\u2588\  \u2588\u2588 |                       
       \u2588\u2588 /  \u2588\u2588 |\u2588\u2588\   \u2588\u2588\       \u2551      \u2588\u2588\u2588\u2588\ \u2588\u2588 | \u2588\u2588\u2588\u2588\u2588\u2588\ \u2588\u2588\   \u2588\u2588\     
       \u2588\u2588 |  \u2588\u2588 |\u2588\u2588 |  \u2588\u2588 |      \u2551      \u2588\u2588 \u2588\u2588\\u2588\u2588 |\u2588\u2588  __\u2588\u2588\\\u2588\u2588\ \u2588\u2588  | 
       \u2588\u2588 |  \u2588\u2588 |\u2588\u2588 |  \u2588\u2588 |      \u2551      \u2588\u2588 \\u2588\u2588\u2588\u2588 |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 |\\u2588\u2588\u2588\u2588  /     
       \u2588\u2588 \u2588\u2588\\u2588\u2588 |\u2588\u2588 |  \u2588\u2588 |      \u2551      \u2588\u2588 |\\u2588\u2588\u2588 |\u2588\u2588   ____|\u2588\u2588  \u2588\u2588\      
       \\u2588\u2588\u2588\u2588\u2588\u2588 / \\u2588\u2588\u2588\u2588\u2588\u2588  |      \u2551      \u2588\u2588 | \\u2588\u2588 |\\u2588\u2588\u2588\u2588\u2588\u2588\u2588\\u2588\u2588  /\\u2588\u2588\     
        \___\u2588\u2588\u2588\  \______/       \u2551      \__|  \__| \_______\__/  \__|    
            \___|                \u2551                                       
 
 
                       DEVELOPED & MAINTAINED BY: 
 
                    Anticevic Lab, Yale University 
               Mind & Brain Lab, University of Ljubljana 
                     Murray Lab, Yale University 
 
                      COPYRIGHT & LICENSE NOTICE: 
 
Use of this software is subject to the terms and conditions defined in 
'LICENSES' which is a part of the QuNex Suite source code package: 
https://gitlab.qunex.yale.edu/qunex/qunex/-/tree/master/LICENSES 
 
 ---> Setting up Octave  

(/opt/env/qunex) [QuNex ~]$ /opt/fsl/fsl/bin/eddy_cuda10.2 \
>   --cnr_maps \
>   --imain=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg \
>   --mask=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/nodif_brain_mask \
>   --index=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/index.txt \
>   --acqp=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/acqparams.txt \
>   --bvecs=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg.bvecs \
>   --bvals=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg.bvals \
>   --fwhm=0 \
>   --topup=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/topup/topup_Pos_Neg_b0 \
>   --out=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/eddy_unwarped_images


EDDY::: Eddy failed with message \ufffdG\ufffd]S
(/opt/env/qunex) [QuNex ~]$ /opt/fsl/fsl/bin/eddy_cuda10.2 \
>   --cnr_maps \
>   --imain=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg \
>   --mask=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/nodif_brain_mask \
>   --index=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/index.txt \
>   --acqp=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/acqparams.txt \
>   --bvecs=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg.bvecs \
>   --bvals=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg.bvals \
>   --fwhm=0 \
>   --topup=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/topup/topup_Pos_Neg_b0 \
>   --out=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/eddy_unwarped_images \
>   --data_is_shelled


EDDY::: Eddy failed with message \ufffd\ufffdV'\ufffd
(/opt/env/qunex) [QuNex ~]$ /opt/fsl/fsl/bin/eddy_cuda10.2 \
>   --cnr_maps \
>   --imain=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg \
>   --mask=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/nodif_brain_mask \
>   --index=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/index.txt \
>   --acqp=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/acqparams.txt \
>   --bvecs=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg.bvecs \
>   --bvals=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg.bvals \
>   --fwhm=0 \
>   --topup=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/topup/topup_Pos_Neg_b0 \
>   --out=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/eddy_unwarped_images \
>   --data_is_shelled


EDDY::: Eddy failed with message \ufffd\ufffd\ufffd\ufffd
(/opt/env/qunex) [QuNex ~]$ date; /opt/fsl/fsl/bin/eddy \
>   --cnr_maps \
>   --imain=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg \
>   --mask=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/nodif_brain_mask \
>   --index=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/index.txt \
>   --acqp=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/acqparams.txt \
>   --bvecs=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg.bvecs \
>   --bvals=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/Pos_Neg.bvals \
>   --fwhm=0 \
>   --topup=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/topup/topup_Pos_Neg_b0 \
>   --out=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10021/hcp/10021/Diffusion/eddy/eddy_unwarped_images \
>   --data_is_shelled; \
> date
Mon Apr 10 10:54:04 CDT 2023


EDDY::: Eddy failed with message \ufffd\ufffd\ufffd{\
Mon Apr 10 10:54:28 CDT 2023
(/opt/env/qunex) [QuNex ~]$ nvidia-smi
Mon Apr 10 12:34:10 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:5C:00.0 Off |                    0 |
| N/A   47C    P0    37W / 250W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  Off  | 00000000:D8:00.0 Off |                    0 |
| N/A   46C    P0    38W / 250W |      0MiB / 32510MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
(/opt/env/qunex) [QuNex ~]$ 

Hi Estephan,

Thanks for trying all this out. Unfortunately, the undescriptive eddy error gives us nothing to work with. Do you maybe have FSL installed on your system (outside of QuNex)? This way, you could test the eddy_cuda command without the container.

Another thing to try is to process our quickstart data (QuNex quick start using a Docker container — QuNex documentation). Here you would first have to run it all the way to hcp_fmri_surface and then continue with hcp_diffusion. This will at least give us the information if this is a data issue or a QuNex/system one.

Jure

Estephan,

Could you please run hcp_diffusion on a non GPU node using QuNex 0.97.3 and the hcp_dwi_nogpu flag. One of my colleagues just said that he was also getting gibberish errors from eddy_cuda10.2 but when the no gpu eddy was used he got a readable error.

Jure

I did this, and I got an error related to not being able to find “eddy_openmp” - an unrelated error to the above, right? See command and error log below.

Command:

msi_resources_time=12:00:00; msi_resources_nodes=1; msi_resources_ntaskspernode=24; msi_resources_mem=64000; msi_queue=agsmall; msi_resources_jobname=HCPDiff; \
study_sharedfolder=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02; \
qunex_container hcp_diffusion \
--batchfile=${study_sharedfolder}/processing/batch_K99Aim2.txt --sessionsfolder=${study_sharedfolder}/sessions --sessions=10021 --parsessions=1 --parelements=24 --overwrite=yes \
--hcp_dwi_nogpu \
--scheduler=SLURM,time=${msi_resources_time},nodes=${msi_resources_nodes},cpus-per-task=${msi_resources_ntaskspernode},mem=${msi_resources_mem},partition=${msi_queue},jobname=${msi_resources_jobname} \
--bind=${study_sharedfolder}:${study_sharedfolder} --container=${HOME}/qunex/qunex_suite-0.97.3.sif

error_hcp_diffusion_10021_2023-04-11_12.23.53.976341.log (86.0 KB)

Yes, this is unrelated. I need to check this in the container. Could you repeat the interactive process above and try using use /opt/fsl/fsl/bin/eddy instead of /opt/fsl/fsl/bin/eddy_cuda10.2. I will take a look at why eddy_openmp over the next week or two.

Not that helpful, only got a “Killed” message. I’ll try to run locally or using the example data you mentioned to see if something else happens. I’ll update you once I get it done.

(/opt/env/qunex) [QuNex ~]$ date; /opt/fsl/fsl/bin/eddy \
>   --cnr_maps \
>   --imain=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/Pos_Neg \
>   --mask=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/nodif_brain_mask \
>   --index=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/index.txt \
>   --acqp=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/acqparams.txt \
>   --bvecs=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/Pos_Neg.bvecs \
>   --bvals=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/Pos_Neg.bvals \
>   --fwhm=0 \
>   --topup=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/topup/topup_Pos_Neg_b0 \
>   --out=/home/moanae/shared/project_K99_ChrTMDHCP_qunex02/sessions/10001/hcp/10001/Diffusion/eddy/eddy_unwarped_images; \
> date
Tue Apr 11 14:21:43 CDT 2023
Killed
Tue Apr 11 14:22:18 CDT 2023

Hi Jure, I finally found what the issue was for the hcp_diffusion in my case, after a long time looking into this. It was the echo spacing parameter that was erroneously provided in seconds. The batch file I used was generated from the information extracted from the json files, which had the echo spacing in seconds. As you know, for BOLD and spin-echo images the echo spacing parameter is needed in seconds, but for the DWI images it should be in milliseconds.

I wonder if it would be beneficial to run a check for the echo spacing during processing with qunex to prevent this from happening again. Thank you for your help, the example data was very helpful to figure out the issue. Cheers.

Estephan