Description:
Not sure if it’s better to get help on this here or the HCP pipeline page, but thought I’d start here. I have 40 sets of scans that I’m running through the longitudinal HCP pipeline in qunex, and 33 have completed successfully - they all have the same data structure and i’m running them with the same script. Of the remaining 7, I’m getting the same error in 6 of them and a segmentation fault in the 7th (with the same result after attempting to re-run).
Call:
Here is my qunex call:
qdir=/work/long/qunex/
projectname=DBIS_longComb_21133
con=$qdir/qunex_suite-1.1.0.sif
qunex_container run_recipe \
--container="${con}" \
--bind="${qdir}" \
--recipe_file="$qdir/$projectname/sessions/specs/recipe.yaml" \
--recipe="hcp_longitudinal" \
--scheduler="SLURM,jobname=hcp_long,cpus-per-task=6,time=96:00:00,mem-per-cpu=16000,partition=scavenger"
and my recipe:
global_parameters:
sessionsfolder : /work/long/qunex/DBIS_longComb_21133/sessions
sessions : p45,p52mprage,p52CSmprage1,p52CSmprage2,p52CSmprage3,p52CSmprage4
overwrite : "yes"
batchfile : /work/long/qunex/DBIS_longComb_21133/processing/batch.txt
parsessions : 6
recipes:
hcp_longitudinal:
parsessions: 6
commands:
- create_session_info:
mapping: /work/long/qunex/DBIS_longComb_21133/sessions/specs/hcp_mapping.txt
- setup_hcp
- create_batch:
targetfile: /work/long/qunex/DBIS_longComb_21133/processing/batch.txt
paramfile : /work/long/qunex/DBIS_longComb_21133/sessions/specs/parameters.txt
- hcp_pre_freesurfer
- hcp_freesurfer
- hcp_post_freesurfer
- hcp_long_freesurfer
- hcp_long_post_freesurfer
Logs:
For the 6x repeated error, here’s the relevant output:
$ tail -20 /work/long/qunex/DBIS_longComb_21222/processing/logs/comlogs/error_hcp_long_post_freesurfer_21222_2025-03-21_03.08.42.941591.log
Info: Time to read /work/long/qunex/DBIS_longComb_21222/subjects/21222/p45.long.base/T1w/wmparc.nii.gz was 0.922471 seconds.
some jobs had errors, please check /work/long/qunex/DBIS_longComb_21222/processing/logs/comlogs/extra_logs_hcp_long_post_freesurfer_21222/PostFreeSurferPipelineLongLauncher.sh.errjobs308835.15.log
Fri Mar 21 06:02:46 EDT 2025:PostFreeSurferPipelineLongLauncher.sh: While running '/opt/HCP/HCPpipelines/PostFreeSurfer/PostFreeSurferPipelineLongLauncher.sh --study-folder=/work/long/qunex/DBIS_longComb_21222/subjects/21222 --subject=21222 --sessions=p45@p52CSmprage1@p52CSmprage2@p52CSmprage3@p52CSmprage4@p52mprage --longitudinal-template=base --t1template=/opt/HCP/HCPpipelines/global/templates/MNI152_T1_1mm.nii.gz --t1templatebrain=/opt/HCP/HCPpipelines/global/templates/MNI152_T1_1mm_brain.nii.gz --t1template2mm=/opt/HCP/HCPpipelines/global/templates/MNI152_T1_2mm.nii.gz --t2template=/opt/HCP/HCPpipelines/global/templates/MNI152_T2_1mm.nii.gz --t2templatebrain=/opt/HCP/HCPpipelines/global/templates/MNI152_T2_1mm_brain.nii.gz --t2template2mm=/opt/HCP/HCPpipelines/global/templates/MNI152_T2_2mm.nii.gz --templatemask=/opt/HCP/HCPpipelines/global/templates/MNI152_T1_1mm_brain_mask.nii.gz --template2mmmask=/opt/HCP/HCPpipelines/global/templates/MNI152_T1_2mm_brain_mask_dil.nii.gz --fnirtconfig=/opt/HCP/HCPpipelines/global/config/T1_2_MNI152_2mm.cnf --freesurferlabels=/opt/HCP/HCPpipelines/global/config/FreeSurferAllLut.txt --surfatlasdir=/opt/HCP/HCPpipelines/global/templates/standard_mesh_atlases --grayordinatesres=2 --grayordinatesdir=/opt/HCP/HCPpipelines/global/templates/91282_Greyordinates --hiresmesh=164 --lowresmesh=32 --subcortgraylabels=/opt/HCP/HCPpipelines/global/config/FreeSurferSubcorticalLabelTableLut.txt --refmyelinmaps=/opt/HCP/HCPpipelines/global/templates/standard_mesh_atlases/Conte69.MyelinMap_BC.164k_fs_LR.dscalar.nii --regname=MSMSulc --parallel-mode=BUILTIN --logdir=/work/long/qunex/DBIS_longComb_21222/processing/logs/comlogs/extra_logs_hcp_long_post_freesurfer_21222':
Fri Mar 21 06:02:46 EDT 2025:PostFreeSurferPipelineLongLauncher.sh: While running '/opt/HCP/HCPpipelines/PostFreeSurfer/PostFreeSurferPipelineLongLauncher.sh --study-folder=/work/long/qunex/DBIS_longComb_21222/subjects/21222 --subject=21222 --sessions=p45@p52CSmprage1@p52CSmprage2@p52CSmprage3@p52CSmprage4@p52mprage --longitudinal-template=base --t1template=/opt/HCP/HCPpipelines/global/templates/MNI152_T1_1mm.nii.gz --t1templatebrain=/opt/HCP/HCPpipelines/global/templates/MNI152_T1_1mm_brain.nii.gz --t1template2mm=/opt/HCP/HCPpipelines/global/templates/MNI152_T1_2mm.nii.gz --t2template=/opt/HCP/HCPpipelines/global/templates/MNI152_T2_1mm.nii.gz --t2templatebrain=/opt/HCP/HCPpipelines/global/templates/MNI152_T2_1mm_brain.nii.gz --t2template2mm=/opt/HCP/HCPpipelines/global/templates/MNI152_T2_2mm.nii.gz --templatemask=/opt/HCP/HCPpipelines/global/templates/MNI152_T1_1mm_brain_mask.nii.gz --template2mmmask=/opt/HCP/HCPpipelines/global/templates/MNI152_T1_2mm_brain_mask_dil.nii.gz --fnirtconfig=/opt/HCP/HCPpipelines/global/config/T1_2_MNI152_2mm.cnf --freesurferlabels=/opt/HCP/HCPpipelines/global/config/FreeSurferAllLut.txt --surfatlasdir=/opt/HCP/HCPpipelines/global/templates/standard_mesh_atlases --grayordinatesres=2 --grayordinatesdir=/opt/HCP/HCPpipelines/global/templates/91282_Greyordinates --hiresmesh=164 --lowresmesh=32 --subcortgraylabels=/opt/HCP/HCPpipelines/global/config/FreeSurferSubcorticalLabelTableLut.txt --refmyelinmaps=/opt/HCP/HCPpipelines/global/templates/standard_mesh_atlases/Conte69.MyelinMap_BC.164k_fs_LR.dscalar.nii --regname=MSMSulc --parallel-mode=BUILTIN --logdir=/work/long/qunex/DBIS_longComb_21222/processing/logs/comlogs/extra_logs_hcp_long_post_freesurfer_21222':
Fri Mar 21 06:02:46 EDT 2025:PostFreeSurferPipelineLongLauncher.sh: ERROR: 'return' command failed with return code: 1
Fri Mar 21 06:02:46 EDT 2025:PostFreeSurferPipelineLongLauncher.sh: ERROR: 'return' command failed with return code: 1
===> ERROR: Command returned with nonzero exit code
---------------------------------------------------
script: PostFreeSurferPipelineLongLauncher.sh
stopped at line: 263
call: return 1
expanded call: return 1
hostname: dcc-adrc-01
exit code: 1
---------------------------------------------------
===> Aborting execution!
and
$ grep -A8 ERROR /work/long/qunex/DBIS_longComb_21222/processing/logs/comlogs/extra_logs_hcp_long_post_freesurfer_*/PostFreeSurferPipelineLongLauncher.sh.*.15.e.log
ERRORS loading scene, output image may be incorrect.
NAME OF FILE: p45.long.base.StrainJ_MSMAll.164k_fs_LR.dscalar.nii
PATH TO FILE: /work/long/qunex/DBIS_longComb_21222/subjects/21222/p45.long.base/MNINonLinear
File was not found
NAME OF FILE: p45.long.base.StrainR_MSMAll.164k_fs_LR.dscalar.nii
PATH TO FILE: /work/long/qunex/DBIS_longComb_21222/subjects/21222/p45.long.base/MNINonLinear
File was not found
Info: Time to read /work/long/qunex/DBIS_longComb_21222/subjects/21222/p45.long.base/MNINonLinear/p45.long.base.L.midthickness.164k_fs_LR.surf.gii was 0.183195 seconds.
FWIW, I noticed that the p45.long.base.StrainJ_MSMAll.164k_fs_LR.dscalar.nii
file does not exist for ANY of the jobs I’m running (including successful completions), but that p45.long.base.StrainJ_MSMSulc.164k_fs_LR.dscalar.nii
exists for ALL of the jobs I’m running (including failures).
For the seg fault, here’s what I’m getting:
$ tail -30 /work/long/qunex/DBIS_longComb_21021/processing/logs/comlogs/error_hcp_freesurfer_p52CSmprage4_2025-03-21_11.56.50.328700.log
029: dt: 0.0500, sse=76080.5, rms=0.428 (0.051%)
030: dt: 0.0500, sse=76114.1, rms=0.428 (0.025%)
positioning took 1.6 minutes
tol=1.0e-04, sigma=2.0, host=dcc-c, nav=16, nbrs=2, l_surf_repulse=5.000, l_tspring=0.100, l_nspring=0.050, l_location=0.250, l_curv=0.100
mom=0.00, dt=0.50
Segmentation fault
Linux dcc-courses-15 5.14.0-503.14.1.el9_5.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Nov 19 21:25:22 EST 2024 x86_64 GNU/Linux
recon-all -s p52CSmprage4 exited with ERRORS at Sat Mar 22 03:21:03 EDT 2025
For more details, see the log file /work/long/qunex/DBIS_longComb_21021/sessions/p52CSmprage4/hcp/p52CSmprage4/T1w/p52CSmprage4/scripts/recon-all.log
To report a problem, see http://surfer.nmr.mgh.harvard.edu/fswiki/BugReporting
Sat Mar 22 03:21:03 EDT 2025:FreeSurferPipeline.sh: While running '/opt/HCP/HCPpipelines/FreeSurfer/FreeSurferPipeline.sh --session-dir=/work/long/qunex/DBIS_longComb_21021/sessions/p52CSmprage4/hcp/p52CSmprage4/T1w --session=p52CSmprage4 --processing-mode=HCPStyleData --t1=/work/long/qunex/DBIS_longComb_21021/sessions/p52CSmprage4/hcp/p52CSmprage4/T1w/T1w_acpc_dc_restore.nii.gz --t1brain=/work/long/qunex/DBIS_longComb_21021/sessions/p52CSmprage4/hcp/p52CSmprage4/T1w/T1w_acpc_dc_restore_brain.nii.gz --t2=/work/long/qunex/DBIS_longComb_21021/sessions/p52CSmprage4/hcp/p52CSmprage4/T1w/T2w_acpc_dc_restore.nii.gz':
Sat Mar 22 03:21:03 EDT 2025:FreeSurferPipeline.sh: While running '/opt/HCP/HCPpipelines/FreeSurfer/FreeSurferPipeline.sh --session-dir=/work/long/qunex/DBIS_longComb_21021/sessions/p52CSmprage4/hcp/p52CSmprage4/T1w --session=p52CSmprage4 --processing-mode=HCPStyleData --t1=/work/long/qunex/DBIS_longComb_21021/sessions/p52CSmprage4/hcp/p52CSmprage4/T1w/T1w_acpc_dc_restore.nii.gz --t1brain=/work/long/qunex/DBIS_longComb_21021/sessions/p52CSmprage4/hcp/p52CSmprage4/T1w/T1w_acpc_dc_restore_brain.nii.gz --t2=/work/long/qunex/DBIS_longComb_21021/sessions/p52CSmprage4/hcp/p52CSmprage4/T1w/T2w_acpc_dc_restore.nii.gz':
Sat Mar 22 03:21:03 EDT 2025:FreeSurferPipeline.sh: ERROR: '"${recon_all_cmd[@]}"' command failed with return code: 1
Sat Mar 22 03:21:03 EDT 2025:FreeSurferPipeline.sh: ERROR: '"${recon_all_cmd[@]}"' command failed with return code: 1
===> ERROR: Command returned with nonzero exit code
---------------------------------------------------
script: FreeSurferPipeline.sh
stopped at line: 573
call: "${recon_all_cmd[@]}"
expanded call: "${recon_all_cmd[@]}"
hostname: dcc-courses-15
exit code: 1
---------------------------------------------------
===> Aborting execution!
(it popped up in the exactly the same place when i re-ran it)
Any suggestions on how to troubleshoot these would be appreciated! Happy to provide additional logs/info as needed. Thanks!