[RESOLVED] Comlogs are being duplicated when using qunex_container (hcp_freesurfer)

Description:

Comlogs are being duplicated when using qunex_container; log status does not appear to match actual hcp_freesurfer status. I am using qunex and qunex_container on Yale’s Farnam cluster. I’m following the conventions from the training document on Grace/n3. For example, for session pb8072, things seem to have run correctly when I check the session folder (/gpfs/ysm/scratch60/kelmendi/rgg27/ExTx/session/pb8072/hcp/pb8072/T1w/pb8072/scripts), however there is no "done_" comlog, just several error comlogs. The dozens of error_ comlogs per session make this situation difficult/impossible to debug. The same problem (duplicated error/done comlogs) happened when I ran hcp_pre_freesurfer. Thanks in advance for any help with fixing this.

Call:

#ran from: /gpfs/ysm/scratch60/kelmendi/rgg27/ExTx

DIR1="/gpfs/ysm/scratch60/kelmendi/rgg27"
DIR2="/gpfs/ysm/project/kelmendi"
my_study_folder="/gpfs/ysm/scratch60/kelmendi/rgg27/ExTx"
batch_file=$my_study_folder/processing/batch.txt
qunex_container=/gpfs/ysm/project/kelmendi/software/qunex/qunex_suite-0.90.1.sif

/gpfs/ysm/project/pittenger/zt84/qunex_container hcp_freesurfer
–sessionsfolder="$my_study_folder/sessions"
–sessions="$batch_file"
–container="$qunex_container"
–scheduler=“SLURM,time=4-04:00:00,ntasks=1,cpus-per-task=4,mem-per-cpu=8000,partition=general”
–overwrite=yes

[Note that this command did appear to successfully submit 21 sessions to 21 parallel jobs]

Path:

You can see the duplicated error logs here (I have opened permissions and this should be accessible from Grace even though it’s on Farnam):

/gpfs/ysm/scratch60/kelmendi/rgg27/ExTx/processing/logs/comlogs

Hi Rachael,

Unfortunately, I do not have access to the /gpfs/ysm/scratch60/kelmendi/ folder.

I see you are using version 0.90.1, this is not the latest version and it includes some bugs regarding parallel execution over multiple sessions. Please use both the latest released version of the container (/gpfs/project/fas/n3/software/Singularity/qunex_suite-latest.sif or /gpfs/project/fas/n3/software/Singularity/qunex_suite-0.90.6.sif on grace) and the latest version of the qunex_container script (/gpfs/project/fas/n3/software/qunex/bin/qunex_container).

Let me know how it goes with the latest version of QuNex.

Cheers, Jure

Thanks so much for the quick reply! I went ahead and accessed those directories from my Grace login and used the latest qunex/container. But it looks like the same exact problem occurs (many duplicated error files)

DIR1="/gpfs/ysm/scratch60/kelmendi/rgg27"
my_study_folder="/gpfs/ysm/scratch60/kelmendi/rgg27/ExTx"
batch_file=$my_study_folder/processing/batch.txt
qunex_container=/gpfs/project/fas/n3/software/Singularity/qunex_suite-0.90.6.sif

/gpfs/project/fas/n3/software/qunex/bin/qunex_container hcp_freesurfer
–sessionsfolder="$my_study_folder/sessions"
–sessions="$batch_file"
–container="$qunex_container"
–scheduler=“SLURM,time=2-00:00:00,ntasks=1,cpus-per-task=2,mem-per-cpu=8000,partition=pi_anticevic_z,account=anticevic_z”
–overwrite=yes

Hi,

Could you post some of these duplicated logfiles here (attach them to your message) please. I am unable to access your study dir, so I cannot really say what is going on. “Duplicate” error logs can be expected behavior, if you run a command multiple times then each run will create an error or a done log with a different time stamp.

Cheers, Jure

Hi, I’ve attached a handful of the error output logs for one session. As you can see from the timestamps, these error logs are being duplicated in rapid-fire/simultaneously per-session from a single run of hcp_freesurfer. I should also note that I had deleted all freesurfer directories completely before I ran this command from qunex container. I only ran it once.

-rw-rw-r-- 1 rgg27 kelmendi 34272 Jun 10 15:12 error_hcp_freesurfer_pb8393_2021-06-10_15.11.1623352310.log
-rw-rw-r-- 1 rgg27 kelmendi 27732 Jun 10 15:12 error_hcp_freesurfer_pb8393_2021-06-10_15.11.1623352311.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:12 error_hcp_freesurfer_pb8393_2021-06-10_15.11.1623352313.log
-rw-rw-r-- 1 rgg27 kelmendi 22959 Jun 10 15:13 error_hcp_freesurfer_pb8393_2021-06-10_15.12.1623352367.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:13 error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352399.log
-rw-rw-r-- 1 rgg27 kelmendi 34272 Jun 10 15:13 error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352400.log
-rw-rw-r-- 1 rgg27 kelmendi 34272 Jun 10 15:13 error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352401.log
-rw-rw-r-- 1 rgg27 kelmendi 34272 Jun 10 15:13 error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352402.log
-rw-rw-r-- 1 rgg27 kelmendi 68544 Jun 10 15:13 error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352403.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:13 error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352404.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:13 error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352405.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:13 error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352406.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:12 error_hcp_freesurfer_pb8826_2021-06-10_15.11.1623352310.log
-rw-rw-r-- 1 rgg27 kelmendi 25878 Jun 10 15:12 error_hcp_freesurfer_pb8826_2021-06-10_15.11.1623352311.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:12 error_hcp_freesurfer_pb8826_2021-06-10_15.11.1623352313.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:12 error_hcp_freesurfer_pb8826_2021-06-10_15.11.1623352315.log
-rw-rw-r-- 1 rgg27 kelmendi 34272 Jun 10 15:13 error_hcp_freesurfer_pb8826_2021-06-10_15.13.1623352401.log
-rw-rw-r-- 1 rgg27 kelmendi 60439 Jun 10 15:14 error_hcp_freesurfer_pb8826_2021-06-10_15.13.1623352402.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:13 error_hcp_freesurfer_pb8826_2021-06-10_15.13.1623352403.log
-rw-rw-r-- 1 rgg27 kelmendi 34272 Jun 10 15:13 error_hcp_freesurfer_pb8826_2021-06-10_15.13.1623352404.log
-rw-rw-r-- 1 rgg27 kelmendi 51408 Jun 10 15:13 error_hcp_freesurfer_pb8826_2021-06-10_15.13.1623352405.log
-rw-rw-r-- 1 rgg27 kelmendi 51408 Jun 10 15:13 error_hcp_freesurfer_pb8826_2021-06-10_15.13.1623352406.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:13 error_hcp_freesurfer_pb8826_2021-06-10_15.13.1623352407.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:12 error_hcp_freesurfer_pb8852_2021-06-10_15.11.1623352311.log
-rw-rw-r-- 1 rgg27 kelmendi 26202 Jun 10 15:12 error_hcp_freesurfer_pb8852_2021-06-10_15.11.1623352312.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:12 error_hcp_freesurfer_pb8852_2021-06-10_15.11.1623352313.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:12 error_hcp_freesurfer_pb8852_2021-06-10_15.11.1623352315.log
-rw-rw-r-- 1 rgg27 kelmendi 60446 Jun 10 15:14 error_hcp_freesurfer_pb8852_2021-06-10_15.13.1623352403.log
-rw-rw-r-- 1 rgg27 kelmendi 51408 Jun 10 15:13 error_hcp_freesurfer_pb8852_2021-06-10_15.13.1623352404.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:13 error_hcp_freesurfer_pb8852_2021-06-10_15.13.1623352405.log
-rw-rw-r-- 1 rgg27 kelmendi 51408 Jun 10 15:13 error_hcp_freesurfer_pb8852_2021-06-10_15.13.1623352406.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:13 error_hcp_freesurfer_pb8852_2021-06-10_15.13.1623352407.log
-rw-rw-r-- 1 rgg27 kelmendi 68544 Jun 10 15:13 error_hcp_freesurfer_pb8852_2021-06-10_15.13.1623352408.log
-rw-rw-r-- 1 rgg27 kelmendi 27605 Jun 10 15:12 error_hcp_freesurfer_pb8993_2021-06-10_15.11.1623352312.log
-rw-rw-r-- 1 rgg27 kelmendi 34272 Jun 10 15:12 error_hcp_freesurfer_pb8993_2021-06-10_15.11.1623352313.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:12 error_hcp_freesurfer_pb8993_2021-06-10_15.11.1623352315.log
-rw-rw-r-- 1 rgg27 kelmendi 22792 Jun 10 15:13 error_hcp_freesurfer_pb8993_2021-06-10_15.12.1623352367.log
-rw-rw-r-- 1 rgg27 kelmendi 60527 Jun 10 15:14 error_hcp_freesurfer_pb8993_2021-06-10_15.13.1623352404.log
-rw-rw-r-- 1 rgg27 kelmendi 34272 Jun 10 15:13 error_hcp_freesurfer_pb8993_2021-06-10_15.13.1623352405.log
-rw-rw-r-- 1 rgg27 kelmendi 51408 Jun 10 15:13 error_hcp_freesurfer_pb8993_2021-06-10_15.13.1623352406.log
-rw-rw-r-- 1 rgg27 kelmendi 34272 Jun 10 15:13 error_hcp_freesurfer_pb8993_2021-06-10_15.13.1623352408.log
-rw-rw-r-- 1 rgg27 kelmendi 68544 Jun 10 15:13 error_hcp_freesurfer_pb8993_2021-06-10_15.13.1623352409.log
-rw-rw-r-- 1 rgg27 kelmendi 17136 Jun 10 15:13 error_hcp_freesurfer_pb8993_2021-06-10_15.13.1623352410.log

error_hcp_freesurfer_pb8393_2021-06-10_15.11.1623352310.log (33.5 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.11.1623352311.log (27.1 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.11.1623352313.log (16.7 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.12.1623352367.log (22.4 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352399.log (16.7 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352400.log (33.5 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352401.log (33.5 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352402.log (33.5 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352403.log (66.9 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352404.log (16.7 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352405.log (16.7 KB)
error_hcp_freesurfer_pb8393_2021-06-10_15.13.1623352406.log (16.7 KB)

Outgoing calls look OK, I just have to figure out why there are duplicate calls for each session. The easiest way would be if I got access to the study’s folder. If that is not possible, please also upload the relevant runlogs.

Hi Rachael,

I think I found the culprit. In your command calls you are using some weird characters (–, “, ”) this characters cause all kind of issues in bash, you should be using “regular” characters (- instead of –, " instead of “ and ”). When I amended the command (see below) my test run worked (it was not executed on your study since I do not have access to it, I tested it on one of my own studies).

my_study_folder="/gpfs/ysm/scratch60/kelmendi/rgg27/ExTx"
batch_file="${my_study_folder}/processing/batch.txt"
qunex_container="/gpfs/project/fas/n3/software/Singularity/qunex_suite-0.90.6.sif"

/gpfs/project/fas/n3/software/qunex/bin/qunex_container hcp_freesurfer \
    --sessionsfolder="${my_study_folder}/sessions" \
    --sessions="${batch_file}" \
    --container="${qunex_container}" \
    --scheduler="SLURM,time=2-00:00:00,ntasks=1,cpus-per-task=2,mem-per-cpu=8000,partition=pi_anticevic_z,account=anticevic_z" \
    --overwrite="yes"