HTCondor & CrystFEL Tutorial

See also

For scaling and merging indexamajig output streams with CrystFEL’s partialator, refer to Partialator on HTCondor.

Indexamajig_condorjob example

1. Data analysis preparation

1.1. Preparing Analysis Script

Copy the indexamajig_condorjob script or the GitHub code to either the account home directory (/pal/home/{account}/{your_dir}) or the group folder (/pal/data/{group_dir}/{your_dir}).

case 1 : Copy indexamajig_htcondor directory

# check sample files location
[USERID@pal-ui-el7 indexamajig_htcondor]$ pwd
/pal/htcondor/

# Copy the tar file to the specified paths
[USERID@pal-ui-el7 ~]$ cp -rf /pal/htcondor/htcondor_sample_ori.tar /pal/{home, data}/{your_path}

# Change to /pal/{home, data}/{your_path} directory
[USERID@pal-ui-el7 ~]$ cd /pal/{home, data}/{your_path}

# Extract the tar file
[USERID@pal-ui-el7 your_path] tar xvf indexamajig_htcondor_ori.tar

 # Script(1_exec_file_list_script.sh) update from github
[USERID@pal-ui-el7 your_path] rm -rf {your_path}/htcondor_sample_ori/ue_191027_SFX/proc/cheetah/hdf5/indexamajig_htcondor/1_exec_file_list_script.sh
[USERID@pal-ui-el7 your_path] wget https://raw.githubusercontent.com/philiosi/indexamajig_htcondor/main/1_exec_file_list_script.sh
[USERID@pal-ui-el7 your_path] chmod ug+x 1_exec_file_list_script.sh   # Granting Execution Permission

# Script(2_submit_condor_indexing.sh) update from github
[USERID@pal-ui-el7 your_path] rm -rf {your_path}/htcondor_sample_ori/ue_191027_SFX/proc/cheetah/hdf5/indexamajig_htcondor/2_submit_condor_indexing.sh
[USERID@pal-ui-el7 your_path] wget https://raw.githubusercontent.com/philiosi/indexamajig_htcondor/main/2_submit_condor_indexing.sh
[USERID@pal-ui-el7 your_path] chmod ug+x 2_submit_condor_indexing.sh  # Granting Execution Permission

# Script(3_exec_indexing.sh) update from github
[USERID@pal-ui-el7 your_path] rm -rf {your_path}/htcondor_sample_ori/ue_191027_SFX/proc/cheetah/hdf5/indexamajig_htcondor/3_exec_indexing.sh
[USERID@pal-ui-el7 your_path] wget https://raw.githubusercontent.com/philiosi/indexamajig_htcondor/main/3_exec_indexing.sh
[USERID@pal-ui-el7 your_path] chmod ug+x 3_exec_indexing.sh           # Granting Execution Permission
NotePlease do not miss download updated script from github.
  • 1_exec_file_list_script.sh

  • 2_submit_condor_indexing.sh

  • 3_exec_indexing.sh

case 2 : Github clone

[USERID@pal-ui-el7 ~]$ cd /pal/{home, data}/{your_path}
# Change to /pal/{home, data}/{your_path} directory.
# If you need, then create {your_path} directory using "mkdir {your_path}" commnad.

[USERID@pal-ui-el7 your_path]$ git clone https://github.com/philiosi/indexamajig_htcondor.git
Cloning into 'indexamajig_htcondor'...
remote: Enumerating objects: 80, done.
remote: Counting objects: 100% (80/80), done.
remote: Compressing objects: 100% (60/60), done.
remote: Total 80 (delta 39), reused 47 (delta 16), pack-reused 0
Unpacking objects: 100% (80/80), done.

1.2. Preparing analysis data

Directory structure:

[hdf5]
├── [0000079-pal40]                     # data sample   ├── cheetah.ini
│   ├── cheetah.out
│   ├── cleaned.txt
│   ├── frames.txt
│   ├── h5files.txt
│   ├── log.txt
│   ├── original.ini
│   ├── peaks.txt
│   ├── r0079-detector0-class0-sum.h5
│   ├── r0079-detector0-class1-sum.h5
│   ├── r0079-detector0-class2-sum.h5
│   ├── r0079-detector0-class3-sum.h5
│   ├── status.txt
│   ├── ue_191027_SFX-r0079-c00.cxi
│   └── ue_191027_SFX-r0079-c00.h5
├── [0000080-pal40]
├── [0000081-pal40]
└── [indexamajig_htcondor]              # code base directory
    ├── 1_exec_file_list_script.sh      # [script] create lst list
    ├── 2_submit_condor_indexing.sh     # [script] submit indexamajig condor job
    ├── 3_exec_indexing.sh              # [script] to be executed by the condor job
    ├── file_list                       # [Directory] Files ('lst' files) to be processed by indexamajig
    ├── geom_file1.geom                 # [file] Example geom file 1
    ├── geom_file2.geom                 # [file] Example geom file 2
    ├── geom_files                      # [Directory] geom files
    ├── lib                             # [Directory] lib
    ├── mosflm.lp                       # [file] example mosflm file
    ├── pdb_file1.pdb                   # [file] example pdb file
    ├── r009400.lst                     # [file] example lst file
    ├── README.md
    └── SASE_1.stream                   # [file] example stream file

2. CXI File Lists Creation

2.1 Preparing files for analysis

[!important] To use the script for generating lst file list (1_exec_file_list_script.sh), each file directory must end with a specific keyword.

  • (Ex) directories ending with ‘pal40’: 0000079-pal40, 0000080-pal40, …

CASE 1 : indexamajig_htcondor directory

Use sample files in the “htcondor_sample_ori”
  • please check location of example files below:

/pal/{your_path}/htcondor_sample_ori/ue_191027_SFX/proc/cheetah/hdf5/
[USERID@pal-ui-el7 hdf5]$ ll /pal/{your_path}/htcondor_sample_ori/ue_191027_SFX/proc/cheetah/hdf5/
total 104
drwxr-x---. 2 pal pal_users  4096 Sep  6 11:20 0000079-pal40
drwxr-x---. 2 pal pal_users  4096 Sep  6 11:20 0000080-pal40
drwxr-x---. 2 pal pal_users  4096 Sep  6 11:21 0000081-pal40
drwxrwx---. 6 pal pal_users  4096 Sep 22 15:28 indexamajig_htcondor

CASE 2 : Github clone Copy sample files in the “/pal/htcondor/hdf5_sample”

(Ex) Copy data sets
[USERID@pal-ui-el7 condor]$ pw
/pal/htcondor/hdf5
[USERID@pal-ui-el7 condor]$ cp -rf /pal/htcondor/hdf5/pal/{your_path}/{your_directory}/
[USERID@pal-ui-el7 hdf5]# ll
total 64
drwxrwx---. 2 pal pal_users 4096 Jun  3 13:19 0000079-pal40
drwxrwx---. 2 pal pal_users 4096 Jun  3 13:19 0000080-pal40
drwxrwx---. 2 pal pal_users 4096 Jun  3 13:19 0000081-pal40

CASE 3 : Use your own file

Step 1. Copy your own data sets to the location below:

  • copyFile location : /pal/{your_path}/{your_directory}/hdf5

Note : Please refer to the directory structure in the section “1.2. Preparing analysis data”.

Step 2. Create your own lst file(s) wherever you want.

Example of cxi file in a single lst file
# relative path
../0000091-pal40/ue_191027_SFX-r0091-c00.cxi
# absolute path
/{your_path}/htcondor_sample/ue_191027_SFX/proc/cheetah/hdf5/0000091-pal40/ue_191027_SFX-r0091-c00.cxi

Warning

When executing ./2_submit_condor_indexing.sh, make sure to clearly specify the path (absolute or relative) of the lst file with the -f option.

2.2 Generating CXI file list

Excute ‘1_exec_file_list_script.sh’ script

Step 1 : Please change the ‘target’ value to whatever you want (Default : ../{your_path}/{your_directory}/hdf5/indexamajig_htcondor/file_list)

1_exec_file_list_script.sh
# target directory will be created.
# Please change directory name what you want
target="file_list"

Step 2 : Run

  • “-d” : applies to directories within the hdf5 directory that contain the keyword(default:pal).

Usage: ./1_exec_file_list_script.sh -d pal40 (default:pal)
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./1_exec_file_list_script.sh
Usage: ./1_exec_file_list_script.sh -d pal40 (default:pal)
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./1_exec_file_list_script.sh -d pal40
../0000079-pal40/ue_191027_SFX-r0079-c00.cxi r0079c00
../0000080-pal40/ue_191027_SFX-r0080-c00.cxi r0080c00
../0000081-pal40/ue_191027_SFX-r0081-c00.cxi r0081c00
../0000081-pal40/ue_191027_SFX-r0081-c01.cxi r0081c01

Result

created lst file list
[USERID@pal-ui-el7 indexamajig_htcondor]$ ll ./file_list/
total 209
-rwxr-x---. 1 USERID USERID 45 Sep 25 13:30 r0079c00.lst
-rwxr-x---. 1 USERID USERID 45 Sep 25 13:30 r0080c00.lst
-rwxr-x---. 1 USERID USERID 45 Sep 25 13:30 r0081c00.lst
-rwxr-x---. 1 USERID USERID 45 Sep 25 13:30 r0081c01.lst
[USERID@pal-ui-el7 indexamajig_htcondor]$ cat ./file_list/r0079c00.lst
../0000079-pal40/ue_191027_SFX-r0079-c00.cxi
  • 1_exec_file_list_script.sh generates each lst file containing the relative path to one cxi file.

  • You can generate lst files manually. Both absolute and relative paths for cxi files are allowed.

Example of a cxi file in a single lst file
# relative path
../0000091-pal40/ue_191027_SFX-r0091-c00.cxi

# absolute path
/{your_path}/htcondor_sample/ue_191027_SFX/proc/cheetah/hdf5/0000091-pal40/ue_191027_SFX-r0091-c00.cxi

3 Submit indexamajig condor jobs

3.1 HTcondor job submit overview

Submitting jobs to HTCondor based on indexamajig inputs

  • Sequentially submit jobs for each input geom file(s) and lst file(s)

submit_condor_indexing job submit example
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./2_submit_condor_indexing.sh -g pal1_new12.geom -i xgandalf -j 72 -f file_list -o SASE_1.stream -p 1vds_sase_temp3.pdb -e "--int-radius=3,4,5 --threshold=600 --min-srn=4 --min-gradient=100000"
  • “-g” : specific geometry file or directory(multiful geom files)

  • “-i” : indexing method - mosflm, xds, asdf, dirax, xgandalf

  • “-j” : Numbers of CPU[1]_

  • “-f” : specific lst file(.lst) or directory(multiful lst files)

  • “-o” : stream file

  • “-p” : pdb file

  • “-e” : another parameters such as -p, -no-check-peaks, –multi, –int-radius, –threshold, –min-srn, –min-fradient, etc.

3.2 Output Setting

Please change the target of ‘stream_dir’과 ‘log’ if you want. Each directory will be created

2_submit_condor_indexing.sh, line 16 to 42
# debug print option
# ex) if [ $DEBUG -eq 1 ]; then echo "[debug] -f option is directory : mf"; fi
DEBUG=1

# Input
# The directory location is determined based on the input parameter.
geom_dir="" # Do not assign a value. -g option parameter
lst_dir="" # Do not assign a value. -f option parameter

# Output
# 'stream_dir' and 'log' directories are required. Please change directories what you want.
# Default directory are 'file_stream' and 'log'
stream_dir="file_stream"
log="log"

# create folder for output and log
PROCDIR="$( cd "$( dirname "$0" )" && pwd -P )"

# fourc input type
# - 1010 : 10 multi lst, multi geom
# - 1001 : 9  multi lst, single geom
# - 0110 : 6  single lst, multi geom
# - 0101 : 5  single lst, single geom
in_type=0

# asign memory
MEM=360

3.3 Job Submition

  • geom_files : directory for multiful geom files

  • file_list : directory for multiful lst files

multiful geoms and multiful lsts
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./2_submit_condor_indexing.sh -g geom_files -i xgandalf -j 72 -f file_list -o SASE_1.stream -p pdb_file1.pdb -e "--int-radius=3,4,5 --threshold=600 --min-srn=4 --min-gradient=100000"
multiful geoms and single lst
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./2_submit_condor_indexing.sh -g geom_files -i xgandalf -j 72 -f file_list/r009100.lst -o SASE_1.stream -p pdb_file1.pdb -e "--int-radius=3,4,5 --threshold=600 --min-srn=4 --min-gradient=100000"
sigle geom and multiful lsts
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./2_submit_condor_indexing.sh -g geom_files/geom_file1.geom -i xgandalf -j 72 -f file_list -o SASE_1.stream -p pdb_file1.pdb -e "--int-radius=3,4,5 --threshold=600 --min-srn=4 --min-gradient=100000"
sigle geom and single lst
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./2_submit_condor_indexing.sh -g geom_files/geom_file1.geom -i xgandalf -j 72 -f file_list/r009100.lst -o SASE_1.stream -p pdb_file1.pdb -e "--int-radius=3,4,5 --threshold=600 --min-srn=4 --min-gradient=100000"

Warning

Make sure to check the paths (absolute/relative) of the files for each option(-g, -f, -o, -p) are correct.

4 HTCondor job managing

Condor_manual : HTCondor Version 9.8.1 Manual — HTCondor Manual 9.8.1 documentation.

4.1. Checking the Condor Queue after Running 2_exec_condor_indexing.sh

Verify the Condor queue status (condor_q) after executing 2_exec_condor_indexing.sh.

Initially, jobs will be in the IDLE state before resource allocation, then transition to the RUN state according to HTCondor scheduling policies.

Check job status and errors: Analyzing Jobs in HTCondor
  • condor_q -analyze {JOB_IDS}: Shows the scheduling status or error information for the jobs.

  • condor_q -better-analyze {JOB_IDS}: more detailed analysis compared to -analyze

  • condor_q -l {JOB_IDS}: Provides detailed information about the jobs.

Note : If there are existing jobs submitted by other users, resource allocation might be delayed according to scheduling policies. Please Refer to the HTCondor References chapter for information on job queue and priority.

4.2. HTCondor Resource Status

You can check the status of Condor resources:
  • Verify the allocation (Claimed) status of jobs on each Worker Node.

Example:

[USERID@pal-ui-el7 indexamajig_htcondor]$ condor_status
Name                         OpSys      Arch   State     Activity LoadAv Mem     ActvtyTime
slot1@pal-wn1001.sdfarm.kr   LINUX      X86_64 Unclaimed Idle      0.000  18030  0+00:33:44
slot1_1@pal-wn1001.sdfarm.kr LINUX      X86_64 Claimed   Busy     75.940 368640  0+00:28:54
slot1@pal-wn1002.sdfarm.kr   LINUX      X86_64 Unclaimed Idle      0.000  18030  0+14:26:17
slot1_1@pal-wn1002.sdfarm.kr LINUX      X86_64 Claimed   Busy     71.570 368640  0+00:29:42
slot1@pal-wn1003.sdfarm.kr   LINUX      X86_64 Unclaimed Idle      0.000  18030  0+14:27:53
slot1_1@pal-wn1003.sdfarm.kr LINUX      X86_64 Claimed   Busy     71.530 368640  0+00:29:41
slot1@pal-wn1004.sdfarm.kr   LINUX      X86_64 Unclaimed Idle      0.000  18030  0+14:25:42
slot1_1@pal-wn1004.sdfarm.kr LINUX      X86_64 Claimed   Busy     71.550 368640  0+00:29:42
slot1@pal-wn1005.sdfarm.kr   LINUX      X86_64 Unclaimed Idle      0.000  18030  0+14:25:41
slot1_1@pal-wn1005.sdfarm.kr LINUX      X86_64 Claimed   Busy     71.630 368640  0+00:29:42
slot1@pal-wn1006.sdfarm.kr   LINUX      X86_64 Unclaimed Idle      0.000  18030  0+20:32:27
slot1_1@pal-wn1006.sdfarm.kr LINUX      X86_64 Claimed   Busy     71.580 368640  0+00:29:36
slot1@pal-wn1007.sdfarm.kr   LINUX      X86_64 Unclaimed Idle      0.000  18030  0+14:25:22
slot1_1@pal-wn1007.sdfarm.kr LINUX      X86_64 Claimed   Busy     71.520 368640  0+00:29:35
slot1@pal-wn1008.sdfarm.kr   LINUX      X86_64 Unclaimed Idle      0.000  18030  0+14:24:48
slot1_1@pal-wn1008.sdfarm.kr LINUX      X86_64 Claimed   Busy     71.580 368640  0+00:29:02
slot1@pal-wn1009.sdfarm.kr   LINUX      X86_64 Unclaimed Idle      0.000  18030  0+14:24:31
slot1_1@pal-wn1009.sdfarm.kr LINUX      X86_64 Claimed   Busy     72.000 368640  0+00:29:39
Machines Owner Claimed Unclaimed Matched Preempting  Drain
X86_64/LINUX       18     0       9         9       0          0      0
Total              18     0       9         9       0          0      0

4.3. Execution Results

The indexing process logs are generated in the ../indexamajig_htcondor/log/ directory:
  • *.error: Indexing log, elapsed time

  • *.log: condor_submit information

  • *.out: Output log

Example:

[USERID@pal-ui-el7 log]$ cd log
[USERID@pal-ui-el7 log]$ ll
total 8242
-rw-r--r--. 1 USERID USERID  795612 Aug 29 12:00 geom_file1_xgandalf_r0079c00_SASE_1_condor.error
-rw-r--r--. 1 USERID USERID    1838 Aug 29 12:00 geom_file1_xgandalf_r0079c00_SASE_1_condor.log
-rw-r--r--. 1 USERID USERID       0 Aug 29 11:30 geom_file1_xgandalf_r0079c00_SASE_1_condor.out
-rw-r--r--. 1 USERID USERID 1038891 Aug 29 12:06 geom_file1_xgandalf_r0080c00_SASE_1_condor.error
-rw-r--r--. 1 USERID USERID    1837 Aug 29 12:06 geom_file1_xgandalf_r0080c00_SASE_1_condor.log
-rw-r--r--. 1 USERID USERID       0 Aug 29 11:30 geom_file1_xgandalf_r0080c00_SASE_1_condor.out
-rw-r--r--. 1 USERID USERID 1127187 Aug 29 12:08 geom_file1_xgandalf_r0081c00_SASE_1_condor.error
-rw-r--r--. 1 USERID USERID    1162 Aug 29 12:06 geom_file1_xgandalf_r0081c00_SASE_1_condor.log
-rw-r--r--. 1 USERID USERID       0 Aug 29 11:30 geom_file1_xgandalf_r0081c00_SASE_1_condor.out
-rw-r--r--. 1 USERID USERID    1706 Aug 29 11:31 geom_file1_xgandalf_r0081c01_SASE_1_condor.error
-rw-r--r--. 1 USERID USERID    1220 Aug 29 11:31 geom_file1_xgandalf_r0081c01_SASE_1_condor.log
-rw-r--r--. 1 USERID USERID       0 Aug 29 11:30 geom_file1_xgandalf_r0081c01_SASE_1_condor.out

Note

The naming convention for the log and stream files is as follows:

output = log/{geom_file_name}_{indexing method}_{runnum}_{streamname}_condor.out error = log/{geom_file_name}_{indexing method}_{runnum}_{streamname}_condor.error log = log/{geom_file_name}_{indexing method}_{runnum}_{streamname}_condor.log

stream = file_stream/{geom_file_name}_{indexing method}_{runnum}_{streamname}.stream

4.4. Job History

View log of HTCondor jobs completed to date(condor_history)

Example:

[USERID@pal-ui-el7 ~]$ condor_history | more
ID        OWNER      SUBMITTED   RUN_TIME     ST    COMPLETED  CMD
56235.0   userid     6/3 22:28   0+00:10:11   C     6/4  15:04 ../path/3_exec_indexing.sh ... ommited ...
56237.0   userid     6/3 22:28   0+00:09:11   C     6/4  15:04 ../path/3_exec_indexing.sh ... ommited ...
56234.0   userid     6/3 22:28   0+00:10:12   C     6/4  15:04 ../path/3_exec_indexing.sh ... ommited ...
56233.0   userid     6/3 22:28   0+00:10:11   C     6/4  15:04 ../path/3_exec_indexing.sh ... ommited ...
56232.0   userid     6/3 22:28   0+00:10:11   C     6/4  15:04 ../path/3_exec_indexing.sh ... ommited ...
56231.0   userid     6/3 22:28   0+00:10:11   C     6/4  15:04 ../path/3_exec_indexing.sh ... ommited ...
... ... ommited ... ...
  • ID : The cluster/process id of the job.

  • OWNER : The owner of the job.

  • SUBMITTED : The month, day, hour, and minute the job was submitted to the queue.

  • RUN_TIME : Remote wall clock time accumulated by the job to date in days, hours, minutes, and seconds, given as the job ClassAd attribute RemoteWallClockTime.

  • ST : Completion status of the job (C = completed and X = removed).

  • COMPLETED : The time the job was completed.

  • CMD : The name of the executable.