HTCondor & CrystFEL Tutorial
See also
For scaling and merging indexamajig output streams with CrystFEL’s
partialator, refer to Partialator on HTCondor.
- Indexamajig_condorjob example
KISTI Storage - /pal/htcondor/htcondor_sample_ori.tar
1. Data analysis preparation
1.1. Preparing Analysis Script
Copy the indexamajig_condorjob script or the GitHub code to either the account home directory (/pal/home/{account}/{your_dir}) or the group folder (/pal/data/{group_dir}/{your_dir}).
case 1 : Copy indexamajig_htcondor directory
# check sample files location
[USERID@pal-ui-el7 indexamajig_htcondor]$ pwd
/pal/htcondor/
# Copy the tar file to the specified paths
[USERID@pal-ui-el7 ~]$ cp -rf /pal/htcondor/htcondor_sample_ori.tar /pal/{home, data}/{your_path}
# Change to /pal/{home, data}/{your_path} directory
[USERID@pal-ui-el7 ~]$ cd /pal/{home, data}/{your_path}
# Extract the tar file
[USERID@pal-ui-el7 your_path] tar xvf indexamajig_htcondor_ori.tar
# Script(1_exec_file_list_script.sh) update from github
[USERID@pal-ui-el7 your_path] rm -rf {your_path}/htcondor_sample_ori/ue_191027_SFX/proc/cheetah/hdf5/indexamajig_htcondor/1_exec_file_list_script.sh
[USERID@pal-ui-el7 your_path] wget https://raw.githubusercontent.com/philiosi/indexamajig_htcondor/main/1_exec_file_list_script.sh
[USERID@pal-ui-el7 your_path] chmod ug+x 1_exec_file_list_script.sh # Granting Execution Permission
# Script(2_submit_condor_indexing.sh) update from github
[USERID@pal-ui-el7 your_path] rm -rf {your_path}/htcondor_sample_ori/ue_191027_SFX/proc/cheetah/hdf5/indexamajig_htcondor/2_submit_condor_indexing.sh
[USERID@pal-ui-el7 your_path] wget https://raw.githubusercontent.com/philiosi/indexamajig_htcondor/main/2_submit_condor_indexing.sh
[USERID@pal-ui-el7 your_path] chmod ug+x 2_submit_condor_indexing.sh # Granting Execution Permission
# Script(3_exec_indexing.sh) update from github
[USERID@pal-ui-el7 your_path] rm -rf {your_path}/htcondor_sample_ori/ue_191027_SFX/proc/cheetah/hdf5/indexamajig_htcondor/3_exec_indexing.sh
[USERID@pal-ui-el7 your_path] wget https://raw.githubusercontent.com/philiosi/indexamajig_htcondor/main/3_exec_indexing.sh
[USERID@pal-ui-el7 your_path] chmod ug+x 3_exec_indexing.sh # Granting Execution Permission
- NotePlease do not miss download updated script from github.
1_exec_file_list_script.sh
2_submit_condor_indexing.sh
3_exec_indexing.sh
case 2 : Github clone
[USERID@pal-ui-el7 ~]$ cd /pal/{home, data}/{your_path}
# Change to /pal/{home, data}/{your_path} directory.
# If you need, then create {your_path} directory using "mkdir {your_path}" commnad.
[USERID@pal-ui-el7 your_path]$ git clone https://github.com/philiosi/indexamajig_htcondor.git
Cloning into 'indexamajig_htcondor'...
remote: Enumerating objects: 80, done.
remote: Counting objects: 100% (80/80), done.
remote: Compressing objects: 100% (60/60), done.
remote: Total 80 (delta 39), reused 47 (delta 16), pack-reused 0
Unpacking objects: 100% (80/80), done.
1.2. Preparing analysis data
Directory structure:
[hdf5]
├── [0000079-pal40] # data sample
│ ├── cheetah.ini
│ ├── cheetah.out
│ ├── cleaned.txt
│ ├── frames.txt
│ ├── h5files.txt
│ ├── log.txt
│ ├── original.ini
│ ├── peaks.txt
│ ├── r0079-detector0-class0-sum.h5
│ ├── r0079-detector0-class1-sum.h5
│ ├── r0079-detector0-class2-sum.h5
│ ├── r0079-detector0-class3-sum.h5
│ ├── status.txt
│ ├── ue_191027_SFX-r0079-c00.cxi
│ └── ue_191027_SFX-r0079-c00.h5
├── [0000080-pal40]
├── [0000081-pal40]
└── [indexamajig_htcondor] # code base directory
├── 1_exec_file_list_script.sh # [script] create lst list
├── 2_submit_condor_indexing.sh # [script] submit indexamajig condor job
├── 3_exec_indexing.sh # [script] to be executed by the condor job
├── file_list # [Directory] Files ('lst' files) to be processed by indexamajig
├── geom_file1.geom # [file] Example geom file 1
├── geom_file2.geom # [file] Example geom file 2
├── geom_files # [Directory] geom files
├── lib # [Directory] lib
├── mosflm.lp # [file] example mosflm file
├── pdb_file1.pdb # [file] example pdb file
├── r009400.lst # [file] example lst file
├── README.md
└── SASE_1.stream # [file] example stream file
2. CXI File Lists Creation
2.1 Preparing files for analysis
[!important] To use the script for generating lst file list (1_exec_file_list_script.sh), each file directory must end with a specific keyword.
(Ex) directories ending with ‘pal40’: 0000079-pal40, 0000080-pal40, …
CASE 1 : indexamajig_htcondor directory
- Use sample files in the “htcondor_sample_ori”
please check location of example files below:
[USERID@pal-ui-el7 hdf5]$ ll /pal/{your_path}/htcondor_sample_ori/ue_191027_SFX/proc/cheetah/hdf5/
total 104
drwxr-x---. 2 pal pal_users 4096 Sep 6 11:20 0000079-pal40
drwxr-x---. 2 pal pal_users 4096 Sep 6 11:20 0000080-pal40
drwxr-x---. 2 pal pal_users 4096 Sep 6 11:21 0000081-pal40
drwxrwx---. 6 pal pal_users 4096 Sep 22 15:28 indexamajig_htcondor
CASE 2 : Github clone Copy sample files in the “/pal/htcondor/hdf5_sample”
[USERID@pal-ui-el7 condor]$ pw
/pal/htcondor/hdf5
[USERID@pal-ui-el7 condor]$ cp -rf /pal/htcondor/hdf5/pal/{your_path}/{your_directory}/
[USERID@pal-ui-el7 hdf5]# ll
total 64
drwxrwx---. 2 pal pal_users 4096 Jun 3 13:19 0000079-pal40
drwxrwx---. 2 pal pal_users 4096 Jun 3 13:19 0000080-pal40
drwxrwx---. 2 pal pal_users 4096 Jun 3 13:19 0000081-pal40
CASE 3 : Use your own file
Step 1. Copy your own data sets to the location below:
copyFile location : /pal/{your_path}/{your_directory}/hdf5
Note : Please refer to the directory structure in the section “1.2. Preparing analysis data”.
Step 2. Create your own lst file(s) wherever you want.
# relative path
../0000091-pal40/ue_191027_SFX-r0091-c00.cxi
# absolute path
/{your_path}/htcondor_sample/ue_191027_SFX/proc/cheetah/hdf5/0000091-pal40/ue_191027_SFX-r0091-c00.cxi
Warning
When executing ./2_submit_condor_indexing.sh, make sure to clearly specify the path (absolute or relative) of the lst file with the -f option.
2.2 Generating CXI file list
Excute ‘1_exec_file_list_script.sh’ script
Step 1 : Please change the ‘target’ value to whatever you want (Default : ../{your_path}/{your_directory}/hdf5/indexamajig_htcondor/file_list)
# target directory will be created.
# Please change directory name what you want
target="file_list"
Step 2 : Run
“-d” : applies to directories within the hdf5 directory that contain the keyword(default:pal).
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./1_exec_file_list_script.sh
Usage: ./1_exec_file_list_script.sh -d pal40 (default:pal)
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./1_exec_file_list_script.sh -d pal40
../0000079-pal40/ue_191027_SFX-r0079-c00.cxi r0079c00
../0000080-pal40/ue_191027_SFX-r0080-c00.cxi r0080c00
../0000081-pal40/ue_191027_SFX-r0081-c00.cxi r0081c00
../0000081-pal40/ue_191027_SFX-r0081-c01.cxi r0081c01
Result
[USERID@pal-ui-el7 indexamajig_htcondor]$ ll ./file_list/
total 209
-rwxr-x---. 1 USERID USERID 45 Sep 25 13:30 r0079c00.lst
-rwxr-x---. 1 USERID USERID 45 Sep 25 13:30 r0080c00.lst
-rwxr-x---. 1 USERID USERID 45 Sep 25 13:30 r0081c00.lst
-rwxr-x---. 1 USERID USERID 45 Sep 25 13:30 r0081c01.lst
[USERID@pal-ui-el7 indexamajig_htcondor]$ cat ./file_list/r0079c00.lst
../0000079-pal40/ue_191027_SFX-r0079-c00.cxi
1_exec_file_list_script.sh generates each lst file containing the relative path to one cxi file.
You can generate lst files manually. Both absolute and relative paths for cxi files are allowed.
# relative path
../0000091-pal40/ue_191027_SFX-r0091-c00.cxi
# absolute path
/{your_path}/htcondor_sample/ue_191027_SFX/proc/cheetah/hdf5/0000091-pal40/ue_191027_SFX-r0091-c00.cxi
3 Submit indexamajig condor jobs
3.1 HTcondor job submit overview
Submitting jobs to HTCondor based on indexamajig inputs
Sequentially submit jobs for each input geom file(s) and lst file(s)
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./2_submit_condor_indexing.sh -g pal1_new12.geom -i xgandalf -j 72 -f file_list -o SASE_1.stream -p 1vds_sase_temp3.pdb -e "--int-radius=3,4,5 --threshold=600 --min-srn=4 --min-gradient=100000"
“-g” : specific geometry file or directory(multiful geom files)
“-i” : indexing method - mosflm, xds, asdf, dirax, xgandalf
“-j” : Numbers of CPU[1]_
“-f” : specific lst file(.lst) or directory(multiful lst files)
“-o” : stream file
“-p” : pdb file
“-e” : another parameters such as -p, -no-check-peaks, –multi, –int-radius, –threshold, –min-srn, –min-fradient, etc.
3.2 Output Setting
Please change the target of ‘stream_dir’과 ‘log’ if you want. Each directory will be created
# debug print option
# ex) if [ $DEBUG -eq 1 ]; then echo "[debug] -f option is directory : mf"; fi
DEBUG=1
# Input
# The directory location is determined based on the input parameter.
geom_dir="" # Do not assign a value. -g option parameter
lst_dir="" # Do not assign a value. -f option parameter
# Output
# 'stream_dir' and 'log' directories are required. Please change directories what you want.
# Default directory are 'file_stream' and 'log'
stream_dir="file_stream"
log="log"
# create folder for output and log
PROCDIR="$( cd "$( dirname "$0" )" && pwd -P )"
# fourc input type
# - 1010 : 10 multi lst, multi geom
# - 1001 : 9 multi lst, single geom
# - 0110 : 6 single lst, multi geom
# - 0101 : 5 single lst, single geom
in_type=0
# asign memory
MEM=360
3.3 Job Submition
geom_files : directory for multiful geom files
file_list : directory for multiful lst files
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./2_submit_condor_indexing.sh -g geom_files -i xgandalf -j 72 -f file_list -o SASE_1.stream -p pdb_file1.pdb -e "--int-radius=3,4,5 --threshold=600 --min-srn=4 --min-gradient=100000"
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./2_submit_condor_indexing.sh -g geom_files -i xgandalf -j 72 -f file_list/r009100.lst -o SASE_1.stream -p pdb_file1.pdb -e "--int-radius=3,4,5 --threshold=600 --min-srn=4 --min-gradient=100000"
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./2_submit_condor_indexing.sh -g geom_files/geom_file1.geom -i xgandalf -j 72 -f file_list -o SASE_1.stream -p pdb_file1.pdb -e "--int-radius=3,4,5 --threshold=600 --min-srn=4 --min-gradient=100000"
[USERID@pal-ui-el7 indexamajig_htcondor]$ ./2_submit_condor_indexing.sh -g geom_files/geom_file1.geom -i xgandalf -j 72 -f file_list/r009100.lst -o SASE_1.stream -p pdb_file1.pdb -e "--int-radius=3,4,5 --threshold=600 --min-srn=4 --min-gradient=100000"
Warning
Make sure to check the paths (absolute/relative) of the files for each option(-g, -f, -o, -p) are correct.
4 HTCondor job managing
Condor_manual : HTCondor Version 9.8.1 Manual — HTCondor Manual 9.8.1 documentation.
4.1. Checking the Condor Queue after Running 2_exec_condor_indexing.sh
Verify the Condor queue status (condor_q) after executing 2_exec_condor_indexing.sh.
Initially, jobs will be in the IDLE state before resource allocation, then transition to the RUN state according to HTCondor scheduling policies.
- Check job status and errors: Analyzing Jobs in HTCondor
condor_q -analyze {JOB_IDS}: Shows the scheduling status or error information for the jobs.
condor_q -better-analyze {JOB_IDS}: more detailed analysis compared to -analyze
condor_q -l {JOB_IDS}: Provides detailed information about the jobs.
Note : If there are existing jobs submitted by other users, resource allocation might be delayed according to scheduling policies. Please Refer to the HTCondor References chapter for information on job queue and priority.
4.2. HTCondor Resource Status
- You can check the status of Condor resources:
Verify the allocation (Claimed) status of jobs on each Worker Node.
Example:
[USERID@pal-ui-el7 indexamajig_htcondor]$ condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
slot1@pal-wn1001.sdfarm.kr LINUX X86_64 Unclaimed Idle 0.000 18030 0+00:33:44
slot1_1@pal-wn1001.sdfarm.kr LINUX X86_64 Claimed Busy 75.940 368640 0+00:28:54
slot1@pal-wn1002.sdfarm.kr LINUX X86_64 Unclaimed Idle 0.000 18030 0+14:26:17
slot1_1@pal-wn1002.sdfarm.kr LINUX X86_64 Claimed Busy 71.570 368640 0+00:29:42
slot1@pal-wn1003.sdfarm.kr LINUX X86_64 Unclaimed Idle 0.000 18030 0+14:27:53
slot1_1@pal-wn1003.sdfarm.kr LINUX X86_64 Claimed Busy 71.530 368640 0+00:29:41
slot1@pal-wn1004.sdfarm.kr LINUX X86_64 Unclaimed Idle 0.000 18030 0+14:25:42
slot1_1@pal-wn1004.sdfarm.kr LINUX X86_64 Claimed Busy 71.550 368640 0+00:29:42
slot1@pal-wn1005.sdfarm.kr LINUX X86_64 Unclaimed Idle 0.000 18030 0+14:25:41
slot1_1@pal-wn1005.sdfarm.kr LINUX X86_64 Claimed Busy 71.630 368640 0+00:29:42
slot1@pal-wn1006.sdfarm.kr LINUX X86_64 Unclaimed Idle 0.000 18030 0+20:32:27
slot1_1@pal-wn1006.sdfarm.kr LINUX X86_64 Claimed Busy 71.580 368640 0+00:29:36
slot1@pal-wn1007.sdfarm.kr LINUX X86_64 Unclaimed Idle 0.000 18030 0+14:25:22
slot1_1@pal-wn1007.sdfarm.kr LINUX X86_64 Claimed Busy 71.520 368640 0+00:29:35
slot1@pal-wn1008.sdfarm.kr LINUX X86_64 Unclaimed Idle 0.000 18030 0+14:24:48
slot1_1@pal-wn1008.sdfarm.kr LINUX X86_64 Claimed Busy 71.580 368640 0+00:29:02
slot1@pal-wn1009.sdfarm.kr LINUX X86_64 Unclaimed Idle 0.000 18030 0+14:24:31
slot1_1@pal-wn1009.sdfarm.kr LINUX X86_64 Claimed Busy 72.000 368640 0+00:29:39
Machines Owner Claimed Unclaimed Matched Preempting Drain
X86_64/LINUX 18 0 9 9 0 0 0
Total 18 0 9 9 0 0 0
4.3. Execution Results
- The indexing process logs are generated in the ../indexamajig_htcondor/log/ directory:
*.error: Indexing log, elapsed time
*.log: condor_submit information
*.out: Output log
Example:
[USERID@pal-ui-el7 log]$ cd log
[USERID@pal-ui-el7 log]$ ll
total 8242
-rw-r--r--. 1 USERID USERID 795612 Aug 29 12:00 geom_file1_xgandalf_r0079c00_SASE_1_condor.error
-rw-r--r--. 1 USERID USERID 1838 Aug 29 12:00 geom_file1_xgandalf_r0079c00_SASE_1_condor.log
-rw-r--r--. 1 USERID USERID 0 Aug 29 11:30 geom_file1_xgandalf_r0079c00_SASE_1_condor.out
-rw-r--r--. 1 USERID USERID 1038891 Aug 29 12:06 geom_file1_xgandalf_r0080c00_SASE_1_condor.error
-rw-r--r--. 1 USERID USERID 1837 Aug 29 12:06 geom_file1_xgandalf_r0080c00_SASE_1_condor.log
-rw-r--r--. 1 USERID USERID 0 Aug 29 11:30 geom_file1_xgandalf_r0080c00_SASE_1_condor.out
-rw-r--r--. 1 USERID USERID 1127187 Aug 29 12:08 geom_file1_xgandalf_r0081c00_SASE_1_condor.error
-rw-r--r--. 1 USERID USERID 1162 Aug 29 12:06 geom_file1_xgandalf_r0081c00_SASE_1_condor.log
-rw-r--r--. 1 USERID USERID 0 Aug 29 11:30 geom_file1_xgandalf_r0081c00_SASE_1_condor.out
-rw-r--r--. 1 USERID USERID 1706 Aug 29 11:31 geom_file1_xgandalf_r0081c01_SASE_1_condor.error
-rw-r--r--. 1 USERID USERID 1220 Aug 29 11:31 geom_file1_xgandalf_r0081c01_SASE_1_condor.log
-rw-r--r--. 1 USERID USERID 0 Aug 29 11:30 geom_file1_xgandalf_r0081c01_SASE_1_condor.out
Note
The naming convention for the log and stream files is as follows:
output = log/{geom_file_name}_{indexing method}_{runnum}_{streamname}_condor.out error = log/{geom_file_name}_{indexing method}_{runnum}_{streamname}_condor.error log = log/{geom_file_name}_{indexing method}_{runnum}_{streamname}_condor.log
stream = file_stream/{geom_file_name}_{indexing method}_{runnum}_{streamname}.stream
4.4. Job History
View log of HTCondor jobs completed to date(condor_history)
Example:
[USERID@pal-ui-el7 ~]$ condor_history | more
ID OWNER SUBMITTED RUN_TIME ST COMPLETED CMD
56235.0 userid 6/3 22:28 0+00:10:11 C 6/4 15:04 ../path/3_exec_indexing.sh ... ommited ...
56237.0 userid 6/3 22:28 0+00:09:11 C 6/4 15:04 ../path/3_exec_indexing.sh ... ommited ...
56234.0 userid 6/3 22:28 0+00:10:12 C 6/4 15:04 ../path/3_exec_indexing.sh ... ommited ...
56233.0 userid 6/3 22:28 0+00:10:11 C 6/4 15:04 ../path/3_exec_indexing.sh ... ommited ...
56232.0 userid 6/3 22:28 0+00:10:11 C 6/4 15:04 ../path/3_exec_indexing.sh ... ommited ...
56231.0 userid 6/3 22:28 0+00:10:11 C 6/4 15:04 ../path/3_exec_indexing.sh ... ommited ...
... ... ommited ... ...
ID : The cluster/process id of the job.
OWNER : The owner of the job.
SUBMITTED : The month, day, hour, and minute the job was submitted to the queue.
RUN_TIME : Remote wall clock time accumulated by the job to date in days, hours, minutes, and seconds, given as the job ClassAd attribute RemoteWallClockTime.
ST : Completion status of the job (C = completed and X = removed).
COMPLETED : The time the job was completed.
CMD : The name of the executable.