HTCondor References
Condor_manual
Condor Commands
You can check the commands by typing condor_ and pressing the tab key.
[USERID@pal-ui02-el7 ~]$ condor_
condor_advertise condor_had condor_release condor_submit_dag
condor_aklog condor_history condor_replication condor_suspend
condor_annex condor_hold condor_reschedule condor_tail
condor_c-gahp condor_job_router_info condor_restart condor_test_match
condor_c-gahp_worker_thread condor_master condor_rm condor_testwritelog
condor_check_userlogs condor_negotiator condor_router_history condor_transferd
condor_cod condor_now condor_router_q condor_transfer_data
condor_collector condor_nsenter condor_router_rm condor_transform_ads
condor_config_val condor_off condor_run condor_update_machine_ad
condor_continue condor_on condor_schedd condor_updates_stats
condor_convert_history condor_ping condor_set_shutdown condor_userlog
condor_credd condor_pool_job_report condor_shadow condor_userlog_job_counter
condor_dagman condor_power condor_sos condor_userprio
condor_docker_enter condor_preen condor_ssh_to_job condor_vacate
condor_drain condor_prio condor_startd condor_vacate_job
condor_fetchlog condor_procd condor_starter condor_version
condor_findhost condor_q condor_stats condor_vm-gahp-vmware
condor_gather_info condor_qedit condor_status condor_vm_vmware
condor_gridmanager condor_qsub condor_store_cred condor_wait
condor_gridshell condor_reconfig condor_submit condor_who
Checking Condor Job Queue
condor_q:
Shows jobs submitted by the user (yourself).
Refer to condor_q -help or the official documentation for options.
condor_q -alluser
Shows jobs submitted by other users.
Only jobs on the single UI node (pal-ui-el7.sdfarm.kr, pal-ui02-el7.sdfarm.kr) can be checked.
condor_q -alluser -global:
Shows jobs submitted by other users across all UI nodes (pal-ui-el7.sdfarm.kr, pal-ui02-el7.sdfarm.kr).
Analyzing Jobs in HTCondor
To analyze jobs that are in the IDLE/HOLD state in HTCondor, you can use the condor_q command with the -analyze or -better-analyze options. These options provide detailed information about why a job is in the IDLE state and suggest potential reasons for scheduling issues.
Use condor_q -analyze: - This command provides a basic analysis of the job’s status and reasons why it might not be running.
condor_q -analyze {JOB_ID}Example:
condor_q -analyze 123.0Use condor_q -better-analyze: - This command offers a more detailed analysis compared to -analyze, giving deeper insights into the job’s scheduling constraints and resource availability.
condor_q -better-analyze {JOB_ID}Example:
condor_q -better-analyze 123.0
By using these commands, you can identify the reasons why a job remains in the IDLE state and take appropriate actions to resolve any issues.
Removing Submitted Condor Jobs
- condor_rm ${JOB_IDS}
JOB_IDS can be found from the condor_q result.
Example: condor_rm 9803.0
- If there are many jobs, you can use braces to specify a range of job IDs.
Example: {9800..9827}: The job IDs from the start number to the end number.
[USERID@pal-ui-el7 file_stream]$ condor_rm {25865..25880}
All jobs matching constraint (ClusterId == 25865 || ClusterId == 25866 || ClusterId == 25867 || ClusterId == 25868 || ClusterId == 25869 || ClusterId == 25870 || ClusterId == 25871 || ClusterId == 25872 || ClusterId == 25873 || ClusterId == 25874 || ClusterId == 25875 || ClusterId == 25876 || ClusterId == 25877 || ClusterId == 25878 || ClusterId == 25879 || ClusterId == 25880) have been marked for removal
Condor Job Prioritization
Jobs are scheduled according to Condor’s scheduling policy. * For example, if a user submits a large number of jobs and another user submits new jobs, the priority might shift, causing delays in resource allocation for the waiting jobs.
[USERID@pal-ui02-el7 ~]$ condor_userprio -all
Last Priority Update: 6/14 13:43
Effective Real Priority Res Total Usage Usage Last Time Since
User Name Priority Priority Factor In Use (wghted-hrs) Start Time Usage Time Last Usage
------------------ ------------ -------- --------- ------ ------------ ---------------- ---------------- ----------
OTHERUSERID@sdfarm.kr 34436.42 34.44 1000.00 0 3974.01 5/25/2022 12:16 6/14/2022 12:59 0+00:43
USERID@sdfarm.kr 343972.75 343.97 1000.00 720 61202.98 5/23/2022 14:29 6/14/2022 13:43 <now>
------------------ ------------ -------- --------- ------ ------------ ---------------- ---------------- ----------
Number of users: 2 720 65176.99 6/13/2022 13:43 0+23:59
Effective Priority - Numerical value indicating the level of resource allocation. - Lower values represent higher priority.