Scheduler

The scheduler module provides a flexible framework for generating batch job submission scripts for different High Performance Computing (HPC) schedulers. It supports both PBS and Slurm schedulers through a factory pattern design.

Overview

The scheduler module consists of three main components:

  • Scheduler: Base class that provides common functionality and factory pattern support

  • PBS: Implementation for PBS Pro scheduler systems

  • Slurm: Implementation for Slurm scheduler systems

Quick Start

Basic usage example:

from wxflow import Scheduler

# Configuration for a PBS job
pbs_config = {
    'scheduler': 'PBS',
    'jobname': 'my_job',
    'account': 'my_account',
    'queue': 'batch',
    'nodes': 2,
    'tasks_per_node': 4,
    'memory': '8G',
    'walltime': '02:00:00',
    'stdout': 'job.out',
    'stderr': 'job.err'
}

# Create scheduler instance
scheduler = Scheduler(pbs_config)
job_card = scheduler.scheduler_factory.create(pbs_config['scheduler'], pbs_config)

# Generate batch card
print(job_card.get_batch_card)

# Save to file
job_card.dump('submit_job.pbs')

Configuration Options

The scheduler configuration accepts the following keys:

Required Keys

  • scheduler: Scheduler type (‘PBS’ or ‘Slurm’)

  • jobname: Name of the job

Optional Keys for Both Schedulers

  • account: Account to charge for resources

  • queue: Queue or partition name (maps to ‘qos’ in Slurm)

  • partition: Partition name (Slurm only)

  • nodes: Number of nodes to request

  • tasks: Total number of tasks

  • tasks_per_node: Tasks per node

  • tasks_per_core: Tasks per core

  • memory: Memory requirement (e.g., ‘8G’, ‘2048M’)

  • walltime: Time limit (e.g., ‘02:00:00’)

  • stdout: Standard output file path

  • stderr: Standard error file path

  • env: Environment variables to export

  • native: Additional scheduler-specific options

  • exclusive: Request exclusive node access

  • debug: Enable debug mode

PBS-Specific Options

  • shell: Shell to use (e.g., ‘/bin/bash’)

  • join: Join stdout and stderr

  • ppn: Processors per node

  • threads: Number of threads (OpenMP)

  • chunk: Placement strategy (‘free’, ‘pack’, ‘scatter’, ‘vscatter’)

Classes and Methods

Scheduler

class wxflow.Scheduler(config: Dict[str, Any], *args: Any, **kwargs: Any)[source]

Bases: object

Initializes the scheduler with the provided configuration.

Parameters:

config (Dict) –

Configuration dictionary for the scheduler. The expected keys and their usage are:
  • ‘scheduler_type’ (str, required): Specifies the type of scheduler to use.

    Accepted values are ‘slurm’ or ‘pbs’.

  • ‘job_name’ (str, required): Name of the job. Used in both Slurm and PBS.

  • ‘partition’ (str, optional): Partition or queue to submit the job to.

    Used in Slurm as ‘partition’, in PBS as ‘queue’.

  • ‘nodes’ (int, optional): Number of nodes to request. Used in both Slurm and PBS.

  • ‘ntasks’ (int, optional): Number of tasks. Used in Slurm.

  • ‘ppn’ (int, optional): Processors per node. Used in PBS.

  • ‘time’ (str, optional): Walltime limit for the job (e.g., ‘01:00:00’). Used in both.

  • ‘output’ (str, optional): Path for standard output file. Used in both.

  • ‘error’ (str, optional): Path for standard error file. Used in both.

  • ‘account’ (str, optional): Account to charge for resources. Used in both.

  • ‘mail_user’ (str, optional): Email address for notifications. Used in both.

  • ‘mail_type’ (str, optional): Type of email notifications. Used in both.

  • ‘native’ (str, optional): Any additional scheduler-specific options can be included as needed.

Notes

  • Required keys: ‘scheduler_type’, ‘job_name’

  • Optional keys: All others listed above.

  • The dictionary may contain other scheduler-specific options as needed.

*argsAny

Additional positional arguments.

**kwargsAny

Additional keyword arguments.

__init__(config: Dict[str, Any], *args: Any, **kwargs: Any) None[source]

Initializes the scheduler with the provided configuration.

Parameters:

config (Dict) –

Configuration dictionary for the scheduler. The expected keys and their usage are:
  • ‘scheduler_type’ (str, required): Specifies the type of scheduler to use.

    Accepted values are ‘slurm’ or ‘pbs’.

  • ‘job_name’ (str, required): Name of the job. Used in both Slurm and PBS.

  • ‘partition’ (str, optional): Partition or queue to submit the job to.

    Used in Slurm as ‘partition’, in PBS as ‘queue’.

  • ‘nodes’ (int, optional): Number of nodes to request. Used in both Slurm and PBS.

  • ‘ntasks’ (int, optional): Number of tasks. Used in Slurm.

  • ‘ppn’ (int, optional): Processors per node. Used in PBS.

  • ‘time’ (str, optional): Walltime limit for the job (e.g., ‘01:00:00’). Used in both.

  • ‘output’ (str, optional): Path for standard output file. Used in both.

  • ‘error’ (str, optional): Path for standard error file. Used in both.

  • ‘account’ (str, optional): Account to charge for resources. Used in both.

  • ‘mail_user’ (str, optional): Email address for notifications. Used in both.

  • ‘mail_type’ (str, optional): Type of email notifications. Used in both.

  • ‘native’ (str, optional): Any additional scheduler-specific options can be included as needed.

Notes

  • Required keys: ‘scheduler_type’, ‘job_name’

  • Optional keys: All others listed above.

  • The dictionary may contain other scheduler-specific options as needed.

*argsAny

Additional positional arguments.

**kwargsAny

Additional keyword arguments.

Properties:

get_batch_card -> str

Returns the complete batch card as a string with newlines.

get_native -> List[str]

Returns native scheduler directives.

Class Methods:

Instance Methods:

dump(filename: str | None = None) None[source]

Dumps the batch card to a file or prints it to stdout.

If a filename is provided, writes the contents of self.batch_card to the specified file. Otherwise, prints the batch card to standard output.

Parameters:

filename (Optional[str], default=None) – The path to the file where the batch card should be written. If None, the batch card is printed to stdout.

Return type:

None

Raises:

Exception – If an error occurs while writing to the specified file.

dump(filename: str | None = None) None[source]

Dumps the batch card to a file or prints it to stdout.

If a filename is provided, writes the contents of self.batch_card to the specified file. Otherwise, prints the batch card to standard output.

Parameters:

filename (Optional[str], default=None) – The path to the file where the batch card should be written. If None, the batch card is printed to stdout.

Return type:

None

Raises:

Exception – If an error occurs while writing to the specified file.

property get_accounting: None

Generate the accounting specific items of the job card. E.g. jobname, queuing, partitions, accounts etc.

property get_batch_card: str
property get_env: None

Export environment variables

property get_native: List[str]
property get_resources: None

Generate the resource specific items of the job card. E.g. nodes, memory, walltime, exclusive, etc.

static memory_in_bytes(memory: str) float[source]

Converts bytes, k, M, G, T (case-insensitive) to number of bytes Default units of input memory string is bytes Uses powers of 1024 for scaling (kilobytes, megabytes, etc.) 1024 Bytes = 1 KB

Parameters:

memory (str) – Memory string to convert, e.g., ‘1024’, ‘1K’, ‘512M’, ‘2G’, ‘1T’

Returns:

int

Return type:

memory in bytes

static memory_in_megabytes(memory: str) int[source]

Converts input memory in bytes into Megabytes 1 MB = 1048576 Bytes

scheduler_factory = <wxflow.factory.Factory object>
static walltime_in_string(walltime: str | timedelta) str[source]

PBS

class wxflow.PBS(config: dict, *args, **kwargs)[source]

Bases: Scheduler

Class to construct PBS job cards. There are several PBS implementations. This implementation supports the PBS Pro implementation from https://www.altair.com/pdfs/pbsworks/PBSUserGuide2021.1.pdf

Parameters:
  • config

  • args

  • kwargs

Return type:

object

__init__(config: dict, *args, **kwargs)[source]
Parameters:
  • config

  • args

  • kwargs

Return type:

object

Properties:

get_accounting -> List[str]

Generate the accounting specific items of the job card. E.g. jobname, queuing, partitions, accounts etc.

Generate accounting-specific PBS directives (job name, account, queue, etc.).

get_resources -> List[str]

Generate the resource specific items of the job card. E.g. nodes, memory, walltime, exclusive, etc.

Generate resource-specific PBS directives (walltime, select, place, etc.).

get_env -> List[str]

Export environment variables

Generate environment variable export directives.

get_select -> str

Construct the “select” resource request for the job :rtype: #PBS -l select=<get_select>

Construct the “select” resource request string.

get_place -> str

Construct the “place” placement request for the job :rtype: #PBS -l place=<get_place>

Construct the “place” placement request string.

get_native -> List[str]

Generate the PBS specific native directives verbatim from the user input.

Generate PBS-specific native directives.

DIRECTIVE = '#PBS'
property get_accounting

Generate the accounting specific items of the job card. E.g. jobname, queuing, partitions, accounts etc.

property get_env

Export environment variables

property get_native: List[str]

Generate the PBS specific native directives verbatim from the user input.

property get_place

Construct the “place” placement request for the job :rtype: #PBS -l place=<get_place>

property get_resources

Generate the resource specific items of the job card. E.g. nodes, memory, walltime, exclusive, etc.

property get_select

Construct the “select” resource request for the job :rtype: #PBS -l select=<get_select>

Slurm

class wxflow.Slurm(config: dict, *args, **kwargs)[source]

Bases: Scheduler

Class to construct Slurm job cards. For SBATCH reference see: https://slurm.schedmd.com/sbatch.html

Constructor for Slurm scheduler.

Parameters:

config (dict)

__init__(config: dict, *args, **kwargs)[source]

Constructor for Slurm scheduler.

Parameters:

config (dict)

Properties:

get_accounting -> List[str]

Generate the accounting specific items of the job card. E.g. jobname, queuing, partitions, accounts etc.

Generate accounting-specific Slurm directives (job name, account, partition, etc.).

get_resources -> List[str]

Generate the resource specific items of the job card. E.g. nodes, memory, walltime, exclusive, etc.

Generate resource-specific Slurm directives (nodes, tasks, memory, etc.).

get_env -> List[str]

Export environment variables

Generate environment variable export directives.

get_native -> List[str]

Generate the Slurm specific native directives verbatim from the user input.

Generate Slurm-specific native directives.

DIRECTIVE = '#SBATCH'
property get_accounting

Generate the accounting specific items of the job card. E.g. jobname, queuing, partitions, accounts etc.

property get_env

Export environment variables

property get_native: List[str]

Generate the Slurm specific native directives verbatim from the user input.

property get_resources

Generate the resource specific items of the job card. E.g. nodes, memory, walltime, exclusive, etc.

Examples

PBS Example

from wxflow import PBS

config = {
    'jobname': 'weather_model',
    'account': 'weather_proj',
    'queue': 'batch',
    'nodes': 4,
    'tasks_per_node': 8,
    'memory': '16G',
    'walltime': '04:00:00',
    'stdout': 'model.out',
    'stderr': 'model.err',
    'shell': '/bin/bash',
    'env': ['ALL'],
    'chunk': 'pack',
    'exclusive': True
}

pbs_job = PBS(config)
print(pbs_job.get_batch_card)

This generates:

#PBS -S /bin/bash
#PBS -N weather_model
#PBS -A weather_proj
#PBS -q batch
#PBS -o model.out
#PBS -e model.err
#PBS -l walltime=04:00:00
#PBS -l select=4:mpiprocs=8:mem=16384M
#PBS -l place=pack:excl
#PBS -V

Slurm Example

from wxflow import Slurm

config = {
    'jobname': 'weather_model',
    'account': 'weather_proj',
    'partition': 'compute',
    'nodes': 4,
    'tasks_per_node': 8,
    'memory': '16G',
    'walltime': '04:00:00',
    'stdout': 'model.out',
    'stderr': 'model.err',
    'env': ['ALL'],
    'exclusive': True
}

slurm_job = Slurm(config)
print(slurm_job.get_batch_card)

This generates:

#SBATCH --job-name=weather_model
#SBATCH --account=weather_proj
#SBATCH --partition=compute
#SBATCH --output=model.out
#SBATCH --error=model.err
#SBATCH --time=04:00:00
#SBATCH --nodes=4
#SBATCH --ntasks_per-node=8
#SBATCH --mem=16384M
#SBATCH --exclusive
#SBATCH --export=ALL

Memory and Walltime Utilities

The scheduler provides utility methods for converting memory and walltime formats:

from wxflow import Scheduler

# Memory conversions
bytes_val = Scheduler.memory_in_bytes('8G')        # 8589934592
mb_val = Scheduler.memory_in_megabytes('8G')       # 8192

# Walltime conversion
from datetime import timedelta
td = timedelta(hours=2, minutes=30)
time_str = Scheduler.walltime_in_string(td)        # '02:30:00'

Factory Pattern

The scheduler uses a factory pattern to create appropriate scheduler instances:

from wxflow import Scheduler

# Factory automatically selects the right class
config = {'scheduler': 'PBS', 'jobname': 'test'}
scheduler = Scheduler(config)
pbs_instance = scheduler.scheduler_factory.create('PBS', config)

config = {'scheduler': 'Slurm', 'jobname': 'test'}
slurm_instance = scheduler.scheduler_factory.create('Slurm', config)

Error Handling

The scheduler validates configuration and raises appropriate errors:

  • ValueError: Invalid memory format, walltime format, or missing required fields

  • KeyError: Missing required configuration keys

try:
    config = {'jobname': 'test'}  # Missing scheduler type
    scheduler = Scheduler(config)
except KeyError as e:
    print(f"Missing configuration: {e}")

Best Practices

  1. Always specify required fields: Ensure ‘scheduler’ and ‘jobname’ are provided

  2. Use appropriate memory units: Prefer ‘G’ for gigabytes, ‘M’ for megabytes

  3. Format walltime correctly: Use ‘HH:MM:SS’ format or timedelta objects

  4. Test batch cards: Always review generated batch cards before submission

  5. Use native options sparingly: Prefer standard configuration options over native directives

Note

Different HPC systems may have varying requirements. Always consult your system’s documentation for specific scheduler configurations and resource limits.