Batch systems

Many distributed computing environments offer batch processing capabilities. OpenMOLE supports most of the batch systems. Batch systems generally work by exposing an entry point on which the user can log in and submit jobs. OpenMOLE accesses this entry point using SSH. Different environments can be assigned to delegate the workload resulting of different tasks or groups of tasks. However, not all clusters expose the same features, so options may vary from one environment to another.

You should first provide your authentication information to OpenMOLE to be able to use your batch system.

OpenMOLE offers several ways to authenticate to a remote machine through SSH: login/password and private key. The following instructions explain how to setup SSH authentications.

If you are using the script editor you should configure them directly in the authentication panel.

In console mode, you can define an authentication using a pair of login / password with the following command:

SSHAuthentication += LoginPassword("login", encrypted, "machine-name")

Or to authenticate with a private key:

SSHAuthentication += PrivateKey("/path/to/the/private/key", "login", encrypted, "machine-name")

Both calls mention the encrypted function. This function will prompt for the password/passphrase of the private key right after the call to the builder of the Environment using this SSHAuthentication.

The last part of the SSHAuthentication: "machine-name" should match exactly the address of the machine in you execution environment. OpenMOLE search the matching ssh keys using an exact match on login and machine-name between the environment and the stored keys.

In case you encounter troubles at setting up an SSH connection in OpenMOLE, you should check the corresponding troubleshooting section.

PBS / Torque

PBS is a venerable batch system for clusters. You may use a PBS computing environment as follow:

val env =
  PBSEnvironment(
    "login",
    "machine.domain"
  )

You also can set options by providing additional parameters to the environment (..., option = value, ...):

port: the number of the port used by the ssh server, by default it is set to 22,
sharedDirectory: OpenMOLE uses this directory to communicate from the head of the cluster to the worker nodes (sharedDirectory = "/home/user/openmole",
workDirectory: the directory in which OpenMOLE will run on the remote server, for instance workDirectory = "${TMP}",
queue: the name of the queue on which jobs should be submitted, for instance queue = "longjobs",
walltime: the maximum duration for the job in term of user time, for instance wallTime = 1 hour,
memory: the memory in mega-byte for the job, for instance memory = 2000,
openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256,
nodes: Number of nodes requested,
threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4,
coreByNodes: An alternative to specifying the number of threads. coreByNodes takes the value of the threads when not specified, or 1 if none of them is specified.
name: the name an environment will take in the logs

SGE

To delegate some computation load to a SGE based cluster you can use the SGEEnvironment as follows:

val env =
  SGEEnvironment(
    "login",
    "machine.domain"
  )

You also can set options by providing additional parameters to the environment (..., option = value, ...):

port: the number of the port used by the ssh server, by default it is set to 22,
sharedDirectory: OpenMOLE uses this directory to communicate from the head of the cluster to the worker nodes (sharedDirectory = "/home/user/openmole",
workDirectory: the directory in which OpenMOLE will run on the remote server, for instance workDirectory = "${TMP}",
queue: the name of the queue on which jobs should be submitted, for instance queue = "longjobs",
walltime: the maximum duration for the job in term of user time, for instance wallTime = 1 hour,
memory: the memory in mega-byte for the job, for instance memory = 2000,
openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256,
threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4,
name: the name an environment will take in the logs.

Slurm

To delegate the workload to a Slurm based cluster you can use the Slurm environment as follows:

val env =
  SLURMEnvironment(
    "login",
    "machine.domain",
    // optional parameters
    gres = List( Gres("resource", 1) ),
    constraints = List("constraint1", "constraint2")
  )

You also can set options by providing additional parameters to the environment (..., option = value, ...):

port: the number of the port used by the ssh server, by default it is set to 22,
sharedDirectory: OpenMOLE uses this directory to communicate from the head of the cluster to the worker nodes (sharedDirectory = "/home/user/openmole",
workDirectory: the directory in which OpenMOLE will run on the remote server, for instance workDirectory = "${TMP}",
queue: the name of the queue on which jobs should be submitted, for instance queue = "longjobs",
walltime: the maximum duration for the job in term of user time, for instance wallTime = 1 hour,
memory: the memory in mega-byte for the job, for instance memory = 2000,
openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256,
nodes: Number of nodes requested,
threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4,
coresByNodes: An alternative to specifying the number of threads. coresByNodes takes the value of the threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4 when not specified, or 1 if none of them is specified.
qos: Quality of Service (QOS) as defined in the Slurm database
gres: a list of Generic Resource (GRES) requested. A Gres is a pair defined by the name of the resource and the number of resources requested (scalar).
constraints: a list of Slurm defined constraints which selected nodes must match,
name: the name an environment will take in the logs.

Condor

Condor clusters can be leveraged using the following syntax:

val env =
  CondorEnvironment(
    "login",
    "machine.domain"
  )

You also can set options by providing additional parameters to the environment (..., option = value, ...):

port: the number of the port used by the ssh server, by default it is set to 22,
sharedDirectory: OpenMOLE uses this directory to communicate from the head of the cluster to the worker nodes (sharedDirectory = "/home/user/openmole",
workDirectory: the directory in which OpenMOLE will run on the remote server, for instance workDirectory = "${TMP}",
memory: the memory in mega-byte for the job, for instance memory = 2000,
openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256,
threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4,
name: the name an environment will take in the logs.

OAR

Similarly, OAR clusters are reached as follows:

val env =
  OAREnvironment(
    "login",
    "machine.domain"
  )

You also can set options by providing additional parameters to the environment (..., option = value, ...):

port: the number of the port used by the ssh server, by default it is set to 22,
sharedDirectory: OpenMOLE uses this directory to communicate from the head of the cluster to the worker nodes (sharedDirectory = "/home/user/openmole",
workDirectory: the directory in which OpenMOLE will run on the remote server, for instance workDirectory = "${TMP}",
queue: the name of the queue on which jobs should be submitted, for instance queue = "longjobs",
walltime: the maximum duration for the job in term of user time, for instance wallTime = 1 hour,
openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256,
threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4,
core: number of cores allocated for each job,
cpu: number of CPUs allocated for each job,
bestEffort: a boolean for setting the best effort mode (true by default),
name: the name an environment will take in the logs.