MethodsModels
ScaleĀ on Clusters

Many distributed computing environments offer batch processing capabilities. OpenMOLE supports most of the batch systems. Batch systems generally work by exposing an entry point on which the user can log in and submit jobs. OpenMOLE accesses this entry point using SSH. Different environments can be assigned to delegate the workload resulting of different tasks or groups of tasks. However, not all clusters expose the same features, so options may vary from one environment to another.

You should first provide your authentication information to OpenMOLE to be able to use your batch system.

PBS and Torque


PBS is a venerable batch system for clusters. You may use a PBS computing environment as follow:
val env =
  PBSEnvironment(
    "login",
    "machine.domain"
  )

You also can set options by providing additional parameters to the environment (..., option = value, ...):
  • port: the port number used by the ssh server, by default it is set to 22,
  • sharedDirectory: OpenMOLE uses this directory to communicate from the head node of the cluster to the worker nodes (defaults to sharedDirectory = "/home/user/.openmole/.tmp/ssh",
  • storageSharedLocally: When set to true, OpenMOLE will use symbolic links instead of physically copying files to the remote environment. This assumes that the OpenMOLE instance has access to the same storage space as the remote environment (think same NFS filesystem on desktop machine and cluster). Defaults to false and shouldn't be used unless you're 100% sure of what you're doing!,
  • workDirectory: the directory in which OpenMOLE will execute on the remote server, for instance workDirectory = "${TMP}",
  • queue: the name of the queue on which jobs will be submitted, for instance queue = "longjobs",
  • walltime: the maximum time a job is permitted to run before being killed, for instance wallTime = 1 hour,
  • memory: the memory for the job, for instance memory = 2 gigabytes,
  • openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256 megabytes,
  • nodes: Number of nodes requested,
  • threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4,
  • coreByNodes: An alternative to specifying the number of threads. coreByNodes takes the value of the threads when not specified, or 1 if none of them is specified.
  • name: the name an environment will take in the logs

SGE


To delegate some computation load to a SGE based cluster you can use the SGEEnvironment as follows:
val env =
  SGEEnvironment(
    "login",
    "machine.domain"
  )

You also can set options by providing additional parameters to the environment (..., option = value, ...):
  • port: the port number used by the ssh server, by default it is set to 22,
  • sharedDirectory: OpenMOLE uses this directory to communicate from the head node of the cluster to the worker nodes (defaults to sharedDirectory = "/home/user/.openmole/.tmp/ssh",
  • storageSharedLocally: When set to true, OpenMOLE will use symbolic links instead of physically copying files to the remote environment. This assumes that the OpenMOLE instance has access to the same storage space as the remote environment (think same NFS filesystem on desktop machine and cluster). Defaults to false and shouldn't be used unless you're 100% sure of what you're doing!
  • workDirectory: the directory in which OpenMOLE will execute on the remote server, for instance workDirectory = "${TMP}",
  • queue: the name of the queue on which jobs will be submitted, for instance queue = "longjobs",
  • walltime: the maximum time a job is permitted to run before being killed, for instance wallTime = 1 hour,
  • memory: the memory for the job, for instance memory = 2 gigabytes,
  • openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256 megabytes,
  • threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4,
  • name: the name an environment will take in the logs.

Slurm


To delegate the workload to a Slurm based cluster you can use the SLURMEnvironment as follows:
val env =
  SLURMEnvironment(
    "login",
    "machine.domain",
    // optional parameters
    gres = List( Gres("resource", 1) ),
    constraints = List("constraint1", "constraint2")
  )

You also can set options by providing additional parameters to the environment (..., option = value, ...):
  • port: the port number used by the ssh server, by default it is set to 22,
  • sharedDirectory: OpenMOLE uses this directory to communicate from the head node of the cluster to the worker nodes (defaults to sharedDirectory = "/home/user/.openmole/.tmp/ssh",
  • storageSharedLocally: When set to true, OpenMOLE will use symbolic links instead of physically copying files to the remote environment. This assumes that the OpenMOLE instance has access to the same storage space as the remote environment (think same NFS filesystem on desktop machine and cluster). Defaults to false and shouldn't be used unless you're 100% sure of what you're doing!
  • workDirectory: the directory in which OpenMOLE will execute on the remote server, for instance workDirectory = "${TMP}",
  • queue: the name of the queue on which jobs will be submitted, for instance queue = "longjobs",
  • walltime: the maximum time a job is permitted to run before being killed, for instance wallTime = 1 hour,
  • memory: the memory for the job, for instance memory = 2 gigabytes,
  • openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256 megabytes,
  • nodes: Number of nodes requested,
  • threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4,
  • coresByNodes: An alternative to specifying the number of threads. coresByNodes takes the value of the threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4 when not specified, or 1 if none of them is specified.
  • qos: Quality of Service (QOS) as defined in the Slurm database
  • gres: a list of Generic Resource (GRES) requested. A Gres is a pair defined by the name of the resource and the number of resources requested (scalar). For instance gres = List( Gres("resource", 1) )
  • constraints: a list of Slurm defined constraints which selected nodes must match,
  • name: the name an environment will take in the logs.

Condor


Condor clusters can be leveraged using the following syntax:
val env =
  CondorEnvironment(
    "login",
    "machine.domain"
  )

You also can set options by providing additional parameters to the environment (..., option = value, ...):
  • port: the port number used by the ssh server, by default it is set to 22,
  • sharedDirectory: OpenMOLE uses this directory to communicate from the head node of the cluster to the worker nodes (defaults to sharedDirectory = "/home/user/.openmole/.tmp/ssh",
  • storageSharedLocally: When set to true, OpenMOLE will use symbolic links instead of physically copying files to the remote environment. This assumes that the OpenMOLE instance has access to the same storage space as the remote environment (think same NFS filesystem on desktop machine and cluster). Defaults to false and shouldn't be used unless you're 100% sure of what you're doing!
  • workDirectory: the directory in which OpenMOLE will execute on the remote server, for instance workDirectory = "${TMP}",
  • memory: the memory for the job, for instance memory = 2 gigabytes,
  • openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256 megabytes,
  • threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4,
  • name: the name an environment will take in the logs.

OAR


Similarly, OAR clusters are reached as follows:
val env =
  OAREnvironment(
    "login",
    "machine.domain"
  )

You also can set options by providing additional parameters to the environment (..., option = value, ...):
  • port: the port number used by the ssh server, by default it is set to 22,
  • sharedDirectory: OpenMOLE uses this directory to communicate from the head node of the cluster to the worker nodes (defaults to sharedDirectory = "/home/user/.openmole/.tmp/ssh",
  • storageSharedLocally: When set to true, OpenMOLE will use symbolic links instead of physically copying files to the remote environment. This assumes that the OpenMOLE instance has access to the same storage space as the remote environment (think same NFS filesystem on desktop machine and cluster). Defaults to false and shouldn't be used unless you're 100% sure of what you're doing!
  • workDirectory: the directory in which OpenMOLE will execute on the remote server, for instance workDirectory = "${TMP}",
  • queue: the name of the queue on which jobs will be submitted, for instance queue = "longjobs",
  • walltime: the maximum time a job is permitted to run before being killed, for instance wallTime = 1 hour,
  • openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256 megabytes,
  • threads: the number of threads for concurrent execution of tasks on the worker node, for instance threads = 4,
  • core: number of cores allocated for each job,
  • cpu: number of CPUs allocated for each job,
  • bestEffort: a boolean for setting the best effort mode (true by default),
  • name: the name an environment will take in the logs.