gc3libs.backends

Interface to different resource management systems for the GC3Libs.

class gc3libs.backends.LRMS(name, architecture, max_cores, max_cores_per_job, max_memory_per_core, max_walltime, auth=None, **extra_args)

Base class for interfacing with a computing resource.

The following construction parameters are also set as instance attributes. All of them are mandatory, except auth.

Attribute name Expected Type Meaning
name string A unique identifier for this resource, used for generating error message.
architecture set of Run.Arch values Should contain one entry per each architecture supported. Valid architecture values are constants in the gc3libs.Run.Arch class.
auth string A gc3libs.authentication.Auth instance that will be used to access the computational resource associated with this backend. The default value None is used to mean that no authentication credentials are needed (e.g., access to the resource has been pre-authenticated) or is managed outside of GC3Pie).
max_cores int Maximum number of CPU cores that GC3Pie can allocate on this resource.
max_cores_per_job int Maximum number of CPU cores that GC3Pie can allocate on this resource for a single job.
max_memory_per_core Memory Maximum memory that GC3Pie can allocate to jobs on this resource. The value is per core, so the actual amount allocated to a single job is the value of this entry multiplied by the number of cores requested by the job.
max_walltime Duration Maximum wall-clock time that can be allotted to a single job running on this resource.

The above should be considered immutable attributes: they are specified at construction time and changed never after.

The following attributes are instead dynamically provided (i.e., defined by the get_resource_status() method or similar), thus can change over the lifetime of the object:

Attribute name Type
free_slots int
user_run int
user_queued int
queued int
static authenticated(fn)

Decorator: mark a function as requiring authentication.

Each invocation of the decorated function causes a call to the get method of the authentication object (configured with the auth parameter to the class constructor).

cancel_job(app)

Cancel a running job. If app is associated to a queued or running remote job, tell the execution middleware to cancel it.

close()

Implement gracefully close on LRMS dependent resources e.g. transport

free(app)

Free up any remote resources used for the execution of app. In particular, this should delete any remote directories and files.

Call this method when app.execution.state is anything other than TERMINATED results in undefined behavior and will likely be the cause of errors later on. Be cautious.

get_resource_status()

Update the status of the resource associated with this LRMS instance in-place. Return updated Resource object.

get_results(job, download_dir, overwrite=False, changed_only=True)

Retrieve job output files into local directory download_dir.

Directory download_dir must already exists.

If optional 3rd argument overwrite is False (default), then existing files within download_dir (or subdirectories thereof) will not be altered in any way.

If overwrite is instead True, then the (optional) 4th argument changed_only determines what files are overwritten:

  • if changed_only is True (default), then only files for which the source has a different size or has been modified more recently than the destination are copied;
  • if changed_only is False, then all files in source will be copied into destination, unconditionally.

Output files that do not exist in download_dir will be copied, independently of the overwrite and changed_only settings.

Parameters:
  • job (Task) – the Task instance whose output should be retrieved
  • download_dir (str) – path to download files into
  • overwrite (bool) – if False, do not download files that already exist
  • changed_only (bool) – if both this and overwrite are True, only overwrite those files such that the source is newer or different in size than the destination.
peek(app, remote_filename, local_file, offset=0, size=None)

Download size bytes (at offset offset from the start) from remote file remote_filename and write them into local_file. If size is None (default), then snarf contents of remote file from offset unto the end.

First argument remote_filename is the path to a file relative to the remote job “sandbox”.

Argument local_file is either a local path name (string), or a file-like object supporting a .write() method. If local_file is a path name, it is created if not existent, otherwise overwritten. In any case, upon exit from this procedure, the stream will be positioned just after the written bytes.

Fourth optional argument offset is the offset from the start of the file. If offset is negative, it is interpreted as an offset from the end of the remote file.

Any exception raised by operations will be re-raised to the caller.

submit_job(application, job)

Submit an Application instance to the configured computational resource; return a gc3libs.Job instance for controlling the submitted job.

This method only returns if the job is successfully submitted; upon any failure, an exception is raised.

Note:

  1. job.state is not altered; it is the caller’s responsibility to update it.
  2. the job object may be updated with any information that is necessary for this LRMS to perform further operations on it.
update_job_state(app)

Query the state of the remote job associated with app and update app.execution.state accordingly. Return the corresponding Run.State; see Run.State for more details.

validate_data(data_file_list=None)

Return True if the list of files is expressed in one of the file transfer protocols the LRMS supports.

Return False otherwise.