gc3libs.backends.shellcmd

Run applications as processes starting them from the shell.

class gc3libs.backends.shellcmd.ShellcmdLrms(name, architecture, max_cores, max_cores_per_job, max_memory_per_core, max_walltime, auth=None, frontend='localhost', transport='local', time_cmd=None, override='False', spooldir='$HOME/.gc3pie_jobs', resourcedir=None, ssh_config=None, keyfile=None, ignore_ssh_host_keys=False, ssh_timeout=None, large_file_threshold=None, large_file_chunk_size=None, **extra_args)

Execute an Application instance through the shell.

Construction of an instance of ShellcmdLrms takes the following optional parameters (in addition to any parameters taken by the base class LRMS):

Parameters:
  • time_cmd (str) –

    Path to the GNU time command. Default is /usr/bin/time which is correct on all known Linux distributions.

    This backend uses many of the extended features of GNU time, so the shell-builtins or the BSD time will not work.

  • spooldir (str) – Path to a filesystem location where to create temporary working directories for processes executed through this backend. The default value None means to use $TMPDIR or /var/tmp (see tempfile.mkftemp for details).
  • resourcedir (str) – Path to a filesystem location where to create a temporary directory that will contain information on the jobs running on the machine. The default value None means to use $HOME/.gc3/shellcmd.d.
  • transport (str) – Transport to use to connect to the resource. Valid values are 'ssh' or 'local'.
  • frontend (str) – If transport is 'ssh', then frontend is the hostname of the remote machine where the jobs will be executed.
  • ignore_ssh_host_key (bool) – When connecting to a remote resource using the 'ssh' transport the server’s SSH public key is usually checked against a database of known hosts, and if the key is found but it does not match with the one saved in the database, the connection will fail. Setting ignore_ssh_host_key to True will disable this check, thus introducing a potential security issue but allowing connection even though the database contains old/invalid keys. (The main use case is when connecting to VMs on a IaaS cloud, since the IP is usually reused and therefore the ssh key is recreated.)
  • override (bool) – ShellcmdLrms by default will try to gather information on the machine the resource is running on, including the number of cores and the available memory. These values may be different from the values stored in the configuration file. If override is True, then the values automatically discovered will be used instead of the ones in the configuration file. If override is False, instead, the values in the configuration file will be used.
  • ssh_timeout (int) – If transport is 'ssh', this value will be used as timeout (in seconds) for connecting to the SSH TCP socket.
  • large_file_threshold (gc3libs.quantity.Memory) – Copy files below this size in one single SFTP GET operation; see SshTransport.get() for more information. Only used if transport is 'ssh'.
  • large_file_chunk_size (gc3libs.quantity.Memory) – Copy files that are over the above-mentioned threshold by sequentially transferring chunks of this size. see SshTransport.get() for more information. Only used if transport is 'ssh'.
MOVER_SCRIPT = 'mover.py'

Name of the data uploader/downloader script (within PRIVATE_DIR).

PRIVATE_DIR = '.gc3pie_shellcmd'

Subdirectory of a tasks’s execution directory reserved for storing ShellcmdLrms files.

RESOURCE_DIR = '$HOME/.gc3/shellcmd.d'

Path to the directory where bookkeeping files are stored. (This is on the target machine where ShellcmdLrms executes commands.)

It may contain environmental variable references, which are expanded through the (remote) shell.

TIMEFMT = 'WallTime=%es\nKernelTime=%Ss\nUserTime=%Us\nCPUUsage=%P\nMaxResidentMemory=%MkB\nAverageResidentMemory=%tkB\nAverageTotalMemory=%KkB\nAverageUnsharedMemory=%DkB\nAverageUnsharedStack=%pkB\nAverageSharedMemory=%XkB\nPageSize=%ZB\nMajorPageFaults=%F\nMinorPageFaults=%R\nSwaps=%W\nForcedSwitches=%c\nWaitSwitches=%w\nInputs=%I\nOutputs=%O\nSocketReceived=%r\nSocketSent=%s\nSignals=%k\nReturnCode=%x'

Format string for running commands with /usr/bin/time. It is used by GC3Pie to capture resource usage data for commands executed through the shell.

The value used here lists all the resource usage values that GNU time can capture, with the same names used by the ARC Resource Manager (for historical reasons).

TIMEFMT_CONV = {'AverageResidentMemory': ('shellcmd_average_resident_memory', <class 'gc3libs.quantity.Memory'>), 'AverageSharedMemory': ('shellcmd_average_shared_memory', <class 'gc3libs.quantity.Memory'>), 'AverageTotalMemory': ('shellcmd_average_total_memory', <class 'gc3libs.quantity.Memory'>), 'AverageUnsharedMemory': ('shellcmd_average_unshared_memory', <class 'gc3libs.quantity.Memory'>), 'AverageUnsharedStack': ('shellcmd_average_unshared_stack', <class 'gc3libs.quantity.Memory'>), 'CPUUsage': ('shellcmd_cpu_usage', <function _parse_percentage>), 'ForcedSwitches': ('shellcmd_involuntary_context_switches', <class 'int'>), 'Inputs': ('shellcmd_filesystem_inputs', <class 'int'>), 'KernelTime': ('shellcmd_kernel_time', <class 'gc3libs.quantity.Duration'>), 'MajorPageFaults': ('shellcmd_major_page_faults', <class 'int'>), 'MaxResidentMemory': ('max_used_memory', <class 'gc3libs.quantity.Memory'>), 'MinorPageFaults': ('shellcmd_minor_page_faults', <class 'int'>), 'Outputs': ('shellcmd_filesystem_outputs', <class 'int'>), 'PageSize': ('shellcmd_page_size', <class 'gc3libs.quantity.Memory'>), 'ReturnCode': ('returncode', <function _parse_returncode_string>), 'Signals': ('shellcmd_signals_delivered', <class 'int'>), 'SocketReceived': ('shellcmd_socket_received', <class 'int'>), 'SocketSent': ('shellcmd_socket_sent', <class 'int'>), 'Swaps': ('shellcmd_swapped', <class 'int'>), 'UserTime': ('shellcmd_user_time', <class 'gc3libs.quantity.Duration'>), 'WaitSwitches': ('shellcmd_voluntary_context_switches', <class 'int'>), 'WallTime': ('duration', <function _parse_time_duration>)}

How to translate GNU time output into values stored in the .execution attribute.

The dictionary maps key names (as used in the TIMEFMT string) to a pair (attribute name, converter function) consisting of the name of an attribute that will be set on a task’s .execution object, and a function to convert the (string) value gotten from GNU time output into the actual Python value written.

WRAPPER_OUTPUT_FILENAME = 'resource_usage.txt'

Name of the file where resource usage is written to.

(Relative to PRIVATE_DIR.)

WRAPPER_PID = 'wrapper.pid'

Name of the file where the wrapper script’s PID is stored.

(Relative to PRIVATE_DIR).

WRAPPER_SCRIPT = 'wrapper_script.sh'

Name of the task launcher script (within PRIVATE_DIR).

The ShellcmdLrms writes here that wrap an application’s payload script, to collect resource usage or download/upload result files, etc.

cancel_job(app)

Kill all children processes of the given task app.

The PID of the wrapper script (which is the root of the PID tree we are going to send a “TERM” signal) must have been stored (by submit_job()) as app.execution.lrms_jobid.

close()

Implement gracefully close on LRMS dependent resources e.g. transport

count_running_tasks()

Returns number of currently running tasks.

Note

  1. The count of running tasks includes also tasks that may have been started by another GC3Pie process so this count can be positive when the resource has just been opened.
  2. The count is updated every time the resource is updated, so the returned number can be stale if the ShellcmdLrms.get_resource_status() has not been called for a while.
count_used_cores()

Return total nr. of cores used by running tasks.

Similar caveats as in ShellcmdLrms.count_running_tasks() apply here.

count_used_memory()

Return total amount of memory used by running tasks.

Similar caveats as in ShellcmdLrms.count_running_tasks() apply here.

free(app)

Delete the temporary directory where a child process has run. The temporary directory is removed with all its content, recursively.

If deletion is successful, the lrms_execdir attribute in app.execution is reset to None; subsequent invocations of this method on the same applications do nothing.

get_resource_status()

Update the status of the resource associated with this LRMS instance in-place. Return updated Resource object.

get_results(app, download_dir, overwrite=False, changed_only=True)

Retrieve job output files into local directory download_dir.

Directory download_dir must already exists.

If optional 3rd argument overwrite is False (default), then existing files within download_dir (or subdirectories thereof) will not be altered in any way.

If overwrite is instead True, then the (optional) 4th argument changed_only determines what files are overwritten:

  • if changed_only is True (default), then only files for which the source has a different size or has been modified more recently than the destination are copied;
  • if changed_only is False, then all files in source will be copied into destination, unconditionally.

Output files that do not exist in download_dir will be copied, independently of the overwrite and changed_only settings.

Parameters:
  • job (Task) – the Task instance whose output should be retrieved
  • download_dir (str) – path to download files into
  • overwrite (bool) – if False, do not download files that already exist
  • changed_only (bool) – if both this and overwrite are True, only overwrite those files such that the source is newer or different in size than the destination.
has_running_tasks()

Return True if tasks are running on the resource.

See ShellcmdLrms.count_running_tasks() for caveats about the count of “running jobs” upon which this boolean check is based.

peek(app, remote_filename, local_file, offset=0, size=None)

Download size bytes (at offset offset from the start) from remote file remote_filename and write them into local_file. If size is None (default), then snarf contents of remote file from offset unto the end.

First argument remote_filename is the path to a file relative to the remote job “sandbox”.

Argument local_file is either a local path name (string), or a file-like object supporting a .write() method. If local_file is a path name, it is created if not existent, otherwise overwritten. In any case, upon exit from this procedure, the stream will be positioned just after the written bytes.

Fourth optional argument offset is the offset from the start of the file. If offset is negative, it is interpreted as an offset from the end of the remote file.

Any exception raised by operations will be re-raised to the caller.

submit_job(app)

Run an Application instance as a shell process.

See:LRMS.submit_job
update_job_state(app)

Query the running status of the local process whose PID is stored into app.execution.lrms_jobid, and map the POSIX process status to GC3Libs Run.State.

validate_data(data_file_list=[])

Return False if any of the URLs in data_file_list cannot be handled by this backend.

The shellcmd backend can handle the following URL schemas:

  • file (natively, read/write);
  • swift/swifts/swt/swts (with Python-based remote helper, read/write);
  • http/https (with Python-based remote helper, read-only).