gc3libs.cmdline

Prototype classes for GC3Libs-based scripts.

Classes implemented in this file provide common and recurring functionality for GC3Libs command-line utilities and scripts. User applications should implement their specific behavior by subclassing and overriding a few customization methods.

There are currently two public classes provided here:

GC3UtilsScript
Base class for all the GC3Utils commands. Implements a few methods useful for writing command-line scripts that operate on jobs by ID.
SessionBasedScript
Base class for the grosetta/ggamess/gcodeml scripts. Implements a long-running script to submit and manage a large number of jobs grouped into a “session”.
class gc3libs.cmdline.GC3UtilsScript(**extra_args)

Base class for GC3Utils scripts.

The default command line implemented is the following:

script [options] JOBID [JOBID ...]

By default, only the standard options -h/--help and -V/--version are considered; to add more, override setup_options() To change default positional argument parsing, override setup_args()

pre_run()

Perform parsing of standard command-line options and call into parse_args() to do non-optional argument processing.

setup()

Setup standard command-line parsing.

GC3Utils scripts should probably override setup_args() and setup_options() to modify command-line parsing.

setup_args()

Set up command-line argument parsing.

The default command line parsing considers every argument as a job ID; actual processing of the IDs is done in parse_args()

class gc3libs.cmdline.SessionBasedScript(**extra_args)

Base class for grosetta/ggamess/gcodeml and like scripts. Implements a long-running script to submit and manage a large number of jobs grouped into a “session”.

The generic scripts implements a command-line like the following:

PROG [options] INPUT [INPUT ...]

First, the script builds a list of input files by recursively scanning each of the given INPUT arguments for files matching the self.input_file_pattern glob string (you can set it via a keyword argument to the ctor). To perform a different treatment of the command-line arguments, override the process_args() method.

Then, new jobs are added to the session, based on the results of the process_args() method above. For each tuple of items returned by process_args(), an instance of class self.application (which you can set by a keyword argument to the ctor) is created, passing it the tuple as init args, and added to the session.

The script finally proceeds to updating the status of all jobs in the session, submitting new ones and retrieving output as needed. When all jobs are done, the method done() is called, and its return value is used as the script’s exit code.

The script’s exitcode tracks job status, in the following way. The exitcode is a bitfield; only the 4 least-significant bits are used, with the following meaning:

Bit Meaning
0 Set if a fatal error occurred: the script could not complete
1 Set if there are jobs in FAILED state
2 Set if there are jobs in RUNNING or SUBMITTED state
3 Set if there are jobs in NEW state
This boils down to the following rules:
  • exitcode == 0: all jobs terminated successfully, no further action
  • exitcode == 1: an error interrupted script execution
  • exitcode == 2: all jobs terminated, not all of them successfully
  • exitcode > 3: run the script again to progress jobs
after_main_loop()

Hook executed after exit from the main loop.

This is called after the main loop has exited (for whatever reason), but before the session is finally saved and other connections are finalized.

Override in subclasses to plug any behavior here; the default implementation does nothing.

before_main_loop()

Hook executed before entering the scripts’ main loop.

This is the last chance to alter the script state as it will be seen by the main loop.

Override in subclasses to plug any behavior here; the default implementation does nothing.

every_main_loop()

Hook executed during each round of the main loop.

This is called from within the main loop, after progressing all tasks.

Override in subclasses to plug any behavior here; the default implementation does nothing.

input_filename_pattern = None

IN SessionBasedScript CONSTRUCTOR

make_directory_path(pathspec, jobname)

Return a path to a directory, suitable for storing the output of a job (named after jobname). It is not required that the returned path points to an existing directory.

This is called by the default process_args() using self.params.output (i.e., the argument to the -o/--output option) as pathspec, and jobname and args exactly as returned by new_tasks()

The default implementation substitutes the following strings within pathspec:

  • SESSION is replaced with the name of the current session (as specified by the -s/--session command-line option) with a suffix .out appended;
  • NAME is replaced with jobname;
  • DATE is replaced with the current date, in YYYY-MM-DD format;
  • TIME is replaced with the current time, in HH:MM format.
make_task_controller()

Return a ‘Controller’ object to be used for progressing tasks and getting statistics. In detail, a good ‘Controller’ object has to implement progress and stats methods with the same interface as gc3libs.core.Engine.

By the time this method is called (from _main()), the following instance attributes are already defined:

  • self._core: a gc3libs.core.Core instance;
  • self.session: the gc3libs.session.Session instance that should be used to save/load jobs

In addition, any other attribute created during initialization and command-line parsing is of course available.

new_tasks(extra)

Iterate over jobs that should be added to the current session. Each item yielded must have the form (jobname, cls, args, kwargs), where:

  • jobname is a string uniquely identifying the job in the session; if a job with the same name already exists, this item will be ignored.
  • cls is a callable that returns an instance of gc3libs.Application when called as cls(*args, **kwargs).
  • args is a tuple of arguments for calling cls.
  • kwargs is a dictionary used to provide keyword arguments when calling cls.

This method is called by the default process_args(), passing self.extra as the extra parameter.

The default implementation of this method scans the arguments on the command-line for files matching the glob pattern self.input_filename_pattern, and for each matching file returns a job name formed by the base name of the file (sans extension), the class given by self.application, and the full path to the input file as sole argument.

If self.instances_per_file and self.instances_per_job are set to a value other than 1, for each matching file N jobs are generated, where N is the quotient of self.instances_per_file by self.instances_per_job.

See also: process_args()

pre_run()

Perform parsing of standard command-line options and call into parse_args() to do non-optional argument processing.

print_summary_table(output, stats)

Print a text summary of the session status to output. This is used to provide the “normal” output of the script; when the -l option is given, the output of the print_tasks_table function is appended.

Override this in subclasses to customize the report that you provide to users. By default, this prints a table with the count of tasks for each possible state.

The output argument is a file-like object, only the write method of which is used. The stats argument is a dictionary, mapping each possible Run.State to the count of tasks in that state; see Engine.stats for a detailed description.

print_tasks_table(output=<open file '<stdout>', mode 'w'>, states=Enum(['TERMINATED', 'UNKNOWN', 'SUBMITTED', 'RUNNING', 'TERMINATING', 'STOPPED', 'NEW']), only=<type 'object'>)

Output a text table to stream output, giving details about tasks in the given states.

Optional second argument states restricts the listing to tasks that are in one of the specified states. By default, all task states are allowed. The states argument should be a list or a set of Run.State values.

Optional third argument only further restricts the listing to tasks that are instances of a subclass of only. By default, there is no restriction and all tasks are listed. The only argument can be a Python class or a tuple – anything infact, that you can pass as second argument to the isinstance operator.

Parameters:
  • output – An output stream (file-like object)
  • states – List of states (Run.State items) to consider.
  • only – Root class (or tuple of root classes) of tasks to consider.
process_args()

Process command-line positional arguments and set up the session accordingly. In particular, new jobs should be added to the session during the execution of this method: additions are not contemplated elsewhere.

This method is called by the standard _main() after loading or creating a session into self.session. New jobs should be appended to self.session and it is also permitted to remove existing ones.

The default implementation calls new_tasks() and adds to the session all jobs whose name does not clash with the jobname of an already existing task.

See also: new_tasks()

setup()

Setup standard command-line parsing.

GC3Libs scripts should probably override setup_args() to modify command-line parsing.

setup_args()

Set up command-line argument parsing.

The default command line parsing considers every argument as an (input) path name; processing of the given path names is done in parse_args()

gc3libs.cmdline.nonnegative_int(num)

This function raise an ArgumentTypeError if num is a negative integer (<0), and returns int(num) otherwise. num can be any object which can be converted to an int.

>>> nonnegative_int('1')
1
>>> nonnegative_int(1)
1
>>> nonnegative_int('-1') 
Traceback (most recent call last):
    ...
ArgumentTypeError: '-1' is not a non-negative integer number.
>>> nonnegative_int(-1) 
Traceback (most recent call last):
    ...
ArgumentTypeError: '-1' is not a non-negative integer number.

Please note that 0 and ‘-0’ are ok:

>>> nonnegative_int(0)
0
>>> nonnegative_int(-0)
0
>>> nonnegative_int('0')
0
>>> nonnegative_int('-0')
0

Floats are ok too:

>>> nonnegative_int(3.14)
3
>>> nonnegative_int(0.1)
0
>>> nonnegative_int('ThisWillRaiseAnException') 
Traceback (most recent call last):
    ...
ArgumentTypeError: 'ThisWillRaiseAnException' is not a non-negative ...
gc3libs.cmdline.positive_int(num)

This function raises an ArgumentTypeError if num is not a*strictly* positive integer (>0) and returns int(num) otherwise. num can be any object which can be converted to an int.

>>> positive_int('1')
1
>>> positive_int(1)
1
>>> positive_int('-1') 
Traceback (most recent call last):
...
ArgumentTypeError: '-1' is not a positive integer number.
>>> positive_int(-1) 
Traceback (most recent call last):
...
ArgumentTypeError: '-1' is not a positive integer number.
>>> positive_int(0) 
Traceback (most recent call last):
...
ArgumentTypeError: '0' is not a positive integer number.

Floats are ok too:

>>> positive_int(3.14)
3

but please take care that float greater than 0 will fail:

>>> positive_int(0.1)
Traceback (most recent call last):
...
ArgumentTypeError: '0.1' is not a positive integer number.

Please note that 0 is NOT ok:

>>> positive_int(-0) 
Traceback (most recent call last):
...
ArgumentTypeError: '0' is not a positive integer number.
>>> positive_int('0') 
Traceback (most recent call last):
...
ArgumentTypeError: '0' is not a positive integer number.
>>> positive_int('-0') 
Traceback (most recent call last):
...
ArgumentTypeError: '-0' is not a positive integer number.

Any string which does cannot be converted to an integer will fail:

>>> positive_int('ThisWillRaiseAnException') 
Traceback (most recent call last):
    ...
ArgumentTypeError: 'ThisWillRaiseAnException' is not a positive integer ...