Welcome to gc3pie’s documentation!¶
Introduction¶
GC3Libs is a python package for controlling the life-cycle of a Grid or batch computational job.
GC3Libs provides services for submitting computational jobs to Grids and batch systems, controlling their execution, persisting job information, and retrieving the final output.
GC3Libs takes an application-oriented approach to batch computing. A generic Application class provides the basic operations for controlling remote computations, but different Application subclasses can expose adapted interfaces, focusing on the most relevant aspects of the application being represented.
This document is the technical reference for the GC3Libs programming model, aimed at programmers who want to use GC3Libs to implement computational workflows in Python.
Outline of the contents¶
The Programming overview section presents the main concepts behind GC3Libs programming.
The GC3Libs modules section is a comprehensive list of all the modules, classes and functions comprising GC3Libs; its content is automatically generated from docstrings in the source code.
Installation of GC3Utils¶
Author: | Riccardo Murri <riccardo.murri@gmail.com> |
---|---|
Date: | 2010-10-06 |
Revision: | $Revision$ |
Installation¶
These instructions show how to install GC3Pie from the GC3 source repository into a separate python environment (called virtualenv). Installation into a virtualenv has two distinct advantages:
- All code is confined in a single directory, and can thus be easily replaced/removed.
- Better dependency handling: additional Python packages that GC3Pie depends upon can be installed even if they conflict with system-level packages.
Install software prerequisites:
On Debian/Ubuntu, install packages:
subversion
,python-dev
,python-profiler
and the C/C++ compiler:apt-get install subversion python-dev python-profiler gcc g++
On CentOS5, install packages
subversion
andpython-devel
and the C/C++ compiler:yum install subversion python-devel gcc gcc-c++
On other Linux distributions, you will need to install:
- the
svn
command (from the SubVersion VCS) - Python development headers and libraries (for installing extension libraries written in C/C++)
- the Python package
pstats
(it’s part of the Python standard library, but sometimes it needs separate installation) - a C/C++ compiler (this is usually installed by default).
- the
In order to use the ARC backend (required for SMSCG), you need the NorduGrid/ARC binaries and a working
slcs-init
command installed on the same machine where GC3Pie are. You can find instructions for installing it at:Additional OS-specific installation details can be found at:
Choose a directory where the GC3Pie software will be installed; any directory that’s writable by your Linux account will be ok.
If you are installing system-wide as
root
, we suggest you install GC3Pie into/opt/gc3pie
.If you are installing as a normal user, we suggest you install GC3Pie into
$HOME/gc3pie
.If it’s not already installed, get the virtualenv Python package and install it:
wget http://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.5.1.tar.gz tar -xzf virtualenv-1.5.1.tar.gz && rm virtualenv-1.5.1.tar.gz cd virtualenv-1.5.1/ If you are installing as `root`, the following command is all you need: python setup.py install If instead you are installing as a normal, unprivileged user, things get more complicated:: export PYTHONPATH=$HOME/lib64/python:$HOME/lib/python:$PYTHONPATH export PATH=$PATH:$HOME/bin mkdir -p $HOME/lib/python python setup.py install --home $HOME (You will also need to add the two `export` lines above to the `$HOME/.bashrc` file -if using the `bash` shell- or to the `$HOME/.cshrc` file -if using the `tcsh` shell.) In any case, once `virtualenv`_ has been installed, you can exit its directory and remove it:: cd .. rm -rf virtualenv-1.5.1
Create a virtualenv to host the
gc3pie
installation at the directory you chose in Step 1.:virtualenv $HOME/gc3pie # use '/opt/gc3pie' if installing as root cd $HOME/gc3pie/ source bin/activate
Check-out the
gc3pie
files in asrc/
directory:svn co http://gc3pie.googlecode.com/svn/branches/1.0/gc3pie src
Install the
gc3pie
in “develop” mode, so any modification pulled from subversion is immediately reflected in the running environment:cd src/ env CC=gcc ./setup.py develop cd .. # back into the `gc3pie` directory
This will place all the GC3Pie command into the
gc3pie/bin/
directory.GC3Pie comes with driver scripts to run and manage large families of jobs from a few selected applications. These scripts are not installed by default because not everyone needs them.
Run the following commands to install the driver scripts for the applications you need:
# if you are insterested in GAMESS, do the following ln -s '../src/gc3apps/gamess/ggamess.py' bin/ggamess # if you are insterested in Rosetta, do the following ln -s '../src/gc3apps/rosetta/gdocking.py' bin/gdocking ln -s '../src/gc3apps/rosetta/grosetta.py' bin/grosetta # if you are insterested in Codeml, do the following ln -s '../src/gc3apps/codeml/gcodeml.py' bin/gcodeml
Before you can actually run the GC3Pie, you need to have a working configuration file; the ConfigurationFile Wiki page <http://code.google.com/p/gc3pie/wiki/ConfigurationFile> provides an explanation of the syntax to use in configuration files.
An example configuration file, enabling access to the SMSCG infrastructure can be found at:
Before you can actually use this file, you will need to insert into it three values, for which we can provide no default:
aai_username
,idp
, andvo
.- aai_username:
This is the “username” you are asked for when accessing any SWITCHaai/Shibboleth web page, e.g., https://gc3-aai01.uzh.ch/secure/
- idp:
Find this out with the command “slcs-info”: it prints a list of IdP (Identity Provider IDs) followed by the human-readable name of the associated institution. Pick the one that corresponds to you University. It is always the last two components of the University’s Internet domain name (e.g., “uzh.ch” or “ethz.ch”).
- vo:
In order to use SMSCG, you must sign up to a VO (Virtual Organisation). One the words “life”, “earth”, “atlas” or “crypto” should be here. Find out more at: http://www.smscg.ch/www/user/
Upgrade¶
These instructions show how to upgrade the GC3Pie scripts to the latest version found in the GC3 svn repository.
cd to the directory containing the GC3Pie virtualenv; assuming it’s named
gc3pie
as in the above installation instructions, you can issue the commands:cd $HOME/gc3pie # use '/opt/gc3pie' if root
Activate the virtualenv
source bin/activate
Upgrade the gc3pie source and run the setup.py script again:
cd src svn up env CC=gcc ./setup.py develop
Note: A major restructuring of the SVN repository took place in r1124 to r1126 (Feb. 15, 2011); if your sources are older than SVN r1124, these upgrade instructions will not work, and you must reinstall completely. You can check what version the SVN sources are, by running the svn info command in the src directory: watch out for the Revision: line.
HTML Documentation¶
HTML documentation for the GC3Libs programming interface can be read online at:
You can also generate a local copy from the sources:
cd $HOME/gc3pie # use '/opt/gc3pie' if root
cd src/docs
make html
Note that you need the Python package Sphinx in order to build the documentation locally.
Programming overview¶
GC3Libs takes an application-oriented approach to asynchronous
computing. A generic Application
class provides the basic
operations for controlling remote computations and fetching a result;
client code should derive specialized sub-classes to deal with a
particular application, and to perform any application-specific
pre- and post-processing.
The generic procedure for performing computations with GC3Libs is the following:
- Client code creates an instance of an Application sub-class.
- Asynchronous computation is started by submitting the application object; this associates the application with an actual (possibly remote) computational job.
- Client code can monitor the state of the computational job; state handlers are called on the application object as the state changes.
- When the job is done, the final output is retrieved and a post-processing method is invoked on the application object.
At this point, results of the computation are available and can be used by the calling program.
The Application
class (and its sub-classes) alow client code
to control the above process by:
Specifying the characteristics (computer program to run, input/output files, memory/CPU/duration requirements, etc.) of the corresponding computational job. This is done by passing suitable values to the
Application
constructor. See theApplication
constructor documentation for more info.Providing methods to control the “life-cycle” of the associated computational job: start, check execution state, stop, retrieve a snapshot of the output files. There are actually two different interfaces for this, detailed below:
A passive interface: a
Core
or aEngine
object is used to start/stop/monitor jobs associated with the given application. For instance:a = GamessApplication(...) # create a `Core` object; only one instance is needed g = Core(...) # start the remote computation g.submit(a) # periodically monitor job execution g.update_job_state(a) # retrieve output when the job is done g.fetch_output(a)The passive interface gives client code full control over the lifecycle of the job, but cannot support some use cases (e.g., automatic application re-start).
As you can see from the above example, the passive interface is implemented by methods in the
Core
andEngine
classes (they implement the same interface). See those classes documentation for more details.An active interface: this requires that the
Application
object be attached to aCore
orEngine
instance:a = GamessApplication(...) # create a `Core` object; only one instance is needed g = Core(...) # tell application to use the active interface a.attach(g) # start the remote computation a.submit() # periodically monitor job execution a.update_job_state() # retrieve output when the job is done a.fetch_output()With the active interface, application objects can support automated restart and similar use-cases.
When an
Engine
object is used instead of aCore
one, the job life-cycle is automatically managed, providing a fully asynchronous way of executing computations.The active interface is implemented by the
Task
class and all its descendants (includingApplication
).Providing “state transition methods” that are called when a change in the job execution state is detected; those methods can implement application specific behavior, like restarting the computational job with changed input if the alloted duration has expired but the computation has not finished. In particular, a postprocess method is called when the final output of an application is available locally for processing.
The set of “state transition methods” currently implemented by the
Application
class are:new()
,submitted()
,running()
,stopped()
,terminated()
andpostprocess()
. Each method is called when the execution state of an application object changes to the corresponding state; see each method’s documentation for exact information.
In addition, GC3Libs provides collection classes, that expose
interfaces 2. and 3. above, allowing one to control a set of
applications as a single whole. Collections can be nested (i.e., a
collection can hold a mix of Application
and
TaskCollection
objects), so that workflows can be implemented
by composing collection objects.
Note that the term computational job (or just job, for short) is used here in a quite general sense, to mean any kind of computation that can happen independently of the main thread of the calling program. GC3Libs currently provide means to execute a job as a separate process on the same computer, or as a batch job on a remote computational cluster.
Execution model of GC3Libs applications¶
An Application can be regarded as an abstraction of an independent asynchronous computation, i.e., a GC3Libs’ Application behaves much like an independent UNIX process (but it can actually run on a separate remote computer). Indeed, GC3Libs’ Application objects mimic the POSIX process model: Application are started by a parent process, run independently of it, and need to have their final exit code and output reaped by the calling process.
The following table makes the correspondence between POSIX processes and GC3Libs’ Application objects explicit.
os module function | Core function | purpose |
---|---|---|
exec | Core.submit | start new job |
kill(..., SIGTERM) | Core.kill | terminate executing job |
wait(..., WNOHANG) | Core.update_job_state | get job status |
Core.fetch_output | retrieve output |
Note
- With GC3Libs, it is not possible to send an arbitrary signal to a running job: jobs can only be started and stopped (killed).
- Since POSIX processes are always executed on the local machine, there is no equivalent of the GC3Libs fetch_output.
Application exit codes¶
POSIX encodes process termination information in the “return code”, which can be parsed through os.WEXITSTATUS, os.WIFSIGNALED, os.WTERMSIG and relative library calls.
Likewise, GC3Libs provides each Application
object with an
execution.returncode attribute, which is a valid POSIX “return
code”. Client code can therefore use os.WEXITSTATUS and relatives
to inspect it; convenience attributes execution.signal and
execution.exitcode are available for direct access to the parts of
the return code. See Run.returncode()
for more information.
However, GC3Libs has to deal with error conditions that are not catered for by the POSIX process model: for instance, execution of an application may fail because of an error connecting to the remote execution cluster.
To this purpose, GC3Libs encodes information about abnormal job
termination using a set of pseudo-signal codes in a job’s
execution.returncode attribute: i.e., if termination of a job is due
to some grid/batch system/middleware error, the job’s
os.WIFSIGNALED(app.execution.returncode) will be True and the
signal code (as gotten from os.WTERMSIG(app.execution.returncode))
will be one of those listed in the Run.Signals
documentation.
Application execution states¶
At any given moment, a GC3Libs job is in any one of a set of
pre-defined states, listed in the table below. The job state is
always available in the .execution.state instance property of any
Application or Task object; see Run.state()
for detailed
information.
GC3Libs’ Job state | purpose | can change to |
---|---|---|
NEW | Job has not yet been submitted/started (i.e., gsub not called) | SUBMITTED (by gsub) |
SUBMITTED | Job has been sent to execution resource | RUNNING, STOPPED |
STOPPED | Trap state: job needs manual intervention (either user- or sysadmin-level) to resume normal execution | TERMINATED (by gkill), SUBMITTED (by miracle) |
RUNNING | Job is executing on remote resource | TERMINATED |
TERMINATED | Job execution is finished (correctly or not) and will not be resumed | None: final state |
When an Application
object is first created, its
.execution.state attribute is assigned the state NEW. After a
successful start (via Core.submit() or similar), it is transitioned
to state SUBMITTED. Further transitions to RUNNING or STOPPED or
TERMINATED state, happen completely independently of the creator
program: the Core.update_job_state() call provides updates on the
status of a job. (Somewhat like the POSIX wait(..., WNOHANG) system
call, except that GC3Libs provide explicit RUNNING and STOPPED states,
instead of encoding them into the return value.)
The STOPPED state is a kind of generic “run time error” state: a job can get into the STOPPED state if its execution is stopped (e.g., a SIGSTOP is sent to the remote process) or delayed indefinitely (e.g., the remote batch system puts the job “on hold”). There is no way a job can get out of the STOPPED state automatically: all transitions from the STOPPED state require manual intervention, either by the submitting user (e.g., cancel the job), or by the remote systems administrator (e.g., by releasing the hold).
The TERMINATED state is the final state of a job: once a job reaches it, it cannot get back to any other state. Jobs reach TERMINATED state regardless of their exit code, or even if a system failure occurred during remote execution; actually, jobs can reach the TERMINATED status even if they didn’t run at all!
A job that is not in the NEW or TERMINATED state is said to be a “live” job.
Computational job specification¶
One of the purposes of GC3Libs is to provide an abstraction layer that frees client code from dealing with the details of job execution on a possibly remote cluster. For this to work, it necessary to specify job characteristics and requirements, so that the GC3Libs scheduler can select an appropriate computational resource for executing the job.
GC3Libs Application provide a way to describe computational job characteristics (program to run, input and output files, memory/duration requirements, etc.) loosely patterned after ARC’s xRSL language.
The description of the computational job is done through keyword
parameters to the Application
constructor, which see for
details. Changes in the job characteristics after an
Application
object has been constructed are not currently
supported.
GC3Libs modules¶
gc3libs¶
gc3libs.core¶
gc3libs.Default¶
Warning
This module is deprecated and will be removed in a future release. Do not depend on it.
gc3libs.Exceptions¶
gc3libs.persistence¶
gc3libs.application¶
gc3libs.application.gamess¶
gc3libs.application.rosetta¶
gc3libs.authentication¶
gc3libs.authentication.ssh¶
gc3libs.authentication.grid¶
gc3libs.backends¶
gc3libs.backends.arc¶
gc3libs.backends.sge¶
gc3libs.backends.transport¶
gc3libs.utils¶
gc3libs.Resource¶
Warning
This module is deprecated and will be removed in a future release. Do not depend on it.
gc3libs.scheduler¶
gc3libs.InformationContainer¶
Warning
This module is deprecated and will be removed in a future release. Do not depend on it.