GC3Pie programming tutorials¶
Contents
Implementing scientific workflows with GC3Pie¶
This is the course material prepared for the “GC3Pie for Programmers” training, held at the University of Zurich for the first time on July 11-14, 2016. (The slides presented here are revised at each course re-run.)
The course aims at showing how to implement patterns commonly seen in scientific computational workflows using Python and GC3Pie, and provide users with enough knowledge of the tools available in GC3Pie to extend and adapt the examples provided.
A presentation of the training material and outline of the course. Probably not much useful unless you’re actually sitting in class.
A quick overview of the kind of computational use cases that GC3Pie can easily solve.
The basics needed to write simple GC3Pie scripts: the minimal session-based script scaffolding, and the properties and features of theApplication
object.
Recall a few GC3Pie utilities that are especially useful when debugging code.
Customizing command-line processing
How to set up command-line argument and option processing in GC3Pie’sSessionBasedScript
How to specify running requirements forApplication
tasks, e.g., how much memory is needed to run.
Application control and post-processing
How to check and react on the termination status of a GC3Pie Task/Application.
A worked-out example of a many-step workflow.
How to run tasks in sequence: basic usage ofSequentialTaskCollection
andStagedTaskCollection
How to run independent tasks in parallel: theParallelTaskCollection
Automated construction of task dependency graphs
How to use theDependentTaskCollection
for automated arrangement of tasks given their dependencies.
Dynamic and Unbounded Sequences of Tasks
How to constructSequentialTaskCollection
classes that change the sequence of tasks while being run.
A bottom-up introduction to programming with GC3Pie¶
This is the course material made for the GC3Pie 2012 Training event held at the University of Zurich on October 1-2, 2012.
The presentation starts with low-level concepts (e.g., the
Application
and how to do manual task submission) and then
gradually introduces more sophisticated tools (e.g., the
SessionBasedScript
and workflows).
This order of introducing concepts will likely appeal most to those already familiar with batch-computing and grid computing, as it provides an immediate map of the job submission and monitoring commands to GC3Pie equivalents.
Introduction to the software: what is GC3Pie, what is it for, and an overview of its features for writing high-throughput computing scripts.
The Application class, the smallest building block of GC3Pie. Introduction to the concept of Job, states of an application and to the Core class.
How to define extra requirements for an application, such as the minimum amount of memory it will use, the number of cores needed or the architecture of the CPUs.
Managing applications: the SessionBasedScript class
Introduction to the highest-level interface to build applications with GC3Pie, the SessionBasedScript. Information on how to create simple scripts that take care of the execution of your applications, from submission to getting back the final results.
Low-level tools to aid debugging the scripts.
Introduction to Workflows with GC3Pie
Using a practical example (the The “Warholize” Workflow Tutorial) we show how workflows are implemented with GC3Pie. The following slides will cover in more details the single steps needed to produce a complex workflow.
Description of the ParallelTaskCollection class, used to run tasks in parallel.
Description of the StagedTaskCollection class, used to run a sequence of a fixed number of jobs.
Description of the SequentialTaskCollection class, used to run a sequence of jobs that can be altered during runtime.
The “Warholize” Workflow Tutorial¶
In this tutorial we show how to use the GC3Pie libraries in order to build a command line script which runs a complex workflow with both parallelly- and sequentially-executing tasks.
The tutorial itself contains the complete source code of the
application (see Literate Programming on Wikipedia), so that you
will be able to test/modify it and produce a working warholize.py
script by downloading the pylit.py
:file: script from the PyLit
Homepage and running the following command on the
docs/programmers/tutorials/warholize/warholize.rst
file,
from within the source tree of GC3Pie:
$ ./pylit warholize.rst warholize.py
Example scripts¶
A collection of small example scripts highlighting different features
of GC3Pie is available in the source distribution, in folder
examples/
:file:
Simplest script you can create. It only uses Application and Engine classes to create an application, submit it, check its status and retrieve its output.
a SessionBasedScript that executes its argument as command. It can also run it multiple times by wrapping it in a ParallelTaskCollection or a SequentialTaskCollection, depending on a command line option. Useful for testing a configured resource.
a simple SessionBasedScript that sums two values by customizing a SequentialTaskCollection.
an enhanced version of the warholize script proposed in the The “Warholize” Workflow Tutorial