Generic Python programming utility functions.

This module collects general utility functions, not specifically related to GC3Libs. A good rule of thumb for determining if a function or class belongs in here is the following: place a function or class in this module if you could copy its code into the sources of a different project and it would not stop working.

class gc3libs.utils.Enum

A generic enumeration class. Inspired by: http://goo.gl/1AL5N0 with some more syntactic sugar added.

An Enum class must be instanciated with a list of strings, that make the enumeration “label”:

>>> Animal = Enum('CAT', 'DOG')

Each label is available as an instance attribute, evaluating to itself:

>>> Animal.DOG

>>> Animal.CAT == 'CAT'

As a consequence, you can test for presence of an enumeration label by string value:

>>> 'DOG' in Animal

Finally, enumeration labels can also be iterated upon:

>>> for a in sorted(Animal): print a
class gc3libs.utils.ExponentialBackoff(slot_duration=0.05, max_retries=5)

Generate waiting times with the exponential backoff algorithm.

Returned times are in seconds (or fractions thereof); they are integral multiples of the basic time slot, which is set with the slot_duration constructor parameter.

After max_retries have been attempted, any call to this iterator will raise a StopIteration exception.

The ExponentialBackoff class implements the iterator protocol, so you can just retrieve waiting times with the .next() method, or by looping over it:

>>> random.seed(314) # not-so-random for testing purposes...
>>> for wt in ExponentialBackoff():
...   print wt,
0.0 0.0 0.0 0.25 0.15 0.3

Return next waiting time.


Wait for another while.

class gc3libs.utils.History

A list of messages with timestamps and (optional) tags.

The append method should be used to add a message to the History:

>>> L = History()
>>> L.append('first message')
>>> L.append('second one')

The last method returns the text of the last message appended, with its timestamp:

>>> L.last().startswith('second one at')

Iterating over a History instance returns message texts in the temporal order they were added to the list, with their timestamp:

>>> for msg in L: print(msg) 
first message ...
append(message, *tags)

Append a message to this History.

The message is timestamped with the time at the moment of the call.

The optional tags argument is a sequence of strings. Tags are recorded together with the message and may be used to filter log messages given a set of labels. (This feature is not yet implemented.)


Return a formatted message, appending to the message its timestamp in human readable format.


Return text of last message appended. If log is empty, return empty string.

class gc3libs.utils.MinusInfinity

An object that is less-than any other object.

>>> x = MinusInfinity()
>>> x < 1
>>> 1 > x
>>> x < -1245632479102509834570124871023487235987634518745
>>> x < -sys.maxint
>>> x > -sys.maxint
>>> -sys.maxint > x

MinusInfinity objects are actually smaller than any given Python object:

>>> x < 'azz'
>>> x < object()

Note that MinusInfinity is a singleton, therefore you always get the same instance when calling the class constructor:

>>> x = MinusInfinity()
>>> y = MinusInfinity()
>>> x is y

Relational operators try to return the correct value when comparing MinusInfinity to itself:

>>> x < y
>>> x <= y
>>> x == y
>>> x >= y
>>> x > y
class gc3libs.utils.PlusInfinity

An object that is greater-than any other object.

>>> x = PlusInfinity()
>>> x > 1
>>> 1 < x
>>> 1245632479102509834570124871023487235987634518745 < x
>>> x > sys.maxint
>>> x < sys.maxint
>>> sys.maxint < x

PlusInfinity objects are actually larger than any given Python object:

>>> x > 'azz'
>>> x > object()

Note that PlusInfinity is a singleton, therefore you always get the same instance when calling the class constructor:

>>> x = PlusInfinity()
>>> y = PlusInfinity()
>>> x is y

Relational operators try to return the correct value when comparing PlusInfinity to itself:

>>> x < y
>>> x <= y
>>> x == y
>>> x >= y
>>> x > y
class gc3libs.utils.Singleton

Derived classes of Singleton can have only one instance in the running Python interpreter.

>>> x = Singleton()
>>> y = Singleton()
>>> x is y
class gc3libs.utils.Struct(initializer=None, **extra_args)

A dict-like object, whose keys can be accessed with the usual ‘[...]’ lookup syntax, or with the ‘.’ get attribute syntax.


>>> a = Struct()
>>> a['x'] = 1
>>> a.x
>>> a.y = 2
>>> a['y']

Values can also be initially set by specifying them as keyword arguments to the constructor:

>>> a = Struct(z=3)
>>> a['z']
>>> a.z

Like dict instances, Struct`s have a `copy method to get a shallow copy of the instance:

>>> b = a.copy()
>>> b.z

Return a (shallow) copy of this Struct instance.

class gc3libs.utils.YieldAtNext(generator)

Provide an alternate protocol for generators.

Wrap a Python generator object, and buffer the return values from send and throw calls, returning None instead. Return the yielded value –or raise the StopIteration exception– upon the subsequent call to the next method.


Rename the filesystem entry at path by appending a unique numerical suffix; return new name.

For example,

  1. create a test file:
>>> import tempfile
>>> path = tempfile.mkstemp()[1]
  1. then make a backup of it; the backup will end in .~1~:
>>> path1 = backup(path)
>>> os.path.exists(path + '.~1~')

3. re-create the file, and make a second backup: this time the file will be renamed with a .~2~ extension:

>>> open(path, 'w').close()
>>> path2 = backup(path)
>>> os.path.exists(path + '.~2~')

cleaning up tests

>>> os.remove(path+'.~1~')
>>> os.remove(path+'.~2~')

Return base name without the extension.

This behaves exactly like os.path.basename() except that the last few characters, up to the rightmost dot, are removed as well:

>>> basename_sans('/tmp/foo.txt')

>>> basename_sans('bar.txt')

If there is no dot in the file name, no “extension” is chopped off:

>>> basename_sans('baz')

If there are several dots in the file name, only the last one and trailing characters are removed:

>>> basename_sans('foo.bar.baz')

Leading directory components are chopped off in any case:

>>> basename_sans('/tmp/foo.bar.baz')

>>> basename_sans('/tmp/foo')

Cache the result of a (nullary) method invocation for a given amount of time. Use as a decorator on object methods whose results are to be cached.

Store the result of the first invocation of the decorated method; if another invocation happens before lapse seconds have passed, return the cached value instead of calling the real function again. If a new call happens after the grace period has expired, call the real function and store the result in the cache.

Note: Do not use with methods that take keyword arguments, as they will be discarded! In addition, arguments are compared to elements in the cache by identity, so that invoking the same method with equal but distinct object will result in two separate copies of the result being computed and stored in the cache.

Cache results and timestamps are stored into the objects’ _cache_value and _cache_last_updated attributes, so the caches are destroyed with the object when it goes out of scope.

The working of the cached method can be demonstrated by the following simple code:

>>> class X(object):
...     def __init__(self):
...         self.times = 0
...     @cache_for(2)
...     def foo(self):
...             self.times += 1
...             return ("times effectively run: %d" % self.times)
>>> x = X()
>>> x.foo()
'times effectively run: 1'
>>> x.foo()
'times effectively run: 1'
>>> time.sleep(3)
>>> x.foo()
'times effectively run: 2'
gc3libs.utils.cat(*args, **extra_args)

Concatenate the contents of all args into output. Both output and each of the args can be a file-like object or a string (indicating the path of a file to open).

If append is True, then output is opened in append-only mode; otherwise it is overwritten.

gc3libs.utils.copy_recursively(src, dst, overwrite=False, changed_only=True)

Copy src to dst, descending it recursively if necessary.

The overwrite and changed_only optional arguments have the same effect as in copytree() (which see).

gc3libs.utils.copyfile(src, dst, overwrite=False, changed_only=True, link=False)

Copy a file from src to dst; return True if the copy was actually made.

If overwrite is False (default), an existing destination entry is left unchanged and False is returned.

If overwrite is True, then changed_only determines if the destination file is overwritten:

  • if changed_only is True (default), then destination is overwritten if and only if it has a different size or has been modified less recently than the source;
  • if changed_only is False, then the destination is overwritten unconditionally.

If link is True, an attempt at hard-linking is done first; failing that, we copy the source file onto the destination one. Permission bits and modification times are copied as well.

If dst is a directory, a file with the same basename as src is created (or overwritten) in the directory specified.

Return True or False, depending on whether the source file was actually copied (or linked) to the destination.

gc3libs.utils.copytree(src, dst, overwrite=False, changed_only=True)

Recursively copy an entire directory tree rooted at src.

If overwrite is False (default), entries that already exist in the destination tree are left unchanged and not overwritten.

If overwrite is True, then changed_only determines which files are overwritten:

  • if changed_only is True (default), then only files for which the source has a different size or has been modified more recently than the destination are copied;
  • if changed_only is False, then all files in source will be copied into destination, unconditionally.

Destination directory dst is created if it does not exist.

See also: shutil.copytree.

gc3libs.utils.count(seq, predicate)

Return number of items in seq that match predicate. Argument predicate should be a callable that accepts one argument and returns a boolean.


Decorator to define properties with a simplified syntax in Python 2.4. See http://goo.gl/IoOZ8m for details and examples.

gc3libs.utils.deploy_configuration_file(filename, template_filename=None)

Ensure that configuration file filename exists; possibly copying it from the specified template_filename.

Return True if a file with the specified name exists in the configuration directory. If not, try to copy the template file over and then return False; in case the copy operations fails, a NoConfigurationFile exception is raised.

The template_filename is always resolved relative to GC3Libs’ ‘package resource’ directory (i.e., the etc/ directory in the sources. If template_filename is None, then it is assumed to be the base name of filename.


Same as os.path.dirname but return . in case of path names with no directory component.

gc3libs.utils.fgrep(literal, filename)

Iterate over all lines in a file that contain the literal string.


Return the first element of sequence or iterator seq. Raise TypeError if the argument does not implement either of the two interfaces.


>>> s = [0, 1, 2]
>>> first(s)

>>> s = {'a':1, 'b':2, 'c':3}
>>> first(sorted(s.keys()))
gc3libs.utils.from_template(template, **extra_args)

Return the contents of template, substituting all occurrences of Python formatting directives ‘%(key)s’ with the corresponding values taken from dictionary extra_args.

If template is an object providing a read() method, that is used to gather the template contents; else, if a file named template exists, the template contents are read from it; otherwise, template is treated like a string providing the template contents itself.

gc3libs.utils.getattr_nested(obj, name)

Like Python’s getattr, but perform a recursive lookup if name contains any dots.

gc3libs.utils.grep(pattern, filename)

Iterate over all lines in a file that match the pattern regular expression.

gc3libs.utils.ifelse(test, if_true, if_false)

Return if_true is argument test evaluates to True, return if_false otherwise.

This is just a workaround for Python 2.4 lack of the conditional assignment operator:

>>> a = 1
>>> b = ifelse(a, "yes", "no"); print b
>>> b = ifelse(not a, 'yay', 'nope'); print b
gc3libs.utils.irange(start, stop, step=1)

Iterate over all values greater or equal than start and less than stop. (Or the reverse, if step < 0.)


>>> list(irange(1, 5))
[1, 2, 3, 4]
>>> list(irange(0, 8, 3))
[0, 3, 6]
>>> list(irange(8, 0, -2))
[8, 6, 4, 2]

Unlike the built-in range function, irange also accepts floating-point values:

>>> list(irange(0.0, 1.0, 0.5))
[0.0, 0.5]

Also unlike the built-in range, both start and stop have to be specified:

>>> irange(42)
Traceback (most recent call last):
TypeError: irange() takes at least 2 arguments (1 given)

Of course, a null step is not allowed:

>>> list(irange(1, 2, 0))
Traceback (most recent call last):
AssertionError: Null step in irange.
gc3libs.utils.lock(path, timeout, create=True)

Lock the file at path. Raise a LockTimeout error if the lock cannot be acquired within timeout seconds.

Return a lock object that should be passed unchanged to the gc3libs.utils.unlock function.

If no path points to a non-existent location, an empty file is created before attempting to lock (unless create is False). An attempt is made to remove the file in case an error happens.

See also: gc3libs.utils.unlock()

gc3libs.utils.mkdir(path, mode=511)

Like os.makedirs, but does not throw an exception if PATH already exists.

gc3libs.utils.mkdir_with_backup(path, mode=511)

Like os.makedirs, but if path already exists and is not empty, rename the existing one to a backup name (see the backup function).

Unlike os.makedirs, no exception is thrown if the directory already exists and is empty, but the target directory permissions are not altered to reflect mode.

gc3libs.utils.move_recursively(src, dst, overwrite=False, changed_only=True)

Move src to dst, descending it recursively if necessary.

The overwrite and changed_only optional arguments have the same effect as in copytree() (which see).

gc3libs.utils.movefile(src, dst, overwrite=False, changed_only=True, link=False)

Move a file from src to dst; return True if the move was actually made.

The overwrite and changed_only optional arguments have the same effect as in copyfile() (which see).

If dst is a directory, a file with the same basename as src is created (or overwritten) in the directory specified.

Return True or False, depending on whether the source file was actually moved to the destination.

See also: copyfile()

gc3libs.utils.movetree(src, dst, overwrite=False, changed_only=True)

Recursively move an entire directory tree rooted at src.

The overwrite and changed_only optional arguments have the same effect as in copytree() (which see).

See also: copytree().

gc3libs.utils.occurs(pattern, filename, match=<function grep>)

Return True if a line in filename matches pattern.

The match argument selects how exactly pattern is searched for in the contents of filename:

  • when match=grep (default), then pattern is a regular expression that is searched for (unanchored) in every line;
  • when match=fgrep, then pattern is a string that is searched for literally in every line;
  • more in general, the match function should return an iterator over matches of pattern within the contents of filename: if at least one match is found, occurs will return True.
  • pattern (str) – Pattern to search for
  • filename (str) – Path name of the file to search into
  • match – Function returning iterator over matches

Return minimum, maximum, and stepping value for a range.

Argument spec must be a string of the form LOW:HIGH:STEP, where LOW, HIGH and STEP are (integer or floating-point) numbers. Example:

>>> parse_range('1:10:2')
(1, 10, 2)

>>> parse_range('1.0:3.5:0.5')
(1.0, 3.5, 0.5)

Note that, as soon as any one of LOW, HIGH, STEP is not an integer, all of them are parsed as Python floats:

>>> parse_range('1:3:0.5')
(1.0, 3.0, 0.5)

>>> parse_range('1.0:3:1')
(1.0, 3.0, 1.0)

>>> parse_range('1:3.0:1')
(1.0, 3.0, 1.0)

The final part :STEP can be omitted if the step is 1:

>>> parse_range('2:5')
(2, 5, 1)

>>> parse_range('1.0:3.0')
(1.0, 3.0, 1.0)

Finally, note that parse_range does not perform any kind of check on the validity of the resulting range; so it is possible to parse a string into an empty range or range specification with stepping 0:

>>> parse_range('1:-5:10')
(1, -5, 10)

>>> parse_range('1:2:0')
(1, 2, 0)

As a special case to simplify user interfaces, a single number is accepted as a degenerate range: it will be parsed as a range whose minimum and maximum are equal to the given number:

>>> parse_range('42')
(42, 42, 1)
gc3libs.utils.prettyprint(D, indent=0, width=0, maxdepth=None, step=4, only_keys=None, output=<open file '<stdout>', mode 'w'>, _key_prefix='', _exclude=None)

Print dictionary instance D in a YAML-like format. Each output line consists of:

  • indent spaces,
  • the key name,
  • a colon character :,
  • the associated value.

If the total line length exceeds width, the value is printed on the next line, indented by further step spaces; a value of 0 for width disables this line wrapping.

Optional argument only_keys can be a callable that must return True when called with keys that should be printed, or a list of key names to print.

Dictionary instances appearing as values are processed recursively (up to maxdepth nesting). Each nested instance is printed indented step spaces from the enclosing dictionary.

gc3libs.utils.progressive_number(qty=None, id_filename=None)

Return a positive integer, whose value is guaranteed to be monotonically increasing across different invocations of this function, and also across separate instances of the calling program.

This is accomplished by using a system-wide file which holds the “next available” ID. The location of this file can be set using the GC3PIE_ID_FILE environment variable, or programmatically using the id_filename argument. By default, the “next ID” file is located at ~/.gc3/next_id.txt:file:


>>> # create "next ID" file in a temporary location
>>> import tempfile, os
>>> (fd, tmp) = tempfile.mkstemp()

>>> n = progressive_number(id_filename=tmp)
>>> m = progressive_number(id_filename=tmp)
>>> m > n

If you specify a positive integer as argument, then a list of monotonically increasing numbers is returned. For example:

>>> ls = progressive_number(5, id_filename=tmp)
>>> len(ls)
(clean up test environment)
>>> os.remove(tmp)

In other words, progressive_number(N) is equivalent to:

nums = [ progressive_number() for n in range(N) ]

only more efficient, because it has to obtain and release the lock only once.

After every invocation of this function, the last returned number is stored into the file passed as argument id_filename. If the file does not exist, an attempt to create it is made before allocating an id; the method can raise an IOError or OSError if id_filename cannot be opened for writing.

Note: as file-level locking is used to serialize access to the counter file, this function may block (default timeout: 30 seconds) while trying to acquire the lock, or raise a LockTimeout exception if this fails.

Raise:LockTimeout, IOError, OSError
Returns:A positive integer number, monotonically increasing with every call. A list of such numbers if argument qty is a positive integer.

Return the whole contents of the file at path as a single string.


>>> read_contents('/dev/null')

>>> import tempfile
>>> (fd, tmpfile) = tempfile.mkstemp()
>>> w = open(tmpfile, 'w')
>>> w.write('hey')
>>> w.close()
>>> read_contents(tmpfile)

(If you run this test, remember to do cleanup afterwards)

>>> os.remove(tmpfile)

Return a string describing Python object obj.

Avoids calling any Python magic methods, so should be safe to use as a ‘last resort’ in implementation of __str__ and __repr__.


Function decorator: sets the docstring of the following function to the one of referenced_fn.

Intended usage is for setting docstrings on methods redefined in derived classes, so that they inherit the docstring from the corresponding abstract method in the base class.

gc3libs.utils.samefile(path1, path2)

Like os.path.samefile but return False if either one of the paths does not exist.


Escape a string for safely passing as argument to a shell command.

Return a single-quoted string that expands to the exact literal contents of text when used as an argument to a shell command. Examples (note that backslashes are doubled because of Python’s string read syntax):

>>> print(sh_quote_safe("arg"))
>>> print(sh_quote_safe("'arg'"))

Single-quote a list of strings for passing to the shell as a command. Return the list of quoted arguments concatenated and separated by spaces.


>>> sh_quote_safe_cmdline(['sh', '-c', 'echo c(1,2,3)'])
"'sh' '-c' 'echo c(1,2,3)'"

Double-quote a string for passing as argument to a shell command.

Return a double-quoted string that expands to the contents of text but still allows variable expansion and \-escapes processing by the UNIX shell. Examples (note that backslashes are doubled because of Python’s string read syntax):

>>> print(sh_quote_unsafe("arg"))
>>> print(sh_quote_unsafe('"arg"'))
>>> print(sh_quote_unsafe(r'"\"arg\""'))

Double-quote a list of strings for passing to the shell as a command. Return the list of quoted arguments concatenated and separated by spaces.


>>> sh_quote_unsafe_cmdline(['sh', '-c', 'echo $HOME'])
'"sh" "-c" "echo $HOME"'

Convert word to a Python boolean value and return it. The strings true, yes, on, 1 (with any capitalization and any amount of leading and trailing spaces) are recognized as meaning Python True:

>>> string_to_boolean('yes')
>>> string_to_boolean('Yes')
>>> string_to_boolean('YES')
>>> string_to_boolean(' 1 ')
>>> string_to_boolean('True')
>>> string_to_boolean('on')

Any other word is considered as boolean False:

>>> string_to_boolean('no')
>>> string_to_boolean('No')
>>> string_to_boolean('Nay!')
>>> string_to_boolean('woo-hoo')

This includes also the empty string and whitespace-only:

>>> string_to_boolean('')
>>> string_to_boolean('  ')

Iterate over lines in iterable and return each of them stripped of leading and trailing blanks.

gc3libs.utils.tempdir(*args, **kwds)

A context manager for creating and then deleting a temporary directory.

All arguments are passed unchanged to the tempfile.mkdtemp standand library function.

(Original source and credits: http://stackoverflow.com/a/10965572/459543)

gc3libs.utils.test_file(path, mode, exception=<type 'exceptions.RuntimeError'>, isdir=False)

Test for access to a path; if access is not granted, raise an instance of exception with an appropriate error message. This is a frontend to os.access(), which see for exact semantics and the meaning of path and mode.

  • path – Filesystem path to test.
  • mode – See os.access()
  • exception – Class of exception to raise if test fails.
  • isdir – If True then also test that path points to a directory.

If the test succeeds, True is returned:

>>> test_file('/bin/cat', os.F_OK)
>>> test_file('/bin/cat', os.R_OK)
>>> test_file('/bin/cat', os.X_OK)
>>> test_file('/tmp', os.X_OK)

However, if the test fails, then an exception is raised:

>>> test_file('/bin/cat', os.W_OK)
Traceback (most recent call last):
RuntimeError: Cannot write to file '/bin/cat'.

If the optional argument isdir is True, then additionally test that path points to a directory inode:

>>> test_file('/tmp', os.F_OK, isdir=True)

>>> test_file('/bin/cat', os.F_OK, isdir=True)
Traceback (most recent call last):
RuntimeError: Expected '/bin/cat' to be a directory, but it's not.

Convert string s to an integer number of bytes. Suffixes like ‘KB’, ‘MB’, ‘GB’ (up to ‘YB’), with or without the trailing ‘B’, are allowed and properly accounted for. Case is ignored in suffixes.


>>> to_bytes('12')
>>> to_bytes('12B')
>>> to_bytes('12KB')
>>> to_bytes('1G')

Binary units ‘KiB’, ‘MiB’ etc. are also accepted:

>>> to_bytes('1KiB')
>>> to_bytes('1MiB')

Ensure a regular file exists at path.

If the file already exists, its access and modification time are updated.

(This is a very limited and stripped down version of the touch POSIX utility.)


Iterate over all unique elements in sequence seq.

Distinct values are returned in a sorted fashion.


>>> for value in uniq([4,1,1,2,3,1,2]): print value
>>> for value in uniq([1,2,3,4]): print value
>>> for value in uniq([1,1,1,1]): print value

Release a previously-acquired lock.

Argument lock should be the return value of a previous gc3libs.utils.lock call.

See also: gc3libs.utils.lock()

gc3libs.utils.update_parameter_in_file(path, var_in, new_val, regex_in)

Updates a parameter value in a parameter file using predefined regular expressions in _loop_regexps.

  • path – Full path to the parameter file.
  • var_in – The variable to modify.
  • new_val – The updated parameter value.
  • regex – Name of the regular expression that describes the format of the parameter file.
gc3libs.utils.write_contents(path, data)

Overwrite the contents of the file at path with the given data. If the file does not exist, it is created.


>>> import tempfile
>>> (fd, tmpfile) = tempfile.mkstemp()
>>> write_contents(tmpfile, 'big data here')
>>> read_contents(tmpfile)
'big data here'

(If you run this test, remember to clean up afterwards)

>>> os.remove(tmpfile)