gc3libs.url

Utility classes and methods for dealing with URLs.

class gc3libs.url.Url

Represent a URL as a named-tuple object. This is an immutable object that cannot be changed after creation.

The following read-only attributes are defined on objects of class Url.

Attribute Index Value if not present
scheme 0 URL scheme specifier empty string
netloc 1 Network location part empty string
path 2 Hierarchical path empty string
query 3 Query component empty string
hostname 4 Host name (lower case) None
port 5 Port number as integer (if present) None
username 6 User name None
password 7 Password None

There are two ways of constructing Url objects:

  • By passing a string urlstring:

    >>> u = Url('http://www.example.org/data')
    
    >>> u.scheme
    'http'
    >>> u.netloc
    'www.example.org'
    >>> u.path
    '/data'
    

    The default URL scheme is file:

    >>> u = Url('/tmp/foo')
    >>> u.scheme
    'file'
    >>> u.path
    '/tmp/foo'
    

    Please note that extra leading slashes ‘/’ are interpreted as the begining of a network location:

    >>> u = Url('//foo/bar')
    >>> u.path
    '/bar'
    >>> u.netloc
    'foo'
    >>> Url('///foo/bar').path
    '/foo/bar'
    

    Check RFC 3986 http://tools.ietf.org/html/rfc3986

    If force_abs is True (default), then the path attribute is made absolute, by calling os.path.abspath if necessary:

    >>> u = Url('foo/bar', force_abs=True)
    >>> os.path.isabs(u.path)
    True
    

    Otherwise, if force_abs is False, then the path attribute stores the passed string unchanged:

    >>> u = Url('foo', force_abs=False)
    >>> os.path.isabs(u.path)
    False
    >>> u.path
    'foo'
    

    Other keyword arguments can specify defaults for missing parts of the URL:

    >>> u = Url('/tmp/foo', scheme='file', netloc='localhost')
    >>> u.scheme
    'file'
    >>> u.netloc
    'localhost'
    >>> u.path
    '/tmp/foo'
    

    Query attributes are also supported:

    >>> u = Url('http://www.example.org?foo=bar')
    >>> u.query
    'foo=bar'
    
  • By passing keyword arguments only, to construct an Url object with exactly those values for the named fields:

    >>> u = Url(scheme='http', netloc='www.example.org', path='/data')
    

    In this form, the force_abs parameter is ignored.

See also: http://goo.gl/9WcRvR

adjoin(relpath)

Return a new Url, constructed by appending relpath to the path section of this URL.

Example:

>>> u0 = Url('http://www.example.org')
>>> u1 = u0.adjoin('data')
>>> str(u1)
'http://www.example.org/data'

>>> u2 = u1.adjoin('moredata')
>>> str(u2)
'http://www.example.org/data/moredata'

Even if relpath starts with /, it is still appended to the path in the base URL:

>>> u3 = u2.adjoin('/evenmore')
>>> str(u3)
'http://www.example.org/data/moredata/evenmore'

Optional query attribute is left untouched:

>>> u4 = Url('http://www.example.org?bar')
>>> u5 = u4.adjoin('foo')
>>> str(u5)
'http://www.example.org/foo?bar'
class gc3libs.url.UrlKeyDict(iter_or_dict=None, force_abs=False)

A dictionary class enforcing that all keys are URLs.

Strings and/or objects returned by urlparse can be used as keys. Setting a string key automatically translates it to a URL:

>>> d = UrlKeyDict()
>>> d['/tmp/foo'] = 1
>>> for k in d.keys(): print (type(k), k.path) 
(<class '....Url'>, '/tmp/foo')

Retrieving the value associated with a key works with both the string or the url value of the key:

>>> d['/tmp/foo']
1
>>> d[Url('/tmp/foo')]
1

Key lookup can use both the string or the Url value as well:

>>> '/tmp/foo' in d
True
>>> Url('/tmp/foo') in d
True
>>> 'file:///tmp/foo' in d
True
>>> 'http://example.org' in d
False

Class UrlKeyDict supports initialization by copying items from another dict instance or from an iterable of (key, value) pairs:

>>> d1 = UrlKeyDict({ '/tmp/foo':'foo', '/tmp/bar':'bar' })
>>> d2 = UrlKeyDict([ ('/tmp/foo', 'foo'), ('/tmp/bar', 'bar') ])
>>> d1 == d2
True

Differently from dict, initialization from keyword arguments alone is not supported:

>>> d3 = UrlKeyDict(foo='foo') 
Traceback (most recent call last):
    ...
TypeError: __init__() got an unexpected keyword argument 'foo'

An empty UrlKeyDict instance is returned by the constructor when called with no parameters:

>>> d0 = UrlKeyDict()
>>> len(d0)
0

If force_abs is True, then all paths are converted to absolute ones in the dictionary keys.

>>> d = UrlKeyDict(force_abs=True)
>>> d['foo'] = 1
>>> for k in d.keys(): print os.path.isabs(k.path)
True
>>> d = UrlKeyDict(force_abs=False)
>>> d['foo'] = 2
>>> for k in d.keys(): print os.path.isabs(k.path)
False
class gc3libs.url.UrlValueDict(iter_or_dict=None, force_abs=False, **extra_args)

A dictionary class enforcing that all values are URLs.

Strings and/or objects returned by urlparse can be used as values. Setting a string value automatically translates it to a URL:

>>> d = UrlValueDict()
>>> d[1] = '/tmp/foo'
>>> d[2] = Url('file:///tmp/bar')
>>> for v in d.values(): print (type(v), v.path) 
(<class '....Url'>, '/tmp/foo')
(<class '....Url'>, '/tmp/bar')

Retrieving the value associated with a key always returns the URL-type value, regardless of how it was set:

>>> repr(d[1]) == "Url(scheme='file', netloc='', path='/tmp/foo', "         "hostname=None, port=None, query='', username=None, password=None)"
True

Class UrlValueDict supports initialization by any of the methods that work with a plain dict instance:

>>> d1 = UrlValueDict({ 'foo':'/tmp/foo', 'bar':'/tmp/bar' })
>>> d2 = UrlValueDict([ ('foo', '/tmp/foo'), ('bar', '/tmp/bar') ])
>>> d3 = UrlValueDict(foo='/tmp/foo', bar='/tmp/bar')

>>> d1 == d2
True
>>> d2 == d3
True

In particular, an empty UrlDict instance is returned by the constructor when called with no parameters:

>>> d0 = UrlValueDict()
>>> len(d0)
0

If force_abs is True, then all paths are converted to absolute ones in the dictionary values.

>>> d = UrlValueDict(force_abs=True)
>>> d[1] = 'foo'
>>> for v in d.values(): print os.path.isabs(v.path)
True
>>> d = UrlValueDict(force_abs=False)
>>> d[2] = 'foo'
>>> for v in d.values(): print os.path.isabs(v.path)
False