Warholize is a GC3Pie demo application to produce, from a generic image picture, a new picture like the famous Warhol’s work: Marylin. The script uses the powerful ImageMagick set of tools (at least version 6.3.5-7). This tutorial will assume that both ImageMagick and GC3Pie are already installed and configured.
In order to produce a similar image we have to do a series of transformations on the picture:
convert the original image to grayscale.
colorize the grayscale image using three different colors each time, based on the gray levels. We may, for instance, make all pixels with luminosity between 0-33% in red, pixels between 34-66% in yellow and pixels between 67% and 100% in green.
To do that, we first have to:
- create a Color Lookup Table (LUT) using a combination of three randomly chosen colors
- apply the LUT to the grayscale image
Finally, we can merge together all the colorized images and produce our warholized image.
Clearly, step 2) depends on the step 1), and 3) depends on 2), so we basically have a sequence of tasks, but since step 2) need to create N different independent images, we can parallelize this step.
The SessionBasedScript class in the gc3libs.cmdline module is used to create a generic script. It already have all what is needed to read gc3pie configuration files, manage resources, schedule jobs etc. The only missing thing is, well, your application!
Let’s start by creating a new empty file and importing some basic modules:
import os import gc3libs from gc3libs.cmdline import SessionBasedScript
we then create a class which inherits from SessionBasedScript (in GC3Pie, most of the customizations are done by inheriting from a more generic class and overriding the __init__ method and possibly others):
class WarholizeScript(SessionBasedScript): """ Demo script to create a `Warholized` version of an image. """ version='1.0'
Please note that you must either write a small docstring, or add a
description attribute. These values are used when the script is
called with options
--version, which are
automatically added by GC3Pie.
The way we want to use our script is straightforward:
$ warholize.py inputfile [inputfiles ...]
and this will create a directory
Warholized.<inputfile> in which
there will be a file called
warhol_<inputfile> containing the
desired warholized image (and a lot of temporary files, at least for now).
But we may want to add some additional options to the script, in order to decide how many colorized pictures the warholized image will be made of, or if we want to resize the image. SessionBasedScript uses the PyCLI module which is, in turn, a wrapper around standard argparse (or optparse for older pythons) module. To customize the script you may define a setup_options method and put in there some calls to SessionBasedScript.add_param(), which is inherited from cli.app.CommandLineApp:
def setup_options(self): self.add_param('--copies', default=4, type=int, help="Number of copyes (Default:4). It has to be a perfect square!")
In this example we will accept a
--copies option to define how
many colorized copies the final picture will be made of. Please refer
to the documentation of the PyCLI module for details on the syntax
of the add_param method.
The heart of the script is, however, the new_tasks method, which will be called to create the initial tasks of the scripts. In our case it will be something like:
def new_tasks(self, extra): gc3libs.log.info("Creating main sequential task") for (i, input_file) in enumerate(self.params.args): extra_args = extra.copy() extra_args['output_dir'] = 'Warholized.%s' % os.path.basename(input_file) yield WarholizeWorkflow(input_file, self.params.copies, **extra_args)
new_tasks is used as a generator (but it could return a list as well). Each yielded object is a task. In GC3Pie, a task is either a single application or a complex workflow, and rapresents an execution unit. In our case we create a WarholizeWorkflow task which is the workflow described before.
In our case we yield a different WarholizeWorkflow task for each input file. These tasks will run in parallel.
Please note that we are using the gc3libs.log module to log
informations about the execution. This module works like the
logging module and has methods like error, warning, info or
debug, but its logging level is automatically configured by
SessionBasedScript’s constructor. This way you can increase the
verbosity of your script by simply adding
-v options from the
Main sequential workflow¶
The module gc3libs.workflow contains two main objects: SequentialTaskCollection and ParallelTaskCollection. They execute tasks in serial and in parallel, respectively. We will use both of them to create our workflow; the first one, WarholizeWorkflow, is a sequential task, therefore we have to inherit from SequentialTaskCollection and customize its __init__ method:
from gc3libs.workflow import SequentialTaskCollection, ParallelTaskCollection import math from gc3libs import Run class WarholizeWorkflow(SequentialTaskCollection): """ Main workflow. """ def __init__(self, input_image, copies, **extra_args): self.input_image = input_image self.output_image = "warhol_%s" % os.path.basename(input_image) gc3libs.log.info( "Producing a warholized version of input file %s " "and store it in %s" % (input_image, self.output_image)) self.output_dir = os.path.relpath(extra_args.get('output_dir')) self.copies = copies # Check that copies is a perfect square if math.sqrt(self.copies) != int(math.sqrt(self.copies)): raise gc3libs.exceptions.InvalidArgument( "`copies` argument must be a perfect square.") self.jobname = extra_args.get('jobname', 'WarholizedWorkflow') self.grayscaled_image = "grayscaled_%s" % os.path.basename(self.input_image)
Up to now we just parsed the arguments. The following lines, instead, create the first task that we want to execute. By now, we can create only the first one, GrayScaleConvertApplication, which will produce a grayscale image from the input file:
self.tasks = [ GrayScaleConvertApplication( self.input_image, self.grayscaled_image, self.output_dir, self.output_dir), ]
Finally, we call the parent’s constructor.:
SequentialTaskCollection.__init__( self, self.tasks)
This will create the initial task list, but we have to run also step 2 and 3, and this is done by creating a next method. This method will be called after all the tasks in self.tasks are finished. We cannot create all the jobs at once because we don’t have all the needed input files yet. Please note that by creating the tasks in the next method you could decide at runtime which tasks to run next and what arguments we may want to give to them.
In our case, however, the next method is quite simple:
def next(self, iteration): last = self.tasks[-1] if iteration == 0: # first time we got called. We have the grayscaled image, # we have to run the Tricolorize task. self.add(TricolorizeMultipleImages( os.path.join(self.output_dir, self.grayscaled_image), self.copies, self.output_dir)) return Run.State.RUNNING elif iteration == 1: # second time, we already have the colorized images, we # have to merge them together. self.add(MergeImagesApplication( os.path.join(self.output_dir, self.grayscaled_image), last.warhol_dir, self.output_image)) return Run.State.RUNNING else: self.execution.returncode = last.execution.returncode return Run.State.TERMINATED
At each iteration, we call self.add() to add an instance of a task-like class (gc3libs.Application, gc3libs.workflow.ParallelTaskCollection or gc3libs.workflow.SequentialTaskCollection, in our case) to complete the next step, and we return the current state, which will be gc3libs.Run.State.RUNNING unless we have finished the computation.
Step one: convert to grayscale¶
GrayScaleConvertApplication is the application responsible to convert to grayscale the input image. The command we want to execute is:
$ convert -colorspace gray <input_image> grayscaled_<input_image>
To create a generic application we create a class which inherit from gc3libs.Application and we usually only need to customize the __init__ method:
# An useful function to copy files from shutil import copyfile class GrayScaleConvertApplication(gc3libs.Application): def __init__(self, input_image, grayscaled_image, output_dir, warhol_dir): self.warhol_dir = warhol_dir self.grayscaled_image = grayscaled_image arguments = [ 'convert', os.path.basename(input_image), '-colorspace', 'gray', ] gc3libs.log.info( "Craeting GrayScale convert application from file %s" "to file %s" % (input_image, grayscaled_image)) gc3libs.Application.__init__( self, arguments = arguments + [grayscaled_image], inputs = [input_image], outputs = [grayscaled_image, 'stderr.txt', 'stdout.txt'], output_dir = output_dir, stdout = 'stdout.txt', stderr = 'stderr.txt', )
Creating a gc3libs.Application is straigthforward: you just call the constructor with the executable, the arguments, and the input/output files you will need.
If you don’t specify the
output_dir directory, gc3pie libraries
will create one starting from the class name. If the output directory
exists already, the old one will be renamed.
To do any kind of post processing you can define a terminate method for your application. It will be called after your application will terminate. In our case we want to copy the gray scale version of the image to the warhol_dir, so that it will be easily reachable by all other applications:
def terminated(self): """Move grayscale image to the main output dir""" copyfile( os.path.join(self.output_dir, self.grayscaled_image), self.warhol_dir)
Step two: parallel workflow to create colorized images¶
The TricolorizeMultipleImages is responsible to create multiple versions of the grayscale image with different coloration chosen randomly from a list of available colors. It does it by running multiple instance of TricolorizeImage with different arguments. Since we want to run the various colorization in parallel, it inherits from gc3libs.workflow.ParallelTaskCollection class. Like we did for GrayScaleConvertApplication, we only need to customize the constructor __init__, creating the various subtasks we want to run:
import itertools import random class TricolorizeMultipleImages(ParallelTaskCollection): colors = ['yellow', 'blue', 'red', 'pink', 'orchid', 'indigo', 'navy', 'turquoise1', 'SeaGreen', 'gold', 'orange', 'magenta'] def __init__(self, grayscaled_image, copies, output_dir): gc3libs.log.info( "TricolorizeMultipleImages for %d copies run" % copies) self.jobname = "Warholizer_Parallel" ncolors = 3 ### XXX Why I have to use basename??? self.output_dir = os.path.join( os.path.basename(output_dir), 'tricolorize') self.warhol_dir = output_dir # Compute a unique sequence of random combination of # colors. Please note that we can have a maximum of N!/3! if N # is len(colors) assert copies <= math.factorial(len(self.colors)) / math.factorial(ncolors) combinations = [i for i in itertools.combinations(self.colors, ncolors)] combinations = random.sample(combinations, copies) # Create all the single tasks self.tasks =  for i, colors in enumerate(combinations): self.tasks.append(TricolorizeImage( os.path.relpath(grayscaled_image), "%s.%d" % (self.output_dir, i), "%s.%d" % (grayscaled_image, i), colors, self.warhol_dir)) ParallelTaskCollection.__init__(self, self.tasks)
The main loop will fill the self.tasks list with various TricolorizedImage tasks, each one with an unique combination of three colors to use to generate the colorized image. The GC3Pie framework will then run these tasks in parallel, on any available resource.
The TricolorizedImage class is indeed a SequentialTaskCollection, since it has to generate the LUT first, and then apply it to the grayscale image. We already saw how to create a SequentialTaskCollection: we modify the constructor in order to add the first job (CreateLutApplication), and the next method will take care of running the ApplyLutApplication application on the output of the first job:
class TricolorizeImage(SequentialTaskCollection): """ Sequential workflow to produce a `tricolorized` version of a grayscale image """ def __init__(self, grayscaled_image, output_dir, output_file, colors, warhol_dir): self.grayscaled_image = grayscaled_image self.output_dir = output_dir self.warhol_dir = warhol_dir self.jobname = 'TricolorizeImage' self.output_file = output_file if not os.path.isdir(output_dir): os.mkdir(output_dir) gc3libs.log.info( "Tricolorize image %s to %s" % ( self.grayscaled_image, self.output_file)) self.tasks = [ CreateLutApplication( self.grayscaled_image, "%s.miff" % self.grayscaled_image, self.output_dir, colors, self.warhol_dir), ] SequentialTaskCollection.__init__(self, self.tasks) def next(self, iteration): last = self.tasks[-1] if iteration == 0: # First time we got called. The LUT has been created, we # have to apply it to the grayscale image self.add(ApplyLutApplication( self.grayscaled_image, os.path.join(last.output_dir, last.lutfile), os.path.basename(self.output_file), self.output_dir, self.warhol_dir)) return Run.State.RUNNING else: self.execution.returncode = last.execution.returncode return Run.State.TERMINATED
The CreateLutApplication is again an application which inherits from gc3libs.Application. The command we want to execute is something like:
$ convert -size 1x1 xc:<color1> xc:<color2> xc:<color3> +append -resize 256x1! <output_file.miff>
This will basically create an image 256x1 pixels big, made of a gradient using all the listed colors. The code will look like:
class CreateLutApplication(gc3libs.Application): """Create the LUT for the image using 3 colors picked randomly from CreateLutApplication.colors""" def __init__(self, input_image, output_file, output_dir, colors, working_dir): self.lutfile = os.path.basename(output_file) self.working_dir = working_dir gc3libs.log.info("Creating lut file %s from %s using " "colors: %s" % ( self.lutfile, input_image, str.join(", ", colors))) gc3libs.Application.__init__( self, arguments = [ 'convert', '-size', '1x1'] + [ "xc:%s" % color for color in colors] + [ '+append', '-resize', '256x1!', self.lutfile, ], inputs = [input_image], outputs = [self.lutfile, 'stdout.txt', 'stderr.txt'], output_dir = output_dir + '.createlut', stdout = 'stdout.txt', stderr = 'stderr.txt', )
Similarly, the ApplyLutApplication application will run the following command:
$ convert grayscaled_<input_image> <lutfile.N.miff> -clut grayscaled_<input_image>.<N>
This command will apply the LUT to the grayscaled image: it will
modify the grayscaled image by coloring a generic pixel with a
luminosity value of n (which will be an integer value from 0 to 255,
of course) with the color at position n in the LUT image (actually,
n+1). Each ApplyLutApplication will save the resulting image to a
file named as
The class will look like:
class ApplyLutApplication(gc3libs.Application): """Apply the LUT computed by `CreateLutApplication` to `image_file`""" def __init__(self, input_image, lutfile, output_file, output_dir, working_dir): gc3libs.log.info("Applying lut file %s to %s" % (lutfile, input_image)) self.working_dir = working_dir self.output_file = output_file gc3libs.Application.__init__( self, arguments = [ 'convert', os.path.basename(input_image), os.path.basename(lutfile), '-clut', output_file, ], inputs = [input_image, lutfile], outputs = [output_file, 'stdout.txt', 'stderr.txt'], output_dir = output_dir + '.applylut', stdout = 'stdout.txt', stderr = 'stderr.txt', )
The terminated method:
def terminated(self): """Copy colorized image to the output dir""" copyfile( os.path.join(self.output_dir, self.output_file), self.working_dir)
will copy the colorized image file in the top level directory, so that it will be easier for the last application to find all the needed files.
Step three: merge all them together¶
At this point we will have in the main output directory a bunch of
files named after
grayscaled_<input_image>.N with N a sequential
<input_image> the name of the original image. The last
application, MergeImagesApplication, will produce a
warhol_<input_image> image by merging all of them using the
$ montage grayscaled_<input_image>.* -tile 3x3 -geometry +5+5 -background white warholized_<input_image>
Now it should be easy to write such application:
import re class MergeImagesApplication(gc3libs.Application): def __init__(self, grayscaled_image, input_dir, output_file): ifile_regexp = re.compile( "%s.[0-9]+" % os.path.basename(grayscaled_image)) input_files = [ os.path.join(input_dir, fname) for fname in os.listdir(input_dir) if ifile_regexp.match(fname)] input_filenames = [os.path.basename(i) for i in input_files] gc3libs.log.info("MergeImages initialized") self.input_dir = input_dir self.output_file = output_file tile = math.sqrt(len(input_files)) if tile != int(tile): gc3libs.log.error( "We would expect to have a perfect square" "of images to merge, but we have %d instead" % len(input_files)) raise gc3libs.exceptions.InvalidArgument( "We would expect to have a perfect square of images to merge, but we have %d instead" % len(input_files)) gc3libs.Application.__init__( self, arguments = ['montage'] + input_filenames + [ '-tile', '%dx%d' % (tile, tile), '-geometry', '+5+5', '-background', 'white', output_file, ], inputs = input_files, outputs = [output_file, 'stderr.txt', 'stdout.txt'], output_dir = os.path.join(input_dir, 'output'), stdout = 'stdout.txt', stderr = 'stderr.txt', )
Making the script executable¶
Finally, in order to make the script executable, we add the following lines to the end of the file. The WarholizeScritp().run() call will be executed only when the file is run as a script, and will do all the magic related to argument parsing, creating the session etc…:
if __name__ == '__main__': import warholize warholize.WarholizeScript().run()
Please note that the
import warholize statement is important to
address issue 95 and make the gc3pie scripts work with your current
session (gstat, ginfo…)
To test this script I would suggest to use the famous Lena picture,
which can be found in the miscelaneous section of the Signal and
Image Processing Institute page. Download the image, rename it as
lena.tiff and run the following command:
$ ./warholize.py -C 1 lena.tiff --copies 9
-r localhost if your gc3pie.conf script support it and you
want to test it locally).
After completion a file
will be created.