gc3libs.optimizer.drivers

Drivers to perform global optimization.

Global optimizations can be performed sequentially on a local machine using SequentialDriver. To make use of parallelization, ParallelDriver allows submission of jobs to gc3pie ressources.

Drivers use an algorithm instance that conforms to optimizer.EvolutionaryAlgorithm to generate new populations.

class gc3libs.optimizer.drivers.ComputeTargetVals(pop, jobname, iteration, path_to_stage_dir, cur_pop_file, task_constructor, **extra_args)
gc3libs.workflow.ParallelTaskCollection to evaluate the current
pop using the user-supplied task_constructor().
Parameters:
  • pop – Population to evaluate. Must be a NumPy “array-like” value.
  • jobname (str) – Name of GridDriver instance driving the optimization.
  • iteration (int) – Current iteration number.
  • path_to_stage_dir (str) – Path to directory in which optimization takes place.
  • cur_pop_file (str) – Filename under which the population is stored in the current iteration dir. The population is discarded if no file is specified.
  • task_constructor – Takes a list of x vectors and the path to the current iteration directory. Returns Application instances that can be executed on the grid.
class gc3libs.optimizer.drivers.ParallelDriver(jobname='', path_to_stage_dir='', opt_algorithm=None, task_constructor=None, extract_value_fn=<function ParallelDriver.<lambda>>, cur_pop_file='', **extra_args)

Drives an optimization using opt_algorithm on the grid.

At each iteration an instance of ComputeTargetVals uses task_constructor() to generate gc3libs.Application instances to be executed in parallel. When all jobs are complete, the output is analyzed with the user-supplied function extract_value_fn(). This function returns the function value for all analyzed input vectors.

Parameters:
  • jobname (str) – string that labels this optimization case.
  • path_to_stage_dir – directory in which to perform the optimization.
  • opt_algorithm – Evolutionary algorithm instance that conforms to optimizer.EvolutionaryAlgorithm.
  • task_constructor – A function that takes a list of x vectors and the path to the current iteration directory, and returns Application instances that can be executed on the grid.
  • extract_value_fn – Takes an Application instance returns the function value computed in that task. The default implementation just looks for a .value attribute on the application instance.
  • cur_pop_file – Filename under which the population is stored in the current iteration dir. The population is discarded if no file is specified.

Optimization drivers use GC3Pie in the following way: A SequentialTaskCollection represents the main loop of the optimization algorithm, checking for convergence at each iteration. This allows for resuming paused or crashed optimizations. Each iteration, the optimization algorithm provides a new set of points to be evaluated. These points are each represented by an Application and bundled into a ParallelTaskCollection that manages each single Application until completion. The structure of GC3Libs objects employed can be summarized as follows:

SequentialTaskCollection
          |
          v
  ParallelTaskCollection
         |
         v
    Application
next(done)

Return collection state or task to run after step number done is terminated.

This method is called when a task is finished; the done argument contains the index number of the just-finished task into the self.tasks list. In other words, the task that just completed is available as self.tasks[done].

The return value from next can be either a task state (i.e., an instance of Run.State), or a valid index number for self.tasks. In the first case:

  • if the return value is Run.State.TERMINATED, then no other jobs will be run;
  • otherwise, the return value is assigned to execution.state and the next job in the self.tasks list is executed.

If instead the return value is a (nonnegative) number, then tasks in the sequence will be re-run starting from that index.

The default implementation runs tasks in the order they were given to the constructor, and sets the state to TERMINATED when all tasks have been run. This method can (and should) be overridden in derived classes to implement policies for serial job execution.

class gc3libs.optimizer.drivers.SequentialDriver(opt_algorithm, target_fn, path_to_stage_dir='/home/docs/checkouts/readthedocs.org/user_builds/gc3pie/checkouts/v2.6.8/docs', cur_pop_file=None, logger=None, fmt=None)

Drives an optimization using opt_algorithm on the local machine.

The user-supplied target_fun() computes target values for the populations generated by opt_algorithm.

Parameters:
  • opt_algorithm – Evolutionary algorithm instance that conforms to optimizer.EvolutionaryAlgorithm.
  • target_fn – Function to evaluate a population and return the corresponding values.
  • path_to_stage_dir – Directory in which to perform the optimization.
  • cur_pop_file – Filename under which the population is stored in the current iteration dir. The population is discarded if no file is specified.
  • logger – Configured logger to use.
  • fmt (str) – %-format string to use (e.g., %12.8f) to print values at each step of the algorithm. If None (default), this verbose report is not generated, as it might be time-consuming for large population sizes.
de_opt()

Drives optimization until convergence or itermax is reached.