Python: module ccp4i

ccp4i (version 1.87)

index
/home/pjx/BIOXHIT/Bioxhit_db/dbccp4i/dbccp4i/ccp4i.py

# Copyright (C) 2006-2008 # # This code is distributed under the terms and conditions of the # CCP4 Program Suite Licence Agreement as a CCP4 Library. # A copy of the CCP4 licence can be obtained by writing to the # CCP4 Secretary, Daresbury Laboratory, Warrington WA4 4AD, UK. # #==================================================================== # # CCP4 Interface - ccp4i.py # # Python module providing a class library for emulating the CCP4i # def file based database. It also provides classes for dealing with # the CCP4i "directories.def" file and general classes for .def files # and lockfiles. # # Peter Briggs # ########################################################################

Modules

copy
os
re
stat
sys
time

Classes



array
database

projectDB
subjobDB

directories
history

extendedhistory

lockfile

dblockfile

class array

    CCP4i array class. The array class provides functionality for dealing with CCP4i arrays, which are typically loaded from and save to .def files. It includes methods for readinng and writing CCP4i .def files, and allowing manipulation of data inbetween. An array consists of 'parameters', which have associated values and may optionally also have associated types. Parameters can be either 'scalar' (in which case the parameter name is a string consisting of alphanumeric characters and the underscore) or 'indexed' (in which case the parameter name consists of a string followed by a comma an then an index component). Conventionally parameter names are uppercased. Examples of scalar parameters include: NRUNS HKLIN1 REFINE_MODE. Examples of indexed parameters include: RUN,0 DERIV,4 ATNAM,1_5 The index component consists of either a single integer (single index), or a pair of integers separated by an underscore (double index). In addition to parameters, the array object also keeps track of 'paramater prototypes' (also just referred to as 'prototypes'). For scalar parameters, the prototype has the same name as the parameter and there is essentially no difference between the two. For indexed parameters, the prototype is the zero'th indexed equivalent. For example: for the parameter DERIV,4 the parameter prototype would be DERIV,0 while for ATNAM,1_5 it would be ATNAM,0 (NB not ATNAM,0_0). RUN,0 is already a parameter prototype. Parameter prototypes are significant for indexed parameters for two reasons: 1. The type of an indexed parameter is attached to the prototype 2. The value attached to the prototype is usually used as a    default value for new indexed parameters based on that    prototype. Prototypes are significant for all parameters because only parameters that have prototypes are written to the output .def file. A set of built-in methods are implemented that allow Python dictionary-like syntax to be used to get, set and delete array elements - specifically: value = array[key] sets value to the value of the element 'key' array[key] = value sets the value of the element 'key' array.has_key(key) tests whether the element 'key' exists del array(key) removes element 'key' from the array.

Methods defined here:

__contains__(self, key)
Internal: implements 'array.has_key(key)' mechanism.

__delitem__(self, key)
Internal: implements 'del array[key]' to delete an item.

__getitem__(self, key)
Internal: implements 'array[key]' to return value.

__init__(self, initfile='')
Create a new array class instance. If a .def format file name is supplied via the optional 'initfile' argument then the contents of the array object will be initialised from that file.

__setitem__(self, key, value)
Internal: implements 'array[key] = value' to set value.

addparameter(self, parameter, index='', type='', value='')
Add a new parameter to the array object. This method is used to add parameters to the array object. If the parameter doesn't exist then a new entry will be created to store the specified 'value'. If 'parameter' is a scalar parameter, or if it is a zero indexed parameter (either explicitly, or when 'index' is set to zero), then it will also be added to the list of parameter prototypes (if the prototype doesn't already exist). The 'type' will be associated with the new prototype definition. NB: indexed parameters that are not already parameter prototypes will NOT have prototypes added for them - this may be significant if the array object is saved to file, since by default parameters with no prototype are not written out.

delparameter(self, parameter, *indices)
Delete a parameter from the array object. 'parameter' supplies the name of a scalar parameter, or the root name of an indexed parameter (in which case one or two indices must be specified as separate arguments), or a complete indexed parameter name. For example: delparameter('PARAM') - scalar delparameter('PARAM',1) - indexed, one index delparameter('PARAM,1') - indexed, index in the name The previous two are equivalent. delparameter('PARAM',1,2) - indexed, two indices delparameter('PARAM,1_2') - indexed, indices in the name The previous two are equivalent. Note however delparameter('PARAM,1',2) is not valid syntax for this method. If successful, removes the parameter from the array. Note however that parameter prototypes are not deleted.

getequivalent(self, parameter1, parameter2)
Return an equivalently indexed parameter name. Given an indexed parameter 'parameter1', this method returns the name of a parameter with the same index but based on 'parameter2'. For example: if parameter1 = 'PARAM,3' and parameter2 = 'EQUIV' then 'EQUIV,2' will be returned (also if parameter2 is 'EQUIV,0' etc).

getheaderdata(self, pattern)
# Methods for extracting data from the header

getvalue(self, parameter, *indices)
Get the value of a parameter stored in the array. 'parameter' supplies the name of a scalar parameter, or the root name of an indexed parameter (in which case one or two indices must be specified as separate arguments), or a complete indexed parameter name. For example: getvalue('PARAM') - scalar getvalue('PARAM',1) - indexed, one index getvalue('PARAM,1') - indexed, index in the name The previous two are equivalent. getvalue('PARAM',1,2) - indexed, two indices getvalue('PARAM,1_2') - indexed, indices in the name The previous two are equivalent. Note however getvalue('PARAM,1',2) is not valid syntax for this method.

has_parameter(self, parameter, *indices)
Test for the existence of a parameter in the array. 'parameter' supplies the name of a scalar parameter, or the root name of an indexed parameter (in which case one or two indices must be specified as separate arguments), or a complete indexed parameter name. For example: has_parameter('PARAM') - scalar has_parameter('PARAM',1) - indexed, one index has_parameter('PARAM,1') - indexed, index in the name The previous two are equivalent. has_parameter('PARAM',1,2) - indexed, two indices has_parameter('PARAM,1_2') - indexed, indices in the name The previous two are equivalent. Note however that has_parameter('PARAM,1',2) is not valid syntax for this method. Returns True if the parameter is defined in the array, and False if not.

incr(self, parameter, increment=1)
Increment or decrement the stored integer value for a parameter. The value stored in 'parameter' must be an integer value - the value will be updated by adding the value of the 'increment' (which can be either a positive or negative integer) to its initial value. If no value is specified for 'increment' then the value is increased by one.

isparameter(self, parameter)
Check whether the parameter has a prototyped in the array. Returns True if the parameter is also represented by an appropriate parameter prototype in the array. For scalar parameters this is the same as the parameter name, while for indexed parameters this is the zero-th indexed equivalent. The 'parameter' must include any indices associated with the parameter name being checked. If not prototype exists then returns False.

listindexed(self, parameter, include_prototype=False)
Return a list of parameters that share the same prototype. Given the name of parameter, this method returns a list of all the parameters which share the same prototype. The 'parameter' argument must include any indices (for example 'PARAM,1' or 'PARAM,1_2'). By default the list that is returned won't include the actual parameter prototype - set the 'include_prototype' argument to True to include the prototype parameter in the list.

parameter_index(self, parameter)
Return the index part of a parameter name. If 'parameter' is an indexed parameter then return the index part, otherwise return an empty string. Wraps the internal __parameter_index method.

parameter_isindexed(self, parameter)
Determine if a parameter is indexed. Returns True if 'parameter' is an indexed parameter, False if not. Wraps the internal __parameter_isindexed method.

parameter_root(self, parameter)
Return parameter name stripped of any index. If 'parameter' is indexed then return the leading part of the parameter name with the index removed; otherwise return the parameter name as supplied. Wraps the internal __parameter_root method.

parameters(self)
Return a list of the parameter prototypes in the array.

read(self, filename)
Reads the contents of a def-format file into the array. The file format can be either key-value pairs, or key-type-value triplets. If a value in the file is of the form '\$name' then 'name' is assumed to be an environment variable which is substituted via a call to the GetEnvVar function.

replacevalue(self, parameter, *indices)
Copy the value of an indexed parameter to another location. This method replaces the value stored at one position in an indexed parameter by the value stored at another position of the same parameter, then delete the parameter that was copied. 'parameter' is the root name of the parameter in question, and the remaining arguments are either 2 or 4 numbers indicating the target and source indices. For example: replacevalue('PARAM',n,m) copies the value of PARAM,m to PARAM,n and deletes PARAM,m replacevalue('PARAM',x,y,p,q) copies the value of PARAM,p,q to PARAM,x,y and deletes PARAM,p,q. This is useful when deleting a value from a list and copying the last item to a new position to replace the deleted value.

setvalue(self, parameter, *indices)
Assign a value toGet the value of a parameter stored in the array. 'parameter' supplies the name of a scalar parameter, or the root name of an indexed parameter (in which case one or two indices must be specified as separate arguments), or a complete indexed parameter name. The last argument in 'indices' must always be the value to be assigned. For an indexed parameter, if this is 'None' then the default value assigned to the parameter prototype will be assigned. For example: setvalue('PARAM','My value') - set scalar to 'My value' setvalue('PARAM',1,'My value') - set single-indexed to 'My value' setvalue('PARAM,1',.My value') - equivalent to above setvalue('PARAM',1,2,'My value') - set double-indexed parameter setvalue('PARAM,1_2','My value') - equivalent to above setvalue('PARAM,1',None) - set parameter to same value as Note however setvalue('PARAM,1',2,'My value') is not valid syntax for this method. If the parameter did not already exist then setting its value using this method automatically creates it.

typeof(self, parameter)
Return the type for a particular parameter. The 'parameter' must include any indices associated with the parameter name being checked. If there is no type associated with the prototype for the specified 'parameter' argument then an empty string is returned.

write(self, filename, application='', version='', defname='', user='', project='', include_types=False, include_prototypes=False, include_nonprototyped=False)
Write the contents of the array object to a def file. Optionally a number of values can be provided which will be written into the header lines of the def file, specifically: 'application' and 'version' specify a name and version number for the application writing the file, i.e.: #CCP4I VERSION <application> <version> The application name defaults to 'CCP4iPy' if none is supplied. 'defname' specifies the name of the file, i.e.: #CCP4I SCRIPT DEF <defname> If defname is not supplied then it defaults to the basename of the file being written out. 'user' is the user name, i.e.: #CCP4I USER <user> If this is not supplied then it defaults to the value returned by the GetUserId function. 'project' is the name of a CCP4 project, i.e.: #CCP4I PROJECT <project> If this is not supplied then this header line will not appear. Additionally, writing the actual parameters to the file are affected by a number of optional 'include_...' arguments: include_types:         if True then the type information will                        also be written to the file; by default                        type information is not written.                         include_prototypes:    if True then the parameter prototypes                        for indexed parameters (i.e. the                        0th-indexed parameters) will also be                        written; by default the prototypes for                        indexed parameters are not written.                         include_nonprototyped: if True then all stored parameters will                        be written; by default only parameters                        with prototypes are written out.

writeparameter(self, parameter, f, include_types=False)
Write an array parameter to a def file. The calling function must supply the complete parameter name (i.e. including index, if appropriate) and an initialised file object f. If include_types is True then the parameter type is also written.

class database

    CCP4i def file project database class. Creates an instance of a CCP4i project database object associated with a particular project name and a project directory. If the project doesn't exist then the create method must be used to make a new project database.def file; use the open method to load data from an existing database. To ensure that the data in the database object reflects the actual state of the persistent storage, checks are made before read and write operations to see whether the database object still holds the lock on the resource and that it has not been modified by some other process since the last save. Since these checks involve file operations which may be expensive (in terms of timing), the 'interval' parameter can be used to specify the length of time in seconds for which the results of a check are held to be valid. An interval of zero means that the checks are made every time an operation is performed; a longer interval reduces the number of checks that are made, which should improve efficiency at the expense of leaving a longer window of opportunity for the resource to come out of synch with the database object.

Methods defined here:

__del__(self)
del(project) Saves data, releases lock and deletes the object.

__init__(self, project, directory, dbdir='', template='', interval=0)
Internal: initialise a new project database object. The name of the project and the path to the corresponding project directory must be supplied. Optionally a database directory can be specified using the 'dbdir' argument - this specifies where the object will look for or write the persistent storage. By default the data items in the file are taken from the 'template' def file in $CCP4/ccp4i/etc; this can be over-ridden by specifying a template def file via the optional 'template' argument. No data is loaded until either the open or create methods are invoked.

__len__(self)
len(project) Returns the number of jobs stored in the project, or zero if the project is not loaded.

__nonzero__(self)
Returns True if the project is loaded, False otherwise.

__repr__(self)
Returns the project name.

addfile(self, job, ftype, dirs, filename, alias)
Add a file to the list of files for a job. The job is specified by an id number. 'ftype' specifies whether the file should be associated as an 'input' or 'output' file. 'filename' specifies the name of the file to be added to the can be either a full path or a relative path, and 'alias' specifies a project alias to be associated with the file. The alias is only used to resolve relative paths in the filename, and is ignored if a full path is supplied for the file being added. 'dirs' is a ccp4i directories object. Returns True if the operation was successful, False if there was a problem. Note that if the file name already appears in the list of files then this is not an error.

addinputfile(self, job, dirs, filename, alias)
Add a file to the list of input files for a job. This function wraps the addfile method: 'job' is a job number, 'dirs' is a directories object, 'filename' is an absolute or relative path, and 'alias' is an optional project name which is used to resolve relative paths but is otherwise ignored. The method returns True on success and False otherwise.

addoutputfile(self, job, dirs, filename, alias)
Add a file to the list of output files for a job. This function wraps the addfile method: 'job' is a job number, 'dirs' is a directories object, 'filename' is an absolute or relative path, and 'alias' is an optional project name which is used to resolve relative paths but is otherwise ignored. The method returns True on success and False otherwise.

auto_refresh(self)
Return the value of the auto_refresh flag. The 'auto_refresh' feature means that read attempts on a 'stale' database automatically cause the data to be refreshed from file in read-only mode. See the set_auto_refresh method for more information.

close(self)
Close an open project. This saves any unsaved data to file, releases the lock on the database.def file, and clears the data from the project object, returning True on success. The project must have first been loaded via an open or create operation, otherwise False is returned.

create(self)
Make a new project. Creating a new project includes creating the project directory (if it doesn't already exist), and creating the database subdirectory and database.def file. The project is then opened and the project object is returned. The create operation will fail (returning False) if the project already exists.

deletejob(self, job)
Remove an existing job record from the project. The job with the specified id number will be removed from the project, also removing all associated data, returning True on success. The operation will fail if the database is not loaded, if the object has lost the lock on the database.def file, or if a job with the specified id number is not found.

describe(self, joblist, itemlist, formatlist)
Return a list of formatted strings populated with job data. For each job id number in the joblist, the describe method retrieves the values of each of the data items in the itemlist and creates a description string by placing those values into fields with character widths supplied in the formatlist. It returns a list of these strings, with one list item per job. Allowed data items can be any stored in the project database plus JOB_ID (the id number of the job), which is an implicit data item.

describejob(self, job, itemlist, formatlist)
Return a formatted string populated with job data. For the specified job id number, the describejob method retrieves the values of each of the data items in the itemlist and creates a description string by placing those values into fields with character widths supplied in the formatlist. Allowed data items can be any stored in the project database plus JOB_ID (the id number of the job), which is an implicit data item.

exists(self)
Test if the project exists. Returns True if the project directory and database.def file both exist, and False otherwise.

getDbFile(self)
Return the database file for the project.

getDbItems(self)
Return a list of the data items stored for all jobs.

getDbdir(self)
Return the database directory of the project.

getProjectname(self)
Return the name of the project.

getdata(self, job, item)
Retrieve the value of a data item stored for a job. The job is specified by an id number. If the job doesn't exist, or if the item is otherwise inaccessible as indicated by a call to the itemexists method, then the operation raises an IndexError exception. Otherwise, the specific value of the data item for the specified job id is returned (or the 'generic' value, if a specific value was not found). If the database is not readable then 'None' is returned.

getmessage(self)
Retrieve last internal error message. If an operation fails then the client application can access the last error message using this method. Invoking getmessage clears the error message.

hasjob(self, job)
Checks if a job with the specified id exists in the database. Given a job id, returns True if the database contains a job with that id and False if not.

haslock(self)
Test whether the project is locked by this process. Returns True if the project database.def file has a lockfile owned by this process, and False otherwise. The check is only performed after a specified time interval, in between the value is cached. Set the time interval to 0 to ensure that the value is never cached.

isloaded(self)
Test whether the project data has been loaded. Returns True if the database object has been populated from the file on disk.

islocked(self)
Test whether the project is locked. Returns True if the project database.def file has a lockfile from any process, and False otherwise.

isreadable(self)
Test if it is possible to get data from the object. Returns True if the database object has been loaded, and has data that is at least as recent as the persistent storage on disk.

isstale(self)
Check if the data in the object is older than on disk. Return True if the modification time of the resource file is more recent than the last save. If so then this suggests that the data in this object may not accurately mirror the data in the persistent storage. The check is only performed after a specified time interval, in between the value is cached. Set the time interval to 0 to ensure that the value is never cached.

iswriteable(self)
Test if it is possible to modify and save the data. Returns True if the database object has been loaded, owns the lock on the persistent storage on disk, and has data that is at least as recent as the persistent storage.

itemexists(self, job, item)
Check for the existence of a data item for a particular job. Returns True if the data item is found for the specified job id number. If the item is not found for the specified job id, but was defined in the template database.def file, then this method also returns True. Otherwise, itemexists returns False, indicating that the item is not accessible for the specified job. itemexists will also return False in the event that the project is not loaded.

listfiles(self, job, ftype, dirs)
Return a list of files for a job. The job is specified by an id number. 'ftype' specifies whether the file list should be the associated 'input' or 'output' files. 'dirs' is a ccp4i directories object. Returns a list with the full filenames for each of the files associated with the job, or a blank list if there are no files.

listinputfiles(self, job, dirs)
Return a list of input files for a job. The job is specified by an id number,'dirs' is a ccp4i directories object. Returns a list with the full filenames for each of the files associated with the job, or a blank list if there are no files.

listjobs(self)
Return the list of all jobs in the project. The list is an unsorted list of job id numbers, or an empty list if the project is not currently loaded.

listoutputfiles(self, job, dirs)
Return a list of output files for a job. The job is specified by an id number,'dirs' is a ccp4i directories object. Returns a list with the full filenames for each of the files associated with the job, or a blank list if there are no files.

newjob(self, taskname='', status='STARTING', title='')
Create a new job record in the project. Creates a new entry in the current project with the current date and returns the id number for the new job. Optionally the taskname (TASKNAME data item), the status (STATUS item) and/or title (TITLE item) may also be set when the job is created. If no taskname is specified then it is set to 'none', if no status is specified then it is set to 'STARTING'. (Note that blank values of the taskname can crash CCP4i and cause corruption of the database.) The operation will fail and return -1 if project is not loaded, or if the current object has lost the lock on the database.def file.

njobs(self)
Return the value of the NJOBS data item. For historical reasons, the NJOBS data item records the current highest job id number used in the project, and is not generally the actual number of jobs in the project. To get the actual number of jobs in an object project, do len(project). Returns -1 if the project has not been correctly loaded.

open(self, grablock=False, readonly=False, strict=False)
Open a project. Read the data from the project database and return the project object. The open operation will fail (return False) if the project is locked, if the project is already loaded, or if the project doesn't exist. Set the 'grablock' argument to True to override any existing lock and force loading of the database. Specifying 'readonly' as True populates the database object from the file on disk without attempting to lock the resource. However it will clear an existing lock, so the process may still own the lock even if it is in 'read-only' mode. If the 'strict' flag is true then header information about the project name and database directory are checked against the internal values, and warnings issued if there is a discrepancy.

refresh(self, grablock=False, readonly=True)
Reload the data from the file into a database object. Use the 'refresh' method to re-read the data from the file on disk into an existing (loaded) database object. This is necessary for example if the file is updated by an external process. Essentially the database object is closed and then reopened. By default the data is reloaded in 'readonly' mode. Set the 'grablock' argument to True to override any existing lock and force loading of the database in write mode - see the 'open' method.

removefile(self, job, ftype, dirs, filename)
Remove a file from the list of files for a job. The job is specified by an id number. 'ftype' specifies whether the file should be associated as an 'input' or 'output' file. 'dirs' is a ccp4i directories object. Returns True if the operation was successful, False if there was a problem.

removeinputfile(self, job, dirs, filename)
Remove a file from the list of input files for a job. The job is specified by an id number. 'dirs' is a ccp4i directories object. Returns True if the operation was successful, False if there was a problem.

removeoutputfile(self, job, dirs, filename)
Remove a file from the list of output files for a job. The job is specified by an id number. 'dirs' is a ccp4i directories object. Returns True if the operation was successful, False if there was a problem.

save(self)
Save the project database contents to file. The save method writes the data in the object to persistent storage (i.e. the database.def file for the project). On successful completion True is returned. The save operation will fail if the project is not loaded, or if the object has lost the lock on the database.def file; in both cases False is returned.

selectjobs(self, item, pattern)
Retrieve a list of jobs based on some selection criterion. Returns a list of jobs for which the value of the given data item matches the supplied regular expression pattern.

set_auto_refresh(self, auto_refresh)
Set the auto_refresh flag. 'auto_refresh' takes a boolean value. Setting it to 'True' means that a read attempt on a 'stale' database (i.e. when the source database file is newer than the data in the object) will automatically invoke the 'refresh' method to load the data in read-only mode. Setting it to 'False' means that the 'refresh' method must be explicitly invoked.

setdata(self, job, item, value)
Set the value of a data item for the specified job. The job is specified by an id number. The operation returns True if the value was successlly updated and False if the database is not writeable, or if the data item is valid but could not be updated. An IndexError exception will be raised if the data item is invalid, either because the job doesn't exist or because the data item doesn't exist for the job, as indicated by a call to the itemexists method. Note that the validity of the new value is not checked, except for the TASKNAME data item, which must not be blank.

updatetime(self, job)
Update the time of the job to be the current time. This wraps the setdata method to automatically set the DATE attribute of the job specified by the id number, and returns the result of the setdata operation.

class dblockfile(lockfile)

    CCP4i database lock file class. This class provides methods for creating and manipulating lock files for CCP4i def file based databases. The calling subprogram should specify the name of the project (argument 'project') and the path to the database directory i.e. the location of the database.def file (argument 'dbdir').

Methods defined here:

__init__(self, project, dbdir)
Instantiate a new dblockfile object.

Methods inherited from lockfile:

filename(self)
Return the lockfile name (including the path).

haslock(self)
Check whether the resource is locked by this lockfile object. This method returns True provided that the resource is locked (i.e. a physical lock file exists) and that the the lock file was created by this instance of the lockfile class. If either condition is not satisfied then the method returns False.

islocked(self)
Check whether the resource is locked. This method returns True if the resource file that the lockfile object is associated with is 'locked' (that is, a physical lock file exists), and False if it is not locked. Note that it is possible for the resource to be locked but for the lock to be owned by a different lockfile instance - in this case, 'islocked' will still indicate that the resource is locked. Use the 'haslock' method to check if the resource is actually locked by this instance of the lockfile class.

lock(self)
Lock the file resource. Attempts to create the physical lock file associated with the resource. This will fail is the resource is already locked (by any lockfile object, not necessarily only this one). Returns True if the lock is successfully created, False if there is a failure.

resource(self)
Return the name of the resource (i.e. file) being locked.

unlock(self, force=False)
Unlock a file resource. This method attempts to unlock a file resource that was previously locked, by removing the physical lock file. The operation will fail if the resource is not currently locked, or if the lock on the resource belongs to a different lockfile object - in this second case, the setting the 'force' to True forces the removal of the lock file regardless of ownership. This method returns True on successful removal of the lock and False otherwise.

class directories

    CCP4i project directories class. This class provides methods for reading and writing the project directory information in directories.def for CCP4i-based applications. The principle data stored in the directories object and the underlying file are 'aliases' associated with directories on the file system. These are divided into two types: 'project directories' and 'default directories' (also referred to as 'data directories'). The two types are kept separate within the directories class, and have separate methods to access and manipulate them. (In addition there are some other data items that are maintained for backwards compatibility with the Tcl CCP4i code.) By default when the directories object instantiated it tries to take the user name from the current environment, and assumes that the 'source' data file (the persistent storage) is the 'directories.def' file in the user's .CCP4 directory or folder. These are determined automatically unless overriden by one or both of the 'user' and 'source' arguments on instantiation. To ensure that the data in the directories object reflects the actual state of the persistent storage, checks are made before read and write operations to see whether the directories object still holds the lock on the resource and that it has not been modified by some other process since the last save. Since these checks involve file operations which may be expensive (in terms of timing), the 'interval' parameter can be used to specify the length of time in seconds for which the results of a check are held to be valid. An interval of zero means that the checks are made every time an operation is performed; a longer interval reduces the number of checks that are made, which should improve efficiency at the expense of leaving a longer window of opportunity for the resource to come out of synch with the directories object. The directories class also supports an 'auto-refresh' mechanism, which is turned off by default but which can be toggled on or off using the 'set_auto_refresh' method. When auto-refresh is active then data in the object is automatically reloaded from the file if the source data file is modified by an other process.

Methods defined here:

__del__(self)
Perform cleanup when the object is deleted. This implementation ensures that the directories data is cleanly released on deletion.

__init__(self, user='', source='', interval=0)
Instantiate a new directories object. Optionally, specify a username 'user' (otherwise this defaults to the current user), a 'source' file (which defaults to the platform-specific directories.def file in the user's .CCP4 area), and an 'interval' (in seconds) specifying the maximum time that filesystem checks are taken to be current.

__nonzero__(self)
Returns True if data is loaded, False if not. This is a wrapper for the __loaded method of this class.

__repr__(self)
Return a string representation of the object, suitable for display to the programmer.

adddefdirref(self, def_dir, path)
Add a new default directory reference. Associate a default dir name 'def_dir' with the directory specified by 'path'. Returns True if the reference is successfully`added and False on failure, for example if the data is not write accessible or if the default dir name is already being used.

addprojectref(self, alias, path, db_dir='')
Add a new project reference. Associate a project name 'alias' with the directory specified by 'path'. Optionally also specify the path to the database directory via the 'db_dir' argument, however it is recommended not to set this as it normally assigned automatically. Returns True if the project reference is successfully added and False on failure, for example if the data is not write accessible or if the project alias is already being used.

array(self)
Diagnostic method: return the underlying array object. This method returns the underlying CCP4i 'array' object that holds the actual data. It should not be necessary to access the array object directly under normal circumstances and the use of this method is not recommended.

auto_refresh(self)
Return the value of the auto_refresh flag. The 'auto_refresh' feature means that read attempts on a 'stale' database automatically cause the data to be refreshed from file in read-only mode. See the set_auto_refresh method for more information.

create(self)
Make a new directories file. This creates a new source file if one does not already exist. The object is automatically loaded once the file has been created.

defdir(self, def_dir)
Return the directory corresponding to a def dir name. Returns the full path of the directory or folder corresponding to the specified default directory name, or an empty string if the default dir name is not one of those currently stored in the object. Raises an AttributeError exception if the data does not have read access.

deletedefdirref(self, def_dir)
Remove a default directory reference. This removes the default directory name 'def_dir', plus all associated data, returning True on success and False on failure. The operation can fail if the data is not write accessible or if the named default dir is not found in the directories object.

deleteprojectref(self, alias)
Remove a project reference. This removes the project name 'alias' plus all associated data, returning True on success and False on failure. The operation can fail if the data is not write accessible or if the named project is not found in the directories object.

filenamecomponents(self, filename)
Break a file/directory name into a filename and alias. Given a file or directory name, this method returns the the appropriate file and project/defdir names as a tuple with two elements i.e.: (filen,project) 'project' will be the alias of a project or default directory (or 'FULL_PATH' if there is no match). 'filen' will be the filename relative to the project or default directory, '' if the supplied file is actually a directory, or the input filename if the alias is returned as 'FULL_PATH'. This result is also returned if the data is not currently loaded.

fullfilename(self, filen, project)
Construct a full path from filename and project or def dir name Given a filename 'filen' and an alias 'project' (which can be either a project alias, a default dir alias, the string 'FULL_PATH', or a blank string), this method attempts to construct a full filename based on the information stored in the directories object. The rules are: If the alias is FULL_PATH or blank, or if it is neither a project or default directory alias, then return the input filename as is. Otherwise append the filename to the directory path corresponding to the alias with the appropriate path separator. If the data has not been loaded into the object then an empty string is returned regardless of the inputs.

getmessage(self)
Access the last internal error message. Provides a way for the calling application to access the last error message set internally. Once the message has been accessed it is deleted.

haslock(self)
Check if the directories object owns the lock. This returns True if the process owns the lock file associated with the source file, and False if not (or if no lock exists). The check is only performed explicitly after a specified time interval - in between, the outcome of the check is cached and it is this cached value that is returned. The time interval is specified when the object is first created, and should be set to 0 to ensure that the value is never cached.

isdefdir(self, def_dir)
Determine if a name is a default directory name. Returns True if the specified name is the name of a project stored in the directories object, and False if it is not (or if the data is not read accessible).

isdefdirdir(self, dir)
Determine if a directory path is a default directory. If the the specified directory corresponds to a default directory then returns the name of the default dir; otherwise returns an empty string. Raises an AttributeError exception if the data does not have read access.

isloaded(self)
Test if the data has been loaded from the file. The object is loaded with data from the source file by successfully executing the 'load' method.

islocked(self)
Test if the directories file is currently locked. This returns True if there is a lock file created by a lockfile object associated with the source file, and False otherwise. This method only reports the existence of the lock file; it doesn't check the ownership of the lock - for that, use the 'haslock' method.

islockedbyccp4i(self)
Check if the resource file is locked by a CCP4i process. This method returns True if the directories resource file is also locked by CCP4i, i.e. a CCP4i-style lock file is present in the same directory or folder as the resource file. If no such file is found then False is returned. CCP4i's lock files are called <filen>.LOCK. CCP4i itself does not recognise lock files created on behalf of this class, and may alter the data in the resource file at will. It is worth noting that CCP4i's lock on the directories.def file seems a little unpredictable in CCP4i version 1.4.4.1. The CCP4i lock is only created when CCP4i is performing certain operations. So the return value of False from this method doesn't necessarily indicate that CCP4i is not still running.

isproject(self, project)
Determine if a name is a project alias. Returns True if the specified name is the name of a project stored in the directories object, and False if it is not.

isprojectdir(self, dir)
Determine if a directory path is a project directory. If the the specified directory corresponds to a project directory then returns the name of the project; otherwise returns an empty string. Raises an AttributeError exception if the data does not have read access.

isreadable(self)
Test if it is possible to get data from the object. The directories object is considered to be 'readable' provided that it is loaded with up-to-date data from the source file. If the source file is newer than the last time the data was loaded (and the 'auto refresh' mechanism is not enabled) then the data is not considered readable.

isstale(self)
Check if the resource file is newer than the stored data. This method returns True if the modification time of the resource file is more recent than the last save (or load) time. In this case, the data stored in the object may not be an accurate copy of the data in persistent storage. Otherwise it returns False (the data in the object is not 'stale'). The check is only performed explicitly after a specified time interval - in between, the outcome of the check is cached and it is this cached value that is returned. The time interval is specified when the object is first created, and should be set to 0 to ensure that the value is never cached.

iswriteable(self)
Test if it is possible to modify data and save to file. Returns True if the data in the object can be modified (e.g. adding or deleting project directory references) and if the data can be saved to persistent storage (i.e. to the source data file). False is returned if this is not possible. Provided that the data has been loaded in the first place, and that this object owns the lock on the source file, modify-and-save should be possible. However if the source file is locked by CCP4i (which doesn't recognise the ccp4i module's lock files) or if the source file has been modified by some external process more recently than the last save from this object, then modify-and-save is not allowed.

listdefdirs(self)
Return a list of the default directory names. Returns a list of the default directory names i.e. the aliases currently loaded into the directories object, or raises an AttributeError exception if the data does not have read access.

listprojects(self)
Return a list of project names. Returns a list of the project names i.e. the aliases currently loaded into the directories object, or raises an AttributeError exception if the data does not have read access.

load(self, force=False, readonly=False)
Load the data from the resource file into the object. Invoking the 'load' method causes the data to be read from the resource file into the directories object, so that it can be accessed and manipulated using the methods in this class. This method returns True on successful load, and False on failure. Situations where the load operation will fail include: the resource file doesn't exist; the data has already been loaded successfully; the resource file is locked by another directories object. In the event that there is a lock on the resource file, setting the 'force' argument to True over-rides the existing lock essentially grabbing it for the current directories instance. Alternatively, setting the 'readonly' argument to True loads the data even if there is a lock from another process, but does not grab the lock - so the data can be read but not modified.

logdir(self)
Return the value of the LOG_DIRECTORY data item. Raises an AttributeError exception if the data has not yet been loaded into the directories object.

n_defdirs(self)
Return the number of default directories. Returns the number of default directories, stored in the N_DEF_DIRS data item, or raises an AttributeError exception if the data does not have read access.

n_projects(self)
Return the number of projects. Returns the number of projects, stored in the N_PROJECTS data item, or raises an AttributeError exception if the data does not have read access.

project_menu(self)
Return the value of the PROJECT_MENU data item. Raises an AttributeError exception if the data has not yet been loaded into the directories object.

projectdb(self, project)
Return the database directory for a specific project. The database directory is a subdirectory of the project directory where the database.def file is stored. Typically this directory is called CCP4_DATABASE. An empty string is returned if the project name is not one of those currently stored in the object, and an AttributeError exception is raised if the data does not have read access.

projectdir(self, project)
Return the project directory for a specific project. Returns the full path of the directory or folder corresponding to the specified project name, or an empty string if the project name is not one of those currently stored in the object. Raises an AttributeError exception if the data does not have read access.

refresh(self, readonly=True)
Reload the data from the source directories.def file. Invoking this method forces the current data in the object to be overwritten by the contents of the source directories file on disk.

release(self)
Close the directories object and release the source file. Invoking this method saves any data to the persistent storage (the source file) and releases the lock on the source file. After doing this the object is no longer loaded and cannot operate on the data without the 'load' method being re-invoked. The object must have write access to the source file in order for the save to take effect, and must own the lock in order for it surrender the lock. 'release' always works and always returns True, regardless of whether the save or unlocking operations were successful.

save(self, filen='')
Save the data in the directories object to file. Invoking this method causes the data in the object to be written out to the associated directories.def file. Returns True on success and False if the operation fails. Reasons for failure include: the directories data is not currently loaded; the data hasn't been updated since the last save; the directories object doesn't own the lock on the source file. Optionally, the data can be written out to a different file than the associated source file, by specifying a non-blank filename as the 'filen' argument.

set_auto_refresh(self, auto_refresh)
Set the auto_refresh flag. 'auto_refresh' takes a boolean value. Setting it to 'True' means that a read attempt on a 'stale' database (i.e. when the source database file is newer than the data in the object) will automatically invoke the 'refresh' method to load the data in read-only mode. Setting it to 'False' means that the 'refresh' method must be explicitly invoked.

setlogdir(self, log_dir)
Set the value of the LOG_DIRECTORY data item. The directories object must have write access. Warning: this method fails silently.

source(self)
Return the source filename associated with the object. This is the full path and name of the .def file holding the directories data that the directories object is associated with.

user(self)
Return the username associated with the object.

class extendedhistory(history)

    Extended history class for use with projectDB class. The extendedhistory class is intended for use with the projectDB class and adds support for querying the history for subjobs as well as 'top-level' jobs. The same methods are available as for the base 'history' class however they can be used for subjobs by specifying the subjob id using the x.y notation, e.g. history.parentsof(9) will return top-level jobs that are 'parents' of job 9, while history.parentsof(9.5) will return subjobs of job 9 that are 'parents' of job 9.5. Note that top-level jobs will not be returned as being linked to subjobs (or vice versa); neither will subjobs that belong to different top-level jobs.

Methods defined here:

__init__(self, database, directories)
Override initialisation of base class. This additionally sets up a dictionary for the histories of each of the subjob databases.

arelinked(self, job_id1, job_id2)
Override the test to see if two jobs are linked.

childrenof(self, job_id)
Override the listing of child ids for the current job.

construct(self)
Override the construction of the history. This additionally constructs histories for each of the subjob databases.

parentsof(self, job_id)
Override the listing of parent ids for the current job.

Methods inherited from history:

addlink(self, parent_job, child_job)
Record a link between two jobs. This method is used when populating the history object; it creates a parent-child link between the supplied job ids 'parent_job' and 'child_job'.

allchildrenof(self, job_id, result_ids=[])
Return list all child jobs descended from a specific job Given a job number 'job_id' this method returns a list of all 'descendent' jobs, i.e. jobs that are immediate children of the specified job, plus jobs that are children of those children and so on. Essentially this returns the set of jobs descended from the specified job. The 'result_ids' argument is used internally when the method calls itself recursively - it is used to pass along and accumulate the results of earlier calls.

allparentsof(self, job_id, result_ids=[])
Return list all parent jobs descended from a specific job Given a job number 'job_id' this method returns a list of all 'antecedent' jobs, i.e. jobs that are immediate parents of the specified job, plus jobs that are parents of those parents and so on. Essentially this returns the set of jobs that anteceded the specified job. The 'result_ids' argument is used internally when the method calls itself recursively - it is used to pass along and accumulate the results of earlier calls.

infiles(self, job_id)
Return a list of the input files associated with a job. Supplied with a job number 'job_id', this method returns a list of the full path names for each of the files.

outfiles(self, job_id)
Return a list of the output files associated with a job. Supplied with a job number 'job_id', this method returns a list of the full path names for each of the files.

update(self)
Update the relationship data between jobs in the database. This method updates descriptions of the relationships between jobs in the database. It differs from the construct method in that it does not clear any existing data before updating. It is possible therefore that 'phantom' links or jobs may persist between updates using this method.

updatelinks(self, job_id)
Not implemented. This method was intended to (re)generate the links associated with the specified job 'job_id', but has not been implemented.

class history

    CCP4i database history object. This class infers links between jobs in a project database based on the names of input and output files, thereby generating a 'history' for the project based on job linkage. Supplied with a loaded database object and a loaded directories object, use the construct method to populate the history object. Use the parentsof and childrenof methods to return a list of parent/child job id numbers for each job. Use the update method to find and add new relationships for an existing history. Use the arelinked method to find out if two job ids are linked. A 'parent' job is a job that has one or more output files that are used as input to a 'child' job. Jobs are referenced using their job ids. Relationships cannot span projects in this implementation. This implementation also doesn't accommodate links between jobs that are not due to data flow - however in principle it would be possible to add arbitrary links using the addlink method. Note that the contents of a history object reflects the state of the project at the time that the construct method was invoked. Subsequent updates to the project data (addition or deletion of jobs, addition or removal of file references etc) will not be reflected in the history object. To circumvent this, either re-invoke the construct or update methods as required to rebuild the object contents.

Methods defined here:

__init__(self, database, directories)
Create a new history instance. The history class requires an open database object (or a subclass) supplied via the 'database' argument, and a loaded directories object supplied via the 'directories' argument.

addlink(self, parent_job, child_job)
Record a link between two jobs. This method is used when populating the history object; it creates a parent-child link between the supplied job ids 'parent_job' and 'child_job'.

allchildrenof(self, job_id, result_ids=[])
Return list all child jobs descended from a specific job Given a job number 'job_id' this method returns a list of all 'descendent' jobs, i.e. jobs that are immediate children of the specified job, plus jobs that are children of those children and so on. Essentially this returns the set of jobs descended from the specified job. The 'result_ids' argument is used internally when the method calls itself recursively - it is used to pass along and accumulate the results of earlier calls.

allparentsof(self, job_id, result_ids=[])
Return list all parent jobs descended from a specific job Given a job number 'job_id' this method returns a list of all 'antecedent' jobs, i.e. jobs that are immediate parents of the specified job, plus jobs that are parents of those parents and so on. Essentially this returns the set of jobs that anteceded the specified job. The 'result_ids' argument is used internally when the method calls itself recursively - it is used to pass along and accumulate the results of earlier calls.

arelinked(self, job_id1, job_id2)
Check whether two jobs are linked. This method returns True if the two jobs 'job_id1' and 'job_id2' are linked by a parent-child relationship within the history object, and False if not. The order that the job ids are supplied in is not important.

childrenof(self, job_id)
Return a list of the child jobs for a specific job. Given a job number 'job_id', this method returns a list of the child job ids linked within the history object. Raises an IndexError if the job is not found.

construct(self)
Initialise the relationship data between jobs in the database. This method constructs descriptions of the relationships between jobs in the database. It must be invoked before querying the history object. The construct method erases any existing data before building the relationships from scratch. Note that the history object only reflects the state of the project at the time that the construct method was last invoked. It is therefore necessary to either the reinvoke the construct method to rebuild the relationship data from scratch or to invoke the update method, if the contents of the project database (and possibly the directories object) are updated. In the current implementation there is little practical difference between the construct and update methods.

infiles(self, job_id)
Return a list of the input files associated with a job. Supplied with a job number 'job_id', this method returns a list of the full path names for each of the files.

outfiles(self, job_id)
Return a list of the output files associated with a job. Supplied with a job number 'job_id', this method returns a list of the full path names for each of the files.

parentsof(self, job_id)
Return a list of the parent jobs for a specific job. Given a job number 'job_id', this method returns a list of the parent job ids linked within the history object. Raises an IndexError if the job is not found.

update(self)
Update the relationship data between jobs in the database. This method updates descriptions of the relationships between jobs in the database. It differs from the construct method in that it does not clear any existing data before updating. It is possible therefore that 'phantom' links or jobs may persist between updates using this method.

updatelinks(self, job_id)
Not implemented. This method was intended to (re)generate the links associated with the specified job 'job_id', but has not been implemented.

class lockfile

    CCP4i lock file class. This class provides methods for creating and manipulating lock files associated with arbitrary files within CCP4i. The lockfile object manages a lock file associated with the the resource file that is specified on instantiation. The lockfile name is the same as that of the resource file, with the extension '.LOCK' appended to it. The class provides methods to set up, check and remove the associated lock file.

Methods defined here:

__init__(self, filen, resource_desc='')
Instantiate a lockfile object. A lockfile object is associated with the file supplied by the 'filen' argument - the resource. By default the 'resource description' is taken as the filename with any leading path stripped off, but this can be overriden by supplying a description string via the 'resource_desc' argument.

filename(self)
Return the lockfile name (including the path).

haslock(self)
Check whether the resource is locked by this lockfile object. This method returns True provided that the resource is locked (i.e. a physical lock file exists) and that the the lock file was created by this instance of the lockfile class. If either condition is not satisfied then the method returns False.

islocked(self)
Check whether the resource is locked. This method returns True if the resource file that the lockfile object is associated with is 'locked' (that is, a physical lock file exists), and False if it is not locked. Note that it is possible for the resource to be locked but for the lock to be owned by a different lockfile instance - in this case, 'islocked' will still indicate that the resource is locked. Use the 'haslock' method to check if the resource is actually locked by this instance of the lockfile class.

lock(self)
Lock the file resource. Attempts to create the physical lock file associated with the resource. This will fail is the resource is already locked (by any lockfile object, not necessarily only this one). Returns True if the lock is successfully created, False if there is a failure.

resource(self)
Return the name of the resource (i.e. file) being locked.

unlock(self, force=False)
Unlock a file resource. This method attempts to unlock a file resource that was previously locked, by removing the physical lock file. The operation will fail if the resource is not currently locked, or if the lock on the resource belongs to a different lockfile object - in this second case, the setting the 'force' to True forces the removal of the lock file regardless of ownership. This method returns True on successful removal of the lock and False otherwise.

class projectDB(database)

    CCP4i project database class. This extends the base 'database' class to allow subjobs to be defined and manipulated. It accepts the same arguments on instantiation as the parent class. Subjob databases are opened in a 'lazy' fashion i.e. only when they are needed.

Methods defined here:

__init__(self, project, directory, interval=0)
Initialise the project database.

addfile(self, jobid, ftype, dirs, filename, alias)
Add a file to the list of files for a job. Over-ride the 'addfile' method of the parent class in order to also deal with subjobs.

addsubjob(self, job, taskname, title, status='STARTING')
Add a subjob to a job in the project database. If this is the first subjob to be added to the job, then a new subjob database will be created first. The id number of the subjob will be returned.

buildjobid(self, job, subjob=None)
Build a single job id given the job and subjob components. This is a wrapper for the 'MakeJobid' function.

close(self)
Close the project database. This extends the 'close' method of the base class, by also closing all the opened subjob databases.

deletejob(self, jobid)
Delete a subjob from a job in the project database. Over-ride the 'deletejob' method of the parent class. 'jobid' can be a job or subjob identifier.

describejob(self, jobid, itemlist, formatlist)
Return a formatted string populated with job data. This over-rides the method in the base class in order to be able to also deal with subjobs.

getdata(self, jobid, item)
Retrieve the value of a data item stored for a job. This over-rides the 'getdata' method in the parent class. 'job' can be either an id for a job in the main database, or for a subjob.

getsubjobdb(self, job)
Return an existing subjobDB object associated with a job. This returns a subjobDB object associated with a job id, or 'False' if a subjob database doesn't already exist for the specified job.

has_subjobs(self, job)
Check if a job currently has subjobs defined.

hasjob(self, jobid)
Checks if a job with the specified id exists in the database. Given a job id, returns True if the database contains a job with that id and False if not. Overrides the version in the parent class.

itemexists(self, jobid, item)
Check for the existence of a data item for a particular job. This over-rides the 'itemexists' method in the parent class. 'jobid' can be either an id for a job in the main database, or for a subjob.

listjobs(self, jobid=None)
Return a list of jobs or subjobs. Over-ride the 'listjob' method of the parent class. If no jobid is supplied then return the list of top-level jobs in the project; if a jobid is supplied for a top-level job then return the list of associated subjobs.

listjobs_withsubjobs(self)
Return a list of the jobs which also have subjobs. This could be quite slow for large projects since it attempts to open subjob databases for every job, to test for their existence. Possibly the projectDB object could keep track of the subjob data internally to speed this up in future.

opensubjobdb(self, job, create=True)
Open a subjob database for the specified job id. Returns an open subjobDB object, or 'False' if no subjob could be opened. If the 'create' argument is True then if the subjob database does exist in the file system then a new subjob database will be created (this is the default). To avoid this set the 'create' argument to False.

removefile(self, jobid, ftype, dirs, filename)
Remove a file from the list of files for a job or subjob. Over-ride the 'addfile' method of the parent class in order to also deal with subjobs.

save(self)
Save the project database to file. This extends the 'save' method of the base class, by also performing save operations on all of the opened subjob databases.

selectsubjobs(self, job, item, pattern)
Retrieve a list of subjobs based on some selection criterion.

setdata(self, jobid, item, value)
Set the value of a data item for the specified job or subjob. This over-rides the 'setdata' method in the parent class. 'jobid' can be either an id for a job in the main database, or for a subjob.

splitjobid(self, jobid)
Split a job id into job and subjob components. This is a wrapper for the 'SplitJobid' function.

Methods inherited from database:

__del__(self)
del(project) Saves data, releases lock and deletes the object.

__len__(self)
len(project) Returns the number of jobs stored in the project, or zero if the project is not loaded.

__nonzero__(self)
Returns True if the project is loaded, False otherwise.

__repr__(self)
Returns the project name.

addinputfile(self, job, dirs, filename, alias)
Add a file to the list of input files for a job. This function wraps the addfile method: 'job' is a job number, 'dirs' is a directories object, 'filename' is an absolute or relative path, and 'alias' is an optional project name which is used to resolve relative paths but is otherwise ignored. The method returns True on success and False otherwise.

addoutputfile(self, job, dirs, filename, alias)
Add a file to the list of output files for a job. This function wraps the addfile method: 'job' is a job number, 'dirs' is a directories object, 'filename' is an absolute or relative path, and 'alias' is an optional project name which is used to resolve relative paths but is otherwise ignored. The method returns True on success and False otherwise.

auto_refresh(self)
Return the value of the auto_refresh flag. The 'auto_refresh' feature means that read attempts on a 'stale' database automatically cause the data to be refreshed from file in read-only mode. See the set_auto_refresh method for more information.

create(self)
Make a new project. Creating a new project includes creating the project directory (if it doesn't already exist), and creating the database subdirectory and database.def file. The project is then opened and the project object is returned. The create operation will fail (returning False) if the project already exists.

describe(self, joblist, itemlist, formatlist)
Return a list of formatted strings populated with job data. For each job id number in the joblist, the describe method retrieves the values of each of the data items in the itemlist and creates a description string by placing those values into fields with character widths supplied in the formatlist. It returns a list of these strings, with one list item per job. Allowed data items can be any stored in the project database plus JOB_ID (the id number of the job), which is an implicit data item.

exists(self)
Test if the project exists. Returns True if the project directory and database.def file both exist, and False otherwise.

getDbFile(self)
Return the database file for the project.

getDbItems(self)
Return a list of the data items stored for all jobs.

getDbdir(self)
Return the database directory of the project.

getProjectname(self)
Return the name of the project.

getmessage(self)
Retrieve last internal error message. If an operation fails then the client application can access the last error message using this method. Invoking getmessage clears the error message.

haslock(self)
Test whether the project is locked by this process. Returns True if the project database.def file has a lockfile owned by this process, and False otherwise. The check is only performed after a specified time interval, in between the value is cached. Set the time interval to 0 to ensure that the value is never cached.

isloaded(self)
Test whether the project data has been loaded. Returns True if the database object has been populated from the file on disk.

islocked(self)
Test whether the project is locked. Returns True if the project database.def file has a lockfile from any process, and False otherwise.

isreadable(self)
Test if it is possible to get data from the object. Returns True if the database object has been loaded, and has data that is at least as recent as the persistent storage on disk.

isstale(self)
Check if the data in the object is older than on disk. Return True if the modification time of the resource file is more recent than the last save. If so then this suggests that the data in this object may not accurately mirror the data in the persistent storage. The check is only performed after a specified time interval, in between the value is cached. Set the time interval to 0 to ensure that the value is never cached.

iswriteable(self)
Test if it is possible to modify and save the data. Returns True if the database object has been loaded, owns the lock on the persistent storage on disk, and has data that is at least as recent as the persistent storage.

listfiles(self, job, ftype, dirs)
Return a list of files for a job. The job is specified by an id number. 'ftype' specifies whether the file list should be the associated 'input' or 'output' files. 'dirs' is a ccp4i directories object. Returns a list with the full filenames for each of the files associated with the job, or a blank list if there are no files.

listinputfiles(self, job, dirs)
Return a list of input files for a job. The job is specified by an id number,'dirs' is a ccp4i directories object. Returns a list with the full filenames for each of the files associated with the job, or a blank list if there are no files.

listoutputfiles(self, job, dirs)
Return a list of output files for a job. The job is specified by an id number,'dirs' is a ccp4i directories object. Returns a list with the full filenames for each of the files associated with the job, or a blank list if there are no files.

newjob(self, taskname='', status='STARTING', title='')
Create a new job record in the project. Creates a new entry in the current project with the current date and returns the id number for the new job. Optionally the taskname (TASKNAME data item), the status (STATUS item) and/or title (TITLE item) may also be set when the job is created. If no taskname is specified then it is set to 'none', if no status is specified then it is set to 'STARTING'. (Note that blank values of the taskname can crash CCP4i and cause corruption of the database.) The operation will fail and return -1 if project is not loaded, or if the current object has lost the lock on the database.def file.

njobs(self)
Return the value of the NJOBS data item. For historical reasons, the NJOBS data item records the current highest job id number used in the project, and is not generally the actual number of jobs in the project. To get the actual number of jobs in an object project, do len(project). Returns -1 if the project has not been correctly loaded.

open(self, grablock=False, readonly=False, strict=False)
Open a project. Read the data from the project database and return the project object. The open operation will fail (return False) if the project is locked, if the project is already loaded, or if the project doesn't exist. Set the 'grablock' argument to True to override any existing lock and force loading of the database. Specifying 'readonly' as True populates the database object from the file on disk without attempting to lock the resource. However it will clear an existing lock, so the process may still own the lock even if it is in 'read-only' mode. If the 'strict' flag is true then header information about the project name and database directory are checked against the internal values, and warnings issued if there is a discrepancy.

refresh(self, grablock=False, readonly=True)
Reload the data from the file into a database object. Use the 'refresh' method to re-read the data from the file on disk into an existing (loaded) database object. This is necessary for example if the file is updated by an external process. Essentially the database object is closed and then reopened. By default the data is reloaded in 'readonly' mode. Set the 'grablock' argument to True to override any existing lock and force loading of the database in write mode - see the 'open' method.

removeinputfile(self, job, dirs, filename)
Remove a file from the list of input files for a job. The job is specified by an id number. 'dirs' is a ccp4i directories object. Returns True if the operation was successful, False if there was a problem.

removeoutputfile(self, job, dirs, filename)
Remove a file from the list of output files for a job. The job is specified by an id number. 'dirs' is a ccp4i directories object. Returns True if the operation was successful, False if there was a problem.

selectjobs(self, item, pattern)
Retrieve a list of jobs based on some selection criterion. Returns a list of jobs for which the value of the given data item matches the supplied regular expression pattern.

set_auto_refresh(self, auto_refresh)
Set the auto_refresh flag. 'auto_refresh' takes a boolean value. Setting it to 'True' means that a read attempt on a 'stale' database (i.e. when the source database file is newer than the data in the object) will automatically invoke the 'refresh' method to load the data in read-only mode. Setting it to 'False' means that the 'refresh' method must be explicitly invoked.

updatetime(self, job)
Update the time of the job to be the current time. This wraps the setdata method to automatically set the DATE attribute of the job specified by the id number, and returns the result of the setdata operation.

class subjobDB(database)

    CCP4i subjob database class. This stores the job data for substeps defined for a job in the main project directory. The subjob data are stored in a subdirectory of the project database directory, called '<jobid>_database. The name of the project database directory must be given explicitly as the 'dbdir' argument. Note that the subjob databases differ from the base class databases, in that when a subjob db is closed it will automatically delete the persistent storage (i.e. the subdirectory and database.def file) if there are no subjobs stored at this point. The subjobDB class accepts an 'interval' argument on instantiation which performs the same function as in the base class.

Methods defined here:

__init__(self, job, taskname, dbdir, interval=0)
Initialise the subjob database. The calling process should include the taskname and job id number of the main job to which the subjobs belong.

close(self)
Close the subjob database. This extends the 'close' method of the base class. If there are no subjobs defined at the point of closure then the subjob database is removed.

getId(self)
Return the identifier for the subjob database.

Methods inherited from database:

__del__(self)
del(project) Saves data, releases lock and deletes the object.

__len__(self)
len(project) Returns the number of jobs stored in the project, or zero if the project is not loaded.

__nonzero__(self)
Returns True if the project is loaded, False otherwise.

__repr__(self)
Returns the project name.

addfile(self, job, ftype, dirs, filename, alias)
Add a file to the list of files for a job. The job is specified by an id number. 'ftype' specifies whether the file should be associated as an 'input' or 'output' file. 'filename' specifies the name of the file to be added to the can be either a full path or a relative path, and 'alias' specifies a project alias to be associated with the file. The alias is only used to resolve relative paths in the filename, and is ignored if a full path is supplied for the file being added. 'dirs' is a ccp4i directories object. Returns True if the operation was successful, False if there was a problem. Note that if the file name already appears in the list of files then this is not an error.

addinputfile(self, job, dirs, filename, alias)
Add a file to the list of input files for a job. This function wraps the addfile method: 'job' is a job number, 'dirs' is a directories object, 'filename' is an absolute or relative path, and 'alias' is an optional project name which is used to resolve relative paths but is otherwise ignored. The method returns True on success and False otherwise.

addoutputfile(self, job, dirs, filename, alias)
Add a file to the list of output files for a job. This function wraps the addfile method: 'job' is a job number, 'dirs' is a directories object, 'filename' is an absolute or relative path, and 'alias' is an optional project name which is used to resolve relative paths but is otherwise ignored. The method returns True on success and False otherwise.

auto_refresh(self)
Return the value of the auto_refresh flag. The 'auto_refresh' feature means that read attempts on a 'stale' database automatically cause the data to be refreshed from file in read-only mode. See the set_auto_refresh method for more information.

create(self)
Make a new project. Creating a new project includes creating the project directory (if it doesn't already exist), and creating the database subdirectory and database.def file. The project is then opened and the project object is returned. The create operation will fail (returning False) if the project already exists.

deletejob(self, job)
Remove an existing job record from the project. The job with the specified id number will be removed from the project, also removing all associated data, returning True on success. The operation will fail if the database is not loaded, if the object has lost the lock on the database.def file, or if a job with the specified id number is not found.

describe(self, joblist, itemlist, formatlist)
Return a list of formatted strings populated with job data. For each job id number in the joblist, the describe method retrieves the values of each of the data items in the itemlist and creates a description string by placing those values into fields with character widths supplied in the formatlist. It returns a list of these strings, with one list item per job. Allowed data items can be any stored in the project database plus JOB_ID (the id number of the job), which is an implicit data item.

describejob(self, job, itemlist, formatlist)
Return a formatted string populated with job data. For the specified job id number, the describejob method retrieves the values of each of the data items in the itemlist and creates a description string by placing those values into fields with character widths supplied in the formatlist. Allowed data items can be any stored in the project database plus JOB_ID (the id number of the job), which is an implicit data item.

exists(self)
Test if the project exists. Returns True if the project directory and database.def file both exist, and False otherwise.

getDbFile(self)
Return the database file for the project.

getDbItems(self)
Return a list of the data items stored for all jobs.

getDbdir(self)
Return the database directory of the project.

getProjectname(self)
Return the name of the project.

getdata(self, job, item)
Retrieve the value of a data item stored for a job. The job is specified by an id number. If the job doesn't exist, or if the item is otherwise inaccessible as indicated by a call to the itemexists method, then the operation raises an IndexError exception. Otherwise, the specific value of the data item for the specified job id is returned (or the 'generic' value, if a specific value was not found). If the database is not readable then 'None' is returned.

getmessage(self)
Retrieve last internal error message. If an operation fails then the client application can access the last error message using this method. Invoking getmessage clears the error message.

hasjob(self, job)
Checks if a job with the specified id exists in the database. Given a job id, returns True if the database contains a job with that id and False if not.

haslock(self)
Test whether the project is locked by this process. Returns True if the project database.def file has a lockfile owned by this process, and False otherwise. The check is only performed after a specified time interval, in between the value is cached. Set the time interval to 0 to ensure that the value is never cached.

isloaded(self)
Test whether the project data has been loaded. Returns True if the database object has been populated from the file on disk.

islocked(self)
Test whether the project is locked. Returns True if the project database.def file has a lockfile from any process, and False otherwise.

isreadable(self)
Test if it is possible to get data from the object. Returns True if the database object has been loaded, and has data that is at least as recent as the persistent storage on disk.

isstale(self)
Check if the data in the object is older than on disk. Return True if the modification time of the resource file is more recent than the last save. If so then this suggests that the data in this object may not accurately mirror the data in the persistent storage. The check is only performed after a specified time interval, in between the value is cached. Set the time interval to 0 to ensure that the value is never cached.

iswriteable(self)
Test if it is possible to modify and save the data. Returns True if the database object has been loaded, owns the lock on the persistent storage on disk, and has data that is at least as recent as the persistent storage.

itemexists(self, job, item)
Check for the existence of a data item for a particular job. Returns True if the data item is found for the specified job id number. If the item is not found for the specified job id, but was defined in the template database.def file, then this method also returns True. Otherwise, itemexists returns False, indicating that the item is not accessible for the specified job. itemexists will also return False in the event that the project is not loaded.

listfiles(self, job, ftype, dirs)
Return a list of files for a job. The job is specified by an id number. 'ftype' specifies whether the file list should be the associated 'input' or 'output' files. 'dirs' is a ccp4i directories object. Returns a list with the full filenames for each of the files associated with the job, or a blank list if there are no files.

listinputfiles(self, job, dirs)
Return a list of input files for a job. The job is specified by an id number,'dirs' is a ccp4i directories object. Returns a list with the full filenames for each of the files associated with the job, or a blank list if there are no files.

listjobs(self)
Return the list of all jobs in the project. The list is an unsorted list of job id numbers, or an empty list if the project is not currently loaded.

listoutputfiles(self, job, dirs)
Return a list of output files for a job. The job is specified by an id number,'dirs' is a ccp4i directories object. Returns a list with the full filenames for each of the files associated with the job, or a blank list if there are no files.

newjob(self, taskname='', status='STARTING', title='')
Create a new job record in the project. Creates a new entry in the current project with the current date and returns the id number for the new job. Optionally the taskname (TASKNAME data item), the status (STATUS item) and/or title (TITLE item) may also be set when the job is created. If no taskname is specified then it is set to 'none', if no status is specified then it is set to 'STARTING'. (Note that blank values of the taskname can crash CCP4i and cause corruption of the database.) The operation will fail and return -1 if project is not loaded, or if the current object has lost the lock on the database.def file.

njobs(self)
Return the value of the NJOBS data item. For historical reasons, the NJOBS data item records the current highest job id number used in the project, and is not generally the actual number of jobs in the project. To get the actual number of jobs in an object project, do len(project). Returns -1 if the project has not been correctly loaded.

open(self, grablock=False, readonly=False, strict=False)
Open a project. Read the data from the project database and return the project object. The open operation will fail (return False) if the project is locked, if the project is already loaded, or if the project doesn't exist. Set the 'grablock' argument to True to override any existing lock and force loading of the database. Specifying 'readonly' as True populates the database object from the file on disk without attempting to lock the resource. However it will clear an existing lock, so the process may still own the lock even if it is in 'read-only' mode. If the 'strict' flag is true then header information about the project name and database directory are checked against the internal values, and warnings issued if there is a discrepancy.

refresh(self, grablock=False, readonly=True)
Reload the data from the file into a database object. Use the 'refresh' method to re-read the data from the file on disk into an existing (loaded) database object. This is necessary for example if the file is updated by an external process. Essentially the database object is closed and then reopened. By default the data is reloaded in 'readonly' mode. Set the 'grablock' argument to True to override any existing lock and force loading of the database in write mode - see the 'open' method.

removefile(self, job, ftype, dirs, filename)
Remove a file from the list of files for a job. The job is specified by an id number. 'ftype' specifies whether the file should be associated as an 'input' or 'output' file. 'dirs' is a ccp4i directories object. Returns True if the operation was successful, False if there was a problem.

removeinputfile(self, job, dirs, filename)
Remove a file from the list of input files for a job. The job is specified by an id number. 'dirs' is a ccp4i directories object. Returns True if the operation was successful, False if there was a problem.

removeoutputfile(self, job, dirs, filename)
Remove a file from the list of output files for a job. The job is specified by an id number. 'dirs' is a ccp4i directories object. Returns True if the operation was successful, False if there was a problem.

save(self)
Save the project database contents to file. The save method writes the data in the object to persistent storage (i.e. the database.def file for the project). On successful completion True is returned. The save operation will fail if the project is not loaded, or if the object has lost the lock on the database.def file; in both cases False is returned.

selectjobs(self, item, pattern)
Retrieve a list of jobs based on some selection criterion. Returns a list of jobs for which the value of the given data item matches the supplied regular expression pattern.

set_auto_refresh(self, auto_refresh)
Set the auto_refresh flag. 'auto_refresh' takes a boolean value. Setting it to 'True' means that a read attempt on a 'stale' database (i.e. when the source database file is newer than the data in the object) will automatically invoke the 'refresh' method to load the data in read-only mode. Setting it to 'False' means that the 'refresh' method must be explicitly invoked.

setdata(self, job, item, value)
Set the value of a data item for the specified job. The job is specified by an id number. The operation returns True if the value was successlly updated and False if the database is not writeable, or if the data item is valid but could not be updated. An IndexError exception will be raised if the data item is invalid, either because the job doesn't exist or because the data item doesn't exist for the job, as indicated by a call to the itemexists method. Note that the validity of the new value is not checked, except for the TASKNAME data item, which must not be blank.

updatetime(self, job)
Update the time of the job to be the current time. This wraps the setdata method to automatically set the DATE attribute of the job specified by the id number, and returns the result of the setdata operation.

Functions

DirExists(directory)
Check for the existence of a directory. This will also return a negative result if the target exists but is not a directory.

FileExists(filen)
Check for the existence of a file.

FileMtime(filen)
Return the modification time of a file.

FindExecutable(program)
Return the full path for the named program. FindExecutable operates by looking for the named 'program' in each directory in the user's path. Under Windows it appends '.exe' automatically. If the program file is found then the full path is returned, otherwise the program name is returned.

FormatDate(epoch)
Return an epoch formatted as a date. Given an epoch value in seconds this returns a more user-friendly string representation similar to that used in CCP4i: if the epoch is a time with the last 24 hours then it is returned as 'hours:minutes:seconds'; otherwise it is returned as 'day month year'.

GetAbsolutePath(filen)
Return the absolute path - assume that relative paths are rooted at the current working directory.

GetDate()
Return the current date. The date is in the format e.g. 27 Feb 2006 15:05:34.

GetDbDir(projectdir)
Return the name of the CCP4i database directory for a project. Given the project directory path, the name of the default CCP4i database subdirectory is returned.

GetDbFile(projectdir)
Return the name of the database.def file. The sole argument is the name of the project directory.

GetDefDate()
Return the current date for a def file header. The date is in the format dd/mm/yy HH:MM:SS.

GetDotCCP4()
Return the location of the .CCP4 directory.

GetEnvVar(var)
Return the value of an environment variable.

GetFileRootname(filen)
Return the rootname of a filename.

GetOPSYS()
Return 'WINDOWS' or 'UNIX', depending on the operating system.

GetOpsys()
Return 'windows' or 'unix', depending on the operating system.

GetPid()
Return the current process id.

GetPlatform()
Return the platform information.

GetUserId()
Return the username of the current user. This function returns the value of the environment variable USER, which should be set on UNIX type systems. If USER is not set then USERNAME is retrieved, which should be set on Windows based systems. Otherwise 'None' will be returned.

InitialiseDotCCP4()
Create or update the .CCP4 directory. This function replicates a subset of the functionality of the InitialiseDotCCP4 procedure in CCP4i's system.tcl.

ListDbItems()
Return a template list of the items in database record.

MakeDbFile(dbfile, project)
Make a new CCP4i database.def file. The arguments are the absolute filename and the name of the project. A database file will be generated from the template database.def file in the CCP4 distribution.

MakeDir(directory)
Create a new directory.

MakeJobid(jobid, subjobid=None)
Return a job id from job and (optionally) subjob components. Given a job id number 'jobid' and (optionally) a subjob id number 'subjobid', construct a compound job id. If no subjob id is supplied then the result is just the job id; if a subjob id is also supplied then the compound id is returned that uniquely identifies the subjob as a child of the job.

SearchPath(topname, *elements)
Return the full path name for a CCP4i code- or data-file. This function is based on the CCP4i 'SearchPath' command. 'topname' should be either 'TOP' or 'HELP', with the remaining elements being subdirectories (typically ending with a filename). SearchPath will then look for the specified file first in the user's .CCP4, then in the DBCCP4I_TOP area, and finally in the CCP4I_TOP area. It will return the first match that it finds. For example: SearchPath('TOP','etc','database.def') will look in $USER/.CCP4I/CCP4I_TOP/etc/, $DBCCP4I_TOP/etc/ and $CCP4I_TOP/etc/ for the file 'database.def'.

SplitJobid(jobid)
Split a job id into job and subjob components. Job ids can be either single (integer) numbers, in which case they are taken as referring to jobs in the main project database, or they can be a pair of numbers joined with a dot '.', in which case they are taken as referring to subjobs. The first number is the main job number and the second the subjob number. This method returns a tuple containing two items. The first is the job number, and the second is either a subjob number or 'None'.

simulateFileAccessDelay()
Simulate a delay in a file access operation. This is used for testing purposes only; the value of the delay should be zero in normal usage.

tokenise(line)
Tokenise a string and return a list. Typically the string will be a line from a CCP4i def file. Tokens are delimited by whitespace, by double quotes, and by curly braces '{' and '}', with quoted whitespace within a token not counting as a delimiter. The tokens are returned with the surrounding quoting characters and spaces removed.

version()
Return the version number of the module extracted from the RCS string.

Data

__version__ = '$Revision: 1.7 $'