I/O

MakeProject

Object Merging

We introduced a new explicit interface for providing merging capability. If a class has a method with the name and signature:
   Long64_t Merge(TCollection *input, TFileMergeInfo*);
it will be used by a TFileMerger (and thus by PROOF) to merge one or more other objects into the current object. Merge should return a negative value if the merging failed.

If this method does not exist, the TFileMerger will use a method with the name and signature:

   Long64_t Merge(TCollection *input);
TClass now provides a quick access to these merging function via TClass::GetMerge. The wrapper function is automatically created by rootcint and can be installed via TClass::SetMerge. The wrapper function should have the signature/type ROOT::MergeFunc_t:
   Long64_t (*)(void *thisobj, TCollection *input, TFileMergeInfo*);
We added the new Merge function to TTree and THStack. We also added the new Merge function to TQCommand as the existing TQCommand::Merge does not have the right semantic (in part because TQCommand is a collection).

In TFileMerger, we added a PrintLevel to allow hadd to request more output than regular TFileMerger.

We removed all hard dependencies of TFileMerger on TH1 and TTree. (Soft dependencies still exist to be able to disable the merging of TTrees and to be able to disable the AutoAdd behavior of TH1).

The object TFileMergeInfo can be used inside the Merge function to pass information between runs of the Merge (see below). In particular it contains:

   TDirectory  *fOutputDirectory;  // Target directory where the merged object will be written.
   Bool_t       fIsFirst;          // True if this is the first call to Merge for this series of object.
   TString      fOptions;          // Additional text based option being passed down to customize the merge.
   TObject     *fUserData;         // Place holder to pass extra information.  This object will be deleted at the end of each series of objects.
The default in TFileMerger is to call Merge for every object in the series (i.e the collection has exactly one element) in order to save memory (by not having all the object in memory at the same time).

However for histograms, the default is to first load all the objects and then merge them in one go ; this is customizable when creating the TFileMerger object.

Asynchronous Prefetching

The prefetching mechanism uses two new classes (TFilePrefetch and TFPBlock) to prefetch in advance a block of tree entries. There is a thread which takes care of actually transferring the blocks and making them available to the main requesting thread. Therefore, the time spent by the main thread waiting for the data before processing considerably decreases. Besides the prefetching mechanisms there is also a local caching option which can be enabled by the user. Both capabilities are disabled by default and must be explicitly enabled by the user.

In order to enable the prefetching the user must set the rootrc environment variable TFile.AsyncPrefetching as follows: gEnv->SetValue("TFile.AsyncPrefetching", 1). Only when the prefetching is enabled can the user set the local cache directory in which the file transferred will be saved. For subsequent reads of the same file the system will use the local copy of the file from cache. To set up a local cache directory, the client can use the following commands:

TString cachedir="file:/tmp/xcache/";
// or using xrootd on port 2000
// TString cachedir="root://localhost:2000//tmp/xrdcache1/";
gEnv->SetValue("Cache.Directory", cachedir.Data());
The TFilePrefetch class is responsible for actually reading and storing the requests received from the main thread. It also creates the working thread which will transfer all the information. Apart from managing the block requests, it also deals with caching the blocks on the local machine and retrieving them when necessary.

The TFPBlock class represents the encapsulation of a block request. It contains the chunks to be prefetched and also serves as a container for the information read.