Tree
Branch creation enhancement and clarifications
- Make the leaflist optional if the address points to a single numerical variable:
Int_t value;
tree->Branch(branchname, &value);
-
Introduce a way to create branch using directly
an object:
MyClass object;
TBranch *branch = tree->Branch(branchname, &object, bufsize, splitlevel)
- Clarify the ownership rules of user objects in a TTree. This clarification (and the improved auto-add-to-directory behavior
of the TH1*) allows for the TTree to now delete the memory that
its has allocated and whose ownsership was _not_ transfer back
to the user (this is happens any time the user give the TTree
the address of a pointer):
For a top-level branch the meaning of addr is as follows:
If addr is zero, then we allocate a branch object
internally and the branch is the owner of the allocated
object, not the caller. However the caller may obtain
a pointer to the branch object with GetObject().
Example:
branch->SetAddress(0);
Event* event = branch->GetObject();
... Do some work.
If addr is not zero, but the pointer addr points at is
zero, then we allocate a branch object and set the passed
pointer to point at the allocated object. The caller
owns the allocated object and is responsible for deleting
it when it is no longer needed.
Example:
Event* event = 0;
branch->SetAddress(&event);
... Do some work.
delete event;
event = 0;
If addr is not zero and the pointer addr points at is
also not zero, then the caller has allocated a branch
object and is asking us to use it. The caller owns it
and must delete it when it is no longer needed.
Example:
Event* event = new Event();
branch->SetAddress(&event);
... Do some work.
delete event;
event = 0;
These rules affect users of TTree::Branch(),
TTree::SetBranchAddress(), and TChain::SetBranchAddress()
as well because those routines call this one.
An example of a tree with branches with objects allocated
and owned by us:
TFile* f1 = new TFile("myfile_original.root");
TTree* t1 = (TTree*) f->Get("MyTree");
TFile* f2 = new TFile("myfile_copy.root", "recreate");
TTree* t2 = t1->Clone(0);
for (Int_t i = 0; i < 10; ++i) {
t1->GetEntry(i);
t2->Fill();
}
t2->Write()
delete f2;
f2 = 0;
delete f1;
f1 = 0;
An example of a branch with an object allocated by us,
but owned by the caller:
TFile* f = new TFile("myfile.root", "recreate");
TTree* t = new TTree("t", "A test tree.")
Event* event = 0;
TBranchElement* br = t->Branch("event.", &event);
for (Int_t i = 0; i < 10; ++i) {
... Fill event with meaningful data in some way.
t->Fill();
}
t->Write();
delete event;
event = 0;
delete f;
f = 0;
Notice that the only difference between this example
and the following example is that the event pointer
is zero when the branch is created.
An example of a branch with an object allocated and
owned by the caller:
TFile* f = new TFile("myfile.root", "recreate");
TTree* t = new TTree("t", "A test tree.")
Event* event = new Event();
TBranchElement* br = t->Branch("event.", &event);
for (Int_t i = 0; i < 10; ++i) {
... Fill event with meaningful data in some way.
t->Fill();
}
t->Write();
delete event;
event = 0;
delete f;
f = 0;
TTreeFormula (TTree::Draw, TTree::Scan)
- Fix CollectionTree->Scan("reco_ee_et[][2]:reco_ee_et[0][2]")
where reco_ee_et is a vector<vector<double> > See http://root.cern.ch/phpBB2/viewtopic.php?t=6536
- Insure that the formula that are used as indices or as argument to special functions have their branch(es) loaded once. This fixes http://root.cern.ch/phpBB2/viewtopic.php?p=27080#27080
- Correct the drawing of "X[1]:X[5]" when X is a vector< vector<float> >
and X[1].size()!=X[5].size(). (reported at http://root.cern.ch/phpBB2/viewtopic.php?p=27070)
- Correct the passing of NaN to function being called by TTree::Draw.
Splitting STL collections of pointers
STL collection of pointers can now be split by calling
TBranch *branch = tree->Branch( branchname, STLcollection, buffsize, splitlevel )
where STLcollection is the address of a pointer to std::vector, std::list,
std::deque, std::set or std::multiset containing pointers to objects.
and where the splitlevel is a value bigger than 100 then the collection
will be written in split mode. Ie. if it contains objects of any
types deriving from TTrack this function will sort the objects
basing on their type and store them in separate branches in split
mode.
The ROOT test example in ROOTSYS/test/bench.cxx shows many examples of collections
and storage in a TTree when using split mode or not. This program illustrates the important
gain in space and time when using this new facility.
Parallel unzipping
Introducing a parallel unzipping algorithm for pre-fetched buffers. Since we already know what buffers are going to be read, we can decompress a few of them in advance in an additional thread and give the impression that the data decompression comes for free (we gain up to 30% in reading intensive jobs).
The size of this unzipping cache is 20% the size of the TTreeCache and can be modified with TTreeCache::SetUnzipBufferSize(Long64_t bufferSize). Theoretically, we only need one buffer in advance but in practice we might fall short if the unzipping cache is too small (synchronization costs).
This experimental feature is disabled by default, to activate it use the static function
TTreeCache::SetParallelUnzip(TTreeCacheUnzip::EParUnzipMode option = TTreeCacheUnzip::kEnable).
The possible values to pass are: - TTreeCacheUnzip::kEnable to enable it
- TTreeCacheUnzip::kDisable to disable it
- TTreeCacheUnzip::kForce to force it.
The TTreeCacheUnzip is actived
only if you have more than one core. To activate it with only one core useTTreeCacheUnzip::kForce option (for example to measure the overhead).
Disk and Memory Space Gain
In ROOT older than v5.20/00, the branches' last basket, also known as the write basket, was always saved in the same "key" as the TTree object and was always present in memory when reading or writing.
When reading this write basket was always present in memory even if the branch was never accessed.
Starting in v5.20/00, TTree::Write closes out, compresses (when requested) and writes to disk in their own file record the write baskets of all the branches.
(This is implemented via the new function TTree::FlushBaskets, TBranch::FlushBaskets, TBranch::FlushOneBaskets)
TTree::AutoSave supports a new option "FlushBaskets" which will call FlushBaskets before saving the TTree object.
Benefits
Flushing the write baskets has several advantages:
- Reduce the file size of the TTree object (it not longer contains the last basket), improving read time of the TTree object
- Reduce memory footprint of the TTree object.
- In a TTree which "flushed" buffer, there is now usually only zero or one buffer in memory.
- Previously each branch always had at least one basket in memory and usually 2 (the write basket and one read basket).
- Now only the basket of the branches actually read are loaded in memory.
- allow for the basket to be compressed and stored separated, increasing the compression factor.
Note: Calling FlushBaskets too often (either directly of via AutoSave("FlushBaskets")) can lead to unnecessary fragmentation of the ROOT file,
since it write the baskets to disk (and a new basket will be started at the next fill) whether or not the content was close to filling the basket or not.
Others
- The fast tree cloning (TTreeCloner) was enhanced to support copying in-memory TTrees (that have been save as a single key on file). This issue was preventing hadd to fast clone files containing any 'in-memory' tree.
- Re-enabled the splitting of TVector3 and of any classes starting by TVector
that is not a TVectorT.
- Fix the list of StreamerInfo stored in the TFile in the case of a slow
CloneTree, previously some of the classes whose named contained '::' and any
of the STL container names was inadvertently omitted (in case of classes
that are part of the TTree but had only a base and no member or in some
cases where it had only object data members.
-
- Prevent storing a 2nd time an object non derived from TObject in the case
where the object is both the top level object of branch and has
some of it sub-object containing a pointer back to the object. (This was
actually activated in v5.18).
-
void TBranch::DeleteBaskets(Option_t* option)
new function which loops on all branch baskets. If the file where branch buffers reside is writable, free the disk space associated to the baskets of the branch, then call Reset(). If the option contains "all", delete also the baskets for the subbranches. The branch is reset.
NOTE that this function must be used with extreme care. Deleting branch baskets
fragments the file and may introduce inefficiencies when adding new entries
in the Tree or later on when reading the Tree.
- Protect TTree::GetCurrentFile in case the current directory is gROOT.
This case may happen when a TChain calls TChain::Process and no files have been
connected to the chain yet, but a TFile has been opened meanwhile.
- Remove the calls to MapObject introduce in revision 21384 when
are unnecessary hence restoring lost performance in case where
the TTree contains many simple type (double, int, etc.)
- In TBranchElement::Streamer when writing, call ForceWriteInfo
not only for the TStreamerInfo directly concerning this branch
but also (in the case of the top level branch of a split TClonesArray
or a split STL container) call ForceWriteInfo for the class of
the value.
This omission meant that slow CloneTree was (fataly) missing in
some cases the copy of the TStreamerInfo for class that are part
part of the TTree but had only a base and no member or in
some cases where it had only object data members.
- Fix the return value of the lookup in TChainIndex
when the value searched for does not exist.