TMVA Package
Factory
- NormMode: as called in PrepareTrainingAndTestTree
The default has changed and is now "EqualNumEvents" (with fixed meaning)
While previously NumEvents and EqualNumEvents (by mistake/miscommunication)
took into account training+test events, they are now correctly
normalising only "Training Events" (note the reason for these
normalisations was to have the possibility to easily force the
effective (weighted) number of training events used for Signal (class
0) to equal the number of training events in the Backgr. (sum of all
remaining classes in multiclass mode)
- NumEvents:
- - the weighted number of events is scaled, independently for signal and
backgroundm, such that the sum of weights equals the number of events given in the
Factory::PrepareTrainingAndTestTree("",nTrain_Signal=3000,nTrain_Background=6000) call.
This example call will give hence end up in having 2x more background
events in the training compared to the signal, no matter what the
individual event weights have been. (watch out! if you specify
nTrain_Signal=0,nTrain_Background=0), then the ratio will be according
to total numbers of MC events in the signal and background
respectively, which could be very different from the usually good
ratio of having about the same weighted number signal and background
events in the training. In that case it is better to use:
- EqualNumEvents:
- - for the signal events, the same is done as for NumEvents.
The background events however are reweighted such, that their sum of weights
equals that for the signal events. This results in the same effective (weighted)
number of signal and background events to be seen in the training.
- Transformations=I is default again in Factory (this defines which
variables distribution plots are added to the TMVA output file - and
hence displayable via the TMVAGui)
Boosted Decision Trees
some changes to the training options:
- nEventsMin (deprecated) please replace by --> MinNodeSize
- The option nEventsMin which specified the minimum number of training event
in a leaf node as an absolute number has been replaced by "MinNodeSize"
which is given in "percentage of the trainin sample". Like this the training
options become less dependent on the actual number of training sample size
- NNodesMax (deprecated) please replace by --> MaxDepth
- GradBaggingFraction and UseNTrainEvents replaced by BaggedSampleFraction
- - they both meant the same thing and are now
deprecated --> use BaggedSampleFraction instead
- UsedBaggedGrad replaced by UseBaggedBoost
- - like this, the use of a bagged sample in Grad-Boost or AdaBoost has the same option name
- UseWeightedTrees --> removed
-
- - it was default anyway and the only reasonable choice there is
- PruneBeforeBoost --> removed
-
- - it has been mostly a debug/trial option
- NegWeightTreatment=IgnoreNegWeights --> replaced by NegWeightTreatment=IgnoreNegWeightsInTraining
- - Unfortunatly the default "IgnoreNegWeights" to the BDT option "NegWeightTreatment"
collided with the a global option and had to be replaced.
MethodBoost
- some cleanup (removed strange experimental boosting option HighEdgeGaus,
HighEdgeCoPara ..... )
- remove options MethodWeightType... have it defined by the Boost Method
(these have been trial options.. but for clarity it is much better to stick
to the "standard" ones (i.e log(alpha) for AdaBoost etc)
- up to now, the first classifier was trained with
the full sample, I think however, it should also be a bagged
sample (i.e. particularily if smaller sample sizes for the bagged
samples were demanded) .. it's changed now, accordingly