TMVA
TMVA version 4.0.1 is included in this root release:
Main changes and new features introduced with TMVA 4
- Reorganisation of internal data handling and constructors
of methods to allow to build arbitrary composite MVA methods,
and to deal with multi-class classification and multi-target
regression
- Extended TMVA to multivariate multi-target regression
- Any TMVA method can now be boosted (linearly or
non-linearly)
- Transformation of input variables can be chained as wished
- Weight files are now in XML format
- New MVA methods "PDE-Foam" and "LD", both featuring
classification and regression
Comments
On XML format:
The old text format is obsolete though still readable in the
application. Backward compatibility is NOT guaranteed. Please
contact the authors if you require the reading of old text weight
files in TMVA 4.
Standard macros:
The structure of the standard macros has changed: macros are
still in the "$ROOTSYS/tmva/test" directory, but distinguished for
classification and regression examples:
TMVAClassification.C, TMVAClassificationApplication.C TMVARegression.C, TMVARegressionApplication.C
Classification and regression analysis (training) is analysed as
usual via standard macros that can be called from dedicated
GUIs.
Regression:
- Not yet available for all MVA methods. It exists for:
PDE-RS, PDE-Foam, K-NN, LD, FDA, MLP, BDT for single targets
(1D), and MLP for multiple targets (nD).
- Not all transformation of input variables are available
(only "Norm" so far).
- Regression requires specific evaluation tools:
- During the training we provide a ranking of input
variables, using various criteria: correlations, transposed
correlation, correlation ratio, and "mutual information" between
input variables and regression target. (Correlation ratio and
mutual information implmentations provided by Moritz Backes,
Geneva U)
- After the training, the trained MVA methods are ranked wrt.
the deviations between regression target and estimate.
- Macros plot various deviation and correlation quantities.
A new GUI (macros/TMVARegGui.C) collects these macros.
Improvements of / new features for MVA methods
Other improvements
- Improved handling of small Likelihood values such that
Likelihood performance increases in analyses with many variables
(~>10). Thanks to Ralph Schaefer (Bonn U.) for reporting this.
- Nicer plotting: custom variable titles and units can be
assigned in "AddVariable" call.
- Introduced the inverse transformation InverseTransform for
the variable transformations into the framework. While this is
not necessary for classification, it is necessary for
regression. The inverse transformation of the normalization
transformation has been implemented.
- Started to extend the variable transformations to the
regression targets as well.
- MethodCuts now produces the 'optimal-cut' histograms needed
by macro mvaeffs.C. (macro 5a of TMVAGui.C)
- MsgLogger can be silenced in order to prevent excess output
during boosting.
- Third dataset type added centrally (Training, Validation
and Testing). The validation data is split off the original
training data set.
- Update of GUI and other Macros according to the new
features of PDF and the addition of MethodBoost.
Updates in TMVA 4.0.1
- "Spectator" variables can be defined now which are computed
just as the input variables and which are written out into the
TestTree, but which don't participate in any MVA calculation
(useful for correlation studies).
- New booking option "IgnoreNegWeightsInTraining" to test the
effect of events with negative weights on the training. This is
especially useful for methods, which do not properly deal with
such events. Note that this new option is not available for all
methods (a training interrupt is issued if not available).
Bug fixes:
- Fixed regression bug in VariableNormalizeTransform (Use
number of targets from Event instead of DataSet)
- Fixed Multitarget-Regression in PDEFoam, foam dimensions
were miscalculated
- Added writing of targets to the weight files in regression
mode to fix problems in RegressionApplication
- Added missing standard C++ header files missing to some
classes, which lead to compilation failures on some
architectures (thanks to Lucian Ancu, Nijmegen, for reporting
these).
- Added checks for unused options to Factory and
DataSetFactory configuration options interpretation. Will now
complain if wrong option labels are used.
- Fixed standard creation of correlation matrix plots
- Fixed internal mapping problem giving a fatal error message
when destroying and recreating the Factory.