StatPatternRecognition: A C++ Package for Multivariate Classification of HEP Data
نویسنده
چکیده
Modern analysis of HEP data needs advanced statistical tools to separate signal from background. A C++ package has been implemented to provide such tools for the HEP community. The package includes linear and quadratic discriminant analysis, decision trees, bump hunting (PRIM), boosting (AdaBoost and arc-x4), bagging and random forest algorithms, a multi-class learner, and interfaces to the standard backpropagation neural net and radial basis function neural net implemented in the Stuttgart Neural Network Simulator. Supplemental tools such as bootstrap, estimation of data moments, a test of zero correlation between two variables with a joint elliptical distribution, and a multivariate goodness-of-fit method are also provided. The package offers a convenient set of tools for imposing requirements on input data, storing output into Root or Hbook, and handling multi-class data. Integrated in the BABAR computing environment, the package maintains a minimal set of external dependencies and can be easily adapted to any other HEP environment. It has been tested on many idealistic and realistic examples.
منابع مشابه
Predictive Factors for General Health Status in Iranian University Students an Unvariate and Multivariate Logistic Regression Analysis
Background: Health is a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity. Social, economic and cultural factors are main effective factors on person’s health. As university students are the spiritual resources of each society and the future manufacturers of their own country, the present study aimed to determine the predictive factors f...
متن کاملA New Framework for Distributed Multivariate Feature Selection
Feature selection is considered as an important issue in classification domain. Selecting a good feature through maximum relevance criterion to class label and minimum redundancy among features affect improving the classification accuracy. However, most current feature selection algorithms just work with the centralized methods. In this paper, we suggest a distributed version of the mRMR featu...
متن کاملUsing multivariate generalized linear latent variable models to measure the difference in event count for stranded marine animals
BACKGROUND AND OBJECTIVES: The classification of marine animals as protected species makes data and information on them to be very important. Therefore, this led to the need to retrieve and understand the data on the event counts for stranded marine animals based on location emergence, number of individuals, behavior, and threats to their presence. Whales are g...
متن کاملAn Object-Oriented Minimization Package for HEP
A portion of the HEP community has perceived the need for a minimization package written in C++ and taking advantage of the Object-Oriented nature of that langauge. To be acceptable for HEP, such a package must at least encompass all the capabilities of Minuit. Aside from the slight plus of not relying on outside Fortran compilation, the advantages that a C++ package based on O-O design would c...
متن کاملMADE4: an R package for multivariate analysis of gene expression data
SUMMARY MADE4, microarray ade4, is a software package that facilitates multivariate analysis of microarray gene-expression data. MADE4 accepts a wide variety of gene-expression data formats. MADE4 takes advantage of the extensive multivariate statistical and graphical functions in the R package ade4, extending these for application to microarray data. In addition, MADE4 provides new graphical a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006