StatPatternRecognition: A C++ Package for Multivariate Classification of HEP Data

نویسنده

  • I. Narsky
چکیده

Modern analysis of HEP data needs advanced statistical tools to separate signal from background. A C++ package has been implemented to provide such tools for the HEP community. The package includes linear and quadratic discriminant analysis, decision trees, bump hunting (PRIM), boosting (AdaBoost and arc-x4), bagging and random forest algorithms, a multi-class learner, and interfaces to the standard backpropagation neural net and radial basis function neural net implemented in the Stuttgart Neural Network Simulator. Supplemental tools such as bootstrap, estimation of data moments, a test of zero correlation between two variables with a joint elliptical distribution, and a multivariate goodness-of-fit method are also provided. The package offers a convenient set of tools for imposing requirements on input data, storing output into Root or Hbook, and handling multi-class data. Integrated in the BABAR computing environment, the package maintains a minimal set of external dependencies and can be easily adapted to any other HEP environment. It has been tested on many idealistic and realistic examples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predictive Factors for General Health Status in Iranian University Students an Unvariate and Multivariate Logistic Regression Analysis

Background: Health is a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity. Social, economic and cultural factors are main effective factors on person’s health. As university students are the spiritual resources of each society and the future manufacturers of their own country, the present study aimed to determine the predictive factors f...

متن کامل

A New Framework for Distributed Multivariate Feature Selection

Feature selection is considered as an important issue in classification domain. Selecting a good feature through maximum relevance criterion to class label and minimum redundancy among features affect improving the classification accuracy. However, most current feature selection algorithms just work with the centralized methods. In this paper, we suggest a distributed version of the mRMR featu...

متن کامل

Using multivariate generalized linear latent variable models to measure the difference in event count for stranded marine animals

BACKGROUND AND OBJECTIVES: The classification of marine animals as protected species makes data and information on them to be very important. Therefore, this led to the need to retrieve and understand the data on the event counts for stranded marine animals based on location emergence, number of individuals, behavior, and threats to their presence. Whales are g...

متن کامل

An Object-Oriented Minimization Package for HEP

A portion of the HEP community has perceived the need for a minimization package written in C++ and taking advantage of the Object-Oriented nature of that langauge. To be acceptable for HEP, such a package must at least encompass all the capabilities of Minuit. Aside from the slight plus of not relying on outside Fortran compilation, the advantages that a C++ package based on O-O design would c...

متن کامل

MADE4: an R package for multivariate analysis of gene expression data

SUMMARY MADE4, microarray ade4, is a software package that facilitates multivariate analysis of microarray gene-expression data. MADE4 accepts a wide variety of gene-expression data formats. MADE4 takes advantage of the extensive multivariate statistical and graphical functions in the R package ade4, extending these for application to microarray data. In addition, MADE4 provides new graphical a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006