Analysis of Data Fusion Methods in Virtual Screening: Similarity and Group Fusion
نویسندگان
چکیده
In a recent companion paper we have related the operation of simple data fusion rules used in virtual screening to a multiple integral formalism. In this paper we extend these ideas to the analysis of data fusion methods applied to real data. We examine several cases of similarity fusion using different coefficients and different representations and consider the reasons for positive or negative results in terms of the similarity distributions. Results are obtained using the SUM-, MAX- MIN-, and CombMNZ-fusion rules. We also develop a customized fusion rule, which provides an estimate of the optimal possible result for fusing multiple searches of a specific database; this shows that similarity fusion can, in principle, achieve retrieval enhancements even if this is not achieved in practice with current fusion rules. The methods are extended to analyze the comparatively successful results of group fusion with multiple actives, and we provide a rationale for the observed superiority of the MAX-rule over the SUM-rule in this context.
منابع مشابه
Multiple search methods for similarity-based virtual screening: analysis of search overlap and precision
BACKGROUND Data fusion methods are widely used in virtual screening, and make the implicit assumption that the more often a molecule is retrieved in multiple similarity searches, the more likely it is to be active. This paper tests the correctness of this assumption. RESULTS Sets of 25 searches using either the same reference structure and 25 different similarity measures (similarity fusion) ...
متن کاملFusing similarity rankings in ligand-based virtual screening
Data fusion is the name given to a range of methods for combining multiple sources of evidence. This mini-review summarizes the use of one such class of methods for combining the rankings obtained when similarity searching is used for ligand-based virtual screening. Two main approaches are described: similarity fusion involves combining rankings from single searches based on multiple similarity...
متن کاملCondorcet and borda count fusion method for ligand-based virtual screening
BACKGROUND It is known that any individual similarity measure will not always give the best recall of active molecule structure for all types of activity classes. Recently, the effectiveness of ligand-based virtual screening approaches can be enhanced by using data fusion. Data fusion can be implemented using two different approaches: group fusion and similarity fusion. Similarity fusion involv...
متن کاملMaximum Common Substructure-Based Data Fusion in Similarity Searching
Data fusion has been shown to work very well when applied to fingerprint-based similarity searching, yet little is known of its application to maximum common substructure (MCS)-based similarity searching. Two similarity search applications of the MCS will be focused on here. Typically, the number of bonds in the MCS, as well as the bonds in the two molecules being compared, are used in a simila...
متن کاملFull combinatorial consensus scoring for ligand-based virtual fragment screening at membrane bound receptors
Virtual screening (VS) has become an integral part of fragment-based drug discovery (FBDD). In this study we have evaluated the applicability of ligand-based virtual screening (LBVS) methods for identifying small fragment-like biologically active molecules using different similarity descriptors and different consensus scoring approaches. For this purpose we have evaluated the performance of 14 ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of chemical information and modeling
دوره 46 6 شماره
صفحات -
تاریخ انتشار 2006