A structural hierarchy matching approach for molecular similarity/substructure searching.

نویسندگان

  • Shu-Shen Ji
  • Hong-Ju Dong
  • Xin-Xin Zhou
  • Ya-Min Liu
  • Feng-Xue Zhang
  • Qi Wang
  • Xin-An Huang
چکیده

An approach for molecular similarity/substructure searching based on structural hierarchy matching is proposed. In this approach, small molecules are divided into two categories, acyclic and cyclic forms. The latter are further divided into three structural hierarchies, namely, framework, complicated-, and mono-rings. During searching, the similarity coefficients of a structural query and each retrieved molecule are calculated using the hierarchy of the query as the reference. A total of 13,911 chemicals were involved in this work, from which the minimal cyclic and acyclic substructures are extracted, and further processed into fuzzy structural fingerprints. Subsequently, the fingerprints are used as the searching indices for molecular similarity or substructure searching. The tests show that this approach can give user options to choose between one-substructure and multi-substructure searching with sorted results. Moreover, this algorithm has the potential to be developed for molecular similarity searching and substructure analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

fmcsR: a Flexible Maximum Common Substructure Algorithm for Advanced Compound Similarity Searching

Maximum common substructure (MCS) algorithms rank among the most sensitive and accurate methods for measuring structural similarities among small molecules. This utility is critical for many research areas in drug discovery and chemical genomics. The MCS problem is a graph-based similarity concept that is defined as the largest substructure (sub-graph) shared among two compounds (Cao et al., 20...

متن کامل

Finding Characteristic Substructures for Metabolite Classes

We introduce a method for finding a characteristic substructure for a set of molecular structures. Different from common approaches, such as computing the maximum common subgraph, the resulting substructure does not have to be contained in its exact form in all input molecules. Our approach is part of the identification pipeline for unknown metabolites using fragmentation trees. Searching datab...

متن کامل

WebCSD: the online portal to the Cambridge Structural Database

WebCSD, a new web-based application developed by the Cambridge Crystallographic Data Centre, offers fast searching of the Cambridge Structural Database using only a standard internet browser. Search facilities include two-dimensional substructure, molecular similarity, text/numeric and reduced cell searching. Text, chemical diagrams and three-dimensional structural information can all be studie...

متن کامل

Chemical Similarity Searching

This paper reviews the use of similarity searching in chemical databases. It begins by introducing the concept of similarity searching, differentiating it from the more common substructure searching, and then discusses the current generation of fragment-based measures that are used for searching chemical structure databases. The next sections focus upon two of the principal characteristics of a...

متن کامل

Multiple Semi-flexible 3D Superposition of Drug-sized Molecules Multiple Semi-flexible 3D Superposition of Drug-sized Molecules

In this paper we describe a new algorithm for multiple semi-flexible superpositioning of drug-sized molecules. The algorithm identifies structural similarities of two or more molecules. When comparing a set of molecules on the basis of their three-dimensional structures, one is faced with two main problems. (1) Molecular structures are not fixed but flexible, i.e., a molecule adopts different f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Molecules

دوره 20 5  شماره 

صفحات  -

تاریخ انتشار 2015