NNFSRR: Nearest Neighbor Feature Selection and Redundancy Removal Method for Nearest Neighbor Search in Microarray Gene Expression Data

نویسندگان

چکیده

INTRODUCTION: Gene expression data analysis is a critical aspect of disease prediction and classification, playing pivotal role in the field bioinformatics biomedical research. High-dimensional gene datasets hold wealth information, but their effective utilization hindered by presence irrelevant dimensions noise. The challenge lies extracting meaningful features from these to enhance accuracy classification while maintaining computational efficiency. Feature selection crucial step addressing challenges, as it aims identify retain only most informative characteristics large high-dimensional microarray datasets. In context data, characterized its substantial dimensionality, selecting relevant essential for efficient nearest neighbor search, fundamental component various analytical tasks mining. Existing feature methods often face issues related trade-off between search efficiency. This paper introduces novel approach, Nearest Neighbor Selection with Symmetrical Uncertainty-based Redundancy Removal (NNFSRR) method, designed through selection. NNFSRR method focuses on reducing dimensionality dataset identifying removing redundant features, allowing subsequent searches operate solely dimensions. OBJECTIVES: primary goal evaluate method's effectiveness improving dimensionality. utilizes correlation efficiency compared existing methods. METHODS: uses Uncertainty remove Reduced are used Experiments conducted using real-world datasets, comparisons made based time accuracy. RESULTS: demonstrates improved performance, outperforming basic brute force techniques. Selected sets exhibit strong class associations minimizing correlations, enhancing precision. CONCLUSION: conclusion, presents promising approach address challenges posed data. It effectively reduces improves accuracy, enhances search. Our experimental results demonstrate that this outperforms techniques terms making valuable tool applications bioinformatics, mining, pattern recognition, biological information retrieval. holds potential advance our understanding complex processes support more accurate classification.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

متن کامل

Continuous Nearest Neighbor Search

A continuous nearest neighbor query retrieves the nearest neighbor (NN) of every point on a line segment (e.g., “find all my nearest gas stations during my route from point s to point e”). The result contains a set of tuples, such that point is the NN of all points in the corresponding interval. Existing methods for continuous nearest neighbor search are based on the repetitiv...

متن کامل

Fast Approximate Nearest-Neighbor Search with k-Nearest Neighbor Graph

We introduce a new nearest neighbor search algorithm. The algorithm builds a nearest neighbor graph in an offline phase and when queried with a new point, performs hill-climbing starting from a randomly sampled node of the graph. We provide theoretical guarantees for the accuracy and the computational complexity and empirically show the effectiveness of this algorithm.

متن کامل

Nearest Neighbor Search Algorithm

A fundamental activity common to image processing, pattern recognition, and clustering algorithm involves searching set of n , k-dimensional data for one which is nearest to a given target data with respect to distance function . Our goal is to find search algorithms with are full search equivalent -which is resulting match as a good as we could obtain if we were to search the set exhausting. 1...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: EAI Endorsed Transactions on Pervasive Health and Technology

سال: 2023

ISSN: ['2411-7145']

DOI: https://doi.org/10.4108/eetpht.9.3910