Data Mining Techniques in Parallel Environment- A Comprehensive Survey

نویسندگان

  • Kinjal Shah
  • Prashant Chauhan
  • M. B. Potdar
  • Jiawei Han
  • Micheline Kamber
  • Shenshen Liang
  • Ying Liu
  • Yike Guo
چکیده

Data mining is the process of discovering interesting and useful patterns and relationships in large volumes of data. The valuable knowledge can be discovered through the process of data mining for the further use and prediction. We have different data mining techniques like clustering classification and association. Classification is one of the major techniques to discover the patterns in huge amount of data. This technique is widely used in many fields. We have a large volume of data and if we extract the data sequentially then it will take a lot of timing. So if we extract the data parallely, the amount of time taken can be reduced. We can use parallel techniques when there is a large volume of data and we want to extract the data in very few seconds. We can implement this techniques using different approaches like MPI, OPENMP, using CUDA or using Map Reduce approach. Here in this paper we will discuss data mining techniques classification by decision tree induction and knearest neighbors using both sequential approach as well as parallel approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Mining Techniques in Parallel and Distributed Environment- A Comprehensive Survey

Distributed sources of voluminous data have raised the need of distributed data mining. Conventional data mining techniques works well on structured data which is clean, pre-processed and properly arranged either in the form of structured files, databases or data warehouse. These techniques are based upon centralised data store however they have several limitations in distributed scenario where...

متن کامل

Mining Massive-Scale Spatiotemporal Trajectories in Parallel: A Survey

With the popularization of positioning devices such as GPS navigators and smart phones, large volumes of spatiotemporal trajectory data have been produced at unprecedented speed. For many trajectory mining problems, a number of computationally efficient approaches have been proposed. However, to more effectively tackle the challenge of big data, it is important to exploit various advanced paral...

متن کامل

Credit scoring in banks and financial institutions via data mining techniques: A literature review

This paper presents a comprehensive review of the works done, during the 2000–2012, in the application of data mining techniques in Credit scoring. Yet there isn’t any literature in the field of data mining applications in credit scoring. Using a novel research approach, this paper investigates academic and systematic literature review and includes all of the journals in the Science direct onli...

متن کامل

Prediction of Student Learning Styles using Data Mining Techniques

This paper focuses on the prediction of student learning styles using data mining techniques within their institutions. This prediction was aimed at finding out how different learning styles are achieved within learning environments which are specifically influenced by already existing factors. These learning styles, have been affected by different factors that are mainly engraved and found wit...

متن کامل

A comprehensive benchmark between two filter-based multiple-point simulation algorithms

Computer graphics offer various gadgets to enhance the reconstruction of high-order statistics that are not correctly addressed by the two-point statistics approaches. Almost all the newly developed multiple-point geostatistics (MPS) algorithms, to some extent, adapt these techniques to increase the simulation accuracy and efficiency. In this work, a scrutiny comparison between our recently dev...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014