People Innovation Excellence

Data Mining – The Process

Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems. Data mining tools can answer business questions that traditionally were too time-consuming to resolve. They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations.

According to Larose, data mining is divided into several groups based on the tasks that can be done, namely: Description, Estimation, Predictions, Classification, Clustering and Association.

Most companies already collect and refine massive quantities of data. Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of existing information resources, and can be integrated with new products and systems as they are brought on-line. When implemented on high performance client/server or parallel processing computers, data mining tools can analyze massive databases to deliver answers to questions such as, “Which clients are most likely to respond to my next promotional mailing, and why?”

In the data mining process there are several steps that must be done: First step is Data cleaning; this stage focuses on data cleansing of source data obtained so that the data is not missing value, not noisy data, and get consistent data.

Second step is Data integration: In this stage, the incorporation of data from other sources of information or different into a single database that is required.

Third step is Data selection: At this stage select relevant data inside the database.

Fourth step is Data Transformation. At this stage the data is already completed selected will be used for the modeling process at the stage of data mining, which is useful for analyzing processes that will show you the hidden information to assist in the calculation of data mining in the future. For example the project would like to use the method of classification, then we define a “Predictor Attribute” and “Class Label”

The Fifth step is Data mining, which at this stage determine a pattern or interesting information in the data by using data mining techniques.

There are several major data mining techniques namely association, classification, clustering, prediction, sequential patterns and decision tree.

Reference:

Jiawei Han, Micheline Kamber, Jian Pei , Data Mining: Concepts and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems) 3rd Edition, 2012

Daniel T. Larose, Chantal D. Larose, Data Mining and Predictive Analytics, 2nd Edition, March 2015


Published at :
Written By
Eka Miranda, S.Kom., MMSI.
Subject Content Specialist | School of Information Systems http://sis.binus.ac.id
Leave Your Footprint

    Periksa Browser Anda

    Check Your Browser

    Situs ini tidak lagi mendukung penggunaan browser dengan teknologi tertinggal.

    Apabila Anda melihat pesan ini, berarti Anda masih menggunakan browser Internet Explorer seri 8 / 7 / 6 / ...

    Sebagai informasi, browser yang anda gunakan ini tidaklah aman dan tidak dapat menampilkan teknologi CSS terakhir yang dapat membuat sebuah situs tampil lebih baik. Bahkan Microsoft sebagai pembuatnya, telah merekomendasikan agar menggunakan browser yang lebih modern.

    Untuk tampilan yang lebih baik, gunakan salah satu browser berikut. Download dan Install, seluruhnya gratis untuk digunakan.

    We're Moving Forward.

    This Site Is No Longer Supporting Out-of Date Browser.

    If you are viewing this message, it means that you are currently using Internet Explorer 8 / 7 / 6 / below to access this site. FYI, it is unsafe and unable to render the latest CSS improvements. Even Microsoft, its creator, wants you to install more modern browser.

    Best viewed with one of these browser instead. It is totally free.

    1. Google Chrome
    2. Mozilla Firefox
    3. Opera
    4. Internet Explorer 9
    Close