Featured

Data Mining Tools

The analytical techniques used in data mining are often well-known mathematical algorithms and techniques. What is new is the application of those techniques to general business problems made possible by the increased availability of data, and inexpensive storage and processing power. Also, the use of graphical interface has led to tools becoming available that business experts can easily use.
In addition to using a particular data mining tool, internal auditors can choose from a variety of data mining techniques. The most commonly used techniques include artificial neural networks, decision trees, rule induction, genetic algorithms and the nearest-neighbor method. Each of these techniques analyzes data in different ways :
  • Artificial Neural Networks :Nonlinear predictive models that learn through training and resemble biological neural networks in structure.
  • Decision trees : Tree-shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset.
  • Rule induction : The extraction of useful if-then rules from databases on statistical significance.
  • Genetic Algorithms : Optimization techniques based on the concepts of genetic combination, mutation, and natural selection.
  • Nearest Neighbor : A classification technique that classifies each record based on the records most similar to it in a historical database.


Each of these approaches brings different advantages and disadvantages that need to be considered prior to their use. Neural networks, which are difficult to implement, require all input and resultant output to be expressed numerically, thus needing some sort of interpretation depending on the nature of the data-mining exercise. The decision tree technique is the most commonly used methodology, because it is simple and straightforward to implement. Finally, the nearest-neighbor method relies more on linking similar items and, therefore, works better for extrapolation rather than predictive enquiries.




Copyright © Computer Science | Blogger Templates | Designed By Code Nirvana