This paper presents a new computational approach to discover interesting relations between variables, called association rules, in large and high dimensional datasets. State-of-the-art techniques are computationally expensive due to reasons like high dimensions, generation of huge number of candidate sets and multiple database scans. In general, most of the enormous discovered patterns are obvious, redundant or uninteresting to the user. So the context of this paper is to improve apriori algorithm to find association rules pertaining to only important attributes from high dimensional data. We employ an information theoretic method together with the concept of QR decomposition to represent the data in its proper substructure form without losing its semantics, by identifying significant attributes. Experiment on real datasets and comparison with the existing technique reveals that the proposed strategy is computationally always faster and statistically always comparable with the apriori algorithms in terms of rules generated and time complexity.
Sandhya Harikumar, Dilipkumar, D. Usha, and Dr. M. R. Kaimal, “Efficient attribute selection strategies for association rule mining in high dimensional data”, International Journal of Computational Science and Engineering, vol. 15, pp. 201–213, 2017.