Due to the ever growing needs of managing huge volume of data, together with the desire for consistent, scalable, reliable and efficient retrieval of information, an intelligent mechanism to design the storage structure for distributing the databases has become inevitable. The two critical facets of distributed databases are data fragmentation and allocation. Existing fragmentation techniques are based on the frequency and type of the queries as well as the statistics of the empirical data. However, very limited work is done to fragment the data based on the pattern of the tuples and the attributes responsible for such patterns. This paper presents a unique approach towards hybridized fragmentation, by applying subspace clustering algorithm, to come up with a set of fragments which partitions the data with respect to tuples as well as attributes. Projected clustering is the one that determines the clusters in the subspaces of high dimensional data. This concept leads to find the closely correlated attributes for different sets of instances thereby giving good hybridized fragments for distributed databases. Experimental results show that fragmenting the database based on clustering, results in reduced database access time as compared to the fragments chosen at design time using certain statistics. © 2015 IEEE.
cited By 0; Conference of 2015 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems, SPICES 2015 ; Conference Date: 19 February 2015 Through 21 February 2015; Conference Code:112047
Sandhya Harikumar and Raji Ramachandran, “Hybridized fragmentation of very large databases using clustering”, in 2015 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems, SPICES 2015, 2015.