Internet is a pool of information, which contains billions of text documents which are stored in compressed format. In literature there are many text classification algorithms which work on uncompressed text documents. Since web pages contain text data which are stored in compressed format and the text documents must be taken back to its original format for the purpose of data mining activities. The process of decompression of text documents consumes more computational time. So this work introduces a study on different text classification and clustering algorithms and their comparison in compressed domain. Various methods for representing text in compressed domain are explained and experiments are conducted on LZW method for comparison. Different classification and clustering algorithms are also discussed. A comparative analysis on all these methods is presented. © 2015, Research India Publications, All rights Reserved.
cited By 0
S. Akshay, Nayana, K., and Karthika, S., “A survey on classification and clustering algorithms for uncompressed and compressed text”, International Journal of Applied Engineering Research, vol. 10, pp. 27355-27373, 2015.