Subathra P. currently serves as Assistant Professor at department of Computer Science and Engineering, School of Engineering, Coimbatore Campus. Her area of Interest is Data mining, Machine learning and Big data analytics.Currently working on Text analysis.


Publication Type: Journal Article
Year of Conference Publication Type Title
2016 Journal Article L. K. Devi, P. Subathra, and Kumar, P. N., “Performance evaluation of sentiment classification using query strategies in a pool based active learning scenario”, Advances in Intelligent Systems and Computing, vol. 412, pp. 65-75, 2016.[Abstract]

In order to perform Sentiment Classification in scenarios where there is availability of huge amounts of unlabelled data (as in Tweets and other big data applications), human annotators are required to label the data, which is very expensive and time consuming. This aspect is resolved by adopting the Active Learning approach to create labelled data from the available unlabelled data by actively choosing the most appropriate or most informative instances in a greedy manner, and then submitting to human annotator for annotation. Active learning (AL) thus reduces the time, cost and effort to label huge amount of unlabelled data. The AL provides improved performance over passive learning by reducing the amount of data to be used for learning; producing higher quality labelled data; reducing the running time of the classification process; and improving the predictive accuracy. Different Query Strategies have been proposed for choosing the most informative instances out of the unlabelled data. In this work, we have performed a comparative performance evaluation of Sentiment Classification in a Pool based Active Learning scenario adopting the query strategies—Entropy Sampling Query Strategy in Uncertainty Sampling, Kullback-Leibler divergence and Vote Entropy in Query By Committee using the evaluation metrics Accuracy, Weighted Precision, Weighted Recall, Weighted F-measure, Root Mean Square Error, Weighted True Positive Rate and Weighted False Positive Rate. We have also calculated different time measures in an Active Learning process viz. Accumulative Iteration time, Iteration time, Training time, Instances selection time and Test time. The empirical results reveal that Uncertainty Sampling query strategy showed better overall performance than Query By Committee in the Sentiment Classification of movie reviews dataset. © Springer Science+Business Media Singapore 2016.

More »»
2015 Journal Article R. Ramachandran, Rajeev, D. C., Krishnan, S. G., and P. Subathra, “Deep learning – An overview”, International Journal of Applied Engineering Research, vol. 10, pp. 25433-25448, 2015.[Abstract]

Deep Learning is a new and emerging field in Machine Learning, developed to model higher level abstraction in data. The goal of Deep Learning is to move towards Artificial Intelligence.It provides semi-supervised or unsupervised feature learning algorithms and hierarchical feature extraction, in place of the traditional handcrafted features. This survey paper is intended to provide an overall understanding of the basic concepts of Deep Learning, by providing answers to the following questions: What is Deep Learning? What is the importance of Deep Learning? How can Deep Learning improve Machine Learning? What are the different types of Deep Learning Architectures used? What tools are used for its implementation? What are its applications? How is it suitable for Big Data Analysis? What are the challenges it faces? © Research India Publications.

More »»
2015 Journal Article S. Jaysri, Priyadharshini, J., P. Subathra, and Kumar, P. N., “Analysis and performance of collaborative filtering and classification algorithms”, International Journal of Applied Engineering Research, vol. 10, pp. 24529-24540, 2015.[Abstract]

Machine learning is a method which is used to learn from data without any human involvement. Recommendation systems come under Machine learning technique which has become one of the essential systems in our day to day e-commerce internet interaction. Many algorithms are proposed to effectively capture the taste of the users and to provide recommendations accurately. Collaborative filtering is one such successful method to provide recommendation to the users. Classification which also falls under Machine learning technique contains many algorithms which can classify text, numerical data, etc. In this paper, we demonstrate two Collaborative Filtering algorithms viz, User based and Item based recommender systems; and three Classification algorithms viz, Naive-Bayes, Logistic Regression and Random Forest Classification. We analysed the results based on evaluation metrics. Our experiment suggests that in Recommender systems, Item based scores over User based; and in Classification, Naive-Bayes emerges superior. © Research India Publications.

More »»
2015 Journal Article P. Subathra, Deepika, R., Yamini, K., Arunprasad, P., and Vasudevan, S. K., “A study of open source data mining tools and its applications”, Research Journal of Applied Sciences, Engineering and Technology, vol. 10, pp. 1108-1132, 2015.[Abstract]

Data Mining is a technology that is used for the process of analyzing and summarizing useful information from different perspectives of data. The importance of choosing data mining software tools for the developing applications using mining algorithms has led to the analysis of the commercially available open source data mining tools. The study discusses about the following likeness of the data mining tools-KNIME, WEKA, ORANGE, R Tool and Rapid Miner. The Historical Development and state-of-art; (i) The Applications supported, (ii) Data mining algorithms are supported by each tool, (iii) The pre-requisites and the procedures to install a tool, (iv) The input file format supported by each tool. To extract useful information from these data effectively and efficiently, data mining tools are used. The availability of many open source data mining tools, there is an increasing challenge in deciding upon the apace-updated tools for a given application. This study has provided a brief study about the open source knowledge discovery tools with their installation process, algorithms support and input file formats support. © Maxwell Scientific Organization, 2015.

More »»
Publication Type: Conference Proceedings
Year of Conference Publication Type Title
2015 Conference Proceedings A. Ravindran, Kumar, P. N., and P. Subathra, “Similarity Scores Evaluation in Social Networking Sites”, Proceedings of the International Conference on Soft Computing Systems (ICSCS). Springer, 2015.[Abstract]

In today’s world, social networking sites are becoming increasingly popular. Often we find suggestions for friends, from such social networking sites. These friend suggestions help us identify friends that we may have lost touch with or new friends that we may want to make. At the same time, these friend suggestions may not be that accurate. To recommend a friend, social networking sites collect information about user’s social circle and then build a social network based on this information. This network is then used to recommend to a user, the people he might want to befriend. FoF algorithm is one of the traditional techniques used to recommend friends in a social network. Delta-SimRank is an algorithm used to compute the similarity between objects in a network. This algorithm is also applied on a social network to determine the similarity between users. Here, we evaluate Delta-SimRank and FoF algorithm in terms of the friend suggestion provided by them, when applied on a Facebook dataset. It is observed that Delta-SimRank provides a higher precise similarity score because it considers the entire network around a user.

More »»
Faculty Details


Faculty Email: