Qualification: 
M.E
p_subathra@cb.amrita.edu

Subathra P. currently serves as Assistant Professor at the Department of Computer Science and Engineering, School of Engineering, Coimbatore Campus. Her area of Interest is Data mining, Machine learning and Big data analytics.Currently working on Text analysis.

Awards and Achievements

  • Gold Partnership from Infosys for “Bigdata analytics”

Publications

Publication Type: Journal Article

Year of Publication Publication Type Title

2017

Journal Article

T. Rajasundari, Subathra P., and Kumar, P. N., “Performance Analysis of Topic Modeling Algorithms for News Articles”, Journal of Advanced Research in Dynamical and Control Systems, vol. 2017, pp. 175-183, 2017.[Abstract]


Topic Modeling is a statistical model, which derives the latent theme from large collection of text. In this work we developed a topic model for BBC news corpus to find the screened regional from the corpus. We have implemented the topic modeling algorithms Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA) and three different machine learning approaches (Naive Bayes, K-NN and K-means). We compared the performance of topic modeling algorithms with machine learning approaches using the measures precision and recall. Our results show that topic modeling algorithms work better for corpus with multiple topic distribution.

More »»

2017

Journal Article

Subathra P., Thiyaneswaran, V., and Kumar, P. N., “Preventive System Against Cyber Bulling Using Topic Modeling Algorithm”, Journal of Advanced Research in Dynamical and Control Systems, vol. 2017, pp. 287-296, 2017.[Abstract]


Text Mining is one of the techniques used for deriving high quality of information from the text. Topic Modeling is a major task involved in text mining for finding hidden subjects from the corpus. Cyber bullying is the act of bullying people through electronic means of communication which primarily threaten the teens in a deliberate manner. In the proposed work we have taken the survey from students, blogs and YouTube comments which are fed as input to LSA and LDA for finding the cyber bullying documents. Both topic modeling techniques are designed for categorizing the cyber bullying documents from the corpus. We experimented with LSA model for document summarization and using LDA model, developed add-ons on Google Chrome to prevent the cyber bullying words. The Performance analysis is done for both LSA and LDA algorithm. Our results show that LDA performs better for documents with multiple topic distribution.

More »»

2015

Journal Article

R. Ramachandran, Rajeev, D. C., Krishnan, S. G., and Subathra P., “Deep learning – An overview”, International Journal of Applied Engineering Research, vol. 10, pp. 25433-25448, 2015.[Abstract]


Deep Learning is a new and emerging field in Machine Learning, developed to model higher level abstraction in data. The goal of Deep Learning is to move towards Artificial Intelligence.It provides semi-supervised or unsupervised feature learning algorithms and hierarchical feature extraction, in place of the traditional handcrafted features. This survey paper is intended to provide an overall understanding of the basic concepts of Deep Learning, by providing answers to the following questions: What is Deep Learning? What is the importance of Deep Learning? How can Deep Learning improve Machine Learning? What are the different types of Deep Learning Architectures used? What tools are used for its implementation? What are its applications? How is it suitable for Big Data Analysis? What are the challenges it faces? © Research India Publications.

More »»

2015

Journal Article

S. Jaysri, Priyadharshini, J., Subathra P., and Dr. (Col.) Kumar P. N., “Analysis and Performance of Collaborative Filtering and Classification Algorithms”, International Journal of Applied Engineering Research, vol. 10, pp. 24529-24540, 2015.[Abstract]


Machine learning is a method which is used to learn from data without any human involvement. Recommendation systems come under Machine learning technique which has become one of the essential systems in our day to day e-commerce internet interaction. Many algorithms are proposed to effectively capture the taste of the users and to provide recommendations accurately. Collaborative filtering is one such successful method to provide recommendation to the users. Classification which also falls under Machine learning technique contains many algorithms which can classify text, numerical data, etc. In this paper, we demonstrate two Collaborative Filtering algorithms viz, User based and Item based recommender systems; and three Classification algorithms viz, Naive-Bayes, Logistic Regression and Random Forest Classification. We analysed the results based on evaluation metrics. Our experiment suggests that in Recommender systems, Item based scores over User based; and in Classification, Naive-Bayes emerges superior. © Research India Publications.

More »»

2015

Journal Article

Subathra P., Deepika, R., Yamini, K., Arunprasad, P., and Dr. Shriram K Vasudevan, “A Study of Open Source Data Mining Tools and its Applications”, Research Journal of Applied Sciences, Engineering and Technology, vol. 10, pp. 1108-1132, 2015.[Abstract]


Data Mining is a technology that is used for the process of analyzing and summarizing useful information from different perspectives of data. The importance of choosing data mining software tools for the developing applications using mining algorithms has led to the analysis of the commercially available open source data mining tools. The study discusses about the following likeness of the data mining tools-KNIME, WEKA, ORANGE, R Tool and Rapid Miner. The Historical Development and state-of-art; (i) The Applications supported, (ii) Data mining algorithms are supported by each tool, (iii) The pre-requisites and the procedures to install a tool, (iv) The input file format supported by each tool. To extract useful information from these data effectively and efficiently, data mining tools are used. The availability of many open source data mining tools, there is an increasing challenge in deciding upon the apace-updated tools for a given application. This study has provided a brief study about the open source knowledge discovery tools with their installation process, algorithms support and input file formats support. © Maxwell Scientific Organization, 2015.

More »»

2015

Journal Article

L. K. Devi, Subathra P., and Kumar, P. N., “Tweet Sentiment Classification Using an Ensemble of Machine Learning Supervised Classifiers Employing Statistical Feature Selection Methods”, Advances in Intelligent Systems and Computing, vol. 415, pp. 1-13, 2015.[Abstract]


Twitter is considered to be the most powerful tool of information dissemination among the micro-blogging websites. Everyday large user generated contents are being posted in Twitter and determining the sentiment of these contents can be useful to individuals, business companies, government organisations etc. Many Machine Learning approaches are being investigated for years and there is no consensus as to which method is most suitable for any particular application. Recent research has revealed the potential of ensemble learners to provide improved accuracy in sentiment classification. In this work, we conducted a performance comparison of ensemble learners like Bagging and Boosting with the baseline methods like Support Vector Machines, Naive Bayes and Maximum Entropy classifiers. As against the traditional method of using Bag of Words for feature selection, we have incorporated statistical methods of feature selection like Point wise Mutual Information and Chi-square methods, which resulted in improved accuracy. We performed the evaluation using Twitter dataset and the empirical results revealed that ensemble methods provided more accurate results than baseline classifiers.

More »»

2015

Journal Article

L. K. Devi, Amrita, R., Subathra P., and Dr. (Col.) Kumar P. N., “An Analysis on the Performance Evaluation of Collaborative Filtering Algorithms Using Apache Mahout”, International Journal of Applied Engineering Research (IJAER), vol. 10, pp. 14797-14812, 2015.[Abstract]


Recommendation systems are being widely adopted in many areas which include social networking, e-commerce etc. Long years of research have led to the proposal of many algorithms in order to effectively capture the real tastes of users and deliver the recommendations accurately. Collaborative filtering is considered to be one of the popular and successful approaches to provide recommendations. In this paper, we conduct a performance evaluation of three popular collaborative filtering algorithms viz. User based, Item based and Slope-one recommender. We illustrate a brief overview on the different approaches of collaborative filtering, their method of working, advantages and limitations. We demonstrate the results based on the evaluation metrics precision, recall, f-measure, fallout and reach. Our experiments revealed that the Slope-one approach outperformed the other two approaches based on the evaluation metrics. We also explored different kinds of similarity metrics and highlighted the effect of size of the neighbourhood on the evaluation metrics. Keywords: Collaborative Filtering (CF), Recommendation systems, Apache Mahout, User based CF, Item based CF, Slope one.

More »»

2015

Journal Article

R. Dhivyapriya, Monisha, N., and Subathra P., “Performance Evaluation of Sequential Pattern Mining Algorithms”, International Journal of Applied Engineering Research, vol. 10, no. 55, pp. 1918-1921, 2015.[Abstract]


The Sequential Pattern Mining is an extraction of relevant patterns, satisfying the minimum support threshold, where the values are delivered in a sequence and the count of the data-sequences that contain the patternis called the support of the pattern. The major goal of sequential pattern mining is to discover a complete set of possible sequential patterns from a large dataset, with small number of database scans which is used to make some business developments. This paper investigates three sequential pattern mining algorithms viz. Apriori based approach, PatternGrowthapproach and Constraint based approach for mining sequential patterns based on the number of patterns generated and running time of the algorithm. The dataset used for the experiments to mine sequential pattern is the Restaurant Database System. © Research India Publications.

More »»

2014

Journal Article

B. Shriladha, Magudeswaran, S., Sini Raj P., and Subathra P., “Library Book Recommendation System using CF-Apriori Algorithm”, International Journal of Applied Engineering Research, vol. 9, pp. 8089-8096, 2014.[Abstract]


There is eventually a transition from traditional libraries to digital libraries; there can also be a transition from digital library to recommended digital library. This paper proposes a better way to facilitate user search process and recommend books based on past library usage and similar users interest. We create the library recommendation system using Apriori algorithm and Collaborative Filtering (CF). Apriori algorithm produces the association rules which can be applied for large database. Collaborative filtering Algorithm is used to recommend the books of similar user profiles. For a recommendation system, data collection, processing data in addition with user data is required, where user ratings plays a crucial role. Automatizing the support count estimation in Apriori algorithm can be done to improve the efficiency of the system as a future work.

More »»

Publication Type: Conference Paper

Year of Publication Publication Type Title

2016

Conference Paper

L. K. Devi, Subathra P., and Dr. (Col.) Kumar P. N., “Performance Evaluation of Sentiment Classification using Query Strategies in a Pool Based Active Learning Scenario”, in Advances in Intelligent Systems and Computing, 2016, vol. 412, pp. 65-75.[Abstract]


In order to perform Sentiment Classification in scenarios where there is availability of huge amounts of unlabelled data (as in Tweets and other big data applications), human annotators are required to label the data, which is very expensive and time consuming. This aspect is resolved by adopting the Active Learning approach to create labelled data from the available unlabelled data by actively choosing the most appropriate or most informative instances in a greedy manner, and then submitting to human annotator for annotation. Active learning (AL) thus reduces the time, cost and effort to label huge amount of unlabelled data. The AL provides improved performance over passive learning by reducing the amount of data to be used for learning; producing higher quality labelled data; reducing the running time of the classification process; and improving the predictive accuracy. Different Query Strategies have been proposed for choosing the most informative instances out of the unlabelled data. In this work, we have performed a comparative performance evaluation of Sentiment Classification in a Pool based Active Learning scenario adopting the query strategies—Entropy Sampling Query Strategy in Uncertainty Sampling, Kullback-Leibler divergence and Vote Entropy in Query By Committee using the evaluation metrics Accuracy, Weighted Precision, Weighted Recall, Weighted F-measure, Root Mean Square Error, Weighted True Positive Rate and Weighted False Positive Rate. We have also calculated different time measures in an Active Learning process viz. Accumulative Iteration time, Iteration time, Training time, Instances selection time and Test time. The empirical results reveal that Uncertainty Sampling query strategy showed better overall performance than Query By Committee in the Sentiment Classification of movie reviews dataset. © Springer Science+Business Media Singapore 2016.

More »»

Publication Type: Book Chapter

Year of Publication Publication Type Title

2016

Book Chapter

A. Ravindran, Kumar, P. N., and Subathra P., “Similarity Scores Evaluation in Social Networking Sites”, in Advances in Intelligent Systems and Computing, vol. 398, 2016, pp. 601 - 614.

Publication Type: Conference Proceedings

Year of Publication Publication Type Title

2015

Conference Proceedings

A. Ravindran, Dr. (Col.) Kumar P. N., and Subathra P., “Similarity Scores Evaluation in Social Networking Sites”, Proceedings of the International Conference on Soft Computing Systems (ICSCS). Springer, 2015.[Abstract]


In today’s world, social networking sites are becoming increasingly popular. Often we find suggestions for friends, from such social networking sites. These friend suggestions help us identify friends that we may have lost touch with or new friends that we may want to make. At the same time, these friend suggestions may not be that accurate. To recommend a friend, social networking sites collect information about user’s social circle and then build a social network based on this information. This network is then used to recommend to a user, the people he might want to befriend. FoF algorithm is one of the traditional techniques used to recommend friends in a social network. Delta-SimRank is an algorithm used to compute the similarity between objects in a network. This algorithm is also applied on a social network to determine the similarity between users. Here, we evaluate Delta-SimRank and FoF algorithm in terms of the friend suggestion provided by them, when applied on a Facebook dataset. It is observed that Delta-SimRank provides a higher precise similarity score because it considers the entire network around a user.

More »»
Faculty Research Interest: 
207
PROGRAMS
OFFERED
5
AMRITA
CAMPUSES
15
CONSTITUENT
SCHOOLS
A
GRADE BY
NAAC, MHRD
8th
RANK(INDIA):
NIRF 2018
150+
INTERNATIONAL
PARTNERS