Qualification: 
MCA
ramya@am.amrita.edu

Remya Rajesh currently serves as Assistant Professor (Senior Grade) at the Department of Computer Science Applications at Amrita School of Engineering, Amritapuri.

Publications

Publication Type: Conference Paper

Year of Publication Title

2017

Remya Rajesh and Aswathy, N., “Document Summarization Using Dictionary Learning”, in 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, 2017.[Abstract]


Document summarization is a strategy, intended to extract information from multiple documents, deliberating the same subject. Many software applications handle document summarization, helping people grab the main thought, from a collection of documents, within a short time. Automatic summaries present information algorithmically extracted from multiple sources, without any impressionistic human intervention mediation. Experiments have resulted in ingenious algorithms, surmount the task of creating a short and salient summary. One such technique suggested in this paper is Dictionary Learning. This paper focuses on Document summarization, using dictionary learning and sparse coding techniques, considering the ordering of sentences and redundancy of documents. We use Singular Value Decomposition(SVD) for dictionary learning and Orthogonal Matching Pursuit(OMP) for sparse coding. The application of SVD augments the semantics of the generated summary. The order of sparsity in the final sparse code is used in ordering the sentences in the final summary. Verification of our proposed methodology have shown 75% precision.

More »»

2016

Remya Rajesh, Kini, N. V., and Krishnan, G. A., “Harnessing the Discriminatory Strength of Dictionaries”, in Second International Symposium on Emerging topics in Computing and Communication, 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Jaipur, India, 2016.[Abstract]


Over the past few years there are many developments in the area of classification in data mining. Classification is a supervised learning method, that maps data into predefined groups or classes. Nowadays classification techniques are extensively used in different applications. In this area most of the research works are done on text, image, signal etc. The main goal of this paper is to use a dictionary-based approach to learn, represent and classify documents. We consider dictionary as a collection of documents and each document in the dictionary is represented as a collection of vectors. An algorithm is also implemented to easily locate a class specific document in the dictionary and if it is not present, update the dictionary. The existing method is based on a dictionary learning algorithm which only improves the document representation based on Singular Value Decomposition (SVD) updation. Since SVD will not be helpful for discrimination of data, so our proposed algorithm is Linear Discriminant Analysis (LDA) for learning a discriminating dictionary. On applying the proposed algorithm on well known dataset, the overall results obtained shows 90% improvement in accuracy. The advantage is that it can be used for both representation as well as classification.

More »»

2016

Remya Rajesh, Gargi, S., and Samili, S., “Clustering of Words Using Dictionary-learnt Word Representations”, in 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Jaipur, India, 2016.[Abstract]


Language is the way of communication through words. This will help to get a better insight of the world. Natural Language Processing (NLP) mainly concentrate on expanding systems that allow computers to communicate with people using everyday language. One of the challenges inherent in NLP is teaching computers to recognize the way humans learn and use language. Word representations give rise to capture syntactic and semantic properties of words. So the main purpose of this work is to find out the set of words which have similar semantics by matching the context in which the words occur. In this work we explore a new method for learning word representations using sparse coding, a technique usually done on signals and images. We present an efficient sparse coding algorithm, Orthogonal Matching Pursuit to generate the sparse code. Based on the input given, sparse codes are generated for the input. The input term vectors are classified based on the sparse code by grouping the terms which have same sparse code into one class. K-Means algorithm is also used to classify the input terms vectors which have semantic similarities. Finally, this paper makes a comparison that gives the best representation from the sparse code and K-Means. The result shows an improved set of similar words using sparse code when compared to K-Means. This is because SVD is used as a part of dictionary learning which captures the latent relationship that exists between the words.

More »»

2016

Remya Rajesh and Namitha, K., “Performance Analysis of Updating-QR Supported OLS Against Stochastic Gradient Descent”, in Intelligent Systems Technologies and Applications, Cham, 2016.[Abstract]


Regression model is a well-studied method for the prediction of real-valued data. Depending on the structure of the data involved, different approaches have been adopted for estimating the parameters which includes the Linear Equation solver, Gradient Descent, Least Absolute Shrinkage and Selection Operator (LASSO) and the like. The performance of each of them varies based on the data size and computation involved. Many methods have been introduced to improve their performance like QR factorization in least squares problem. Our focus is on the analysis of performance of gradient descent and QR based ordinary least squares for estimating and updating the parameters under varying data size. We have considered both tall/skinny as well as short/fat matrices. We have implemented Block Householders method of QR factorization in Compute Unified Device Architecture (CUDA) platform using Graphics Processor Unit (GPU) GTX 645 with the initial set of data. New upcoming data is updated directly to the existing Q and R factors rather than applying QR factorization from the scratch. This updating-QR platform is then utilized for performing regression analysis. This will support the regression analysis process on-the-fly. The results are then compared against the gradient descent implemented. The results prove that parallel-QR method for regression analysis achieves speed-up of up to 22x compared with the gradient descent method when the attribute size is larger than the sample size and speed-up of up to 2x when the sample size is larger than the attribute size. Our implementation results also prove that the updating-QR method achieves speed-up approaching 2x over the gradient descent method for large datasets when the sample size is less than the attribute size.

More »»

2015

Remya Rajesh and Prof. Prema Nedungadi, “New methodology to differentiate instructional strategies for ESL learners in the Indian context”, in 2015 IEEE Frontiers in Education Conference (FIE), 2015.[Abstract]


Many students struggle with reading when whole group instruction forms the core of the reading program. This is especially true when teaching second language students. The proposed intervention methodology combines multiple proven methods to improve reading skills in students. The study focused on using differentiated instruction and multiple assessments such as Informal Reading Inventory (IRI), Qualitative Spelling Inventory (QSI), Running Records, Dynamic Indicators of Basic Early Literacy Skills (DIBELS), High-frequency words and phonological awareness. After the usage of the said tools, students learnt to follow a firm reading routine, respect classroom procedures, work in teams and solve problems independently. This five-month study examined the benefits of the differentiated instruction with thirty six 5th grade students, who were the second language English learners in a school in the state of Karnataka, India. The key findings from this study indicated that differentiating instruction and using small group instruction assisted and improved students' reading and writing proficiency. With our proposed method, 94% of the students improved their reading comprehension by a minimum of three grade levels. An unexpected benefit was a positive change in attitude and behavior of the students along with increased confidence.

More »»
PDF iconnew-methodology-to-differentiate-instructional-strategies-for-esl-learners-in-the-indian-context.pdf

2014

Remya Rajesh, Shaji, C. P., and Kaimal, M. R., “Singular Value Decomposition- A Revisit on A CUDA Platform”, in International Conference on Emerging Research in Computing, Information, Communication and Applications, ERCICA 2014, 2014.

2014

Remya Rajesh, K., A., and Kaimal, M. R., “Parallel Sparse coding for Categorical Data”, in Second International Conference on Emerging Research in Computing, Information, Communication and Applications(ERCICA), NMIT, Yelahanka, Bangalore, 2014.[Abstract]


The area of machine learning has been witnessing tremendous improvements with new algorithms being developed to suit a particular domain, improving existing algorithms for computational cost benefits and also using the concepts of algorithms applicable in one domain to another. This is an era of big data generated from different resources like the web, medicine, E-learning, Networking etc. Sparse representation is a promising area for handling big data. Also, for reducing computational cost and simultaneously handling this colossal data requires data distribution, parallel algorithms or both. The focus of our paper is the application of sparse coding for the sparse representation of categorical data in a parallel environment using the GPGPU and comparing the results with the sequential iterative method for sparse coding and also in the Message Passing Interface (MPI) environment. Categorical data has been transformed to a vector space model. From among the different sparse coding algorithms like matching pursuit, basis pursuit, FOCUSS etc applicable mostly to the signal and image domain, we have applied parallelism to the computational steps of the Batch Orthogonal Matching Pursuit algorithm which generates separate sparse code for each instance of a large dataset over the same dictionary. The algorithm is analysed and it is found that it fairs 90% better under the GPU compared to sequential and 80% better compared to MPI environment. The results are demonstrated on synthetic and real data.

More »»

2013

Remya Rajesh, Nair, S. S., Srindhya, K., and Kaimal, M. R., “Sparsity-based Representation for Categorical Data”, in 2013 IEEE Recent Advances in Intelligent Computational Systems (RAICS), Trivandrum, India, 2013.[Abstract]


Over the past few decades, many algorithms have been continuously evolving in the area of machine learning. This is an era of big data which is generated by different applications related to various fields like medicine, the World Wide Web, E-learning networking etc. So, we are still in need for more efficient algorithms which are computationally cost effective and thereby producing faster results. Sparse representation of data is one giant leap toward the search for a solution for big data analysis. The focus of our paper is on algorithms for sparsity-based representation of categorical data. For this, we adopt a concept from the image and signal processing domain called dictionary learning. We have successfully implemented its sparse coding stage which gives the sparse representation of data using Orthogonal Matching Pursuit (OMP) algorithms (both Batch and Cholesky based) and its dictionary update stage using the Singular Value Decomposition (SVD). We have also used a preprocessing stage where we represent the categorical dataset using a vector space model based on the TF-IDF weighting scheme. Our paper demonstrates how input data can be decomposed and approximated as a linear combination of minimum number of elementary columns of a dictionary which so formed will be a compact representation of data. Classification or clustering algorithms can now be easily performed based on the generated sparse coded coefficient matrix or based on the dictionary. We also give a comparison of the dictionary learning algorithm when applying different OMP algorithms. The algorithms are analysed and results are demonstrated by synthetic tests and on real data.

More »»

Publication Type: Journal Article

Year of Publication Title

2017

Remya Rajesh, Joseph, D., and Dr. Kaimal, M. R., “Semantics-based topic inter-relationship extraction”, Journal of Intelligent and Fuzzy Systems, vol. 32, pp. 2941-2951, 2017.[Abstract]


Maintaining large collection of documents is an important problem in many areas of science and industry. Different analysis can be performed on large document collection with ease only if a short or reduced description can be obtained. Topic modeling offers a promising solution for this. Topic modeling is a method that learns about hidden themes from a large set of unorganized documents. Different approaches and alternatives are available for finding topics, such as Latent Dirichlet Allocation (LDA), neural networks, Latent Semantic Analysis (LSA), probabilistic LSA (pLSA), probabilistic LDA (pLDA). In topic models the topics inferred are based only on observing the term occurrence. However, the terms may not be semantically related in a manner that is relevant to the topic. Understanding the semantics can yield improved topics for representing the documents. The objective of this paper is to develop a semantically oriented probabilistic model based approach for generating topic representation from the document collection. From the modified topic model, we generate 2 matrices-a document-topic and a term-topic matrix. The reduced document-term matrix derived from these two matrices has 85 similarity with the original document-term matrix i.e. we get 85 similarity between the original document collection and the documents reconstructed from the above two matrices. Also, a classifier when applied to the document-topic matrix appended with the class label, shows an 80 improvement in F-measure score. The paper also uses the perplexity metric to find out the number of topics for a test set. © 2017-IOS Press and the authors. All rights reserved.

More »»

2016

Remya Rajesh and Aswathi, P., “Document classification with hierarchically structured dictionaries”, Advances in Intelligent Systems and Computing, vol. 385, pp. 387-397, 2016.[Abstract]


Classification, clustering of documents, detecting novel documents, detecting emerging topics etc in a fast and efficient way, is of high relevance these days with the volume of online generated documents increasing rapidly. Experiments have resulted in innovative algorithms, methods and frameworks to address these problems. One such method is Dictionary Learning. We introduce a new 2-level hierarchical dictionary structure for classification such that the dictionary at the higher level is utilized to classify the K classes of documents. The results show around an 85% recall during the classification phase. This model can be extended to distributed environment where the higher level dictionary should be maintained at the master node and the lower level ones should be kept at worker nodes. © Springer International Publishing Switzerland 2016.

More »»
Faculty Research Interest: