Back close

Logistic regression within DBMS

Publication Type : Conference Paper

Publisher : Proceedings of the 2016 2nd International Conference on Contemporary Computing and Informatics, IC3I 2016, Institute of Electrical and Electronics Engineers Inc.,

Source : Proceedings of the 2016 2nd International Conference on Contemporary Computing and Informatics, IC3I 2016, Institute of Electrical and Electronics Engineers Inc., p.661-666 (2016)

Url : https://www.scopus.com/inward/record.uri?eid=2-s2.0-85020008364&doi=10.1109%2fIC3I.2016.7918045&partnerID=40&md5=7b2f333ada1e6abf57896a7d812a775a

ISBN : 9781509052554

Keywords : Analytical queries, Classification algorithm, Data mining, Database systems, Feature engineerings, large scale systems, Logistic regressions, Mining projects, Query processing, Real data sets, Regression analysis, Statistical packages, User Defined Functions

Campus : Amritapuri

School : Department of Computer Science and Engineering, School of Engineering

Center : AI (Artificial Intelligence) and Distributed Systems

Department : Computer Science

Year : 2016

Abstract : The context of this paper is to come up with an analytical query model for data categorization within DBMS. DBMS being the asset for most of the organizations, classification can help in getting better insight and control over the data. Conventionally, classification algorithms like logistic regression, KNN, etc. are applied after exporting the data out of DBMS, using non DBMS tools like R, matrix packages, generic data mining programs or large scale systems like Hadoop and Spark. However, this leads to I/O overhead since the data within DBMS is updated quite frequently and usually cannot be accommodated in the main memory. This paper proposes an alternative strategy, based on SQL and UDFs, to integrate the logistic regression for data categorization as well as prediction query processing within DBMS. A comparison of SQL with user defined functions (UDFs) as well as with statistical packages like R is presented, by experimentation on real datasets. The empirical results show the viability and validity of this approach for predicting the class of a given query. © 2016 IEEE.

Cite this Research Publication : J. Isaac and Sandhya Harikumar, “Logistic regression within DBMS”, in Proceedings of the 2016 2nd International Conference on Contemporary Computing and Informatics, IC3I 2016, 2016, pp. 661-666

Admissions Apply Now