Back close

COVID-19 Semantic Search Engine Using Sentence-Transformer Models

Publication Type : Conference Paper

Publisher : Springer

Source : Computational Intelligence, Cyber Security and Computational Models. Recent Trends in Computational Models, Intelligent and Secure Systems. ICC3 2021. Communications in Computer and Information Science, vol 1631. Springer, Cham

Url : https://link.springer.com/chapter/10.1007/978-3-031-15556-7_14

Campus : Amritapuri

School : School of Computing

Center : AI (Artificial Intelligence) and Distributed Systems

Department : Computer Science and Engineering

Year : 2022

Abstract : With the onset of COVID-19, enormous research papers are being published with unprecedented information. It is impractical for the stake holders in medical domain to keep in pace with the new knowledge being generated by reading the entire research papers and articles in order to keep pace with new information. In this work, a semantic search engine is proposed that utilises different sentence transformer models such as BERT, DistilBERT, RoBERTa, ALBERT and DistilRoBERTa for semantic retrieval of information based on the query provided by the user. These models begin by collecting COVID-19-related research papers and are used as an input to the pre-trained sentence transformer models. The collected research papers are then converted into embedded paragraphs, and the input query is sent to the same model, which in turn delivers the embedded query. The model uses cosine similarity to compare both embedded paragraphs and the embedded query. Consequently, it returns the top K most similar paragraphs, together with their paper ID, title, abstract, and abstract summary. The bidirectional nature of the sentence transformer models allows them to read text sequences from both directions, making the text sequence more meaningful. Using these models, COVID-19 semantic search engine has been developed and deployed for efficient query processing. The similarity score for each model was computed by averaging the top 100 query scores. As a result, the RoBERTa model is faster, generates a higher score of similarity, and consumes less runtime.

Cite this Research Publication : Jose, A., Harikumar, S. (2022). COVID-19 Semantic Search Engine Using Sentence-Transformer Models. In: Raman, I., Ganesan, P., Sureshkumar, V., Ranganathan, L. (eds) Computational Intelligence, Cyber Security and Computational Models. Recent Trends in Computational Models, Intelligent and Secure Systems. ICC3 2021. Communications in Computer and Information Science, vol 1631. Springer, Cham

Admissions Apply Now