Kateryna Kuzma1 and Oleksandr Melnyk2, 1Department of Computer Science, Mykolaiv V. O. Sukhomlynskyi National University, Mykolaiv, Ukraine, 2Department of Economics and Information Technology, Mykolaiv Institute of Human Development of the Higher Education Establishment «Open International University of Human Development «Ukraine», Mykolaiv, Ukraine
The aim is to find a solution for assessment the answers to open-ended questions presented in natural-looking text format, in Ukrainian, using machine learning methods. This problem was considered as a task of binary text classification. The following results were obtained: application of machine learning models for assessment the detailed response in natural-looking text format was researched; use of the logistic regression model for this problem was substantiated; mathematical calculations of the model’s parameters using the functionality of the scikit-learn library were considered; the procedure of text normalization was offered; the method that allows to check whether the word is an abbreviation was developed. The use of two ways (text normalization and Bag of Words & TFIDF model; character n-grams & TFIDF) for text vectorization was argued. The proposed approach can be used for pre-processing of answers to open-ended questions in testing systems in order to determine the relevance of the answer to the content of the discipline.
Natural-Language Processing, Machine Learning, Answer Assessment, Open-Ended Questions, Natural-Looking Text.
Rohan Mistry and Neel Shah and Ravi Patel and Dr. Ramchandra Mangrulkar, Department of Computer Engineering, Dwarkadas J. Sanghvi College of Engineering, Mumbai, India
COVID-19 pandemic has rapidly affected our day-to-day life, disrupting world trade and movements. Wearing a protective face mask has become a new normal. In the near future, many public service providers will ask the customers to wear masks correctly to avail of their services. Therefore, face mask detection has become a crucial task to help global society. This paper presents a simplified approach to achieve this purpose using some basic Machine Learning packages like TensorFlow, Keras, OpenCV and Scikit-Learn. The proposed method detects the face from the image correctly and then identifies if it has a mask on it or not. As a surveillance task performer, it can also detect a face along with a mask in motion. The method attains accuracy up to 95.77% and 94.58% respectively on two different datasets. We explore optimized values of parameters using the Sequential Convolutional Neural Network model to detect the presence of masks correctly without causing over-fitting.
Image Processing, Deep Learning, Convolutional Neural Networks, Face Detection.
Mohd Suaib*1 and Dr. M Shahid Husain2, 1Dept. of Computer Science Integral University, Lucknow, 2Dept. of Information Technology UTAS, Oman
Ranking Webpages is an important task as it assists the user look for highly ranked pages that are relevant to the query. Different metrics have been proposed to rank web pages according to their quality. With the help of web usage analysis, we can effectively improve the ranking of the web pages according to the user’s requirement. The objective of the proposed work is to provide an efficient framework for personalized web page ranking of search engine based on web usage analysis. The proposed framework is consisting of two modules: in first module; the association rules were generated using frequent patterns (access sequence of pages) for ranking the webpages. In second module; the rules discovered in first module were optimized using Bat Algorithm, an optimizing technique inspired by nature.
Personalized page ranking, Search ranking, web mining, web usage mining, association rule mining, web log analysis.
Mohammad Naveed Hossain, Sheikh Fahim Uz Zaman, Tazria Zerin Khan, Sumiaya Azad Katha, Tawhid Anwar and Dr. Muhammad Iqbal Hossain
When it comes to technology, we live in a time when it has the potential to improve or degrade our lives. We cannot imagine a day without technology in today’s digital world, and for security considerations, we rely heavily on single-factor or two-factor authentication. While Even if we utilize two-factor authentication, our data may still be hacked (2FA). 2FA has several features. Our password contains flaws, and as a result, it may be easily hacked or compromised by hackers, even if our OTP is not available to hackers. To address this vulnerability and enhance the security and dependability of our data, we use three-factor authentication to prevent any unauthorized user from accessing our data. Each of the five authentication steps requires three authentications. The first is the most often used login and password. If both the password and the OTP are legitimate, the system will prompt you for another piece of information: Bio-metric authentication, such as fingerprint or voice recognition, is another alternative, albeit not all devices enable these features. Permits bio-metric identification to be used on specific devices. The option of using a graphical password is available. By encrypting our data, we can assure that it is secure and trustworthy for all of our users by utilizing these three authentication methods.
OTP, Authentication, 2FA, 3FA, Hacked, Bio-Metric, alphanumeric password, data protection, network security, three-factor authentication.
Awritrojit Banerjee1* and Aruna Chakraborty2, 1Department of Information Technology, St. Thomas’ College of Engineering & Technology, Kolkata, India, 2Department of Basic Science and Humanities, St. Thomas’ College of Engineering & Technology, Kolkata, India
The prevalent machine learning algorithms used for classification require extensive data pre-processing and a lot of training to figure out the best values for the learnable parameters that maximize the prediction capability of the model. This makes such algorithms slow and often, numerically unstable. Seldom, we find classifiers that can not only correctly classify points actually belonging to the dataset, but can also identify points that do not. In this paper, we propose a fuzzy algorithm to solve these problems. The proposed algorithm is highly scalable, is not affected by the curse of dimensionality and an also identify points not belonging to the dataset with a certainty of 1. This fuzzy-logic-oriented approach utilizes the power of membership functions to build classifiers that achieved 100% accuracy on the Iris dataset and 93.85% accuracy on the high dimensional Wisconsin Breast Cancer dataset at thresholds greater than 0.65.
Fuzzy Logic, Pattern Classification, High dimensional dataset, Unknown pattern identification, Gaussian membership function.
Gleb Kiselev1,2*, Daniil Weizenfeld1,2 and Yaroslava Gorbunova1,2, 1Department of Information Technology, Peoples’ Friendship University of Russia, Miklukho-Maklaya str. 6, Moscow, 117198, Russia, 2Artificial Intelligence Research Institute, FRC CSC RAS, Vavilova str. 44, Moscow, 119333, Russia
The paper considers the automatic analysis problem of a user’s natural language query from an image. The mechanism synthesizes a logically correct non-binary response. Synthesis is carried out on the basis of combining the results of convolutional and recurrent networks and projection on a set of valid answers. A three-dimensional data set has been developed to search for an answer in a complex environment using a robotic arm. Similar systems examples and their comparison are given. The experiments results showed that our method is able to achieve indicators comparable with known models.
computer science, machine learning, computer vision, neural networks.
Carlo Petalver, Roderick Bandalan and Gregg Victor Gabison, Graduate School of Computer Studies, University of San Jose – Recoletos, Cebu City, Philippines
Categorizing books and other archaic paper sources to course reference or syllabus is a challenge in library science. The traditional way of categorization is manually done by professionals and the process of seeking and retrieving information can be frustrating. It needs intellectual tasks and conceptual analysis of a human effort to recognize similarities of items in determining the subject to the correct category. Unlike the traditional categorization process, the author implemented the concept of automatic document categorization for libraries using text mining. The project involves the creation of a web app and mobile app. This can be accomplished through the use of a supervised machine learning classification model using the Support Vector Machine algorithm that can predict the given category of data from the book or other archaic paper sources to the course syllabus they belong to.
Text Mining, Document Categorization, Classification algorithm, Support Vector Machine, Library.
Mishahira.N1, Mohammad Talal Houkan1, Kishor Kumar Sadasivuni1*, Mithra Geetha1, Somaya Al-Maadeed2, Asiya Albusaidi3, Nandhini Subramanian2, Huseyin Cagatay Yalcin4, Hassen M. Ouakad3, Issam Bahadur5, 1Center for Advanced Materials Qatar University, Doha, Qatar, 2Department of Computer Science and Engineering, Qatar University, Qatar, 3Mechanical and Industrial Engineering Department, Sultan Qaboos University, Muscat, Oman, 4Biomedical Research Center, Qatar University, Qatar, 5College of Engineering, Sultan Qaboos University, Muscat, Oman
Globally, cardiovascular problems are the leading cause of death. The early identification of heart failure will help patients and healthcare practitioners take better measures to avoid risks. The purpose of this study is to identify a method that can accurately predict the risk of cardiovascular diseases. These predictions are made by deep learning algorithms, such as Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), using the training data we provide. Insufficient medical data will decrease prediction accuracy. As part of our study, we analyzed DNN architectures to predict heart failure. Existing deep learning algorithms were used over the training data. Comparing the accuracy performance of the existing Model over the proposed Model leads to achieving a new deep learning algorithm that can predict heart failure from RR interval measurements. NSR-RR and CHF-RR databases from Physiobank were used for obtaining the results. The proposed Model achieved 94% accuracy than the existing model accuracy of 93.1% and was based on the experimental results using these two open-source RR interval databases.
Heart Failure, Deep learning, Time series, Time-LeNet. Database.
Om Mane and Sarvanakumar Kandasamy, Department of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
The stock market is a network which provides a platform for almost all major economic transactions. While investing in the stock market is a good idea, investing in individual stocks may not be, especially for the casual investor. Smart stock-picking requires in-depth research and plenty of dedication. Predicting this stock value offers enormous arbitrage profit opportunities. This attractiveness of finding a solution has prompted researchers to find a way past problems like volatility, seasonality, and dependence on time. This paper surveys recent literature in the domain of natural language processing and machine learning techniques used to predict stock market movements. The main contributions of this paper include the sophisticated categorizations of many recent articles and the illustration of the recent trends of research in stock market prediction and its related areas.
Stock market Prediction, Sentiment Analysis, Opinion Mining, Natural Language Processing, Deep Learning.
Achraf Lassoued, University Paris 2, France
Given a stream of text, we associate a stream of edges in a graph G and study its large clusters by analysing the giant components of random subgraphs, obtained by sampling some edges with different distributions. For a stream of Tweets, we show that the large giant components of uniform sampled edges of the Twitter graph reflect the large clusters of G. For a stream of text, the uniform sampling is inefficient but the weighted sampling where the weight is proportional to the Word2vec similarity provides good results. Nodes of high degree of the giant components define the central words and central sentences of the text.
Streaming algorithms, Clustering, Dynamic graphs.
Priya K P, Harikrishnan T P, Datahub Technologies and R& D, Kochi, Ernakulam, Kerala, India
Question Answering (QA) is a branch of the Natural Language Understanding (NLU) field (which falls under the NLP umbrella). It aims to implement systems that, when given a question in natural language, can extract relevant information from provided data and present it in the form of natural language answer. The problem of making a fully functional question answering system is a problem which has been quite popular among researchers. Information Extraction systems takes text in natural language as input and produces structured information specified by a certain criterion, which is relevant to that particular use case. This paper introduces Information Extraction technology, its various sub-tasks focusing on question answering, highlights state-of-the-art research in variousIE subtasks, current challenges, and future research directions.
Natural language processing(NLP), Information Retrieval(IR), Information Extraction(IE), Question Answering(QA).
Bonil Shah, P. M. Jat and Kalyan Sasidhar, DAIICT, Gandhinagar, India
The growth of big-data sectors such as the Internet of Things (IoT) generates enormous volumes of data. As IoT devices generate a vast volume of time-series data, the Time Series Database (TSDB) popularity has grown alongside the rise of IoT. Time series databases are developed to manage and analyze huge amounts of time series data. However, it is not easy to choose the best one from them. The most popular benchmarks compare the performance of different databases to each other but use random or synthetic data that applies to only one domain. As a result, these benchmarks may not always accurately represent real-world performance. It is required to comprehensively compare the performance of time series databases with real datasets. The experiment shows significant performance differences for data injection time and query execution time when comparing real and synthetic datasets. The results are reported and analyzed.
Timeseries database, benchmark, real-world application.
Sadarabalaji, Ambati Naga Praneetha Reddy, L Raghav Kalyan, Kuruba Madhu, Nikhath Tabassum, Ashwini P, Geetha D.D, School of Electronics and Communication Engineering, REVA University, Bangalore, India
The pandemic has brought a paradigm shift in the wireless communication technology. Due to social distance norms, wireless technologies have emerged much stronger and better in the consumer sector. Our proposed project, Smart Cosmetic Selector (SCS), brings this touch less technology to the cosmetic industry. In the proposed project, we have built a smart cosmetic selection unit that helps a person to choose lipstick shades without applying on the lips. The image of the person is captured by a high resolution camera in real time. The user can choose the lipstick shades and see the color on their lips in real time. The open computer vision and Haar cascade files that analyze facial characteristics and can recommend the best fit lip colors based on individual’s complexion. The proposed method is much better than the existing library detection method in terms of efficiency, memory and speed.
Haar cascade, Lip color, Open Cv, segmentation, facial recognition and Image processing.
Pushya Chaparala, SatyaSri Pothula, Ramya Bellamkonda, Vignan’s Foundation for Science, Technology and Researc, Deemed to be University, Vadlamudi, Guntur, India
Network Intrusion Detectionis a challenge in a real world, itis a malicious attack that will be occurred on network ,to prevent these attack s there are multiple algorithms are existed. Due to increase in usage of network ing it had been difficult to classify those malicious attack sfrom normal network s. Hence there are few machine learning algorithms are introduced among them we need to find the best algorithm technique and also to reduce the complexity we need to choose a feature selection method that is best suitable for the algorithm. We use supervised machine learning technique to identify this attack s through network.
Machine Learning, Network ing, Network Intrusion Detection, Intrusion Detection System(IDS), Support Vector Machine(SVM), Artificial Neural Network(ANN), Coefficient correlation, Feature Selection..