2019, Volume 23, № 1

Information and Computer Technologies

A study on MapReduce job failures in Hadoop PDF

Ehsan Shirzad, Hamid Saadatfar
Faculty of Electrical and Computer Engineering, University of Birjand, Daneshgah Blvd, Birjand, Iran

Today, many big companies such as Facebook, Yahoo, and Google are using Hadoop for a variety of purposes. Hadoop is an open source software framework based on MapReduce parallel programming model for processing big data. Due to the importance of big data systems such as Hadoop, many studies have been conducted on these systems in order to achieve various goals such as efficient resource management, effective scheduling, and cognition of failure causes. By studying the failure causes, we can discern and resolve them, increase system’s efficiency, and prevent from waste of resources and time. In this paper, we studied log files of a research cluster named OpenCloud in order to recognize job failures. OpenCloud has a long history of using Hadoop framework and has been used by researchers in various fields. Our study showed that different factors such as executing duration, number of executor hosts, volume of input/output data, and configurations affect the success or failure rate of the MapReduce jobs in Hadoop.

Automatic photometric processing methods for star variability identification PDF

Sergey Bratarchuk1, Zlata Potiļicina2
1RTU MTAF AERTI, Lomonosova Str, 1v, Riga, LV-1003, Latvia
2Longenesis, Hong-Kong


In the task of variable star detection exists a problem of missing data. By using shared telescope networks like LCO, users often face the concurrence for the observation time. This concurrence does not let to make a lot of photos of the same part of the sky. The author of the research proposes a new method for the solution of the missing data or unevenly based data problem in the task of variable stars’ detection. Method is based on the addition of your own variable star data by using the data of other researchers. Author suggests an algorithm that identifies the star of interest on the series of photos. Algorithm automatically identifies the stars on the different images independently from the shift or rotation of the stars on the image. Then the algorithm extracts the data about the flux and magnitude of the stars on the image. In this way, by getting data about the magnitude and flux of the star from different sources, it is possible to fill the gaps in data that will increase the probability that a star will be identified as a variable one.

Using convolutional neural network for Android malware detection PDF

Isil Karabey Aksakalli
Erzurum Technical University, Faculty of Engineering and Architecture, Department of Computer Engineering, ERZURUM

With the increasing usage of smart mobile devices, the number of applications developed for these devices is already increasing day by day. Nearly all functionalities (sending e-mails, searching the internet, messaging via internet, making bank account transactions etc.) performed by using computer are carried out on mobile devices anymore. However, misuse of personal information emerges through malicious applications in the devices and these applications render the devices unusable. In the literature and industry, new methodologies have been proposed for mobile malware detection; however, there is still a research challenge to identify malwares on mobile applications and take precautions. In this paper, a permission-based model is implemented to detection of malware applications in mobile devices which have Android operating system. Permission-based features have been extracted from the apk files in the AndroTracker1 data set which is previously created in the literature. The results of classification techniques have been evaluated by applying four types of machine learning techniques (Support Vector Machine, k-Nearest Neighbor, Back Propagation) and these techniques have been compared with Convolutional Neural Network. The experimental results show that the permission-based model is highly successful using both machine learning technique and deep learning in the AndroTracker data set. Back Propagation gives the best result among the other machine learning techniques by 96.1% acurracy rate. Also Convolutional Neural Network has achieved success rate of 96.71%. This demonstrates that the accuracy rates of CNN and classical machine learning techniques close to each other and they have high accuracy rate because of small number of targets which are benign and malware.