Adaptive anomaly based fraud detection model for handling concept drift in short-term profile

Loading...
Thumbnail Image
Date
2018
Journal Title
Journal ISSN
Volume Title
Publisher
Universiti Teknologi Malaysia
Abstract
Fraud is a cybercrime where the purpose is to take money by illegal means. Fraud results in significant losses to organizations, companies and government agencies. Detecting fraud accurately will have an impact on reducing such loss, for instance by using anomaly detection, which relies on behavioural modelling methods. The anomaly based Fraud Detection System (FDS) model aims to detect and recognize fraudulent activities or anomalies as they enter a system and report them accordingly. Many anomaly based FDSs have been proposed in the literature. However, current anomaly based FDS models have low accuracy, high false alarms and delayed detection due to the drifted behaviour over time (concept drift issue), developing behavioural patterns of customers, hidden indicators, and the large dimensionality. The main purpose of this research is to design and develop an adaptive anomaly FDS model based on concept drift detection technique using shortterm aggregation profile to improve fraud detection accuracy and support early fraud detection. To achieve this purpose, two main phases are involved: the first phase is the data pre-processing phase and the second phase is the fraud and concept drift detection phase. The data pre-processing phase contains two stages; firstly, deriving features and profile building and; secondly, the feature selection stage. The first stage in the pre-processing phase is to support early detection by using a combination of derived features and features derived from literature. A rank-search feature selection stage is a hybrid approach which consists of two steps; Support Vector Machines Recursive Feature Elimination (SVM-RFE) Rank method and Greedy Stepwise (GS) Search method. A feature selection stage is used to improve fraud detection accuracy by selecting optimum features of user behaviour. In the second phase of the proposed adaptive FDS model, the fraud and drift detection phase, an effective online streaming approach based on an incremental classifier is adopted to accuratelydiscriminate fraudulent from normal data. In the concept drift detection phase, the trigger based approach is used for adaptive learning, and an adaptive training window is used to manage training data. The Statistical Process Control (SPC) technique is used as a drift detector to identify the sudden and gradual drift in the users’ behaviour. The Call Details Records (CDR) dataset containing Subscriber Identity Module (SIM) Box fraud is used to test and evaluate the proposed model. The proposed adaptive Incremental Learning Strategy and Concept Drift Detection Technique (FDS-ILS-CDDT) model integrated with the rank-search feature selection approach improves the detection accuracy of the SIM Box fraud containing the concept drifts. The average detection accuracy on a daily basis for DATA-CP (continuous pattern) saw an increase from 91.16%, 88.08% and 90.81% to 91.40% for FDS-SLS, FDS-PLS and FDS-ILS models respectively. The same growth occurred for DATA-CDP (continuous and discrete pattern), from 84.55%, 83.81% and 85.11% to 89.34% for FDS-SLS, FDS-PLS and FDS-ILS models respectively. Furthermore, FDS-ILS-CDDT obtained the best performance for false negative rate and false positive rate compared with other FDS models. The features are reduced to two and eleven of the most relevant and influential features for DATA-CP and DATA-CDP respectively.
Description
Thesis (Ph.D (Computer Science))
Keywords
Computer crimes, Computer fraud
Citation