Specializing network analysis to detect anomalous insider actions
1 Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, TN, 37203, USA
2 Department of Electrical Engineering and Computer Science, School of Engineering, Vanderbilt University, Nashville, TN, 37203, USA
Security Informatics 2012, 1:5 doi:10.1186/2190-8532-1-5Published: 27 February 2012
Collaborative information systems (CIS) enable users to coordinate efficiently over shared tasks in complex distributed environments. For flexibility, they provide users with broad access privileges, which, as a side-effect, leave such systems vulnerable to various attacks. Some of the more damaging malicious activities stem from internal misuse, where users are authorized to access system resources. A promising class of insider threat detection models for CIS focuses on mining access patterns from audit logs, however, current models are limited in that they assume organizations have significant resources to generate label cases for training classifiers or assume the user has committed a large number of actions that deviate from "normal" behavior. In lieu of the previous assumptions, we introduce an approach that detects when specific actions of an insider deviate from expectation in the context of collaborative behavior. Specifically, in this paper, we introduce a specialized network anomaly detection model, or SNAD, to detect such events. This approach assesses the extent to which a user influences the similarity of the group of users that access a particular record in the CIS. From a theoretical perspective, we show that the proposed model is appropriate for detecting insider actions in dynamic collaborative systems. From an empirical perspective, we perform an extensive evaluation of SNAD with the access logs of two distinct environments: the patient record access logs a large electronic health record system (6,015 users, 130,457 patients and 1,327,500 accesses) and the editing logs of Wikipedia (2,394,385 revisors, 55,200 articles and 6,482,780 revisions). We compare our model with several competing methods and demonstrate SNAD is significantly more effective: on average it achieves 20-30% greater area under an ROC curve.