Behavioral Modelling from Encrypted Remote Desktop Protocol Network Traffic
Traditional traffic monitoring relies on availability of unencrypted payload data inside of network packets, on which pattern match analysis is performed. With the growth in network traffic encryption, user behavioral monitoring has been severely hindered. Therefore, alternative methods are explored, such as machine learning for encrypted traffic classification which does not require decryption prior to analysis. In my study, I analyze encrypted Remote Desktop Protocol traffic from behavioral perspective on a network traffic dataset I generate. I develop a heterogeneous ensemble classification model that performs multi-label classification for five common RDP behaviors – Download, Browsing, Notepad, YouTube and Clipboard. The task is complicated as the data samples I generate may belong to one or more classes at the same time. I use Shapley Values to determine significant features and perform classification using the following techniques: SVM, KNN, Neural Network, Decision Tree, AdaBoost, Random Forest and XGBoost. The final model achieves a cross-validated minimum Precision of 97% and a minimum Recall of 94% for each of the five behavioral classes. Finally, I discuss some of the risks to privacy associated with the Remote Desktop Protocol traffic.