Makine Öğrenme Yöntemi Kullanılarak DarkWEB Trafiği Tespiti ve Sınıflandırılması

Esen Gül İlgün; Yusuf Sönmez; Murat Dener

Araştırma Makalesi

DarkWEB Traffic Detection and Classification Using Machine Learning Method

Yıl 2023, Cilt: 9 Sayı: 4, 126 - 140, 31.12.2023

Esen Gül İlgün Yusuf Sönmez Murat Dener

Öz

DarkWEB makes up 6% of DeepWEB, which contains data that search engines cannot index and is approximately 96% of all websites. DarkWEB is encrypted network traffic tunneled through special software such as TOR (The Onion Router) and provides a high level of anonymity with a series of anonymized connections that make the IP address untraceable. This makes it easier to carry out criminal activities such as media piracy, drug dealing, terrorism and child pornography. In this study, the statistical information of the packets was analyzed without decrypting this encrypted network traffic. Different data sets were obtained by applying categorical data coding, scaling, feature selection and data balancing pre-processes separately and together to the CIC-Darknet2020 data set used within the scope of the proposed methodology for high-accuracy detection and classification of DarkWEB traffic. Obtained data sets and Logistic Regression (LR), Gaussian Naive Bayes (GNB), Decision Tree (DT), K-Nearest Neighbor (KNN), Multi Layer Perceptron (MLP), Random Forest (RF), eXtreme Gradient Boosting (XGBoost). ), many DarkWEB traffic detection and classification models have been created using Light Gradient Boosting Machine (LightGBM), Category Boosting (CatBoost) machine learning algorithms. With the models created, Encryption (Encrypted, Standard), Category (Tor, Non-Tor, Non-VPN, VPN), Subcategory (Audio-Stream, Browsing, Chat, E-mail, P2P, Transfer, Video-Stream, VOIP) classes 2, 4 and 8 classifications were made. The correct detection and classification rate of DarkWEB traffic was achieved at 99.9% in 2-way and 4-way classification and 94% in 8-way classification.

Anahtar Kelimeler

DeepWEB, DarkWEB, encrypted network traffic, machine learning, classification

Kaynakça

[1] G. Weımann, “Going Darker? The Challenge of Dark Net Terrorism”, wilsoncenter.org, [Online]. Available: https://www.wilsoncenter.org/sites/default/files/media/documents/publication/going_darker_challenge_of_dark_net_terrorism.pdf. [Accessed: Jun. 6, 2023].
[2] R. Badhwar, The CISO’s Next Frontier: Dark Web & Dark Net, Springer Nature Switzerland AG 2021.
[3] K. Demertzis, K. Tsiknas, D. Takezis, C. Skianis and L. Iliadis, “Darknet traffic bigdata analysis and network management for real-time automating of the malicious intent detection process by a weight agnostic neural networks framework”, Electronics, vo.10, no.7, pp.781, 2021. doi: 10.3390/electronics10070781
[4] A. Bracci, M.Nadini, M. Aliapoulios, D. McCoy, I. Gray, A. Teytelboym, A. Gallo and A. Baronchelli, “Dark Web Marketplaces and COVID-19: before the vaccine,” EPJ Data Sci, vol.10, no. 6, 2021. doi: 10.1140/epjds/s13688-021-00259-w
[5] A.H. Lashkari, G. Kaur and A. Rahali, “DIDarknet: A Contemporary Approach to Detect and Characterize the Darknet Traffic using Deep Image Learning,” 10th International Conference on Communication and Network Security, 2020, Tokyo, pp. 1-13, November, 2020.
[6] M. B. Sarwar, M. K. Hanif, R. Talib, M. Younas and M. U. Sarwar, "DarkDetect: Darknet Traffic Detection and Categorization Using Modified Convolution-Long Short-Term Memory," in IEEE Access, vol. 9, pp. 113705-113713, 2021, doi: 10.1109/ACCESS.2021.3105000.
[7] L. A. Iliadis and T. Kaifas, "Darknet Traffic Classification using Machine Learning Techniques," 2021 10th International Conference on Modern Circuits and Systems Technologies (MOCAST), Thessaloniki, July 2021, Greece [Online]. Available: IEEE Xplore, https://ieeexplore.ieee.org/document/9493386. [Accessed: 10 Sept. 2023].
[8] S. Sridhar and S. Sanagavarapu, "DarkNet Traffic Classification Pipeline with Feature Selection and Conditional GAN-based Class Balancing," 2021 IEEE 20th International Symposium on Network Computing and Applications (NCA), Boston, MA, USA, 2021, [Online]. Available: IEEE Xplore, https://ieeexplore.ieee.org/document/9685743. [Accessed: 20 May. 2023].
[9] Y. Li, Y. Lu and S. Li, "EZAC: Encrypted Zero-day Applications Classification using CNN and K-Means," 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Dalian, China, 2021, [Online]. Available: IEEE Xplore, https://ieeexplore.ieee.org/document/9437716. [Accessed: 12 Feb. 2023].
[10] M. Ugurlu, İ. Dogru, ve R. S. Arslan, “Karanlık ağ trafiğinin makine öğrenmesi yöntemleri kullanılarak tespiti ve sınıflandırılması,” GUMMFD, vol. 38, no. 3, pp. 1737–1746, 2023, doi: 10.17341/gazimmfd.1023147.
[11] N. Rust-Nguyen, S. Sharma, and M. Stamp, “Darknet traffic classification and adversarial attacks using machine learning,” Comput. Secur, vol. 127, pp.16, 2023. doi: 10.1016/j.cose.2023.103098
[12] A. Almomani, “Darknet traffic analysis, and classification system based on modified stacking ensemble learning algorithms,” Inf Syst E-Bus Manage, 2023. doi: 10.1007/s10257-023-00626-2
[13] H. Mohanty, A. H. Roudsari, and A. Habibi Lashkari, “Robust stacking ensemble model for darknet traffic classification under adversarial settings,” Comput. Secur, vol.120, Sep. 2022. doi: 10.1016/j.cose.2022.102830
[14] Q. A. Al-Haija, M. Krichen and W. A. Elhaija, “Machine-Learning-Based Darknet Traffic Detection System for IoT Applications,” Electronics, vol. 11, no.4, pp.556, 2022. doi:11. 556. 10.3390/electronics11040556.
[15] Y. Li and Y. Lu, “ ETCC: Encrypted Two-Label Classification Using CNN,” Sec. and Commun. Netw. vol.2021, pp.11, 2021. doi:10.1155/2021/6633250
[16] M. Alimoradi, M. Zabihimayvan, A. Daliri, R. Sledzik and R. Sadeghi, “Deep Neural Classification of Darknet Traffic,” In book: Artificial Intelligence Research and Development, Edition: printChapter: 356, Publisher: IOS Press, 2022, pp.105-114
[17] A. H. Lashkari, G. Draper Gil, M. Mamun and A. Ghorbani, “Characterization of Encrypted and VPN Traffic Using Time-Related Features,” The International Conference on Information Systems Security and Privacy (ICISSP), Feb 2016, Italy, [Online]. Available: IEEE Xplore, https://doi.org/10.5220/0005740704070414. [Accessed: 10 Apr. 2023].
[18] A. H. Lashkari, G. Kaur and A. Rahali, “DIDarknet: A Contemporary Approach to Detect and Characterize the Darknet Traffic using Deep Image Learning,” 10th International Conference on Communication and Network Security, November 2020, Tokyo, Japan, [Online]. Available: https://doi.org/10.1145/3442520.3442521. [Accessed: 20 May. 2023].
[19] E. G. İlgün ve R. Samet, “Veri setine uygulanan ön işlemler ile makine öğrenimi yöntemi kullanılarak geliştirilen saldırı tespit modellerinin performanslarının arttırılması,” GUMMFD, vol. 39, no. 2, pp. 679–692, 2023, doi: 10.17341/gazimmfd.1122021.
[20] E. G. İlgün, “Veri setine uygulanan ön işlemlerin anomali tabanlı saldırı tespit modellerinin performansları üzerindeki etkisinin incelenmesi,” Yüksek Lisans Tezi, Ankara Üniversitesi, Ankara, Türkiye, 2022.
[21] O. Kaynar, H. Arslan, Y. Görmez ve Y. E. Işık, “Makine Öğrenmesi ve Öznitelik Seçim Yöntemleriyle Saldırı Tespiti,” Bilişim Teknolojileri Dergisi, 11 (2), pp.175-185, 2018. doi: 10.17671/gazibtd.368583
[22] A. Fernandez, S. Garcia, M. Galar, R.C. Prati, B. Krawczyk and F. Herrera, “Learning from Imbalanced Data Sets,” Cambridge International Law Journal, pp. 83, 2018. doi:10.1007/978-3-319-98074-4
[23] J. Brownlee, “Random Oversampling and Undersampling for Imbalanced Classification,” machinelearningmastery.com, Jan. 15, 2020. [Online]. Available: https://machinelearningmastery.com/random-oversampling-and-undersampling-for-imbalanced-classification/on. [Accessed: 12 Apr. 2023].

Makine Öğrenme Yöntemi Kullanılarak DarkWEB Trafiği Tespiti ve Sınıflandırılması

Yıl 2023, Cilt: 9 Sayı: 4, 126 - 140, 31.12.2023

Esen Gül İlgün Yusuf Sönmez Murat Dener

Öz

DarkWEB, arama motorlarının indeksleyemediği verileri içeren ve tüm web sitelerinin yaklaşık %96’sı olan DeepWEB’in %6’sını oluşturur. DarkWEB, TOR (The Onion Router) gibi özel yazılımlar ile tünellenen şifreli ağ trafiğidir ve IP adresini izlenemez hale getiren anonimleştirilmiş bir dizi bağlantı ile yüksek düzeyde anonimlik sağlar. Bu durum medya korsanlığı, uyuşturucu satıcılığı, terörizm, çocuk pornografisi gibi suç faaliyetlerinin gerçekleştirilmesini kolaylaştırır. Bu çalışmada, bu şifreli ağ trafiğinde deşifreleme işlemi yapılmadan, paketlerin istatistiki bilgileri analiz edilmiştir. DarkWEB trafiğinin yüksek doğrulukta tespiti ve sınıflandırılması için önerilen metodoloji kapsamında kullanılan CIC-Darknet2020 veri setine kategorik veri kodlama, ölçeklendirme, öznitelik seçimi ve veri dengeleme ön işlemleri ayrı ayrı ve de birlikte uygulanarak farklı veri setleri elde edilmiştir. Elde edilen veri setleri ve Logistic Regression (LR), Gaussian Naive Bayes (GNB), Decision Tree (DT), K-Nearest Neighbor (KNN), Multi Layer Perceptron (MLP), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Category Boosting (CatBoost) makine öğrenme algoritmaları kullanılarak çok sayıda DarkWEB trafiği tespit ve sınıflandırma modeli oluşturulmuştur. Oluşturulan modeller ile Encryption (Şifreli, Standart), Category (Tor, Non-Tor, Non-VPN, VPN), Subcategory ( Audio-Stream, Browsing, Chat, E-mail, P2P, Transfer, Video-Stream, VOIP) sınıfları olmak üzere 2’li, 4’lü, 8’li sınıflandırmalar yapılmıştır. 2’li ve 4’lü sınıflandırmada %99.9, 8’li sınıflandırmada ise %94, DarkWEB trafiği doğru tespit ve sınıflandırma oranına ulaşılmıştır.

Anahtar Kelimeler

DeepWEB, DarkWEB, şifreli ağ trafiği, makine öğrenme, sınıflandırma

Kaynakça

[1] G. Weımann, “Going Darker? The Challenge of Dark Net Terrorism”, wilsoncenter.org, [Online]. Available: https://www.wilsoncenter.org/sites/default/files/media/documents/publication/going_darker_challenge_of_dark_net_terrorism.pdf. [Accessed: Jun. 6, 2023].
[2] R. Badhwar, The CISO’s Next Frontier: Dark Web & Dark Net, Springer Nature Switzerland AG 2021.
[3] K. Demertzis, K. Tsiknas, D. Takezis, C. Skianis and L. Iliadis, “Darknet traffic bigdata analysis and network management for real-time automating of the malicious intent detection process by a weight agnostic neural networks framework”, Electronics, vo.10, no.7, pp.781, 2021. doi: 10.3390/electronics10070781
[4] A. Bracci, M.Nadini, M. Aliapoulios, D. McCoy, I. Gray, A. Teytelboym, A. Gallo and A. Baronchelli, “Dark Web Marketplaces and COVID-19: before the vaccine,” EPJ Data Sci, vol.10, no. 6, 2021. doi: 10.1140/epjds/s13688-021-00259-w
[5] A.H. Lashkari, G. Kaur and A. Rahali, “DIDarknet: A Contemporary Approach to Detect and Characterize the Darknet Traffic using Deep Image Learning,” 10th International Conference on Communication and Network Security, 2020, Tokyo, pp. 1-13, November, 2020.
[6] M. B. Sarwar, M. K. Hanif, R. Talib, M. Younas and M. U. Sarwar, "DarkDetect: Darknet Traffic Detection and Categorization Using Modified Convolution-Long Short-Term Memory," in IEEE Access, vol. 9, pp. 113705-113713, 2021, doi: 10.1109/ACCESS.2021.3105000.
[7] L. A. Iliadis and T. Kaifas, "Darknet Traffic Classification using Machine Learning Techniques," 2021 10th International Conference on Modern Circuits and Systems Technologies (MOCAST), Thessaloniki, July 2021, Greece [Online]. Available: IEEE Xplore, https://ieeexplore.ieee.org/document/9493386. [Accessed: 10 Sept. 2023].
[8] S. Sridhar and S. Sanagavarapu, "DarkNet Traffic Classification Pipeline with Feature Selection and Conditional GAN-based Class Balancing," 2021 IEEE 20th International Symposium on Network Computing and Applications (NCA), Boston, MA, USA, 2021, [Online]. Available: IEEE Xplore, https://ieeexplore.ieee.org/document/9685743. [Accessed: 20 May. 2023].
[9] Y. Li, Y. Lu and S. Li, "EZAC: Encrypted Zero-day Applications Classification using CNN and K-Means," 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Dalian, China, 2021, [Online]. Available: IEEE Xplore, https://ieeexplore.ieee.org/document/9437716. [Accessed: 12 Feb. 2023].
[10] M. Ugurlu, İ. Dogru, ve R. S. Arslan, “Karanlık ağ trafiğinin makine öğrenmesi yöntemleri kullanılarak tespiti ve sınıflandırılması,” GUMMFD, vol. 38, no. 3, pp. 1737–1746, 2023, doi: 10.17341/gazimmfd.1023147.
[11] N. Rust-Nguyen, S. Sharma, and M. Stamp, “Darknet traffic classification and adversarial attacks using machine learning,” Comput. Secur, vol. 127, pp.16, 2023. doi: 10.1016/j.cose.2023.103098
[12] A. Almomani, “Darknet traffic analysis, and classification system based on modified stacking ensemble learning algorithms,” Inf Syst E-Bus Manage, 2023. doi: 10.1007/s10257-023-00626-2
[13] H. Mohanty, A. H. Roudsari, and A. Habibi Lashkari, “Robust stacking ensemble model for darknet traffic classification under adversarial settings,” Comput. Secur, vol.120, Sep. 2022. doi: 10.1016/j.cose.2022.102830
[14] Q. A. Al-Haija, M. Krichen and W. A. Elhaija, “Machine-Learning-Based Darknet Traffic Detection System for IoT Applications,” Electronics, vol. 11, no.4, pp.556, 2022. doi:11. 556. 10.3390/electronics11040556.
[15] Y. Li and Y. Lu, “ ETCC: Encrypted Two-Label Classification Using CNN,” Sec. and Commun. Netw. vol.2021, pp.11, 2021. doi:10.1155/2021/6633250
[16] M. Alimoradi, M. Zabihimayvan, A. Daliri, R. Sledzik and R. Sadeghi, “Deep Neural Classification of Darknet Traffic,” In book: Artificial Intelligence Research and Development, Edition: printChapter: 356, Publisher: IOS Press, 2022, pp.105-114
[17] A. H. Lashkari, G. Draper Gil, M. Mamun and A. Ghorbani, “Characterization of Encrypted and VPN Traffic Using Time-Related Features,” The International Conference on Information Systems Security and Privacy (ICISSP), Feb 2016, Italy, [Online]. Available: IEEE Xplore, https://doi.org/10.5220/0005740704070414. [Accessed: 10 Apr. 2023].
[18] A. H. Lashkari, G. Kaur and A. Rahali, “DIDarknet: A Contemporary Approach to Detect and Characterize the Darknet Traffic using Deep Image Learning,” 10th International Conference on Communication and Network Security, November 2020, Tokyo, Japan, [Online]. Available: https://doi.org/10.1145/3442520.3442521. [Accessed: 20 May. 2023].
[19] E. G. İlgün ve R. Samet, “Veri setine uygulanan ön işlemler ile makine öğrenimi yöntemi kullanılarak geliştirilen saldırı tespit modellerinin performanslarının arttırılması,” GUMMFD, vol. 39, no. 2, pp. 679–692, 2023, doi: 10.17341/gazimmfd.1122021.
[20] E. G. İlgün, “Veri setine uygulanan ön işlemlerin anomali tabanlı saldırı tespit modellerinin performansları üzerindeki etkisinin incelenmesi,” Yüksek Lisans Tezi, Ankara Üniversitesi, Ankara, Türkiye, 2022.
[21] O. Kaynar, H. Arslan, Y. Görmez ve Y. E. Işık, “Makine Öğrenmesi ve Öznitelik Seçim Yöntemleriyle Saldırı Tespiti,” Bilişim Teknolojileri Dergisi, 11 (2), pp.175-185, 2018. doi: 10.17671/gazibtd.368583
[22] A. Fernandez, S. Garcia, M. Galar, R.C. Prati, B. Krawczyk and F. Herrera, “Learning from Imbalanced Data Sets,” Cambridge International Law Journal, pp. 83, 2018. doi:10.1007/978-3-319-98074-4
[23] J. Brownlee, “Random Oversampling and Undersampling for Imbalanced Classification,” machinelearningmastery.com, Jan. 15, 2020. [Online]. Available: https://machinelearningmastery.com/random-oversampling-and-undersampling-for-imbalanced-classification/on. [Accessed: 12 Apr. 2023].

Toplam 23 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Yazılım Mühendisliği (Diğer)
Bölüm	Araştırma Makalesi
Yazarlar	Esen Gül İlgün 0000-0002-1719-5727 Yusuf Sönmez Murat Dener
Yayımlanma Tarihi	31 Aralık 2023
Gönderilme Tarihi	19 Kasım 2023
Kabul Tarihi	20 Aralık 2023
Yayımlandığı Sayı	Yıl 2023 Cilt: 9 Sayı: 4

Kaynak Göster

IEEE	E. G. İlgün, Y. Sönmez, ve M. Dener, “Makine Öğrenme Yöntemi Kullanılarak DarkWEB Trafiği Tespiti ve Sınıflandırılması”, GMBD, c. 9, sy. 4, ss. 126–140, 2023.

Kapak Resmi İndir

Makale Dosyaları

Tam Metin

Gazi Journal of Engineering Sciences (GJES) publishes open access articles under a Creative Commons Attribution 4.0 International License (CC BY)