Back to Search View Original Cite This Article

Abstract

<jats:p>This paper investigates the influence of training data composition and model complexity on the reliability and generalization ability of a lightweight DDoS detector deployed at the edge of Amazon Web Services cloud infrastructure. A minimal processing pipeline was developed using Amazon CloudFront logs and one-minute aggregated features, including request rate, the number of unique IP addresses, and counts of 4xx/5xx responses. A separate validation set was used to select and fix a single decision threshold before testing. Two supervised learning algorithms—Random Forest (RF) and Logistic Regression (LR)—were trained on a dataset comprising 89 minutes of traffic, including 17 malicious and 72 legitimate minutes. The dataset covered background traffic, a known high-RPS attack, and a legitimate flash crowd. Evaluation was conducted on a continuous four-scenario timeline with a total duration of 159 minutes, where the final scenario represented a previously unseen attack. Both detectors achieved a zero false-positive rate on 125 legitimate test minutes. On the known attack, RF detected 17 of 21 malicious minutes, whereas LR detected 20 of 21; on the unseen attack, RF detected 9 of 13 minutes, while LR failed to detect any. Keywords: Amazon Web Services, CloudFront access logs, cloud security, DDoS attack detection, edge traffic analysis, lightweight machine learning methods.</jats:p>

Show More

Keywords

minutes attack amazon traffic legitimate

Related Articles

PORE

About

Connect