Large Language Models
for
Intrusion Detection

Tobias Becher

Technische Universität Berlin

379929

July 23, 2025

Introduction

The Cybersecurity Threat Landscape

Cyberattacks are rising rapidly

Early detection is critical

Security analysts are overwhelmed by alerts

“Alert fatigue” leads to missed threats

Can LLMs Help?

Can Large Language Models (LLMs) improve network intrusion detection?

LLMs can be used to explain network traffic features and what they might mean. What can the user expect?

Studies have shown that LLMs can be used for regression. Can they do complex numerical tasks?

Using LLMs on network traffic data is novel. Does deeper exploration warrant the cost?

Research Question (RQ1)


Are LLMs effective for classifying malicious network traffic?

RQ1a

Do LLMs improve classification performance compared to a multi-stage SOTA baseline model?

RQ1b

Do LLMs perform better than random guessing in classifying malicious network traffic?

Null Hypotheses

LLM-enhanced IDS is worse than baseline

\[ H_0^1:\;\; \PIDS^{(ours)} \leq \PIDS^{(baseline)} \]

LLM-enhanced IDS is worse than random guessing

\[ H_0^2:\;\; \PIDS^{(ours)} \leq \PIDS^{(random)} \]

Related Work

Evolution of Intrusion Detection Systems


Signature

Anomaly

Hybrid

ML

DL

Large Language Models for Textual and Numerical Predictive Tasks


Textual

Numerical


Open question: Can LLMs learn to classify real-world network traffic from raw numbers alone?

Table 1: Sample data points used in the study by Biji and Kim (2024).

Classification Biji and Kim (2024)

Measurements1 Age Range Code
73 192 36 12 10 22 34 25 19 42 43 32 8 42 35 18 38 17 9 51 58 3 53 45 20 10 18 1 1
75 253 42 12 10 21 34 25 20 49 48 32 8 43 37 18 41 17 10 53 61 3 55 48 20 10 18 3 3
73 240 40 11 11 23 36 26 19 46 46 32 8 44 35 20 40 18 11 52 60 3 55 46 20 11 19 3 3

Input Format Vacareanu et al. (2024)

Feature 1: <number>
Feature 2: <number>
Output: <number>

Background

Network Traffic Features


Flow Statistics

Duration and traffic rate (bytes/s, packets/s) across the entire flow.

Packet Count & Size

Number and size of packets in forward and backward directions.

Timing

Inter-arrival times and active/idle durations in the flow lifecycle.

Header Information

Header lengths and TCP window data; shows protocol-level overhead.

TCP Flags

Counts of control flags (SYN, ACK, PSH, etc.) signaling connection state.

Bulk Transfer Stats

Average size and rate of data sent in large chunks per direction.

Subflow Metrics

Stats for subdivisions of a flow (packet and byte counts per subflow).

Miscellaneous

Other traffic descriptors like payload packet counts and rates.

Listing 1: Example flows from the CIC-IDS2017 dataset by Sharafaldin, Habibi Lashkari, and Ghorbani (2018).
                                Flow ID,     Source IP, Source Port, Destination IP, Destination Port, Protocol,    Timestamp, Flow Duration, Total Fwd Packets, Total Backward Packets,Total Length of Fwd Packets, Total Length of Bwd Packets, Fwd Packet Length Max, Fwd Packet Length Min, Fwd Packet Length Mean, Fwd Packet Length Std,Bwd Packet Length Max, Bwd Packet Length Min, Bwd Packet Length Mean, Bwd Packet Length Std,Flow Bytes/s, Flow Packets/s, Flow IAT Mean, Flow IAT Std, Flow IAT Max, Flow IAT Min,Fwd IAT Total, Fwd IAT Mean, Fwd IAT Std, Fwd IAT Max, Fwd IAT Min,Bwd IAT Total, Bwd IAT Mean, Bwd IAT Std, Bwd IAT Max, Bwd IAT Min,Fwd PSH Flags, Bwd PSH Flags, Fwd URG Flags, Bwd URG Flags, Fwd Header Length, Bwd Header Length,Fwd Packets/s, Bwd Packets/s, Min Packet Length, Max Packet Length, Packet Length Mean, Packet Length Std, Packet Length Variance,FIN Flag Count, SYN Flag Count, RST Flag Count, PSH Flag Count, ACK Flag Count, URG Flag Count, CWE Flag Count, ECE Flag Count, Down/Up Ratio, Average Packet Size, Avg Fwd Segment Size, Avg Bwd Segment Size, Fwd Header Length.1,Fwd Avg Bytes/Bulk, Fwd Avg Packets/Bulk, Fwd Avg Bulk Rate, Bwd Avg Bytes/Bulk, Bwd Avg Packets/Bulk,Bwd Avg Bulk Rate,Subflow Fwd Packets, Subflow Fwd Bytes, Subflow Bwd Packets, Subflow Bwd Bytes,Init_Win_bytes_forward, Init_Win_bytes_backward, act_data_pkt_fwd, min_seg_size_forward,Active Mean, Active Std, Active Max, Active Min,Idle Mean, Idle Std, Idle Max, Idle Min, Label
192.168.10.5-104.16.207.165-54865-443-6,104.16.207.165,         443,   192.168.10.5,            54865,        6,7/7/2017 3:30,             3,                 2,                      0,                         12,                           0,                     6,                     6,                    6.0,                   0.0,                    0,                     0,                      0,                     0,   4000000.0,    666666.6667,           3.0,          0.0,            3,            3,            3,          3.0,         0.0,           3,           3,            0,            0,           0,           0,           0,            0,             0,             0,             0,                40,                 0,  666666.6667,           0.0,                 6,                 6,                6.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             0,                 9.0,                  6.0,                    0,                  40,                 0,                    0,                 0,                  0,                    0,                0,                  2,                12,                   0,                 0,                    33,                      -1,                1,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
  192.168.10.5-104.16.28.216-55054-80-6, 104.16.28.216,          80,   192.168.10.5,            55054,        6,7/7/2017 3:30,           109,                 1,                      1,                          6,                           6,                     6,                     6,                    6.0,                   0.0,                    6,                     6,                      6,                     0, 110091.7431,    18348.62385,         109.0,          0.0,          109,          109,            0,          0.0,         0.0,           0,           0,            0,            0,           0,           0,           0,            0,             0,             0,             0,                20,                20,  9174.311927,   9174.311927,                 6,                 6,                6.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              1,              0,              0,             1,                 9.0,                  6.0,                    6,                  20,                 0,                    0,                 0,                  0,                    0,                0,                  1,                 6,                   1,                 6,                    29,                     256,                0,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
  192.168.10.5-104.16.28.216-55055-80-6, 104.16.28.216,          80,   192.168.10.5,            55055,        6,7/7/2017 3:30,            52,                 1,                      1,                          6,                           6,                     6,                     6,                    6.0,                   0.0,                    6,                     6,                      6,                     0, 230769.2308,    38461.53846,          52.0,          0.0,           52,           52,            0,          0.0,         0.0,           0,           0,            0,            0,           0,           0,           0,            0,             0,             0,             0,                20,                20,  19230.76923,   19230.76923,                 6,                 6,                6.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              1,              0,              0,             1,                 9.0,                  6.0,                    6,                  20,                 0,                    0,                 0,                  0,                    0,                0,                  1,                 6,                   1,                 6,                    29,                     256,                0,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.16-104.17.241.25-46236-443-6, 104.17.241.25,         443,  192.168.10.16,            46236,        6,7/7/2017 3:30,            34,                 1,                      1,                          6,                           6,                     6,                     6,                    6.0,                   0.0,                    6,                     6,                      6,                     0, 352941.1765,    58823.52941,          34.0,          0.0,           34,           34,            0,          0.0,         0.0,           0,           0,            0,            0,           0,           0,           0,            0,             0,             0,             0,                20,                20,  29411.76471,   29411.76471,                 6,                 6,                6.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              1,              0,              0,             1,                 9.0,                  6.0,                    6,                  20,                 0,                    0,                 0,                  0,                    0,                0,                  1,                 6,                   1,                 6,                    31,                     329,                0,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.5-104.19.196.102-54863-443-6,104.19.196.102,         443,   192.168.10.5,            54863,        6,7/7/2017 3:30,             3,                 2,                      0,                         12,                           0,                     6,                     6,                    6.0,                   0.0,                    0,                     0,                      0,                     0,   4000000.0,    666666.6667,           3.0,          0.0,            3,            3,            3,          3.0,         0.0,           3,           3,            0,            0,           0,           0,           0,            0,             0,             0,             0,                40,                 0,  666666.6667,           0.0,                 6,                 6,                6.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             0,                 9.0,                  6.0,                    0,                  40,                 0,                    0,                 0,                  0,                    0,                0,                  2,                12,                   0,                 0,                    32,                      -1,                1,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
 192.168.10.5-104.20.10.120-54871-443-6, 104.20.10.120,         443,   192.168.10.5,            54871,        6,7/7/2017 3:30,          1022,                 2,                      0,                         12,                           0,                     6,                     6,                    6.0,                   0.0,                    0,                     0,                      0,                     0, 11741.68297,    1956.947162,        1022.0,          0.0,         1022,         1022,         1022,       1022.0,         0.0,        1022,        1022,            0,            0,           0,           0,           0,            0,             0,             0,             0,                40,                 0,  1956.947162,           0.0,                 6,                 6,                6.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             0,                 9.0,                  6.0,                    0,                  40,                 0,                    0,                 0,                  0,                    0,                0,                  2,                12,                   0,                 0,                    32,                      -1,                1,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
 192.168.10.5-104.20.10.120-54925-443-6, 104.20.10.120,         443,   192.168.10.5,            54925,        6,7/7/2017 3:30,             4,                 2,                      0,                         12,                           0,                     6,                     6,                    6.0,                   0.0,                    0,                     0,                      0,                     0,   3000000.0,       500000.0,           4.0,          0.0,            4,            4,            4,          4.0,         0.0,           4,           4,            0,            0,           0,           0,           0,            0,             0,             0,             0,                40,                 0,     500000.0,           0.0,                 6,                 6,                6.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             0,                 9.0,                  6.0,                    0,                  40,                 0,                    0,                 0,                  0,                    0,                0,                  2,                12,                   0,                 0,                    32,                      -1,                1,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
 192.168.10.5-104.20.10.120-54925-443-6, 104.20.10.120,         443,   192.168.10.5,            54925,        6,7/7/2017 3:30,            42,                 1,                      1,                          6,                           6,                     6,                     6,                    6.0,                   0.0,                    6,                     6,                      6,                     0, 285714.2857,    47619.04762,          42.0,          0.0,           42,           42,            0,          0.0,         0.0,           0,           0,            0,            0,           0,           0,           0,            0,             0,             0,             0,                20,                20,  23809.52381,   23809.52381,                 6,                 6,                6.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             1,                 9.0,                  6.0,                    6,                  20,                 0,                    0,                 0,                  0,                    0,                0,                  1,                 6,                   1,                 6,                    32,                     256,                0,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
  192.168.10.8-104.28.13.116-9282-443-6, 104.28.13.116,         443,   192.168.10.8,             9282,        6,7/7/2017 3:30,             4,                 2,                      0,                         12,                           0,                     6,                     6,                    6.0,                   0.0,                    0,                     0,                      0,                     0,   3000000.0,       500000.0,           4.0,          0.0,            4,            4,            4,          4.0,         0.0,           4,           4,            0,            0,           0,           0,           0,            0,             0,             0,             0,                40,                 0,     500000.0,           0.0,                 6,                 6,                6.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             0,                 9.0,                  6.0,                    0,                  40,                 0,                    0,                 0,                  0,                    0,                0,                  2,                12,                   0,                 0,                    32,                      -1,                1,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.5-104.97.123.193-55153-443-6,104.97.123.193,         443,   192.168.10.5,            55153,        6,7/7/2017 3:30,             4,                 2,                      0,                         37,                           0,                    31,                     6,                   18.5,           17.67766953,                    0,                     0,                      0,                     0,   9250000.0,       500000.0,           4.0,          0.0,            4,            4,            4,          4.0,         0.0,           4,           4,            0,            0,           0,           0,           0,            1,             0,             0,             0,                40,                 0,     500000.0,           0.0,                 6,                31,        22.66666667,       14.43375673,            208.3333333,             0,              1,              0,              0,              1,              0,              0,              0,             0,                34.0,                 18.5,                    0,                  40,                 0,                    0,                 0,                  0,                    0,                0,                  2,                37,                   0,                 0,                   946,                      -1,                1,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.5-104.97.125.160-55143-443-6,104.97.125.160,         443,   192.168.10.5,            55143,        6,7/7/2017 3:30,             3,                 2,                      0,                         37,                           0,                    31,                     6,                   18.5,           17.67766953,                    0,                     0,                      0,                     0,  12300000.0,    666666.6667,           3.0,          0.0,            3,            3,            3,          3.0,         0.0,           3,           3,            0,            0,           0,           0,           0,            1,             0,             0,             0,                40,                 0,  666666.6667,           0.0,                 6,                31,        22.66666667,       14.43375673,            208.3333333,             0,              1,              0,              0,              1,              0,              0,              0,             0,                34.0,                 18.5,                    0,                  40,                 0,                    0,                 0,                  0,                    0,                0,                  2,                37,                   0,                 0,                   980,                      -1,                1,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.5-104.97.125.160-55144-443-6,104.97.125.160,         443,   192.168.10.5,            55144,        6,7/7/2017 3:30,             1,                 2,                      0,                         37,                           0,                    31,                     6,                   18.5,           17.67766953,                    0,                     0,                      0,                     0,  37000000.0,      2000000.0,           1.0,          0.0,            1,            1,            1,          1.0,         0.0,           1,           1,            0,            0,           0,           0,           0,            1,             0,             0,             0,                40,                 0,    2000000.0,           0.0,                 6,                31,        22.66666667,       14.43375673,            208.3333333,             0,              1,              0,              0,              1,              0,              0,              0,             0,                34.0,                 18.5,                    0,                  40,                 0,                    0,                 0,                  0,                    0,                0,                  2,                37,                   0,                 0,                   980,                      -1,                1,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.5-104.97.125.160-55145-443-6,104.97.125.160,         443,   192.168.10.5,            55145,        6,7/7/2017 3:30,             4,                 2,                      0,                         37,                           0,                    31,                     6,                   18.5,           17.67766953,                    0,                     0,                      0,                     0,   9250000.0,       500000.0,           4.0,          0.0,            4,            4,            4,          4.0,         0.0,           4,           4,            0,            0,           0,           0,           0,            1,             0,             0,             0,                40,                 0,     500000.0,           0.0,                 6,                31,        22.66666667,       14.43375673,            208.3333333,             0,              1,              0,              0,              1,              0,              0,              0,             0,                34.0,                 18.5,                    0,                  40,                 0,                    0,                 0,                  0,                    0,                0,                  2,                37,                   0,                 0,                   980,                      -1,                1,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
 192.168.10.5-104.97.139.37-55254-443-6, 104.97.139.37,         443,   192.168.10.5,            55254,        6,7/7/2017 3:30,             3,                 3,                      0,                         43,                           0,                    31,                     6,            14.33333333,           14.43375673,                    0,                     0,                      0,                     0,  14300000.0,      1000000.0,           1.5,  0.707106781,            2,            1,            3,          1.5, 0.707106781,           2,           1,            0,            0,           0,           0,           0,            0,             0,             0,             0,                60,                 0,    1000000.0,           0.0,                 6,                31,              12.25,              12.5,                 156.25,             0,              0,              0,              0,              1,              0,              0,              0,             0,         16.33333333,          14.33333333,                    0,                  60,                 0,                    0,                 0,                  0,                    0,                0,                  3,                43,                   0,                 0,                   946,                      -1,                2,                   20,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
 192.168.10.16-104.97.140.32-36206-80-6, 104.97.140.32,          80,  192.168.10.16,            36206,        6,7/7/2017 3:30,            54,                 1,                      1,                          0,                           0,                     0,                     0,                    0.0,                   0.0,                    0,                     0,                      0,                     0,         0.0,    37037.03704,          54.0,          0.0,           54,           54,            0,          0.0,         0.0,           0,           0,            0,            0,           0,           0,           0,            0,             0,             0,             0,                32,                32,  18518.51852,   18518.51852,                 0,                 0,                0.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              1,              0,              0,             1,                 0.0,                  0.0,                    0,                  32,                 0,                    0,                 0,                  0,                    0,                0,                  1,                 0,                   1,                 0,                   939,                    1269,                0,                   32,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.25-121.29.54.141-53524-443-6, 121.29.54.141,         443,  192.168.10.25,            53524,        6,7/7/2017 3:30,             1,                 2,                      0,                          0,                           0,                     0,                     0,                    0.0,                   0.0,                    0,                     0,                      0,                     0,         0.0,      2000000.0,           1.0,          0.0,            1,            1,            1,          1.0,         0.0,           1,           1,            0,            0,           0,           0,           0,            0,             0,             0,             0,                64,                 0,    2000000.0,           0.0,                 0,                 0,                0.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             0,                 0.0,                  0.0,                    0,                  64,                 0,                    0,                 0,                  0,                    0,                0,                  2,                 0,                   0,                 0,                   130,                      -1,                0,                   32,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.25-121.29.54.141-53524-443-6, 121.29.54.141,         443,  192.168.10.25,            53524,        6,7/7/2017 3:30,           154,                 1,                      1,                          0,                           0,                     0,                     0,                    0.0,                   0.0,                    0,                     0,                      0,                     0,         0.0,    12987.01299,         154.0,          0.0,          154,          154,            0,          0.0,         0.0,           0,           0,            0,            0,           0,           0,           0,            0,             0,             0,             0,                32,                32,  6493.506494,   6493.506494,                 0,                 0,                0.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             1,                 0.0,                  0.0,                    0,                  32,                 0,                    0,                 0,                  0,                    0,                0,                  1,                 0,                   1,                 0,                   130,                   65535,                0,                   32,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.25-121.29.54.141-53526-443-6, 121.29.54.141,         443,  192.168.10.25,            53526,        6,7/7/2017 3:30,             1,                 2,                      0,                          0,                           0,                     0,                     0,                    0.0,                   0.0,                    0,                     0,                      0,                     0,         0.0,      2000000.0,           1.0,          0.0,            1,            1,            1,          1.0,         0.0,           1,           1,            0,            0,           0,           0,           0,            0,             0,             0,             0,                64,                 0,    2000000.0,           0.0,                 0,                 0,                0.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             0,                 0.0,                  0.0,                    0,                  64,                 0,                    0,                 0,                  0,                    0,                0,                  2,                 0,                   0,                 0,                   130,                      -1,                0,                   32,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.25-121.29.54.141-53526-443-6, 121.29.54.141,         443,  192.168.10.25,            53526,        6,7/7/2017 3:30,           118,                 1,                      1,                          0,                           0,                     0,                     0,                    0.0,                   0.0,                    0,                     0,                      0,                     0,         0.0,    16949.15254,         118.0,          0.0,          118,          118,            0,          0.0,         0.0,           0,           0,            0,            0,           0,           0,           0,            0,             0,             0,             0,                32,                32,  8474.576271,   8474.576271,                 0,                 0,                0.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             1,                 0.0,                  0.0,                    0,                  32,                 0,                    0,                 0,                  0,                    0,                0,                  1,                 0,                   1,                 0,                   130,                   65535,                0,                   32,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
192.168.10.25-121.29.54.141-53527-443-6, 121.29.54.141,         443,  192.168.10.25,            53527,        6,7/7/2017 3:30,           239,                 1,                      1,                          0,                           0,                     0,                     0,                    0.0,                   0.0,                    0,                     0,                      0,                     0,         0.0,    8368.200837,         239.0,          0.0,          239,          239,            0,          0.0,         0.0,           0,           0,            0,            0,           0,           0,           0,            0,             0,             0,             0,                32,                32,  4184.100418,   4184.100418,                 0,                 0,                0.0,               0.0,                    0.0,             0,              0,              0,              0,              1,              0,              0,              0,             1,                 0.0,                  0.0,                    0,                  32,                 0,                    0,                 0,                  0,                    0,                0,                  1,                 0,                   1,                 0,                   130,                   65535,                0,                   32,          0,          0,          0,          0,        0,        0,        0,        0,BENIGN
  • Threat Groupings: Benign, (D)DOS, Port Scan, Brute Force, Botnet, Web Attack etc.

Large Language Models

Transformer Architecture

  • Neural sequence models with varying context windows
  • Replace RNNs and LSTMs for long-range modeling (no recurrence; highly parallelizable)
  • Introduced in 2017 by Vaswani et al. (2017) (Self-Attention)
  • Forms basis of virtually all modern LLMs

Prompting Techniques: Examples

Prompting Strategies

  • Instruction Tuning
  • Few-Shot Prompting
  • Chain-of-Thought (CoT) (Wei et al. (2022))
  • Tree-of-Thought (ToT) (Yao et al. (2023))

Bootstrapping: Use high-confidence model outputs as demonstrations for the prompt (e.g., including CoT steps); reduces manual labeling while maintaining alignment with the desired structure (Opsahl-Ong et al. (2024)).


❌ classify this {...}

Instruction-Tuning
------------------------
Classify this network traffic as either Benign or Malicious.
The input is structured as a JSON packet capture.
Base your classification on protocol type, ports, and payload signatures.
Input: {...}


Few-Shot
------------------------
Classify the following network session:

Example 1:  
Input: {...}  
Label: Malicious

Example 2:  
Input: {...}  
Label: Benign

Now classify:  
Input: {...}


Chain-of-Thought
------------------------
Input: {...}

Think step-by-step:  
- Identify source/destination and protocol  
- Evaluate payload entropy and known threat indicators  
- Compare timing patterns with known scanning behavior  
Come to a conclusion and label the input accordingly.


Tree-of-Thought
------------------------
Imagine three different experts are answering this question.

All experts will write down 1 step of their thinking,  
then share it with the group.  

Then all experts will go on to the next step, etc.  
If any expert realizes they're wrong at any point then they leave.

The question is: How should we classify this network session?  
Input: {...}

Methodology

System Architecture: 4-Stage IDS

Performance Metrics & Notation


Malicious

True Positive = Attack
False Positive = False Alarm

Benign

True Negative = Benign Activity
False Negative = Missed Attack

Accuracy

\(\frac{TP + TN}{TP + TN + FP + FN}\)

Proportion of correctly classified samples.

Precision

\(\frac{TP}{TP + FP}\)

How many ‘attacks’ were truly malicious (avoiding false alarms).

Recall

\(\frac{TP}{TP + FN}\)

How many attacks were detected (avoiding missed threats).

F1-Score

\(2 \cdot \frac{\text{Precision} \cdot \text{Recall}}{\text{Precision} + \text{Recall}}\)

Balances recall and precision.

Table 2: Aggregated metrics for multi-class classification.
Metric Formula Description
Weighted F1 \[\sum_{i=1}^{7} w_i \cdot F_{1,i}\] F1 score per class, weighted by how often each class occurs in the dataset.
Balanced Accuracy \[\frac{1}{7} \sum_{i=1}^{7} \text{Recall}_i\] Average recall across all classes. Ensures each class contributes equally, even if imbalanced.
Macro F1 \[\frac{1}{7} \sum_{i=1}^{7} F_{1,i}\] Unweighted average F1 across all classes, treating rare and common attack types equally.
Micro F1 \[\frac{2 \cdot \sum TP_i}{2 \cdot \sum TP_i + \sum FP_i + \sum FN_i}\] Global view: aggregates all predictions across classes before computing a single F1.

Evaluated LLMs & Prompting Techniques


  • Three open-source LLMs:
    • DeepSeek-R11 (reasoning/thinking general-purpose; Qwen2.5 and Llama base; distilled from DeepSeek-R1-Zero)
    • Gemma-32 (general-purpose; distilled from Gemini)
    • Qwen2.5-Coder3 (coding-focused, Qwen2.5 base)
  • Model sizes: 1B to 32B parameters (Q4_K_M quantization)
  • Three custom fine-tuned variants: DeepSeek-R1 7B, Qwen2.5-Coder 7B, Gemma-3 12B

deepseek-r1

gemma3

qwen2.5-coder



Table 3: Network traffic flow LLM input formats.
Abbr. Description
raw Network traffic flow features as comma-separated integers
ctx Network traffic flow features as Description, Value pairs with additional statistics
sumctx A summarized version oft ctx

Table 4: LLM Prompting Techniques.
Abbr. Full Name Description
inst Pure Instruction Tuning Refined system prompt, no demonstrations
6 6-Shot Prompting 3 bootstrapped + 3 labeled examples
CoT-6 6-Shot Chain-of-Thought Prompting Step-by-step reasoning, 3 bootstrapped + 3 labeled examples
ToT-V 5-Expert 4-Shot Tree-of-Thought 5 experts, 4-shot, independent reasoning paths, 3 labeled + 1 bootstrapped

Input Format Examples

Listing 2: Example of the raw input format.
[[ ## flow_parameters ## ]]
6, 8686549, 2, 0, 12, 0, 6, 6, ...
[[ ## Label ## ]]
(D)DOS
[[ ## completed ## ]]
Listing 3: Example of the ctx input format.
[[ ## feature_contextualization ## ]]
The Protocol is TCP:
- 'Benign', where TCP is 56.62% of flows
- '(D)DOS', where TCP is 100.00% of flows
- 'Port Scan', where TCP is 100.00% of flows
- 'Brute Force', where TCP is 100.00% of flows
- 'Botnet', where TCP is 100.00% of flows
- 'Web Attack', where TCP is 100.00% of flows

The Flow_Duration is 8.686549 s:
- 'Benign', 8.686548 s above min (0.000001 s) and 3.114020 s below mean (11.800569 s)
- '(D)DOS', 8.686543 s above min (0.000006 s) and 44.797952 s below mean (53.484501 s)
- 'Port Scan', 8.466484 s above mean (0.220065 s) and 64.916533 s below max (73.603082 s)
- 'Brute Force', 0.844334 s above mean (7.842215 s) and 8.679114 s below max (17.365663 s)
- 'Botnet', 8.296008 s above mean (0.390541 s) and 52.317346 s below max (61.003895 s)
- 'Web Attack', 1.803673 s above mean (6.882876 s) and 61.516507 s below max (70.203056 s)

The total forward packets are 2.0:
- 'Benign', 1 packets above min (1) and 4 packets below mean (6)
- '(D)DOS', 1 packets above min (1) and 3 packets below mean (5)
- 'Port Scan', 1 packets above mean (1) and 3 packets below max (5)
- 'Brute Force', 1 packets above min (1) and 9 packets below mean (11)
- 'Botnet', 1 packets above min (1) and 1 packets below mean (3)
- 'Web Attack', 1 packets above min (1) and 10 packets below mean (12)

...
Listing 4: Example of the sumctx input format.
[[ ## benign_summary ## ]]
The flow exhibits some characteristics that partially align with benign traffic, such as being a TCP protocol (which is 56.62% typical for benign flows). However, the flow's metrics deviate 
significantly from benign norms: it has only two forward packets (below the benign mean of 6), zero backward packets (below the benign mean of 6), and an extremely low byte rate. The flow duration 
of 8.686549 seconds is longer than many benign flow minimums but still below the mean, suggesting an atypical benign interaction.

[[ ## ddos_summary ## ]]
While the flow uses TCP (100% consistent with DDoS), almost all other metrics diverge from typical (D)DOS patterns. The flow's packet count (2 forward, 0 backward) is below DDoS minimums, and the 
byte rate of 0.000977 kB/s is substantially lower than expected. The inter-arrival times are also inconsistent with DDoS characteristics, being much longer and more uniform than typical DDoS 
traffic's rapid, varied packet exchanges.

[[ ## port_scan_summary ## ]]
The flow fundamentally contradicts port scan traffic characteristics. While using TCP (100% consistent with port scans), the flow has critical mismatches: only two forward packets (above the port 
scan mean, but with zero backward packets), extremely low byte rates, and inter-arrival times that are much longer and more consistent than typical port scanning behavior. The flow lacks the rapid,
probing nature characteristic of port scanning.

[[ ## brute_force_summary ## ]]
The flow shows minimal alignment with brute force attack patterns. Although it uses TCP (100% consistent with brute force), the packet count (2 forward, 0 backward) is far below brute force 
minimums. The extremely low byte rate, zero payload, and long, uniform inter-arrival times are antithetical to the rapid, multiple authentication attempt pattern typical of brute force attacks.

[[ ## botnet_summary ## ]]
While the TCP protocol matches botnet traffic (100%), other metrics strongly diverge from botnet characteristics. The flow has only two forward packets and zero backward packets, which is 
inconsistent with botnet communication patterns. The extremely low byte rate, zero payload, and long, uniform inter-arrival times do not reflect the typically more dynamic and data-rich botnet 
network interactions.

[[ ## web_attack_summary ## ]]
The flow shows minimal correspondence with web attack traffic. Although it uses TCP (100% consistent with web attacks), the metrics are fundamentally different: only two forward packets, zero 
backward packets, extremely low byte rate, and no payload. Web attacks typically involve more complex packet exchanges, varied packet sizes, and more substantial data transfer, none of which are 
present in this flow.

[[ ## overall_summary ## ]]
This flow represents an anomalous network interaction that does not cleanly fit any of the examined traffic categories. Its defining characteristics are extremely low data transfer (zero payload), 
minimal packet count (two forward, zero backward), long but uniform inter-arrival times, and TCP protocol. While it shares the TCP protocol with all examined traffic types, its metrics are too 
sparse and uniform to confidently classify as malicious or even typical benign traffic. The flow appears to be a minimal, potentially incomplete or aborted network connection that lacks the dynamic
characteristics of established traffic patterns.

Dataset Splitting and Stage Distribution

Figure 6: Overview of data sampling and distribution across stages. Left: Initial sampling from CIC-IDS2017. Right: Detailed distribution for the first two stages. All stages are sampled from our dataset \(\mathcal{F}\). Stage 3 and 4 are omitted since they contain only testing data and all data respectively.

Results

Baseline Performance

Table 5: Baseline performance of the Verkerken et al. (2023) system across evaluation scopes. Reported corresponds to published results; Reproduction reflects results under our setup.
System b. Acc. Acc. F1 (w) F1 (M)
Reported (End-to-End) 0.9608 0.9877 0.9897 0.8276
Reproduction (End-to-End) 0.734 0.9787 0.9804 0.7103
Reproduction (Stage-Specific) 0.208 0.9461 0.9394 0.1504

System Performance by Model Variant

Figure 7: System Performance measured by weighted \(F_1\)-Score for configurations with varying model, model size and input format.
Figure 8: Stage-Specific – Model Family Best Configurations by specified metric.

System Performance by Prompting Technique

Figure 9: Stage-Specific – Prompting Technique All configurations in comparison and confusion matrix for the best performing configuration.

System Performance Under Fine-Tuning

Table 6: Stage-Specific – Fine-Tuning Isolated Stage 4 Results for System Performance Under Fine-tuning.
Configuration b. Acc. Acc. Prec. Rec. F1 (w) F1 (m) F1 (M)
deepseek-r1:7b-ft-raw 0.1670 0.1088 0.7054 0.1088 0.1122 0.1088 0.0477
qwen2.5-coder:7b-ft-raw 0.1141 0.0372 0.2679 0.0372 0.0084 0.0372 0.0202
gemma3:12b-ft-raw 0.1643 0.0306 0.2784 0.0306 0.0058 0.0306 0.0155

Auxiliary Analyses

Table 7: End-to-End – Auxiliary Results for Balanced Evaluation Across (Full) Test Set. Results for the gemma:27b-raw configuration.
Test Split Version b. Acc. Acc. Prec. Rec. F1 (w) F1 (m) F1 (M)
Original 0.2076 0.6670 0.6567 0.6670 0.6583 0.6670 0.1732
Binary-Balanced 0.1828 0.4533 0.3154 0.4533 0.3660 0.4533 0.1652
Multi-Class-Balanced 0.2117 0.2117 0.1735 0.2117 0.1469 0.2117 0.1259
Figure 10: End-to-End – Auxiliary Confusion matrices showing model performance on three evaluation splits. All evaluations used the gemma:27b-raw configuration. (a) serves as a comparison for the other two figures.
Figure 11: Stage-Specific – Auxiliary Percentage of samples for which the final LLM output remained malformatted even after self-correction, shown by model and input format, averaged over three seeded runs. Error bars denote the minimum and maximum values from these runs. The pipeline labels these samples as “N/A” before converting them to Unknown.
Table 8: Stage-Specific – Auxiliary Stratified Random Guessing Baseline: Mean Performance Metrics and 95% CI (N=10,000 Monte Carlo trials).
Metric Mean 95% CI
Accuracy 0.6924 (0.6547, 0.7265)
Balanced Accuracy 0.1430 (0.1157, 0.1858)
Precision 0.6924 (0.6746, 0.7121)
Recall 0.6924 (0.6547, 0.7265)
\(F_1\)-Score (Weighted) 0.6922 (0.6674, 0.7162)
\(F_1\)-Score (Micro) 0.6924 (0.6547, 0.7265)
\(F_1\)-Score (Macro) 0.1423 (0.1164, 0.1844)
Figure 12: System Performance measured by balanced accuracy for configurations with varying model, model size and input format.

Results Summary

  • All LLM configurations underperformed, especially in the final (LLM) stage
  • Class imbalances undermine high End-to-End results
  • Stage-Specific results drastically underperform the baseline
  • No consistent improvement from advanced prompting or larger model sizes
  • Fine-tuning (LoRA) was ineffective or detrimental

Discussion

What the Negative Result Tells Us

  • LLMs failed to match baseline performance in multi-class network traffic classification
  • Performance often near random chance on challenging samples
  • Critical reliability issues: malformatted outputs, often single-class prediction bias
  • No evidence that more data, larger models, or advanced techniques would change this conclusion

Current general-purpose LLMs are not suitable as standalone classifiers for low-level network flow data due to fundamental task-model misalignment.

Implications for IDS Design & LLM Research

For Practice:

  • LLMs not suitable for core IDS classification tasks
  • Recommend caution and focus on auxiliary, NLP-centric roles (e.g., report generation)
  • Critical reliability and performance issues make deployment in high-throughput IDS impossible

For Research:

  • Limited numerical reasoning capabilities and bad task-model fit
  • Comma-separated input representations for LLMs are not suitable, neither are longer input formats
  • Possibility of delving deeper into adaptation techniques (PEFT, full fine-tuning, domain-adaptive pre-training) to uncover better task-model fit potential

Conclusion

Main Findings

  • General-purpose LLMs, as evaluated, are not suitable enhancements for established ML techniques in multi-class network traffic classification.
  • LLMs consistently underperformed the reported SOTA baseline across all metrics.
  • Occasional wins over the reproduced baseline are artifacts of dataset imbalance.

RQ1a (Improvement over Baseline)

No. All configurations performed worse than the reported baseline across all metrics.

RQ1b (Improvement over Random-Guessing)

No. Some configurations beat random guessing on single metrics, but none on all, as required.

While LLMs are not ready for core IDS classification, careful integration in auxiliary roles remains a promising direction for advancing cybersecurity operations.

References

Biji, Divya Mary, and Yong-Woon Kim. 2024. “Evaluating the Performance of Large Language Models in Classifying Numerical Data.” In 2024 15th International Conference on Information and Communication Technology Convergence (ICTC), 840–44. https://doi.org/10.1109/ICTC62082.2024.10827495.
DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, et al. 2025. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.” January 22, 2025. https://doi.org/10.48550/arXiv.2501.12948.
Hui, Binyuan, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, et al. 2024. “Qwen2.5-Coder Technical Report.” November 12, 2024. https://doi.org/10.48550/arXiv.2409.12186.
Injadat, Mohammad Noor, Abdallah Moubayed, Ali Bou Nassif, and Abdallah Shami. 2021. “Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection.” IEEE Transactions on Network and Service Management 18 (2): 1803–16. https://doi.org/10.1109/TNSM.2020.3014929.
Manocchio, Liam Daly, Siamak Layeghy, Wai Weng Lo, Gayan K. Kulatilleke, Mohanad Sarhan, and Marius Portmann. 2024. FlowTransformer: A Transformer Framework for Flow-Based Network Intrusion Detection Systems.” Expert Syst. Appl. 241 (C). https://doi.org/10.1016/j.eswa.2023.122564.
Opsahl-Ong, Krista, Michael J Ryan, Josh Purtell, David Broman, Christopher Potts, Matei Zaharia, and Omar Khattab. 2024. “Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs.” In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, edited by Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen, 9340–66. Miami, Florida, USA: Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.emnlp-main.525.
Requeima, James, John F Bronskill, Dami Choi, Richard E. Turner, and David Duvenaud. 2024. LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language.” In Advances in Neural Information Processing Systems, 37:109609–71. https://proceedings.neurips.cc/paper_files/paper/2024/hash/c5ec22711f3a4a2f4a0a8ffd92167190-Abstract-Conference.html.
Sharafaldin, Iman, Arash Habibi Lashkari, and Ali A. Ghorbani. 2018. “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization:” In Proceedings of the 4th International Conference on Information Systems Security and Privacy, 108–16. Funchal, Madeira, Portugal: SCITEPRESS - Science and Technology Publications. https://doi.org/10.5220/0006639801080116.
Team, Gemma, Aishwarya Kamath, Johan Ferret, Shreya Pathak, Nino Vieillard, Ramona Merhej, Sarah Perrin, et al. 2025. “Gemma 3 Technical Report.” March 25, 2025. https://doi.org/10.48550/arXiv.2503.19786.
Vacareanu, Robert, Vlad Andrei Negru, Vasile Suciu, and Mihai Surdeanu. 2024. “From Words to Numbers: Your Large Language Model Is Secretly a Capable Regressor When Given in-Context Examples.” In First Conference on Language Modeling. https://openreview.net/forum?id=LzpaUxcNFK.
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. “Attention Is All You Need.” In.
Verkerken, Miel, Laurens D’hooge, Didik Sudyana, Ying-Dar Lin, Tim Wauters, Bruno Volckaert, and Filip De Turck. 2023. “A Novel Multi-Stage Approach for Hierarchical Intrusion Detection.” IEEE Transactions on Network and Service Management 20 (3): 3915–29. https://doi.org/10.1109/TNSM.2023.3259474.
Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. 2022. “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” In Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–37. NIPS ’22. Red Hook, NY, USA: Curran Associates Inc.
Yao, Shunyu, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao, and Karthik Narasimhan. 2023. “Tree of Thoughts: Deliberate Problem Solving with Large Language Models.” In Advances in Neural Information Processing Systems, edited by A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, 36:11809–22. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2023/file/271db9922b8d1f4dd7aaef84ed5ac703-Paper-Conference.pdf.