Hacking

Creating An AI Honeypot To Have interaction With Attackers Sophisticatedly

17 September 2024

Honeypots, decoy methods, detect and analyze malicious exercise by coming in numerous kinds and might be deployed on cloud platforms to supply insights into attacker conduct, enhancing safety.

The research proposes to create an interactive honeypot system utilizing a Massive Language Mannequin (LLM) to imitate Linux server conduct.

By fine-tuning the LLM with a dataset of attacker-generated instructions, the objective is to boost honeypot effectiveness in detecting and analyzing malicious actions.

– Commercial –

The authors mixed three datasets of Linux instructions, together with real-world attacker knowledge, frequent instructions, and command explanations, and processed this knowledge by simulating command execution and preprocessing the textual content, creating a strong dataset for coaching their language mannequin to imitate a honeypot.

Immediate engineering concerned refining prompts to align with analysis targets and improve mannequin interplay with the dataset, resulting in a more practical honeypot system.

The Llama3 8B mannequin was chosen for honeypot LLM on account of its stability of linguistic proficiency and computational effectivity.

Bigger fashions have been too gradual, whereas code-centric fashions have been much less efficient for honeypot simulation.

Decoding Compliance: What CISOs Must Know – Be a part of Free Webinar

They fine-tuned a pre-trained language mannequin utilizing LlamaFactory, using LoRA, QLoRA, NEFTune noise, and Flash Consideration 2 to boost coaching effectivity and efficiency, leading to a honeypot server-like mannequin.

It proposes an LLM-Honeypot framework utilizing an SSH server and a fine-tuned LLM to work together with attackers in pure language, enabling real looking simulation and attacker conduct evaluation.

The customized SSH server, constructed utilizing Python’s Paramiko library, employs a fine-tuned language mannequin to generate real looking responses to consumer instructions.

It logs SSH connections, consumer credentials, and command interactions, offering precious knowledge for cybersecurity evaluation.

The fine-tuned mannequin’s coaching losses exhibited a gradual decline, indicating efficient studying from the dataset.

A studying fee of 5×10−4 was used for 36 coaching steps, leading to constant efficiency enchancment and enhanced skill to generate real looking and contextually applicable responses.

Histogram of Cosine Similarity Scores over 140 Samples

It demonstrated superior efficiency in producing terminal outputs in comparison with the bottom mannequin, as evidenced by persistently larger similarity scores and decrease distance metrics throughout all samples, which signifies the mannequin’s effectiveness in producing outputs that carefully align with anticipated responses from a Cowrie honeypot server.

The paper proposes a brand new technique for creating interactive and real looking honeypot methods utilizing LLMs. By fine-tuning an LLM on attacker knowledge, the system enhances response high quality, improves menace detection, and gives deeper insights into attacker conduct.

They plan to develop coaching datasets, discover different fine-tuning, and incorporate behavioral evaluation by deploying the system publicly to gather assault logs and create data graphs to investigate attacker methods.

They will even consider efficiency utilizing metrics like accuracy and interplay high quality to refine the mannequin and improve honeypots for higher cyber-threat detection and evaluation.

Are You From SOC/DFIR Groups? - Strive Superior Malware and Phishing Evaluation With ANY.RUN - 14-day free trial

LEAVE A REPLY Cancel reply