Skip to main content
Select your country and language

Depending on your country, different offers might be available

Europe

Africa

Middle East

 

Autopentest-drl ✓ (FRESH)

The agent learns basics: scan → detect vulnerable service → execute correct exploit. Rewards are given immediately.

Simulators are imperfect. They do not model network latency jitter, packet loss, or ephemeral service failures. An agent that thrives in CybORG may freeze when a real web server occasionally drops a FIN packet, interpreting it as a firewall. autopentest-drl

Defenders deploy simple firewalls and IDS alerts. The agent learns to add random delays or route through decoys. The agent learns basics: scan → detect vulnerable

Training a single robust policy requires 50,000 to 200,000 episodes. In real time, at 30 seconds per episode (optimistic for a small network), that is 1.7 years of continuous simulation. Distributed training on GPU clusters cuts this to days, but hyperparameter tuning remains an art. 000 to 200