A new AI-based malware experiment has raised serious concerns in the cybersecurity world. A researcher named Avery has developed a proof-of-concept (PoC) that uses artificial intelligence to create malware that can bypass Microsoft Defender for Endpoint, one of the most widely used antivirus tools in the world.

What makes this more alarming is that the PoC uses reinforcement learning (RL), a type of AI that learns by trial and error. The AI was trained to create malware samples and then test them inside a system that had Microsoft Defender running. If a sample managed to run without being blocked or detected, the model received a reward. Over time, the model learned to generate better and more stealthy malware.

To set this up, Avery used an open-source large language model called Qwen 2.5. This AI was placed in a sandbox environment that included Windows Defender. Every malware file it created was automatically tested. If the file got caught by Defender, the model would adjust and try again. The idea was to improve with each round by avoiding the behaviors that triggered detection.

What’s even more interesting is how the model received feedback. Defender’s cloud-based system scores each file it scans and assigns it a severity rating. This score tells how risky the file is. The AI used these scores as feedback to guide its learning process. The goal was to lower the severity rating, ideally to a point where the malware was either flagged as low risk, or not flagged at all.

After enough training cycles, the model was able to evade Microsoft Defender in about 8% of cases. That means if it created 100 files, roughly 8 of them would bypass security completely. While 8% may not sound like much, it’s significantly better than other attempts. For comparison, similar tests using other models like Anthropic’s and DeepSeek’s only achieved evasion rates of less than 1% and 0.5% respectively.

The researcher revealed that the whole setup cost around $1,500 to $1,600, which includes the compute resources for running the AI model and setting up the testing environment. This cost is relatively low, especially considering how powerful the outcome is. It means that attackers with modest budgets could potentially recreate this process.

Although this is still just a PoC and not a live malware campaign, it shows how AI is becoming more capable of helping attackers. It’s also a wake-up call for security professionals and software vendors. If basic reinforcement learning can already outsmart Microsoft Defender in some cases, more advanced AI setups could be even more dangerous.

The experiment also raises questions about how antivirus systems like Defender will need to evolve. Traditional detection methods that rely on signatures or common behavior patterns may not be enough anymore. We may be entering a new phase of cybersecurity where threats are generated by machines, for machines, and they keep getting smarter over time.

It’s worth noting that Microsoft Defender still successfully blocked the vast majority of malware samples created during the experiment. This shows that it remains a strong tool overall. However, the fact that any AI-generated samples were able to sneak through highlights the need for new defensive strategies, like behavioral analysis, AI-assisted threat detection, and multi-layered security.

In conclusion, this proof-of-concept doesn’t mean AI malware is already a widespread threat, but it proves that it could become one soon. It also shows that even trusted security software like Microsoft Defender is not invincible in the age of AI. As AI tools become more affordable and available, both attackers and defenders will need to level up their strategies.

Stay alert, and keep your security measures updated!

Source: Follow cybersecurity88 on X and LinkedIn for the latest cybersecurity news