Computer vision can help detect cyber threats with surprising accuracy


This article is part of our AI research article reviews, a series of articles that explore the latest discoveries in artificial intelligence.

The growing interest of the last decade in deep learning was triggered by the proven ability of neural networks in computer vision tasks. If you train a neural network with enough tagged photos of cats and dogs, it will be able to find recurring patterns in each category and categorize the invisible images with decent precision.

What else can you do with an image classifier?

In 2019, a group of cybersecurity researchers wondered if they could treat security threat detection as an image classification problem. Their intuition turned out well and they were able to create a machine learning model capable of detecting malware from images created from the contents of application files. A year later, the same technique was used to develop a machine learning system that detects phishing websites.

The combination of binary visualization and machine learning is a powerful technique that can provide new solutions to old problems. It shows promise in cybersecurity, but it could be applied to other areas as well.

Detect malware with deep learning

The traditional way to detect malware is to scan files for known signatures of malicious payloads. Malware detectors maintain a database of virus definitions that include opcode sequences or snippets of code, and they scan new files for the presence of those signatures. Unfortunately, malware developers can easily bypass these detection methods by using various techniques such as obfuscating their code or using polymorphism techniques to mutate their code at runtime.

Dynamic scanning tools attempt to detect malicious behavior at runtime, but they are slow and require you to set up a sandbox environment to test for suspicious programs.

In recent years, researchers have also tried a range of machine learning techniques to detect malware. These ML models have been successful in advancing some of the challenges of malware detection, including code obfuscation. But they present new challenges, including the need to learn too many features and a virtual environment to analyze target samples.

Binary visualization can redefine malware detection by turning it into a computer vision problem. In this methodology, files are executed through algorithms that transform binary and ASCII values ​​into color codes.

In a article published in 2019, researchers at the University of Plymouth and the University of the Peloponnese have shown that when benign and malicious files are viewed using this method, new patterns emerge that separate malicious files from safe files. These differences would have gone unnoticed with traditional malware detection methods.

binary malware visualization