Microsoft and Intel recently partnered with a new research project to try a new approach to detecting and classifying malware. This project is called STAMINA (STAtic Malware-as-Image Network Analysis) and is based on a new technique that converts samples. malware in grayscale images and then scans an image for structural patterns that refer to malware samples. During the development of this project, the research team of Microsoft and Intel followed some steps. The first of these was to download an input file and convert its binary format to a stream raw pixel date. The researchers then took this 1D pixel stream and converted it to 2D photography, so that normal image analysis algorithms could analyze it. The researchers then changed the size of the resulting photo to make it smaller.
The Microsoft and Intel team said that changing the size of the raw image did not negatively affect the ranking result, noting that this was a necessary step so that the computing resources did not have to work with images consisting of billions. pixel, which probably slows down processing. The images then went into a pre-trained deep neural network (DNN), which was scanned with a 2D representation of malware and then classified as "clean" or "infected". Microsoft said it used a sample of 2,2 million PE (Portable Executable) infected files as a basis for the investigation. In particular, the researchers used 60% of the known malware samples to train the original DNN algorithm, 20% of the files for DNN validation and the other 20% for the actual testing process. The research team stated that STAMINA achieved 99,07% accuracy in locating and classifying malware samples, with a false positive rate of 2,58%.
This research is part of Microsoft's recent efforts to improve the malware detection process using techniques mechanical learning. STAMINA used a technique called deep learning. It is essentially a subset of machine learning (ML), a branch of it artificial intelligence (AI), which refers to smart networks computers that can learn on their own from incoming data stored in an unstructured or unlabeled format - in this case, a random binary malware. Microsoft said that while STAMINA was accurate and fast when working with smaller files, the same did not happen with larger files. In particular, Microsoft noted that in larger applications, STAMINA is less efficient due to the fact that there are limitations to converting billions of pixels to JPEG images and subsequently resizing them. However, this is probably not important, as the project could only be used for small files and even with excellent results. In an interview with ZDNet, o Tanmay GanacharyaMicrosoft Threat Protection's Director of Security Research said that Microsoft now relies heavily on machine learning to detect emerging threats, while this system uses different modules of machine learning that are developed in customer systems or in servers of Microsoft. At present, Microsoft can make this approach work better than other companies, mainly because of the huge amount of data it has from its hundreds of millions of installations. Windows Defender.