Sponsored by: National Institute of Standards and Technology
Project Type: SBIR Phase I
The virus detection systems used on desktop devices are bulky and must frequently be updated, so they are poorly suited for handheld devices, with their limited storage and intermittent network connectivity. We investigated a different technique for detecting malicious executables, based on machine learning. Such a malicious code detector can be built with a much smaller footprint, while incorporating some ability to detect novel attacks, making frequent updates less crucial. As an added benefit, updates to a machine-learning based detector entail less manual effort for the vendor. Our malicious-code detection technology can advance the state of the art for desktop computers, not just for PDAs.
Existing antivirus solutions do not scale very well to resource-constrained devices such as hand-held computers or embedded systems. The processing and data-storage requirements of antivirus tools are too expensive to be fully implemented in these scaled-down environments. This project investigates the use of machine-learning techniques to develop an efficient and effective antivirus solution that can function with minimal resources. Specifically, we are exploring the use of generalized virus signatures that will enable a single "signature" to detect a multitude of viruses. If successful, this approach may have the added advantage of detecting novel viruses that have not yet been included in an antivirus signature database.
Our work on this project can be divided into two areas: identifying malicious software and understanding malicious software. We are developing methods of locating malicious software that has attached itself to a benign host program (as is commonly the case with executable viruses and Trojan horses). These techniques enable an analyst to quickly distinguish regions of an application that need to be analyzed from those that are unimportant to the analysis process. The second year of this project focuses on leveraging domain knowledge of malicious software to improve automated program understanding capabilities. Combined, these approaches will greatly reduce the work required by an expert to locate and analyze malicious software in a laboratory environment.