The majority of established industry malware research labs have been built organically over the course of years - even, in the case of many founding members of the anti-virus field, over decades.
A lab usually started as a simple virus repository - a physically isolated space in which samples were stored in a secure fashion so that they could only be accessed, analysed and tested by skilled and authorized researchers. Over time, sample-collection systems such as honeypots, spam traps and web crawlers were added, along with systems for sharing samples between trusted vendors.
Eventually, these systems, together with the increase in activity by malware writers, raised sample volumes to a point where human researchers could not manually process all samples. This point marked the start of the age of automated analysis.
Soon, though, existing automated analysis systems were inundated with ever-increasing traffic volumes. The need for clustering, correlation and automated classification became clear. All this organic growth caused malware labs to become extremely complex, with systems that were interdependent and tied to existing technology used by each company's products.
Recently, we have seen an increase in the number of newcomers to the field of malware research, who each bring their own ideas on how malware problems should be tackled. These newcomers include incident response companies as well as the emergency response teams of big companies and government organizations, and they all need their own labs.
Unfortunately, it is not always clear how to successfully evaluate available options and start building an integrated environment for threat collection, analysis, correlation, and incident tracking and management. There is a clear need for a process that can be followed to build a malware research lab from scratch.
Our paper will propose a simple process for building a fully functional malware research lab in a relatively short time. It will provide criteria for evaluating existing systems in each of the mandatory areas of a fully functional malware lab: collection, analysis, classification, protection, testing, sharing and integration.