Malicious documents are commonly attached to phishing emails or downloaded during web browsing. The malicious documents primarily contain an exploit code that triggers vulnerability when the user opens the document. The exploit code includes a shellcode that performs malicious behaviors. Many studies have been conducted for detecting malicious documents such as static analysis, dynamic analysis, and machine learning. Nonetheless, because attackers continue to create new types of shellcodes to bypass the existing detection methods, detecting malicious documents is like an endless race with the attackers. On the other hand, for the purpose of prevention, content randomization is available to split the shellcodes by randomizing the structure of documents. However, if the shellcodes are less than 512 bytes, there is a possibility that the shellcodes can be located within a sector without being split, which is an unintended case of content randomization.
In this thesis, we propose DeDocs, which is a Moving Target Defense (MTD) approach through“content transformation”for preventing exploits in documents. Content transformation method consists of document resize, dummy code insertion, and content reordering.
DeDocs not only blocks the exploits that are less than 512 bytes but also does not cause harm if applied to normal documents because the content of the dummy code is filled with the element of the document which is legitimate but uninfluential. Thus, the dummy code can be performed as a normal element of the document and it also defends the executing exploits. We collected a total of 7,786 malicious documents in the forms of .doc, .xls, .ppt from Contagio, Virustotal, and Virussign from years 2014 to 2017. Among the 7,786 samples, 135 samples were used in the test, which samples not only includes a shellcode but also successfully reproduced exploits. On the tests, DeDocs showed a reasonable size overhead of less than 2% and resulted in 88.9% prevention of 120 from 135 malicious documents. The prevention rate of DeDocs is 34.1% higher, compared with 54.8% prevention of content randomization.
DeDocs presents a more radical and innovative solution strategy for malicious document detection. It prevents exploit execution of malicious documents through content transformation without detecting the documents, and thus, even if a new type of shellcode appears, no additional defense techniques are required. In addition, by using a small unit of a document file, DeDocs prevents the execution of a small exploit code. In this study, we describe principles of DeDocs, evaluate its defense effectiveness, and analyze the results.