0% found this document useful (0 votes)
21 views7 pages

Analyzing Malicious Documents Cheat Sheet

This cheat sheet provides essential tips and tools for analyzing malicious documents, including Microsoft Office, RTF, and PDF files. It outlines a general approach to document analysis, useful commands for file analysis, and highlights risky tags in PDF formats. Additionally, it lists various tools for deobfuscation and analysis of embedded code and shellcode.

Uploaded by

faggotkilla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views7 pages

Analyzing Malicious Documents Cheat Sheet

This cheat sheet provides essential tips and tools for analyzing malicious documents, including Microsoft Office, RTF, and PDF files. It outlines a general approach to document analysis, useful commands for file analysis, and highlights risky tags in PDF formats. Additionally, it lists various tools for deobfuscation and analysis of embedded code and shellcode.

Uploaded by

faggotkilla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Analyzing Malicious Documents Cheat

Sheet

MORE ON

Information Security

Malicious Software

The SANS malware analysis


course I’ve co-authored explains the
This cheat sheet outlines tips and tools for analyzing malicious documents, such as Microsoft techniques summarized in this cheat

O ce, RTF and Adobe Acrobat (PDF) les. To print it, use the one-page PDF version; you can also sheet and covers many other reverse-
engineering topics.
edit the Word version to customize it for you own needs.
If you like this reference, take a look
General Approach to Document Analysis at my other IT and security cheat

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
1. Examine the document for anomalies, such as risky tags, scripts, or other anomalous aspects. sheets.
2. Locate embedded code, such as shellcode, VBA macros, JavaScript or other suspicious
SHARE 
objects.
3. Extract suspicious code or object from the le.
4. If relevant, deobfuscate and examine JavaScript or macro code.
5. If relevant, disassemble and/or debug shellcode.
6. Understand the next steps in the infection chain.

Microsoft O ce Format Notes


Binary document les supported by Microsoft O ce use the OLE2 (a.k.a. Structured Storage)
format.
SRP streams in OLE2 documents sometimes store a cached version of earlier macro code.
OOXML documents (.docx, .xlsm, etc.) supported by MS O ce use zip compression to store
contents.
Macros embedded in OOXML les are stored inside the OLE2 binary le, which is within the
zip archive.
RTF documents don’t support macros, but can contain other les embedded as OLE1 objects.

Useful MS O ce File Analysis Commands


unzip le.pptx Extract contents of OOXML le le.pptx.

olevba.py le.xlsm Locate and extract macros from le.xlsm or le.doc.


olevba.py le.doc

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
oledump.py le.xls List all OLE2 streams present in le.xls.

oledump.py -s 3 -v le.xls Extract macros stored inside stream 3 in le.xls.

oledump.py le.xls -p Find obfuscated URLs in le.xls macros.


plugin_http_heuristics

mso ce-crypt -d -p pass le.docm Decrypt OOXML le le.docm using password pass to
le2.docm create le2.docm.

pcodedmp.py -d le.doc Disassemble p-code macro code from le.doc.

rtfobj.py le.rtf Extract objects embedded into RTF-formatted le.rtf.

rtfdump.py le.rtf List groups and structure of RTF-formatted le.rtf.

rtfdump.py le.rtf -f O List groups in le.rtf that enclose an object.

rtfdump.py le.rtf -s 5 -H -d Extract object from group 5 and save it into out.bin.
> out.bin

pyxswf.py -xo le.doc Extract Flash (SWF) objects from OLE2 le le.doc.

Risky PDF Format Tags


/OpenAction and /AA specify the script or action to run automatically.
/JavaScript and /JS specify JavaScript to run.
/GoTo changes the view to a speci ed destination within the PDF or in another PDF le.

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
/Launch can launch a program or open a document.
/URI accesses a resource by its URL.
/SubmitForm and /GoToR can send data to URL.
/RichMedia can be used to embed Flash in a PDF.
/ObjStm can hide objects inside an Object Stream.
Be mindful of obfuscation with hex codes, such as /JavaScript vs. /J#61vaScript. (See
examples.)

Useful PDF File Analysis Commands


pd d.py le.pdf Scan le.pdf for risky keywords and dictionary entries.

peepdf.py - le.pdf Examine le.pdf for risky tags and malformed objects.

pdf-parser.py --object id le.pdf Display contents of object id in le.pdf. Add “-- lter --raw”
to decode the object’s stream.

qpdf --password=pass --decrypt Decrypt in le.pdf using password pass to create


in le.pdf out le.pdf out le.pdf.

swf_mastah.py -f le.pdf -o out Extract Flash (SWF) objects from le.pdf into the out
directory.

Shellcode and Other Analysis Commands


xorsearch -W -d 3 le.bin Locate shellcode patterns inside the binary le le.bin.

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
scdbg le.bin /fo 0x2B Emulate execution of shellcode in le.bin starting at o set
0x2B.

shellcode2exe le.bin Generate PE executable le.exe that runs shellcode from


le.bin.

jmp2it le.bin 0x2B Execute shellcode in le le.bin starting at o set 0x2B.

base64dump.py le.txt List Base64-encoded strings present in le le.txt.

base64dump.py le.txt -e bu -s Convert backslash Unicode-encoded Base64 string #2


2 -d > le.bin from le.txt as le.bin le.

Additional Document Analysis Tools


SpiderMonkey, V8 and box-js help deobfuscate JavaScript that you extract from document
les.
PDF Stream Dumper combines several PDF analysis utilities under a single graphical user
interface.
ViperMonkey emulates VBA macro execution.
VirusTotal and some automated analysis sandboxes can analyze aspects of malicious
document les.
Hachoir-urwid can display OLE2 stream contents.
101 Editor (commercial) and FileInsight hex editors can parse and edit OLE structures.
ExeFilter can lter scripts from O ce and PDF les.

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
REMnux distro includes many of the free document analysis tools mentioned above.

Post-Scriptum
Special thanks for feedback to Pedro Bueno and Didier Stevens. If you have suggestions for
improving this cheat sheet, please let me know. Creative Commons v3 “Attribution” License for
this cheat sheet version 3.0.

Updated September 6, 2017

DID YOU LIKE THIS?

Follow me for more of the good stu .

  
TWITTER RSS FEED NEWSLETTER

About the Author

Lenny Zeltser develops teams, products, and programs that use information security to achieve
business results. Over the past two decades, Lenny has been leading e orts to establish resilient
security practices and solve hard security problems. As a respected author and speaker, he has been

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
advancing cybersecurity tradecraft and contributing to the community. His insights build upon 20 years
of real-world experiences, a Computer Science degree from the University of Pennsylvania, and an MBA
degree from MIT Sloan.

Learn more

Copyright © 1995-2019 Lenny Zeltser. All rights reserved.

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy