Static malware analysis examines a suspicious file without executing it. This approach is the safest form of malware analysis — no risk of inadvertent execution, infection, or network communication. Static analysis extracts maximum information from the binary's structure, content, and code before committing to dynamic analysis in a sandbox. For incident responders and threat intelligence analysts, the ability to rapidly triage and characterize malware samples is essential. This article covers the complete static analysis methodology used by professional malware analysts.
Safety First: Analysis Environment
# Disable network on analysis VM (or use host-only adapter)
# Take a clean snapshot before any analysis
# Work with copies, never originals — preserve chain of custody
# REMnux setup
curl -sSL https://remnux.org/get-remnux.sh | sudo bash
# Calculate initial hashes for reference
md5sum malware.exe
sha1sum malware.exe
sha256sum malware.exe
Phase 1: Initial Triage and Hash Analysis
File Identification
# Identify true file type (ignore extension)
file malware.exe
# Output: PE32 executable (GUI) Intel 80386, for MS Windows
file suspicious.pdf
# May reveal: PE32 executable or HTML or actual PDF
# Calculate cryptographic hashes
md5sum malware.exe # Fast hash for deduplication
sha256sum malware.exe # Strong hash for definitive identification
# Example output:
# MD5: d41d8cd98f00b204e9800998ecf8427e
# SHA256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
VirusTotal Analysis
# Search by hash (never upload sensitive/classified samples to VT)
# https://www.virustotal.com/gui/file/SHA256HASH
# CLI with vt-cli
vt file SHA256HASH
vt file --format json SHA256HASH | jq '.last_analysis_stats'
# Key fields to review:
# - Detection ratio (X/72 engines)
# - First submission date (older = likely known malware)
# - File names (reveals attacker-used filenames)
# - Behavioral summaries
# - Network indicators
# - Sandbox reports (MalwareBazaar, Hybrid-Analysis, ANY.RUN)
# MalwareBazaar lookup
curl -s "https://mb-api.abuse.ch/api/v1/" -d "query=get_info&hash=SHA256" | jq .
Phase 2: PE File Format Analysis
Windows executables (EXE, DLL, SYS, OCX) follow the Portable Executable (PE) format. Understanding PE structure is fundamental to Windows malware analysis.
PE Structure Overview
| Component | Description | Security Relevance |
|---|---|---|
| DOS Header | Starts with MZ signature (0x4D5A) | Malformed headers indicate packing |
| PE Header | Machine type, timestamp, characteristics | Compile timestamp may be spoofed |
| Optional Header | Entry point, image base, subsystem | High entry point offset = packed |
| Section Table | List of .text, .data, .rsrc sections | High entropy = packed/encrypted |
| Import Table (IAT) | DLLs and functions the binary imports | Most revealing static indicator |
| Export Table | Functions exported by DLLs | Presence in EXE is unusual |
| Resources (.rsrc) | Icons, strings, embedded files | Can hide packed malware |
| Overlay | Data appended after the PE structure | Common technique to append payloads |
PE Analysis with pefile (Python)
pip install pefile
import pefile
import datetime
pe = pefile.PE('malware.exe')
# Compile timestamp
ts = pe.FILE_HEADER.TimeDateStamp
print("Compile time:", datetime.datetime.utcfromtimestamp(ts))
# Check if packed (high section entropy)
for section in pe.sections:
print(f"Section: {section.Name.decode().strip()}")
print(f" Entropy: {section.get_entropy():.2f}") # > 7.0 suggests packing
print(f" Virtual Size: {section.Misc_VirtualSize}")
print(f" Raw Size: {section.SizeOfRawData}")
# Import table
if hasattr(pe, 'DIRECTORY_ENTRY_IMPORT'):
for entry in pe.DIRECTORY_ENTRY_IMPORT:
print(f"\nDLL: {entry.dll.decode()}")
for imp in entry.imports:
if imp.name:
print(f" {imp.name.decode()}")
pecheck / Detect-It-Easy (DIE)
# Detect-It-Easy — identifies packers, compilers, linkers
die malware.exe
# Output:
# PE32
# Packer: UPX(3.91)[NRV,brute]
# Compiler: Microsoft Visual C/C++(-)[-]
# pecheck.py
python pecheck.py malware.exe
# exiftool for PE metadata
exiftool malware.exe
Phase 3: Strings Extraction
Strings analysis is often the highest-value, lowest-effort static analysis technique. Even packed malware frequently contains revealing strings in the unpacked stub.
# Basic strings (GNU strings)
strings malware.exe
strings -n 8 malware.exe # minimum length 8 characters
strings -el malware.exe # wide (Unicode) strings
strings -a malware.exe # scan entire file
# Both ASCII and Unicode
strings -a malware.exe > strings_ascii.txt
strings -a -el malware.exe > strings_unicode.txt
# Sort and deduplicate
strings -a malware.exe | sort -u
# FLOSS — extracts obfuscated and stack strings that regular strings misses
floss malware.exe
floss --only-floss-strings malware.exe
# Look for high-value indicators:
strings malware.exe | grep -iE "http|ftp|\.exe|\.dll|cmd\.exe|powershell|regsvr32"
strings malware.exe | grep -iE "HKEY_|SOFTWARE\\|CurrentVersion"
strings malware.exe | grep -iE "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
strings malware.exe | grep -iE "CreateRemoteThread|VirtualAlloc|WriteProcessMemory"
strings malware.exe | grep -iE "\\\\pipe\\|\\\\mailslot\\"
Key String Categories for Malware
| Category | Examples | Malware Use |
|---|---|---|
| Network IOCs | IPs, domains, URLs | C2 infrastructure |
| Registry paths | HKCU\Software\Run | Persistence |
| File paths | %TEMP%\payload.exe | Drop location |
| API names | CreateRemoteThread | Process injection |
| Crypto constants | MZ headers, RC4 key | Payload decryption |
| Mutex names | Global\a1b2c3d4 | Anti-reinfection |
| Error messages | Internal debug strings | Developer language/origin |
Phase 4: Import Table Analysis
The Import Address Table (IAT) reveals which Windows API functions the malware uses. This alone can characterize the malware's capabilities without running it.
# Using dumpbin (Windows SDK)
dumpbin /imports malware.exe
# Using objdump (Linux/cross-platform)
objdump -p malware.exe | grep -A 1000 "DLL Name"
# Python pefile
import pefile
pe = pefile.PE('malware.exe')
for entry in pe.DIRECTORY_ENTRY_IMPORT:
print(entry.dll.decode())
for func in entry.imports:
if func.name: print(f" {func.name.decode()}")
Suspicious API Functions by Category
| Capability | Key APIs |
|---|---|
| Process Injection | VirtualAllocEx, WriteProcessMemory, CreateRemoteThread, NtCreateThreadEx |
| Persistence | RegCreateKeyEx, RegSetValueEx, CreateService, SchtasksX |
| Network | WSAStartup, connect, InternetOpen, HttpOpenRequest, URLDownloadToFile |
| Keylogging | SetWindowsHookEx, GetAsyncKeyState, GetForegroundWindow |
| Credential Theft | CryptAcquireContext, CryptDecrypt, SamOpenDatabase |
| Anti-Analysis | IsDebuggerPresent, CheckRemoteDebuggerPresent, GetTickCount, Sleep |
| File Operations | CreateFile, ReadFile, WriteFile, DeleteFile, CopyFile |
| Process/Memory | OpenProcess, VirtualAlloc, HeapAlloc, NtUnmapViewOfSection |
Phase 5: Entropy Analysis
High entropy (close to 8.0) in PE sections indicates compression or encryption — a hallmark of packing and obfuscation.
# Calculate entropy with Python
import math
def calculate_entropy(data):
if not data:
return 0
freq = {}
for byte in data:
freq[byte] = freq.get(byte, 0) + 1
entropy = 0
length = len(data)
for count in freq.values():
p = count / length
entropy -= p * math.log2(p)
return entropy
import pefile
pe = pefile.PE('malware.exe')
for section in pe.sections:
data = section.get_data()
entropy = calculate_entropy(data)
print(f"{section.Name.decode().strip():10s} entropy: {entropy:.4f}")
# Interpretation:
# 0.0 - 1.0: Mostly null bytes (sparse section)
# 4.0 - 6.5: Normal code/data
# 7.0 - 8.0: Packed/encrypted/compressed — suspicious
# ~7.9: Very high — likely AES or random data
Phase 6: YARA Rules
YARA is the malware analyst's pattern matching engine. YARA rules match patterns in files to identify malware families, detect variants, and hunt across large file collections.
YARA Rule Syntax
rule ExampleMalware {
meta:
description = "Detects Example RAT v2.x"
author = "Analyst Name"
date = "2024-01-15"
hash = "e3b0c44298fc1c149afbf4c8996fb924..."
family = "ExampleRAT"
strings:
// String literals
$s1 = "ExampleRAT Control Panel"
$s2 = "C:\\Users\\Admin\\Desktop\\rat.pdb"
// Case insensitive
$s3 = "keylogger" nocase
// Wide (Unicode)
$s4 = "malicious.exe" wide
// Both ASCII and Unicode
$s5 = "backdoor" wide ascii
// Regular expression
$re1 = /https?:\/\/[a-z0-9]{8,20}\.xyz\/[a-z]{4,8}/
// Hex pattern with wildcards
$hex1 = { E8 ?? ?? ?? ?? 83 C4 04 85 C0 74 }
// Hex with jumps
$hex2 = { FC E8 [2-4] 00 00 00 60 }
condition:
// All conditions must be true
uint16(0) == 0x5A4D // MZ magic bytes (PE file)
and filesize < 2MB
and (2 of ($s*) or $hex1)
}
Real-World YARA Rules
// Detect Meterpreter shellcode pattern
rule Meterpreter_Shellcode {
meta:
description = "Detects common Meterpreter shellcode patterns"
strings:
$a = { FC E8 82 00 00 00 60 89 E5 31 C0 64 8B 50 30 }
$b = "metsrv.x86.dll" wide ascii
$c = "ReflectiveLoader" ascii
condition:
any of them
}
// Detect Cobalt Strike beacon
rule CobaltStrike_Beacon {
meta:
description = "Cobalt Strike Beacon config pattern"
strings:
$a = { 69 68 69 68 69 6B } // beacon XOR key pattern
$b = ".cobaltstrike.artifact" ascii
$c = { 00 00 00 00 00 00 00 00 ?? 00 01 00 ?? 00 00 00 ?? ?? 00 00 00 00 00 00 }
condition:
$b or ($a and $c)
}
// Generic suspicious PowerShell downloader
rule Suspicious_PowerShell_Download {
meta:
description = "Detects common PowerShell download cradle patterns"
strings:
$ps1 = "DownloadString" nocase wide ascii
$ps2 = "DownloadFile" nocase wide ascii
$ps3 = "IEX" nocase wide ascii
$ps4 = "Invoke-Expression" nocase wide ascii
$ps5 = "Net.WebClient" nocase wide ascii
$ps6 = "-EncodedCommand" nocase wide ascii
$ps7 = "FromBase64String" nocase wide ascii
condition:
3 of ($ps*)
}
Running YARA
# Single file scan
yara rules.yar malware.exe
# Directory scan
yara -r rules.yar /samples/
# Multiple rule files
yara rule1.yar rule2.yar malware.exe
# Print all matching strings
yara -s rules.yar malware.exe
# Scan running processes (Windows)
yara rules.yar -p 1234 # scan specific PID
yara rules.yar --proc-list # scan all processes
# Compile rules for faster scanning
yarac rules.yar compiled.rules
yara compiled.rules malware.exe
Phase 7: Ghidra Reverse Engineering Basics
Ghidra is NSA's open-source reverse engineering suite. It disassembles and decompiles binaries into readable pseudo-C code, enabling deep static analysis of malware logic.
# Install Ghidra
# Download from https://ghidra-sre.org
# Requires Java JDK 17+
# Launch Ghidra
./ghidraRun # Linux/macOS
ghidraRun.bat # Windows
# Key workflow:
# 1. File > New Project > Non-Shared > Create Project
# 2. File > Import File > malware.exe
# 3. Double-click imported file in CodeBrowser
# 4. Click Yes to auto-analyze
# 5. Accept default analysis options
# Key Ghidra windows:
# Program Trees: Section layout
# Symbol Tree: Functions, Labels, Imports
# Listing: Disassembly view
# Decompiler: Pseudo-C view (right side)
# References: Cross-references (X key or right-click)
Finding Key Functions in Ghidra
# In Ghidra Script Manager or Interpreter:
# Find all calls to specific Windows API
# Search > Search for Direct References > VirtualAlloc
# Find all strings
Window > Defined Strings
# Navigate to entry point
# Symbol Tree > Functions > entry
# Search for suspicious patterns
# Search > For Instruction Patterns
# Search > Memory
# Rename variables and functions as you understand them
# Press 'L' to rename, 'C' to add comment
Phase 8: Obfuscation Detection and Unpacking
# Common packers and their signatures (DIE detects these)
# UPX: "UPX0", "UPX1" section names, easy to unpack
upx -d packed.exe -o unpacked.exe
# MPRESS, Themida, VMProtect, ASPack require dynamic unpacking
# General approach:
# 1. Load in debugger (x64dbg/OllyDbg)
# 2. Set hardware breakpoint on memory write to code section
# 3. Run until unpacked code executes
# 4. Dump process memory with Scylla (x64dbg plugin)
# Detect encoding/obfuscation patterns
# XOR obfuscation: Repeated bytes in function prologues
# RC4: 256-byte initialization tables
# Base64: alphabet range bytes 0x41-0x7A
# Automated unpacking tools
# PE-sieve: Scan and dump modified PE from memory
pe-sieve.exe /pid 1234 /out C:\dumps\
# hollows_hunter: Find process hollowing
hollows_hunter.exe /out C:\dumps\
IOC Extraction and Reporting
# After analysis, compile IOCs:
# File IOCs
- MD5/SHA1/SHA256 hashes of sample and dropped files
- File names and paths used
- Mutexes created
# Network IOCs
- C2 domains and IPs
- User-Agent strings
- URL patterns
- Certificate thumbprints
# Host IOCs
- Registry keys created/modified
- Scheduled tasks
- Services installed
- Startup folder entries
- Prefetch artifacts expected
# YARA rules for the specific sample
# Export to MISP, OpenCTI, or plain text/CSV for detection rule creation