Static Malware Analysis: Tools, Techniques and Indicators of Compromise

Master static malware analysis — PE header examination, strings extraction, YARA rules, Ghidra/IDA basics, import table analysis, obfuscation and IOC extraction.

lazyhackers
Mar 27, 2026 · 17 min read · 14 views

Static malware analysis examines a suspicious file without executing it. This approach is the safest form of malware analysis — no risk of inadvertent execution, infection, or network communication. Static analysis extracts maximum information from the binary's structure, content, and code before committing to dynamic analysis in a sandbox. For incident responders and threat intelligence analysts, the ability to rapidly triage and characterize malware samples is essential. This article covers the complete static analysis methodology used by professional malware analysts.

Safety First: Analysis Environment

Never analyze malware on your primary workstation. Use an isolated virtual machine with no network access, snapshotted before analysis. Tools like REMnux (Linux) and FlareVM (Windows) provide pre-configured analysis environments. Disable automatic execution features (autorun, preview panes) before working with samples.
# Disable network on analysis VM (or use host-only adapter)
# Take a clean snapshot before any analysis
# Work with copies, never originals — preserve chain of custody

# REMnux setup
curl -sSL https://remnux.org/get-remnux.sh | sudo bash

# Calculate initial hashes for reference
md5sum malware.exe
sha1sum malware.exe
sha256sum malware.exe

Phase 1: Initial Triage and Hash Analysis

File Identification

# Identify true file type (ignore extension)
file malware.exe
# Output: PE32 executable (GUI) Intel 80386, for MS Windows

file suspicious.pdf
# May reveal: PE32 executable or HTML or actual PDF

# Calculate cryptographic hashes
md5sum malware.exe      # Fast hash for deduplication
sha256sum malware.exe   # Strong hash for definitive identification

# Example output:
# MD5:    d41d8cd98f00b204e9800998ecf8427e
# SHA256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

VirusTotal Analysis

# Search by hash (never upload sensitive/classified samples to VT)
# https://www.virustotal.com/gui/file/SHA256HASH

# CLI with vt-cli
vt file SHA256HASH
vt file --format json SHA256HASH | jq '.last_analysis_stats'

# Key fields to review:
# - Detection ratio (X/72 engines)
# - First submission date (older = likely known malware)
# - File names (reveals attacker-used filenames)
# - Behavioral summaries
# - Network indicators
# - Sandbox reports (MalwareBazaar, Hybrid-Analysis, ANY.RUN)

# MalwareBazaar lookup
curl -s "https://mb-api.abuse.ch/api/v1/" -d "query=get_info&hash=SHA256" | jq .

Phase 2: PE File Format Analysis

Windows executables (EXE, DLL, SYS, OCX) follow the Portable Executable (PE) format. Understanding PE structure is fundamental to Windows malware analysis.

PE Structure Overview

ComponentDescriptionSecurity Relevance
DOS HeaderStarts with MZ signature (0x4D5A)Malformed headers indicate packing
PE HeaderMachine type, timestamp, characteristicsCompile timestamp may be spoofed
Optional HeaderEntry point, image base, subsystemHigh entry point offset = packed
Section TableList of .text, .data, .rsrc sectionsHigh entropy = packed/encrypted
Import Table (IAT)DLLs and functions the binary importsMost revealing static indicator
Export TableFunctions exported by DLLsPresence in EXE is unusual
Resources (.rsrc)Icons, strings, embedded filesCan hide packed malware
OverlayData appended after the PE structureCommon technique to append payloads

PE Analysis with pefile (Python)

pip install pefile

import pefile
import datetime

pe = pefile.PE('malware.exe')

# Compile timestamp
ts = pe.FILE_HEADER.TimeDateStamp
print("Compile time:", datetime.datetime.utcfromtimestamp(ts))

# Check if packed (high section entropy)
for section in pe.sections:
    print(f"Section: {section.Name.decode().strip()}")
    print(f"  Entropy: {section.get_entropy():.2f}")  # > 7.0 suggests packing
    print(f"  Virtual Size: {section.Misc_VirtualSize}")
    print(f"  Raw Size: {section.SizeOfRawData}")

# Import table
if hasattr(pe, 'DIRECTORY_ENTRY_IMPORT'):
    for entry in pe.DIRECTORY_ENTRY_IMPORT:
        print(f"\nDLL: {entry.dll.decode()}")
        for imp in entry.imports:
            if imp.name:
                print(f"  {imp.name.decode()}")

pecheck / Detect-It-Easy (DIE)

# Detect-It-Easy — identifies packers, compilers, linkers
die malware.exe
# Output:
# PE32
# Packer: UPX(3.91)[NRV,brute]
# Compiler: Microsoft Visual C/C++(-)[-]

# pecheck.py
python pecheck.py malware.exe

# exiftool for PE metadata
exiftool malware.exe

Phase 3: Strings Extraction

Strings analysis is often the highest-value, lowest-effort static analysis technique. Even packed malware frequently contains revealing strings in the unpacked stub.

# Basic strings (GNU strings)
strings malware.exe
strings -n 8 malware.exe     # minimum length 8 characters
strings -el malware.exe      # wide (Unicode) strings
strings -a malware.exe       # scan entire file

# Both ASCII and Unicode
strings -a malware.exe > strings_ascii.txt
strings -a -el malware.exe > strings_unicode.txt

# Sort and deduplicate
strings -a malware.exe | sort -u

# FLOSS — extracts obfuscated and stack strings that regular strings misses
floss malware.exe
floss --only-floss-strings malware.exe

# Look for high-value indicators:
strings malware.exe | grep -iE "http|ftp|\.exe|\.dll|cmd\.exe|powershell|regsvr32"
strings malware.exe | grep -iE "HKEY_|SOFTWARE\\|CurrentVersion"
strings malware.exe | grep -iE "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
strings malware.exe | grep -iE "CreateRemoteThread|VirtualAlloc|WriteProcessMemory"
strings malware.exe | grep -iE "\\\\pipe\\|\\\\mailslot\\"

Key String Categories for Malware

CategoryExamplesMalware Use
Network IOCsIPs, domains, URLsC2 infrastructure
Registry pathsHKCU\Software\RunPersistence
File paths%TEMP%\payload.exeDrop location
API namesCreateRemoteThreadProcess injection
Crypto constantsMZ headers, RC4 keyPayload decryption
Mutex namesGlobal\a1b2c3d4Anti-reinfection
Error messagesInternal debug stringsDeveloper language/origin

Phase 4: Import Table Analysis

The Import Address Table (IAT) reveals which Windows API functions the malware uses. This alone can characterize the malware's capabilities without running it.

# Using dumpbin (Windows SDK)
dumpbin /imports malware.exe

# Using objdump (Linux/cross-platform)
objdump -p malware.exe | grep -A 1000 "DLL Name"

# Python pefile
import pefile
pe = pefile.PE('malware.exe')
for entry in pe.DIRECTORY_ENTRY_IMPORT:
    print(entry.dll.decode())
    for func in entry.imports:
        if func.name: print(f"  {func.name.decode()}")

Suspicious API Functions by Category

CapabilityKey APIs
Process InjectionVirtualAllocEx, WriteProcessMemory, CreateRemoteThread, NtCreateThreadEx
PersistenceRegCreateKeyEx, RegSetValueEx, CreateService, SchtasksX
NetworkWSAStartup, connect, InternetOpen, HttpOpenRequest, URLDownloadToFile
KeyloggingSetWindowsHookEx, GetAsyncKeyState, GetForegroundWindow
Credential TheftCryptAcquireContext, CryptDecrypt, SamOpenDatabase
Anti-AnalysisIsDebuggerPresent, CheckRemoteDebuggerPresent, GetTickCount, Sleep
File OperationsCreateFile, ReadFile, WriteFile, DeleteFile, CopyFile
Process/MemoryOpenProcess, VirtualAlloc, HeapAlloc, NtUnmapViewOfSection
A very small or empty import table is a strong indicator of packing or obfuscation. Packed malware only needs a few APIs to unpack itself (VirtualAlloc, LoadLibrary, GetProcAddress) — the real imports are resolved at runtime.

Phase 5: Entropy Analysis

High entropy (close to 8.0) in PE sections indicates compression or encryption — a hallmark of packing and obfuscation.

# Calculate entropy with Python
import math

def calculate_entropy(data):
    if not data:
        return 0
    freq = {}
    for byte in data:
        freq[byte] = freq.get(byte, 0) + 1
    entropy = 0
    length = len(data)
    for count in freq.values():
        p = count / length
        entropy -= p * math.log2(p)
    return entropy

import pefile
pe = pefile.PE('malware.exe')
for section in pe.sections:
    data = section.get_data()
    entropy = calculate_entropy(data)
    print(f"{section.Name.decode().strip():10s} entropy: {entropy:.4f}")

# Interpretation:
# 0.0 - 1.0: Mostly null bytes (sparse section)
# 4.0 - 6.5: Normal code/data
# 7.0 - 8.0: Packed/encrypted/compressed — suspicious
# ~7.9: Very high — likely AES or random data

Phase 6: YARA Rules

YARA is the malware analyst's pattern matching engine. YARA rules match patterns in files to identify malware families, detect variants, and hunt across large file collections.

YARA Rule Syntax

rule ExampleMalware {
    meta:
        description = "Detects Example RAT v2.x"
        author = "Analyst Name"
        date = "2024-01-15"
        hash = "e3b0c44298fc1c149afbf4c8996fb924..."
        family = "ExampleRAT"

    strings:
        // String literals
        $s1 = "ExampleRAT Control Panel"
        $s2 = "C:\\Users\\Admin\\Desktop\\rat.pdb"

        // Case insensitive
        $s3 = "keylogger" nocase

        // Wide (Unicode)
        $s4 = "malicious.exe" wide

        // Both ASCII and Unicode
        $s5 = "backdoor" wide ascii

        // Regular expression
        $re1 = /https?:\/\/[a-z0-9]{8,20}\.xyz\/[a-z]{4,8}/

        // Hex pattern with wildcards
        $hex1 = { E8 ?? ?? ?? ?? 83 C4 04 85 C0 74 }

        // Hex with jumps
        $hex2 = { FC E8 [2-4] 00 00 00 60 }

    condition:
        // All conditions must be true
        uint16(0) == 0x5A4D           // MZ magic bytes (PE file)
        and filesize < 2MB
        and (2 of ($s*) or $hex1)
}

Real-World YARA Rules

// Detect Meterpreter shellcode pattern
rule Meterpreter_Shellcode {
    meta:
        description = "Detects common Meterpreter shellcode patterns"
    strings:
        $a = { FC E8 82 00 00 00 60 89 E5 31 C0 64 8B 50 30 }
        $b = "metsrv.x86.dll" wide ascii
        $c = "ReflectiveLoader" ascii
    condition:
        any of them
}

// Detect Cobalt Strike beacon
rule CobaltStrike_Beacon {
    meta:
        description = "Cobalt Strike Beacon config pattern"
    strings:
        $a = { 69 68 69 68 69 6B }  // beacon XOR key pattern
        $b = ".cobaltstrike.artifact" ascii
        $c = { 00 00 00 00 00 00 00 00 ?? 00 01 00 ?? 00 00 00 ?? ?? 00 00 00 00 00 00 }
    condition:
        $b or ($a and $c)
}

// Generic suspicious PowerShell downloader
rule Suspicious_PowerShell_Download {
    meta:
        description = "Detects common PowerShell download cradle patterns"
    strings:
        $ps1 = "DownloadString" nocase wide ascii
        $ps2 = "DownloadFile" nocase wide ascii
        $ps3 = "IEX" nocase wide ascii
        $ps4 = "Invoke-Expression" nocase wide ascii
        $ps5 = "Net.WebClient" nocase wide ascii
        $ps6 = "-EncodedCommand" nocase wide ascii
        $ps7 = "FromBase64String" nocase wide ascii
    condition:
        3 of ($ps*)
}

Running YARA

# Single file scan
yara rules.yar malware.exe

# Directory scan
yara -r rules.yar /samples/

# Multiple rule files
yara rule1.yar rule2.yar malware.exe

# Print all matching strings
yara -s rules.yar malware.exe

# Scan running processes (Windows)
yara rules.yar -p 1234   # scan specific PID
yara rules.yar --proc-list  # scan all processes

# Compile rules for faster scanning
yarac rules.yar compiled.rules
yara compiled.rules malware.exe

Phase 7: Ghidra Reverse Engineering Basics

Ghidra is NSA's open-source reverse engineering suite. It disassembles and decompiles binaries into readable pseudo-C code, enabling deep static analysis of malware logic.

# Install Ghidra
# Download from https://ghidra-sre.org
# Requires Java JDK 17+

# Launch Ghidra
./ghidraRun   # Linux/macOS
ghidraRun.bat # Windows

# Key workflow:
# 1. File > New Project > Non-Shared > Create Project
# 2. File > Import File > malware.exe
# 3. Double-click imported file in CodeBrowser
# 4. Click Yes to auto-analyze
# 5. Accept default analysis options

# Key Ghidra windows:
# Program Trees: Section layout
# Symbol Tree: Functions, Labels, Imports
# Listing: Disassembly view
# Decompiler: Pseudo-C view (right side)
# References: Cross-references (X key or right-click)

Finding Key Functions in Ghidra

# In Ghidra Script Manager or Interpreter:

# Find all calls to specific Windows API
# Search > Search for Direct References > VirtualAlloc

# Find all strings
Window > Defined Strings

# Navigate to entry point
# Symbol Tree > Functions > entry

# Search for suspicious patterns
# Search > For Instruction Patterns
# Search > Memory

# Rename variables and functions as you understand them
# Press 'L' to rename, 'C' to add comment

Phase 8: Obfuscation Detection and Unpacking

# Common packers and their signatures (DIE detects these)
# UPX: "UPX0", "UPX1" section names, easy to unpack
upx -d packed.exe -o unpacked.exe

# MPRESS, Themida, VMProtect, ASPack require dynamic unpacking
# General approach:
# 1. Load in debugger (x64dbg/OllyDbg)
# 2. Set hardware breakpoint on memory write to code section
# 3. Run until unpacked code executes
# 4. Dump process memory with Scylla (x64dbg plugin)

# Detect encoding/obfuscation patterns
# XOR obfuscation: Repeated bytes in function prologues
# RC4: 256-byte initialization tables
# Base64: alphabet range bytes 0x41-0x7A

# Automated unpacking tools
# PE-sieve: Scan and dump modified PE from memory
pe-sieve.exe /pid 1234 /out C:\dumps\

# hollows_hunter: Find process hollowing
hollows_hunter.exe /out C:\dumps\

IOC Extraction and Reporting

# After analysis, compile IOCs:

# File IOCs
- MD5/SHA1/SHA256 hashes of sample and dropped files
- File names and paths used
- Mutexes created

# Network IOCs
- C2 domains and IPs
- User-Agent strings
- URL patterns
- Certificate thumbprints

# Host IOCs
- Registry keys created/modified
- Scheduled tasks
- Services installed
- Startup folder entries
- Prefetch artifacts expected

# YARA rules for the specific sample

# Export to MISP, OpenCTI, or plain text/CSV for detection rule creation
Always note what you do NOT find. A lack of network strings, persistence mechanisms, or suspicious APIs is as informative as their presence — it may indicate heavy obfuscation requiring dynamic analysis, or the sample may be a dropper/loader rather than the primary payload.
Reactions

Related Articles