Building Offensive Tooling

Off-the-shelf tools cover the common ground, but real engagements have gaps no scanner fills — and that is where you write your own. Offensive tooling from first principles: when to build versus buy, the language trade-offs (Python, Go, Rust), and the anatomy of a tool you can trust — bounded concurrency, rate-limit and jitter, timeouts, structured JSONL. Then we build a real async prober in ~25 lines and make it compose into any recon pipeline via the stdin/stdout contract.

Build vs buy · anatomy of a good tool · a real async prober

Every operator hits the same wall eventually. You are mid-engagement, you know exactly what you want to check, and no tool does it. The target speaks a protocol nobody wrote a client for. Two tools you love output formats that do not line up. A scanner that would work is far too loud for this network. At that point you stop looking for a tool and you write one.

This is the first part of the Offensive Tooling & Automation series. It is about building offensive tools that hold up on a real job — not toy scripts. We will cover when it is actually worth building, what to never rebuild, how to pick a language, and the handful of properties that separate a tool you trust from a script you babysit. Then we build a real, runnable async prober in about twenty-five lines and make it compose into any recon pipeline.

Everything here is for authorised security work: your own labs, CTFs, and engagements with written permission. The point of building your own tools is precision and control — including the control to stay strictly in scope.

When it is worth building

"Build versus buy" is the wrong framing for offensive work, because most of what you build is free either way. The real question is build versus use the existing tool — and the honest default answer is use the existing tool. Rebuilding nmap to feel clever is time stolen from the target. Build only when one of these signals is true:

SignalWhat it looks likeWhy build
Capability gapNo tool does this exact check/logicThe cleanest reason — you fill a real gap
OPSECKnown scanners are too loud hereA small, signatureless tool slips past
ScaleTens of thousands of targetsTune concurrency + rate for your case
Niche targetOdd protocol / format / APINobody wrote a parser; you will
GlueReshape tool A’s output for tool BFifty lines saves hours every job
RepeatableSame steps every engagementAutomate once, win on every future job
🔑 Key idea: If none of these hold, do not build. The mature tool is mature for a reason — years of edge cases you will not re-discover. Your scarce time belongs on the target, not on reinventing a solved problem.

Know the map — do not rebuild these

Before you write a line, know the landscape cold — so you instantly recognise what is already excellent and refuse to rebuild it. These are mature, battle-tested, and almost always the right call. Your code lives in the seams between them and in the gaps none of them cover.

CategoryUse theseWhat they already solve
Reconsubfinder, amass, httpx, dnsxSubdomain + DNS + HTTP probing at scale
Content / fuzzingffuf, feroxbusterDirectory + parameter fuzzing
Port + vuln scannmap, naabu, nucleiHost scans, fast port sweeps, templated checks
Web proxyBurp Suite, mitmproxyIntercept, replay — extend with plugins/addons
AD + lateralBloodHound, netexec (nxc), ImpacketAttack paths, spraying, protocol clients
Exploitation / C2sqlmap, MetasploitDeep, tested engines — wrap, don’t replace

The pattern across that whole table: extend and compose, do not replace. Add a nuclei template, a custom Cypher query for BloodHound, a netexec module, a Burp extension. You get the maturity of the host tool plus the one thing only you needed.

Pick the language for the job

There is no single right language — there is a right language per job. Three cover almost everything an operator builds, and fluency in two of them is plenty.

LanguageWhyBest for
Pythonasyncio, huge ecosystem, fast to writeGlue, recon, PoCs, anything you iterate on
GoOne static binary, goroutines, cross-compileDrop-on-target scanners and agents
Rust / CLow-level, memory control, no runtimeShellcode, loaders, evasion, hot loops
Most modern offensive tooling — nuclei, httpx, the whole ProjectDiscovery suite — is written in Go, and for good reason: go build produces a single dependency-free binary, and cross-compiling for another OS or architecture is one environment variable. Prototype in Python, ship the heavy scanner in Go, and reach for Rust or C only when you genuinely need to go low.
shell
# Go cross-compiles to anything from one machine — no toolchain juggling:
GOOS=windows GOARCH=amd64 go build -ldflags="-s -w" -o probe.exe .
GOOS=linux   GOARCH=arm64 go build -ldflags="-s -w" -o probe-arm .
# -s -w strips the symbol table + debug info: smaller binary, less for a defender to read.

Anatomy of a tool you can trust

Here is the part that actually separates a tool from a script. A script does the happy path once on your machine. A tool survives a real engagement — a flaky network, a hostile WAF, fifty thousand targets, and a client who will read the logs. Eight properties get you there.

PropertyWhat it meansWhere
Bounded concurrencyA fixed worker pool, never thread-per-target§5
Rate-limit + jitterCap req/s and randomise timing§6
Timeouts + retriesEvery call has a deadline + bounded retrynever hangs
Structured outputEmit JSONL, not pretty text§9
stdin in, stdout outRead targets in, write results out§8
ResumableCheckpoint so a crash does not restart from zero--resume
ConfigurableTargets/rate/output via flags + envno hardcoding
Quiet by defaultData on stdout, logs on stderrpipe-safe
🔑 Key idea: You do not need all eight on day one. But every one you skip is a way the tool fails you at the worst moment — usually live, in front of the client. The next sections build the three that matter most.

Concurrency — a bounded worker pool

Concurrency is where naive tools die. The instinct is either too slow or too violent, and both are wrong. There is exactly one pattern you want.

The three approaches

ApproachWhat it doesVerdict
One at a timeA loop: probe, wait, nextCorrect but useless — hours for 10k hosts
Thread per targetSpawn one per host10k sockets → you DoS yourself and the target
Bounded worker poolN workers pull from a shared queueThe answer — predictable load, capped at N

A bounded pool means you, not luck, decide the parallelism. Set it to 50 and exactly 50 requests are ever in flight, no matter whether you feed it ten hosts or a million. Results stream out the instant each one finishes, so a single slow host never holds up the fast ones. In Python that pool is one asyncio.Semaphore; in Go it is a buffered channel feeding a fixed set of goroutines.

Rate-limit + jitter — stay under the radar

Concurrency controls how many requests run at once; the rate limit controls how fast they go out. They are different knobs, and confusing them gets you banned. You can have a pool of 50 workers and still cap the whole tool at 20 requests per second.

Fire with no throttle and you send a burst the moment the run starts. It is fast for about three seconds — then the WAF rate-limit trips, the 429s begin, your source IP is banned, and a SOC analyst has a fresh alert with your traffic all over it. Speed bought you a detection and a dead IP. A token bucket fixes the rate: refill N tokens per second, each request spends one, and you hold a steady pace that stays under the limit instead of slamming into it.

Perfectly even timing is itself a signature — nothing human sends a request every exactly 50 ms. Add jitter: randomise the spacing so the traffic stops looking machine-generated. On a real engagement, quiet beats fast almost every time. The slow scan that no one noticed beats the fast one that got you blocked on host #200.

Build it — an async prober

Theory is cheap. Here is a real, runnable async HTTP prober — it reads URLs on stdin, fetches each with bounded concurrency and a hard timeout, grabs the page title, and prints one JSON object per result. About twenty-five lines, and it already obeys half the anatomy rules.

python
import asyncio, aiohttp, json, sys
from aiohttp import ClientTimeout

SEM = asyncio.Semaphore(50)          # cap: 50 requests in flight
TIMEOUT = ClientTimeout(total=8)     # never hang on a dead host

async def probe(session, url):
    async with SEM:                  # acquire a slot from the pool
        try:
            async with session.get(url, ssl=False) as r:
                body = await r.text()
                title = ""
                if "<title" in body.lower():
                    title = body.split("<title")[1] \
                                .split(">", 1)[1].split("</")[0].strip()
                return {"url": url, "status": r.status, "title": title[:80]}
        except Exception as e:
            return {"url": url, "error": type(e).__name__}

async def main():
    urls = [l.strip() for l in sys.stdin if l.strip()]   # targets from stdin
    async with aiohttp.ClientSession(timeout=TIMEOUT) as s:
        tasks = [probe(s, u) for u in urls]
        for fut in asyncio.as_completed(tasks):
            print(json.dumps(await fut), flush=True)      # JSONL to stdout

asyncio.run(main())

Run it

bash
$ pip install aiohttp
$ cat urls.txt | python3 probe.py
{"url": "https://a.example", "status": 200, "title": "Admin Login"}
{"url": "https://b.example", "status": 403, "title": ""}
{"url": "https://c.example", "error": "TimeoutError"}

Notice the failure handling: a dead host becomes a result with an error field, not an exception that kills the run. One unreachable target must never stop the other 9,999. That single try/except is the difference between a tool you start and walk away from and a script you sit and restart all afternoon.

The contract — stdin in, stdout out

The prober has a quiet superpower: it reads stdin and writes stdout. That one decision is what lets fifty lines of yours slot into a pipeline of world-class tools. This is the unix philosophy, and the entire ProjectDiscovery ecosystem is built on it.

bash
# Your tool is just one stage — it composes with everything:
subfinder -d target.com -silent \
  | dnsx -silent -a -resp-only \
  | httpx -silent \
  | python3 probe.py \
  | nuclei -silent -tags cve

Each tool reads lines, does its one job, and writes lines. subfinder finds subdomains, dnsx resolves them, httpx keeps the live ones, your prober enriches each, and nuclei scans the result. Your custom logic — the niche check no off-the-shelf tool does — drops straight into the middle and the whole chain just works. The rule is simple: data on stdout, logs and progress on stderr, so a noisy progress bar never corrupts the data flowing down the pipe.

🔑 Key idea: Build small tools that do one thing and speak stdin/stdout. A pile of small, composable tools beats one monolith every time — you recombine them per engagement instead of rewriting one giant program.

Why JSONL — structured output

The prober prints JSONL — one self-contained JSON object per line — and that choice pays off constantly. No wrapping array, no commas between records, so the output is streamable: you can tail it live, or kill the run halfway and still have a perfectly valid file of everything done so far.

bash
# Each line stands alone, so jq + standard unix tools become your analytics:

# only the live 200s, just the URLs:
$ jq -r 'select(.status==200) | .url' out.jsonl

# a status-code histogram from the same file:
$ jq -r '.status' out.jsonl | sort | uniq -c

# feed the live ones straight into the next tool:
$ jq -r 'select(.status==200) | .url' out.jsonl | nuclei -silent

You never write a parser again. jq treats each line as a document, filters and reshapes it, and pipes the result onward. Pretty-printed tables look nice in a terminal and are useless the moment you need to filter, count, diff, or feed them to another tool — which on a real engagement is always. Emit machine-readable data first; pretty-print later if a human needs to read it.

Ship it — from script to tool

The gap between a clever script and something you trust on a paid engagement is mostly packaging and discipline. Run this checklist before the tool touches a client network.

StepHowWhy it matters
Pin dependenciesrequirements.txt / go.mod with exact versionsThe tool you tested is the tool that runs
One artefactgo build -ldflags "-s -w"Single static binary; nothing to install
ConfigurableTargets/rate/output as flags or envNo scope baked into the binary
Believable User-AgentNot the default library stringA default UA screams “tool”
Proxy / SOCKSA --proxy optionRoute via a redirector; choose your source IP
ResumableCheckpoint progressSurvive a dropped run on a big sweep
Scope enforced in codeLoad scope; refuse anything outside itMake out-of-scope hits impossible
Kill-switchClean SIGINT + a hard time-boxWhen the window closes, it stops now
Two of these are not optional on a real job. Enforce scope in the tool itself — load the authorised target list and have the code refuse everything else, so a fat-fingered flag can never hit a host you were not allowed to touch. And ship a kill-switch — honour Ctrl-C cleanly and support a hard stop time, because “stop now” from the client means now, mid-flight.

Closing

Building offensive tooling is not about writing the next nmap. It is about the fifty lines that fill the gap nmap leaves — the niche check, the glue between two scanners, the quiet prober tuned for the network in front of you. Know the landscape so you never rebuild what is already excellent, reach for the right language for the job, and bake in the anatomy that separates a throwaway script from something you can trust: bounded concurrency, rate-limit and jitter, sane timeouts, structured JSONL, and a clean stdin/stdout contract.

Get those right and the async prober we built holds up under a real engagement — it does one job, stays exactly inside the scope you handed it, and passes its results to the next tool in a form that tool can read. Start there: find one gap your current toolkit cannot cover, write the smallest thing that fills it, and make it behave well enough to keep.

Reactions

Related Articles