OSINT

OSINT for Threat Intelligence

Master threat intelligence OSINT — breach data analysis with HaveIBeenPwned API, dark web monitoring, paste site tracking, MISP/OpenCTI, attribution and OPSEC.

Threat intelligence (TI) transforms raw data from security incidents, breach disclosures, and adversary activity into actionable knowledge that organizations can use to defend themselves proactively. OSINT-based threat intelligence leverages publicly available sources — including breach databases, dark web forums, paste sites, and adversary infrastructure — to identify risks, track threat actors, and monitor for indicators of compromise. This article covers the professional threat intelligence lifecycle and the OSINT techniques that power it.

The Threat Intelligence Lifecycle

Professional threat intelligence follows a structured cycle to ensure collected data becomes actionable intelligence:

Planning and Direction: Define intelligence requirements. What threats are you monitoring? Which threat actors target your sector? What assets need protection?
Collection: Gather raw data from OSINT sources, dark web, technical feeds, and human intelligence
Processing: Normalize, deduplicate, and structure collected data for analysis
Analysis: Apply context, correlation, and expert judgment to produce intelligence from data
Dissemination: Share intelligence in appropriate formats (STIX/TAXII, PDF reports, SIEM rules)
Feedback: Evaluate intelligence quality and refine collection requirements

Intelligence Types

Type	Description	Consumers	Example
Strategic	Long-term trends, threat actor campaigns	C-suite, board	Nation-state APT targeting your sector
Operational	Upcoming or ongoing attacks	Security management	Ransomware group targeting critical infrastructure
Tactical	TTPs, MITRE ATT&CK techniques	SOC, IR teams	LockBit uses specific persistence mechanism
Technical	Specific IOCs: IPs, hashes, domains	SIEM, EDR, firewall	C2 IP: 198.51.100.1, Hash: abc123...

Breach Data Sources and Analysis

HaveIBeenPwned API

# HaveIBeenPwned (HIBP) API — check if accounts appear in breaches
# API key required for bulk queries: https://haveibeenpwned.com/API/Key

# Check if an email was breached
curl -H "hibp-api-key: YOUR_API_KEY" \
  "https://haveibeenpwned.com/api/v3/breachedaccount/[email protected]"

# Response includes breach names, dates, data classes
# Example response:
# [{"Name":"Adobe","Title":"Adobe","Domain":"adobe.com","BreachDate":"2013-10-04",
#   "DataClasses":["Email addresses","Password hints","Passwords","Usernames"]}]

# Check all breaches
curl -H "hibp-api-key: YOUR_API_KEY" \
  "https://haveibeenpwned.com/api/v3/breaches"

# Check pastes (Pastebin, GitHub, etc.)
curl -H "hibp-api-key: YOUR_API_KEY" \
  "https://haveibeenpwned.com/api/v3/pasteaccount/[email protected]"

# k-Anonymity password check (privacy-preserving)
# Hash the password with SHA1, send first 5 chars
echo -n "password123" | sha1sum | head -c 5 | tr 'a-z' 'A-Z'
# → CBFDA
curl "https://api.pwnedpasswords.com/range/CBFDA"
# Response contains all hashes starting with CBFDA and breach counts
# Match against your full hash to check if password was breached

# Python implementation
import hashlib, requests
def is_password_pwned(password):
    sha1 = hashlib.sha1(password.encode()).hexdigest().upper()
    prefix, suffix = sha1[:5], sha1[5:]
    response = requests.get(f"https://api.pwnedpasswords.com/range/{prefix}")
    return suffix in response.text

Dehashed — Leaked Credential Search

# Dehashed aggregates breach data with search by email, username, domain, IP
# API: https://www.dehashed.com/docs

# Search by email
curl -H "Authorization: Basic BASE64(email:api_key)" \
  "https://api.dehashed.com/search?query=email:[email protected]"

# Search by domain (finds all credentials with @target.com emails)
curl -H "Authorization: Basic BASE64(email:api_key)" \
  "https://api.dehashed.com/search?query=domain:target.com"

# Search by username
curl -H "Authorization: Basic BASE64(email:api_key)" \
  "https://api.dehashed.com/search?query=username:johndoe"

# Search by IP address (find other accounts from same IP)
curl -H "Authorization: Basic BASE64(email:api_key)" \
  "https://api.dehashed.com/search?query=ip_address:192.168.1.1"

# Response fields:
# id, email, username, password, hashed_password, name,
# vin, address, phone, database_name, leaked_date

Analyzing Leaked Credential Dumps

# After obtaining breach data (through authorized CTI work):
# Extract domain-specific credentials
grep "@target.com" breach_data.txt | cut -d':' -f1,2 > target_creds.txt

# Count by password type
grep "@target.com" breach_data.txt | awk -F: '{print $2}' | sort | uniq -c | sort -rn

# Find plaintext passwords vs hashes
grep -E ":[a-fA-F0-9]{32}$" target_creds.txt    # MD5 hashes
grep -E ":[a-fA-F0-9]{40}$" target_creds.txt    # SHA1 hashes
grep -E ":\$2[aby]\$" target_creds.txt           # bcrypt hashes
grep -E ":[^:$]{6,20}$" target_creds.txt         # likely plaintext (6-20 chars)

# Identify password patterns for threat intelligence
# Common patterns reveal organizational password policies
# e.g., CompanyName2023! = basic seasonal pattern

Dark Web Monitoring

Tor Network Fundamentals

The Tor network provides anonymized access to .onion services. These hidden services host both legitimate privacy tools and criminal marketplaces, forums, and data leak sites.

Accessing dark web marketplaces to purchase stolen data, drugs, or other illegal goods is a criminal offense regardless of your role. Threat intelligence dark web monitoring should be limited to passive observation of publicly accessible information. Always consult legal counsel before conducting dark web investigations professionally.

Tor OPSEC for Investigators

# Essential OPSEC before accessing dark web:
# 1. Use a dedicated, air-gapped or VM-based environment
# 2. Never access dark web from work or home network
# 3. Use Tails OS (amnesic OS, leaves no traces)
# 4. Never login to personal accounts while on Tor
# 5. Use Tor Browser — never regular browser over Tor
# 6. Disable JavaScript where possible (Security Level: Safest)
# 7. Use a bridge if in a censored country
# 8. Document everything for legal compliance

# Tails OS setup:
# Download from https://tails.boum.org (verify GPG signature)
# Boot from USB — runs in RAM, leaves no disk traces
# All traffic automatically routed through Tor

# Tor Browser configuration for investigators:
# Security Level: Safest (NoScript blocks JS by default)
# New Identity when switching investigations (Ctrl+Shift+U)
# Never maximize browser window (prevents screen size fingerprinting)

Dark Web Search Engines and Indexes

# Dark web search engines (accessible via Tor Browser):
# Ahmia: http://ahmia.fi (surface web) or ahmia.fi .onion version
# Torch: http://xmh57jrknzkhv6y3ls3ubitzfqnkrwxhopf5ayieeo2through7t5k6uyd.onion
# Haystak: Haystak .onion — indexes millions of .onion pages
# DarkSearch: https://darksearch.io (surface clearnet version)

# Ahmia from command line
curl "https://ahmia.fi/search/?q=target+company+breach"

# Dark web threat intelligence sources:
# RaidForums archive sites (after seizure)
# BreachForums.is (clearnet mirror often available)
# Ransomware group leak sites (LockBit, ALPHV/BlackCat, etc.)
# Stolen credentials markets
# Underground hacking forums

# Ransomware leak sites monitoring:
# Most ransomware groups maintain public-facing .onion sites listing victims
# Monitor for your organization or clients: DDoSecrets, ransomwatch.telemetry.ltd

Monitoring Ransomware Leak Sites

# RansomWatch — automated ransomware group monitoring
# https://ransomwatch.telemetry.ltd (clearnet)
# Aggregates posts from 60+ ransomware group sites

# Manual monitoring workflow:
# 1. Check known ransomware leak sites weekly
# 2. Search for target organization name
# 3. Alert: Any mention before public disclosure = early warning
# 4. Document: Screenshot, timestamp, URL

# Known ransomware .onion sites (change frequently — verify current status)
# LockBit, ALPHV/BlackCat, Cl0p, RansomHub, Akira, etc.
# Vx-underground and Krebs on Security track new groups

Paste Site Monitoring

# Paste sites are frequently used to distribute leaked credentials and data

# PasteHunter — automated paste site monitoring
# https://github.com/kevthehermit/PasteHunter
pip install pastehunter
pastehunter --config config.json   # searches pastes for keywords

# psbdmp.ws — archive of removed Pastebin pastes
curl "https://psbdmp.ws/api/search/TARGET_KEYWORD"

# Pastebin monitoring (requires account for some features)
# https://pastebin.com/api — search API for Pro accounts

# Sites to monitor:
# pastebin.com, paste.ee, ghostbin.com
# hastebin.com, privatebin.net
# rentry.co, paste2.org

# Manual search strings for each target:
"@target.com" password
"target.com" "confidential"
"TARGET_COMPANY" "leaked"
"api_key" "target.com"
TARGET_DOMAIN credentials

# Commercial solutions for automated monitoring:
# Intel471, Recorded Future, Digital Shadows, Flashpoint
# BreachIQ, SpyCloud (focuses on employee credentials)

OSINT Intelligence Frameworks

MISP (Malware Information Sharing Platform)

# MISP is an open-source threat intelligence platform
# Install MISP: https://www.misp-project.org/download/
# Docker: https://github.com/MISP/misp-docker

# Key MISP concepts:
# Events: Container for threat intelligence (one incident/campaign)
# Attributes: Individual IOCs (IP, domain, hash, email, URL)
# Tags: Taxonomy labels (TLP, PAP, threat actor, sector)
# Galaxies: Structured knowledge (MITRE ATT&CK, threat actors)
# Feeds: External threat intelligence feeds (CIRCL, Abuse.ch, etc.)

# MISP feeds to enable (Settings > Feeds):
# CIRCL OSINT Feed
# Abuse.ch URLhaus
# Abuse.ch MalwareBazaar
# PhishTank
# Emerging Threats

# Python API usage
from pymisp import ExpandedPyMISP, MISPEvent, MISPAttribute

misp = ExpandedPyMISP('https://your-misp-instance.com', 'YOUR_API_KEY')

# Create event
event = MISPEvent()
event.info = "Target Corp phishing campaign 2024-03"
event.distribution = 1   # Community
event.threat_level_id = 2   # Medium

# Add attributes
attr = MISPAttribute()
attr.type = 'ip-dst'
attr.value = '198.51.100.42'
attr.comment = 'C2 server'
event.add_attribute(attr)

# Add domain
event.add_attribute('domain', 'malicious-domain.example.com')

# Add hash
event.add_attribute('md5', 'd41d8cd98f00b204e9800998ecf8427e')

# Create event
result = misp.add_event(event)

OpenCTI — Open Cyber Threat Intelligence Platform

# OpenCTI uses STIX2 as its data model
# Install via Docker: https://github.com/OpenCTI-Platform/docker

# OpenCTI connectors for data ingestion:
# MITRE ATT&CK (built-in)
# VirusTotal
# Shodan
# OpenCTI datasets
# AlienVault OTX
# MISP

# Python client
from pycti import OpenCTIApiClient

api = OpenCTIApiClient('https://your-opencti.com', 'YOUR_API_KEY')

# Create threat actor
threat_actor = api.threat_actor.create(
    name="APT Group Name",
    description="Nation-state threat actor",
    first_seen="2020-01-01T00:00:00.000Z",
    last_seen="2024-01-01T00:00:00.000Z",
    sophistication="advanced",
    resource_level="government",
    primary_motivation="national-security"
)

# Add indicator
indicator = api.indicator.create(
    name="Malicious IP",
    pattern="[ipv4-addr:value = '198.51.100.42']",
    pattern_type="stix",
    valid_from="2024-01-01T00:00:00.000Z",
    main_observable_type="IPv4-Addr"
)

Attribution Techniques

Attribution — determining who is responsible for a cyberattack — is one of the most complex and contested activities in threat intelligence. It requires correlating multiple data points across technical and non-technical dimensions.

Technical Attribution Indicators

# Infrastructure overlap
# Check if known malicious IPs/domains share registration data
# Tools: DomainTools Iris, Maltego, Passivetotal (Recorded Future)

# Passive DNS: Find other domains that resolved to same IP
# https://api.passivetotal.org/v2/dns/passive
curl -u "user:apikey" "https://api.passivetotal.org/v2/dns/passive?query=198.51.100.42"

# Malware code similarity
# Find code reuse between known malware families and new samples
# Tools: BinDiff, TLSH, ssdeep fuzzy hashing

# Fuzzy hash comparison
ssdeep -b sample1.exe > hash1.txt
ssdeep -bm hash1.txt sample2.exe   # compare sample2 against hash1

# TLSH (trend micro locality sensitive hash)
tlsh -f sample1.exe
tlsh -c sample1.exe sample2.exe   # compare

# TTP correlation (MITRE ATT&CK)
# If a new campaign uses same techniques as known APT, correlate in ATT&CK Navigator
# https://mitre-attack.github.io/attack-navigator/

# Language artifacts
# Error messages, debug strings, PDB paths, code comments in specific language
# Keyboard layout fingerprinting (charset, encoding)
# Time zone of compilation timestamps / activity

Open Source Attribution Resources

# MITRE ATT&CK Groups
# https://attack.mitre.org/groups/
# Comprehensive TTP profiles for 100+ tracked threat actors

# Malpedia (malware family encyclopedia)
# https://malpedia.caad.fkie.fraunhofer.de
# Free API access with registration

# Vx-underground (malware samples and reports)
# https://vx-underground.org
# Malware sample repository and threat actor papers

# Mandiant Advantage Threat Intelligence
# Public reports: https://www.mandiant.com/resources/blog

# CISA Known Exploited Vulnerabilities
curl "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json" | \
  jq '.vulnerabilities[] | select(.vendorProject == "Microsoft") | .cveID, .vulnerabilityName'

Integrating Threat Intelligence into Defense

# STIX/TAXII — standardized sharing format
# Import CTI feeds into your SIEM via TAXII

# Sigma rules from CTI (GitHub.com/SigmaHQ/sigma)
# Convert threat actor TTPs to detection rules
# Sigma → Splunk, Elastic, Microsoft Sentinel

# IOC enrichment workflow:
# 1. Receive alert with suspicious IP/domain/hash
# 2. Query MISP/OpenCTI for existing intelligence
# 3. Query VirusTotal, Shodan for additional context
# 4. Query HIBP if user credentials potentially compromised
# 5. Add new intelligence back to platform for future correlation

# Automated enrichment with Python
import requests

def enrich_ip(ip):
    results = {}

    # VirusTotal
    vt = requests.get(f"https://www.virustotal.com/api/v3/ip_addresses/{ip}",
                      headers={"x-apikey": "VT_API_KEY"})
    results['virustotal'] = vt.json()

    # Shodan
    shodan = requests.get(f"https://api.shodan.io/shodan/host/{ip}?key=SHODAN_KEY")
    results['shodan'] = shodan.json()

    # AbuseIPDB
    abuse = requests.get("https://api.abuseipdb.com/api/v2/check",
                        headers={"Key": "ABUSEIPDB_KEY", "Accept": "application/json"},
                        params={"ipAddress": ip, "maxAgeInDays": "90"})
    results['abuseipdb'] = abuse.json()

    return results

The most valuable threat intelligence is timely, specific, and actionable. A report noting "APT28 generally targets defense contractors" is strategic but not immediately actionable. An alert stating "APT28 C2 domain resolving to 198.51.100.42 is identical to infrastructure used against organizations in your sector last week — block it now" is technical intelligence that drives immediate defensive action.

Reactions

Published	Mar 27, 2026
Updated	Jul 13, 2026
Reading time	15 min
Access	public

The Threat Intelligence Lifecycle

Intelligence Types

Breach Data Sources and Analysis

HaveIBeenPwned API

Dehashed — Leaked Credential Search

Analyzing Leaked Credential Dumps

Dark Web Monitoring

Tor Network Fundamentals

Tor OPSEC for Investigators

Dark Web Search Engines and Indexes

Monitoring Ransomware Leak Sites

Paste Site Monitoring

OSINT Intelligence Frameworks

MISP (Malware Information Sharing Platform)

OpenCTI — Open Cyber Threat Intelligence Platform

Attribution Techniques

Technical Attribution Indicators

Open Source Attribution Resources

Integrating Threat Intelligence into Defense

Related Articles

Related Cheatsheets

Subfinder & Amass — Subdomain Enumeration

GitLeaks & Source Code Recon — Secret Discovery

theHarvester — OSINT Email & Domain Recon

Related Courses

OSINT & Reconnaissance Masterclass