AI/LLM Security
Premium
Training Data Extraction from LLMs
Large language models do not only generalise — they memorise, and memorised text can be pulled back out word for word. Feed the right prefix and the model completes a phone number it saw in the crawl; make it repeat a word forever and it can fall out of chat mode and dump verbatim training data. Here is why memorisation happens, the divergence trick that triggered it on a live model, and why deduplication is the main defence.
Members Only Content
This article is exclusively available to premium members of LazyHackers. Login or subscribe to read.