How Binary Data Embeds in Image Pixels
Discover the step-by-step process of converting text to binary and embedding each bit into the least significant bits of image pixel channels.
Detailed Explanation
From Text to Pixels: The Embedding Pipeline
Embedding a text message into an image via LSB steganography involves a precise pipeline that converts human-readable characters into individual bits, then slots each bit into pixel data.
Step 1 — UTF-8 Encoding
The message is first encoded as UTF-8 bytes. For example, the letter "A" becomes:
"A" → 0x41 → 01000001
Multi-byte characters like "ü" become two bytes: 11000011 10111100.
Step 2 — Length Header
Before the message bits, a 32-bit length header is embedded. This tells the extractor how many bytes to read:
Message: "Hi" (2 bytes)
Header: 00000000 00000000 00000000 00000010
Without this header, the extractor would not know where the message ends and random pixel data begins.
Step 3 — Bit Replacement
The combined header + message bit stream is iterated bit by bit. For each bit, the algorithm:
- Reads the next color channel value from the image (R, G, or B)
- Clears the least significant bit:
value & 0xFE - Sets the new LSB:
value | messageBit
// Core embedding logic
pixels[i] = (pixels[i] & 0xFE) | bitValue;
Step 4 — Sequential Channel Traversal
Pixels are traversed left-to-right, top-to-bottom. Within each pixel, channels are visited in R → G → B order. Each channel stores exactly one bit of the payload.
Capacity
An image of width W and height H can store:
capacity = (W × H × 3) / 8 bytes
Subtract 4 bytes for the length header to get usable payload capacity.
Practical Consideration
The entire process uses the Canvas API in the browser: getImageData() to read pixels and putImageData() to write them back. No server communication is needed, and the original image data never leaves your device.
Use Case
A developer building a steganography library needs to understand the exact bit-manipulation pipeline to implement encoding correctly in JavaScript or Python.