Base64 in Python
Complete guide to Base64 encoding and decoding in Python using the base64 module. Covers strings, files, URL-safe variants, and binary data handling.
Detailed Explanation
Python's base64 module (part of the standard library) provides a comprehensive set of functions for Base64 encoding and decoding. Unlike JavaScript, Python makes a clear distinction between byte strings (bytes) and text strings (str), which you must handle correctly.
Basic string encoding and decoding:
import base64
# Encoding: str -> bytes -> base64 bytes -> str
text = "Hello, World!"
encoded = base64.b64encode(text.encode("utf-8")).decode("ascii")
print(encoded) # "SGVsbG8sIFdvcmxkIQ=="
# Decoding: str -> bytes -> decoded bytes -> str
decoded = base64.b64decode(encoded).decode("utf-8")
print(decoded) # "Hello, World!"
The key detail is that b64encode() takes bytes and returns bytes. You must .encode("utf-8") your string first, and .decode("ascii") the result if you need a string.
File encoding:
import base64
# Encode a file
with open("image.png", "rb") as f:
encoded = base64.b64encode(f.read()).decode("ascii")
# Decode back to file
with open("output.png", "wb") as f:
f.write(base64.b64decode(encoded))
URL-safe Base64:
# Uses - and _ instead of + and /
encoded = base64.urlsafe_b64encode(data).decode("ascii")
decoded = base64.urlsafe_b64decode(encoded)
Other variants available:
base64.b32encode()/b32decode()-- Base32 encodingbase64.b16encode()/b16decode()-- Base16 (hexadecimal)base64.a85encode()/a85decode()-- Ascii85 encodingbase64.encodebytes()-- adds newlines every 76 characters (MIME style)
Common mistakes in Python:
- Passing a
strdirectly tob64encode()instead ofbytes. This raises aTypeError. - Using
b64encode()when you need MIME-formatted output with line breaks. Useencodebytes()instead. - Forgetting to open files in binary mode (
"rb"/"wb"), which causes encoding errors or file corruption on Windows. - Not handling padding correctly when decoding URL-safe Base64 from external sources. Python's
urlsafe_b64decode()is lenient about missing padding, but adding it explicitly is safer.
Use Case
Encoding uploaded document files as Base64 strings for storage in a PostgreSQL JSONB column when a dedicated file storage service is not available.