The Base64 Alphabet
Explore the 64-character Base64 alphabet (A-Z, a-z, 0-9, +, /). Learn how characters map to 6-bit values, variants like Base64url, and encoding mechanics.
Detailed Explanation
The Base64 encoding scheme uses exactly 64 characters to represent data, plus an optional 65th character (=) for padding. The standard alphabet defined in RFC 4648 is:
Value Char Value Char Value Char Value Char
0 A 16 Q 32 g 48 w
1 B 17 R 33 h 49 x
2 C 18 S 34 i 50 y
3 D 19 T 35 j 51 z
4 E 20 U 36 k 52 0
5 F 21 V 37 l 53 1
6 G 22 W 38 m 54 2
7 H 23 X 39 n 55 3
8 I 24 Y 40 o 56 4
9 J 25 Z 41 p 57 5
10 K 26 a 42 q 58 6
11 L 27 b 43 r 59 7
12 M 28 c 44 s 60 8
13 N 29 d 45 t 61 9
14 O 30 e 46 u 62 +
15 P 31 f 47 v 63 /
Why 64? Because 64 is 2^6, meaning each Base64 character represents exactly 6 bits of data. This clean power-of-two relationship is what makes the encoding scheme efficient and reversible. Three input bytes (24 bits) map perfectly to four Base64 characters (4 x 6 = 24 bits) with no wasted bits.
Why these specific characters? The 62 alphanumeric characters (A-Z, a-z, 0-9) are universally safe in virtually all text-based systems. The two remaining characters (+ and /) were chosen because they are printable ASCII. However, they cause problems in URLs (where + means space and / is a path separator) and in filenames (where / is a directory separator).
Variant alphabets:
- Base64url (RFC 4648 Section 5): replaces
+with-and/with_. Used in JWTs and URL parameters. - Base64 for filenames (sometimes seen): replaces
+with-and/with_, same as Base64url. - Base64 for XML identifiers: replaces
+with.and/with_. - Base64 for regular expressions: replaces
+with!and/with-.
Case sensitivity: Base64 IS case-sensitive. A (value 0) and a (value 26) represent different 6-bit values. A decoder that performs case-insensitive matching will produce incorrect output. This is a common source of bugs when Base64 strings pass through systems that normalize text to lowercase.
Use Case
Validating user-submitted Base64 strings in a web form by checking that every character belongs to the valid Base64 alphabet before attempting to decode.