S3 URL with Special Characters in Object Key
Parse an S3 URL where the object key contains spaces, unicode characters, or special characters that require URL encoding.
Virtual-Hosted Style
Detailed Explanation
Handling Special Characters in S3 Object Keys
S3 object keys can contain virtually any UTF-8 character, but when these keys appear in URLs, certain characters must be percent-encoded. Understanding how encoding works is critical for correctly constructing and parsing S3 URLs.
Example URL
https://media-bucket.s3.eu-west-1.amazonaws.com/uploads/2024/My%20Document%20%28final%29.pdf
Parsed Components
| Component | Value |
|---|---|
| Bucket | media-bucket |
| Key | uploads/2024/My Document (final).pdf |
| Region | eu-west-1 |
| Style | Virtual-Hosted |
Characters That Need Encoding
| Character | Encoded | Notes |
|---|---|---|
| Space | %20 |
Most common encoding issue |
( |
%28 |
Parentheses in filenames |
) |
%29 |
Parentheses in filenames |
+ |
%2B |
Often confused with space |
# |
%23 |
Fragment delimiter |
& |
%26 |
Query parameter separator |
Safe Characters
The following characters are safe and do not need encoding in S3 keys: A-Z, a-z, 0-9, /, -, _, ., ~.
Common Pitfalls
- Plus sign vs space — In query strings,
+means space, but in the path portion of a URL,+is a literal plus sign. S3 treats the path according to URL standards, so+in the key is literal. - Double encoding — Be careful not to encode a key that is already encoded. For example, encoding
My%20Fileagain producesMy%2520File, which is a different key. - Forward slashes —
/characters in the key are NOT encoded because they define the "folder" hierarchy in S3 (even though S3 is flat storage).
Use Case
Troubleshooting broken download links in a web application where user-uploaded files with spaces and special characters in their names produce 404 errors due to incorrect URL encoding.