Parse an S3 Protocol URI (s3://)

Parse the s3:// protocol URI format used by AWS CLI, SDKs, and tools like Spark and Hadoop. Understand how it maps to HTTP endpoints.

S3 Protocol

Detailed Explanation

The s3:// Protocol URI

The s3:// protocol is not an HTTP URL — it is a URI scheme used by AWS tools (CLI, SDKs), Apache Spark, Hadoop, and other data processing frameworks to reference S3 objects. It provides a concise, human-readable format that abstracts away the HTTP endpoint details.

URI Structure

s3://BUCKET/KEY

Example

s3://data-lake-prod/raw/events/2024/01/15/events.parquet

Parsed Components

Component Value
Bucket data-lake-prod
Key raw/events/2024/01/15/events.parquet
Region (not embedded in URI)
Style S3 Protocol

Where s3:// Is Used

Context Example
AWS CLI aws s3 cp s3://bucket/key ./local-file
AWS SDK Used in SDK configuration to specify S3 locations
Apache Spark spark.read.parquet("s3://bucket/path")
Hadoop hadoop fs -ls s3://bucket/prefix/
AWS Glue Data catalog table locations
AWS Athena Query result locations
Terraform S3 backend state storage

Region Resolution

The s3:// URI does not include region information. The region is resolved by:

  1. The AWS_DEFAULT_REGION environment variable
  2. The ~/.aws/config file profile
  3. The EC2 instance metadata (when running on AWS)
  4. The bucket's actual region (via a HEAD request)

Related URI Schemes

  • s3a:// — Used by Hadoop 2.x+ for improved S3 access with the S3A filesystem connector.
  • s3n:// — Legacy Hadoop S3 native filesystem (deprecated).
  • s3-external-1:// — Rarely used, for us-east-1 explicit routing.

Use Case

Converting between s3:// URIs used in AWS Glue job scripts and HTTPS URLs needed for browser-based access or API calls in a web application frontend.

Try It — AWS S3 URL Parser

Open full tool