BCP 47 Language Tags — The Web Standard for Locale Identifiers

Complete guide to BCP 47 (IETF) language tags covering syntax, subtag types, and practical examples for web applications.

Standards

Detailed Explanation

What Is BCP 47?

BCP 47 (Best Current Practice 47) is the IETF standard for identifying languages. It defines the syntax for language tags used across the web, in HTTP headers, HTML, JavaScript's Intl API, and almost every internationalization framework.

Tag Structure

A BCP 47 tag is composed of subtags separated by hyphens:

language[-script][-region][-variant][-extension]
Subtag Length Example Description
Language 2-3 chars en, zh ISO 639-1 or 639-2 code
Script 4 chars Hans, Latn ISO 15924 script code
Region 2 chars or 3 digits US, 419 ISO 3166-1 or UN M.49
Variant 5-8 chars valencia Dialect or spelling variant

Real-World Examples

Tag Meaning
en English (any region)
en-US English as used in the United States
en-GB English as used in Great Britain
zh-Hans Chinese written in Simplified script
zh-Hant-TW Chinese in Traditional script, Taiwan
sr-Latn-RS Serbian in Latin script, Serbia
pt-BR Portuguese as used in Brazil
es-419 Spanish for Latin America (UN M.49 region)

Matching Rules

BCP 47 defines matching algorithms:

  1. Basic Filtering — prefix matching (en matches en-US)
  2. Extended Filtering — wildcard subtags
  3. Lookup — finds best single match by truncating subtags

The IANA Subtag Registry

All valid subtags are listed in the IANA Language Subtag Registry. This is the authoritative source for language, script, region, and variant subtags.

Use Case

BCP 47 tags are used everywhere on the web: in HTML lang attributes, HTTP Accept-Language headers, JavaScript Intl API, CSS :lang() selector, CLDR locale data, and every major internationalization library (i18next, react-intl, vue-i18n).

Try It — Language Code Reference

Open full tool