text-wrap with Japanese, Chinese, and Thai Text
How text-wrap: balance and pretty behave with CJK and Thai text, where words are not separated by spaces.
Detailed Explanation
Why CJK Wrapping is Different
English line-breaking algorithms work on word boundaries (spaces). Japanese, Chinese, and Thai do not use spaces between words. Instead:
- Japanese can break between most characters (kana and kanji), with rules forbidding certain combinations like opening punctuation at line-end (
kinsoku shori). - Chinese can break between any two Han characters; punctuation rules apply.
- Thai must use a dictionary or rule-based segmenter to find word boundaries (this is hard).
How browsers handle CJK with text-wrap
For Japanese and Chinese, the browser treats the line-breaking opportunity space as much denser (between every character pair, with kinsoku constraints). text-wrap: balance therefore has many more break candidates to consider — the algorithm still works, but the resulting balance is much finer-grained because nearly every character is a potential break point.
text-wrap: pretty is less impactful for CJK because the orphan-prevention notion (single short word at line-end) doesn't translate directly to character-based breaking.
Practical recommendations
For Japanese/Chinese headings:
h1[lang="ja"], h1[lang="zh"] {
text-wrap: balance;
/* CJK lines look better with tighter line-height */
line-height: 1.4;
}
For Japanese/Chinese body text:
article[lang="ja"] p, article[lang="zh"] p {
/* pretty has minimal effect; default wrapping is usually fine */
text-wrap: wrap;
line-height: 1.7; /* CJK needs more breathing room */
}
Use word-break: keep-all if you want to discourage breaking inside CJK words (treating sequences as units):
.no-cjk-break {
word-break: keep-all;
overflow-wrap: anywhere;
}
Mixed-script text
If your text mixes English and Japanese ("React の useState は便利"), the browser applies different rules to each script segment automatically. balance and pretty work with this, but you may see slight asymmetries because the two scripts have different break-point densities.
Thai
Thai requires word segmentation to find break points. Modern browsers ship a Thai dictionary and segment correctly. text-wrap: pretty and balance work the same as for English Thai content, just based on the dictionary-derived word boundaries.
Setting the lang attribute
For all of the above to work correctly, set the document or block language:
<html lang="ja">
or per-element:
<p lang="zh">這是中文段落。</p>
Without lang, the browser uses a default segmenter (usually English-style), which produces poor results for CJK.
Use Case
Internationalized sites with CJK content, Japanese e-commerce, Chinese tech documentation, Thai-language news sites, multilingual marketing pages that switch language at runtime, Asian app dashboards.