AccessibilityApril 29, 20268 min read

html lang: One Attribute That Decides Pronunciation, Translation, and SEO

A single attribute — <html lang="en"> — controls whether a screen reader pronounces your page in the right voice or in robotic gibberish. The same attribute drives the browser's “Translate from Spanish?” toolbar, controls hyphenation, and tells search engines what language market the page is for.

1. What is the html-has-lang violation?

axe-core's html-has-lang rule requires the root <html> element to carry a lang attribute. A companion rule, html-lang-valid, requires the value to be a BCP 47 code (e.g., en, tr, en-GB).

Both map directly to WCAG 2.1 3.1.1 Language of Page (Level A) — the lowest tier of conformance. If you skip lang, you fail the easiest bar in accessibility.

2. Why it matters

A lot of platforms read lang actively:

  • Screen-reader voice: NVDA and VoiceOver pick the speech engine from lang. A Turkish page left as lang="en" reads “merhaba” with an English engine — unintelligible.
  • Browser translation: Chrome and Edge detect page language from lang. Wrong value means missed translation prompts or unwanted ones.
  • Spell check: Form spell-checkers honor page language. Wrong lang flags every correctly-spelled word as a typo.
  • Typography: CSS hyphens, quotes, and date/number formatting are language-aware.
  • Search-engine classification: Google infers target language from lang, hreflang, and content together. Mismatch means wrong market.

One attribute, multiple rails. Getting it wrong dings accessibility, UX, and SEO at the same time.

3. When does it fire?

  1. No lang attribute: <html> opens with no attributes. The most common offender on legacy templates.
  2. Empty lang: lang="" or whitespace.
  3. Invalid code: lang="eng" (BCP 47 says en), lang="english", or any made-up value. Triggers the html-lang-valid rule.
  4. Wrong code: A Turkish page set to lang="en". Passes the rule but breaks screen-reader pronunciation entirely.
  5. xml:lang vs lang mismatch: XHTML compatibility sets both, but the two disagree. axe-core's html-xml-lang-mismatch rule catches this.

4. The fixes

4.1 Single-language site

Yanlış / WrongNo lang
<!DOCTYPE html>
<html>
  <head>...</head>
  <body>...</body>
</html>
Doğru / RightBCP 47 code for English
<!DOCTYPE html>
<html lang="en">
  <head>...</head>
  <body>...</body>
</html>

4.2 Regional variants

BCP 47 lets you tack on a region: en-GB, en-US, pt-BR, zh-Hant. Use it only when spelling, pronunciation, or date format actually differs. en-GB is conventional for British English; for many languages the bare two-letter code is enough.

Doğru / Right
<html lang="en-GB">  <!-- British English -->
<html lang="pt-BR">  <!-- Brazilian Portuguese -->
<html lang="zh-Hant"> <!-- Traditional Chinese -->

4.3 Multilingual sites

When you serve TR and EN versions, update lang on every variant. Leaving a single lang="en" in place across both means screen readers use the wrong engine on the Turkish version.

Doğru / RightNext.js root layout — dynamic lang
// app/layout.tsx
import { cookies } from "next/headers";

export default async function RootLayout({
  children,
}: {
  children: React.ReactNode;
}) {
  const c = await cookies();
  const locale = c.get("locale")?.value === "tr" ? "tr" : "en";

  return (
    <html lang={locale}>
      <body>{children}</body>
    </html>
  );
}

4.4 Mixed-language content within one page

A Turkish article that quotes English needs the quote marked with its own lang — that triggers the screen reader to switch engines mid-paragraph.

Doğru / RightMarking nested languages
<p>
  As Steve Jobs famously said:
  <q lang="en">Stay hungry, stay foolish.</q>
</p>

4.5 SPA language switching

When an SPA toggles language at runtime, update document.documentElement.lang alongside the content.

Doğru / Right
function setLocale(locale: "tr" | "en") {
  document.documentElement.lang = locale;
  // Bonus: handle text direction too
  document.documentElement.dir = locale === "ar" ? "rtl" : "ltr";
}

5. How to pick the right code

BCP 47 combines ISO 639 and ISO 3166. Common patterns:

  • en — English (generic)
  • en-US, en-GB, en-AU — regional Englishes
  • tr — Turkish
  • de — German
  • de-CH, de-AT — regional Germans
  • fr — French
  • ar — Arabic (right-to-left; pair with dir="rtl")
  • zh-Hans, zh-Hant — Simplified / Traditional Chinese

6. Cases that get missed

  • iframes: Each embedded document needs its own <html lang>. Embedded maps, video players, payment iframes are all separate documents.
  • PDFs: Downloadable PDFs have their own embedded language metadata — WCAG covers them too.
  • HTML email templates: Frequently shipped with no lang. Many mail clients then mispronounce the message in screen readers.
  • Mixed-language single-document sites: When two languages share one HTML, set lang on <html> to the primary language and mark each block.
  • CJK pages stuck on "en": Typography, line breaking, and font selection all suffer. Always tag correctly.

7. How to test

  1. View source: One-second check on <html lang="...">.
  2. axe DevTools / Lighthouse: Both flag missing or invalid lang instantly.
  3. Screen-reader test: Run NVDA or VoiceOver and load the page. If Turkish content reads with an English voice, lang is wrong.
  4. Keysonar SEO Tools: Site-wide listing of every page's lang — surfaces inconsistencies across language variants and broken templates.

8. Quick checklist

  • Every page's <html> carries a lang attribute.
  • Lang value is a valid BCP 47 code (tr, en, en-GB, …).
  • Lang matches the actual content language.
  • Each language variant in a multilingual site sets its own lang.
  • SPA language toggles also update document.documentElement.lang.
  • Foreign-language quotes/terms inside the page carry their own lang.
  • iframes have their own <html lang>.
  • RTL languages also set dir='rtl' alongside lang.

9. References

  • WCAG 2.1 SC 3.1.1 — Language of Page — Level A
  • WCAG 2.1 SC 3.1.2 — Language of Parts — Level AA
  • BCP 47 — Tags for Identifying Languages
  • HTML Living Standard — The lang attribute
  • W3C: Choosing a Language Tag

Catch these violations automatically across your site

Keysonar SEO Tools crawls every page, runs the full axe-core ruleset including button-name, and gives you the exact list of affected URLs.

Start free