URL slug rules: transliteration, normalization, and SEO impact

3 min read

The ”/blog/url-slug-rules/” portion of a URL is the slug. This article walks through how to generate slugs, character handling, and SEO concerns.

What’s a slug

The portion of a URL identifying a specific resource:

https://example.com/blog/how-to-make-pizza/
                    ─────────────────────
                    ↑ slug

Compared to a numeric ID (/blog/12345), a slug:

  • Carries meaning for humans.
  • Helps SEO (keywords in URL).
  • Reveals content in link previews.

Basic generation rules

A common pipeline from title to slug:

  1. Lowercase everything.
  2. Replace whitespace with hyphens (-).
  3. Strip punctuation (!?,.()[]"').
  4. Collapse consecutive hyphens.
  5. Trim leading/trailing hyphens.

Examples:

"How to Make Pizza!" → "how-to-make-pizza"
"What's New?"        → "whats-new"
"7 Tips & Tricks"    → "7-tips-tricks"

Hyphens vs underscores

  • Hyphen (-) — Google treats it as a word separator.
  • Underscore (_) — Google treats it as joining two words.

SEO-wise, prefer hyphens. “new_post” reads as one token.

Non-ASCII (CJK, etc.)

Several strategies for non-ASCII titles:

1. URL-encode the original

"日本語のスラグ" → "%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%81%AE%E3%82%B9%E3%83%A9%E3%82%B0"

Long, hard to read, often broken when copied.

2. Transliterate

"日本語のスラグ" → "nihongo-no-slug"

Readable, but “猫” and “ねこ” both become “neko” — collision risk.

3. Separate English slug

Title: "日本語のスラグ"
Slug:  "japanese-slug-rules"

Most flexible. Standard for multilingual sites.

4. Numeric ID

"日本語のスラグ" → "post-123"

Worse for SEO but mechanically easy.

Length limits

Technical bounds:

  • Full URL — browsers practically support 2,000–8,000 chars.
  • Slug alone — no hard limit, but 50–80 chars max in practice.

Overly long slugs:

  • Wrap awkwardly in email/chat link previews.
  • Get truncated by character limits (Twitter, SMS).
  • Risk being flagged as keyword stuffing.

Stop-word removal

Some pipelines drop low-information words (“the”, “a”, “of”):

"The Best Way to Use Git" → "best-way-use-git"

Pro — shorter, denser keywords. Con — sometimes loses meaning.

WordPress and others toggle this in settings.

Collision handling

When the slug already exists:

1. Append a counter

"how-to-cook" → "how-to-cook"
"how-to-cook" (2nd) → "how-to-cook-2"

WordPress, Hugo, Jekyll standard.

2. Append a date

"how-to-cook" → "how-to-cook-2024-04-26"

Encodes time order in the URL.

3. Append an ID

"how-to-cook" → "how-to-cook-abc123"

Guarantees uniqueness.

Case sensitivity

URLs are technically case-sensitive:

example.com/About and example.com/about are different

Conventions:

  • Hostnames are case-insensitive (Example.com = example.com).
  • Paths should be lowercased by convention.

Most servers redirect mixed-case paths to lowercase.

Safe characters

“Unreserved” URL characters per RFC 3986:

  • Alphanumeric — A-Z, a-z, 0-9.
  • Marks — -, ., _, ~.

Anything else needs percent-encoding (%xx). Stick to a minimal set in slugs.

SEO impact

Google’s stance:

  • Short, keyword-bearing slugs preferred.
  • Hyphen-separated recommended.
  • Meaningful slugs marginally outperform opaque IDs.
  • Keyword stuffing is penalized.

“best-best-best-pizza-recipe-pizza-pizza” is bad.

Risk of changing URLs

After a URL is published, changing it:

  • Breaks existing links (404).
  • Invalidates bookmarks.
  • May reset SEO ranking.

Mitigations:

  • 301 redirect old → new.
  • Notify via Google Search Console.
  • Old social shares can’t be patched.

“Cool URIs don’t change” (Tim Berners-Lee).

File system vs URL

OS path constraints are separate:

  • Windows — disallows < > : " / \ | ? *.
  • macOS / Linux — minimal restrictions.

URL slug rules are independent of the OS — the lowercase / hyphen conventions are about SEO and readability.

Implementation checklist

When building a slug generator:

  • Unicode normalization (NFKC).
  • Lowercase.
  • Replace non-alphanumeric with hyphens.
  • Collapse consecutive hyphens.
  • Trim leading/trailing hyphens.
  • Empty-string fallback (ID, etc.).
  • Collision check.
  • Length cap.

Summary

  • Slugs make URLs human-readable.
  • Hyphens beat underscores for SEO.
  • For non-ASCII, transliterate or maintain a separate English slug.
  • Keep them short (50–80 chars) and meaningful.
  • 301-redirect when slugs change.

To generate a slug from a title, the slug generator on this site applies these rules.