Some text editors prepend a UTF-8 BOM (0xEF 0xBB 0xBF) which breaks header() calls, JSON, and PHP output. Detect and strip it defensively when ingesting external files.
Convert any string into a clean URL-safe slug. Transliterates accented characters to ASCII, lowercases, replaces non-alphanumerics with dashes, and collapses runs of dashes. Works on UTF-8 input out of the box.
Count words in a string in a way that works for any Unicode script — including those without ASCII whitespace. Built on the \p{L}\p{N} Unicode regex categories.
Cleanly truncate a string to a max length without breaking words mid-character. Adds a trailing ellipsis when truncated and never exceeds the requested length including the ellipsis.
Wrap mb_* with sensible defaults so you stop accidentally chopping multi-byte characters in half. Defensive helpers for projects where input is not guaranteed ASCII.