Skip to main content
DevConverter
Home/Encoding / Decoding/Unicode Escape Converter

Unicode Escape Converter

Convert text to Unicode escape sequences (\uAC00) and back. Supports \uXXXX and \u{X} syntax.

Escape
Syntax

About this tool

Unicode is the universal character encoding standard that assigns a unique code point to every character across all human writing systems, plus symbols, mathematical notation, and emoji. Code points are written as U+XXXX (using 4–6 hex digits), ranging from U+0000 to U+10FFFF, covering over 140,000 characters. Unicode is maintained by the Unicode Consortium and updated regularly as new scripts and symbols are added.

Unicode escape sequences represent characters by their code point value rather than their literal glyph. In JSON, \uXXXX escapes represent UTF-16 code units and can represent all characters in the Basic Multilingual Plane (U+0000 to U+FFFF) directly. Characters outside this range require surrogate pairs (two consecutive \uXXXX sequences). In many programming languages, \UXXXXXXXX (with uppercase U and 8 hex digits) can represent any Unicode code point directly.

UTF-8 is the dominant encoding on the web, used by over 98% of websites. It encodes ASCII characters in 1 byte, Western European characters in 2 bytes, CJK (Chinese, Japanese, Korean) characters in 3 bytes, and other characters including emoji in 4 bytes. UTF-16 uses 2 bytes for most characters and is the internal encoding of JavaScript strings, Java, and .NET. The distinction between code points and code units matters when indexing strings — a single emoji may be 2 code units in UTF-16 or 4 bytes in UTF-8.