Open In App

Unicode - UTF-8, UTF-16 and UTF32

Last Updated : 23 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Unicode is a universal character encoding standard designed to represent text and symbols from all writing systems around the world.

  • Unicode is the most fundamental and universal character encoding standard. For every character, there is a unique 4 to 6-digit unique hexadecimal number.
  • Unicode is standardized among all global computing platforms, devices and programs, enabling consistent representation and manipulation of text across different systems and applications.
  • Unicode supports multiple languages, mathematical symbols, emojis and specialized symbols.
  • Unicode is flexible. It allows new characters to be added, supporting the evolving communication and language needs.

How is Unicode Compatible with ASCII?

  • We can also say that ASCII is a subset of Unicode.
  • But wait! For the character 'A', the ASCII representation is 0065 and the unicode point is U+0041. How is it backward compatible with ASCII?
  • This is because the U+0041 is in hexadecimal form! which corresponds to 0065 in Decimal.(0041)16 = (0065)10

Size and Growth

As of today, Unicode supports over 1,49,000 characters! This set continues to grow to accommodate new symbols, emojis, and characters. Here are some characters with their Unicodes:

Character

Unicode

😊

U+1F60A

👍

U+1F44

1

U+0031

+

U+002B

How To Type in Unicode Characters?

  • Open your computer and log into your Operating System.
  • Opening unicode window.
    • On a Windows machine press the Windows Key (🪟) + period key (Dot key).
    • On Mac OS press Control + command + space
  • This will open a small window with Unicode characters.
  • Search for the character you want and click on it. The character will appear on the screen.

Unicode Transformation Format (UTF)

Unicode Transformation Format is a method of encoding unicode characters for storage and communication purposes. This format specifies how Unicode characters will be converted into a sequence of bytes. The most common UTF forms are UTF-8, UTF-16, UTF-32.

UTF-8

  • UTF-8 is a variable width encoding system where each character is encoded into 1 to 4-byte unicode points.
  • UTF-8 is backward compatible with ASCII. All the ASCII characters (0-127) and 10 are represented inside UTF-8 (00-F7)16 using one byte.
  • Other Unicode characters in UTF-8 are represented using multiple bytes.
  • UTF-8 is widely used in internet and UNIX-like operating systems.

UTF-16

  • UTF-16 is also a variable width encoding system where each character is encoded into a 2 to 4-byte unicode point.
  • UTF-16 is used in Microsoft Windows OS and programming languages like Java

UTF-32

  • UTF-32 is a fixed-width encoding system where each character is encoded into 4-byte unicode point.
  • This format provides a simple one-to-one correspondence between Unicode characters but makes it less space-efficient, as where it should only take 1 byte of data (Example: 01), it is taking up 4 bytes (Example: 00000001).
  • UTF-32 is less commonly used in mainstream applications and systems due to its space inefficiency and compatibility considerations

History of Unicode

There have been numerous versions of Unicode released till now :

Unicode VersionYear of ReleaseMonth (Day)
15.1.02023September 12
15.0.02022September 13
14.0.02021September 14
13.0.02020March 10
12.1.02019May 7
12.0.02019March 5
11.0.02018June 5
10.0.02017June 20
9.0.02016June 21
8.0.02015June 17
7.0.02014June 16
6.3.02013September 30
6.2.02012September 26
6.1.02012January 31
6.0.02010October 11
5.2.02009October 1
5.1.02008April 4
5.0.02006July 14
4.1.02005March 31
4.0.12004March
4.0.02003April
3.2.02002March
3.1.12001August
3.1.02001March
3.0.12000August
3.0.01999September
2.1.91999April
2.1.81998December
2.1.51998August
2.1.21998May
2.0.01996July
1.1.51995July
1.1.01993June
1.0.11992June
1.0.01991October

Similar Reads

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy