跳至內容

Unicode

Wikipedia (chū-iû ê pek-kho-choân-su) beh kā lí kóng...

Unicode, he̍k-chiá kiò thong-iōng-bé (通用碼), bān-kok-bé (萬國碼; Hôa-gí ho͘-im: Wanguoma) sī 1-chióng pian-bé piau-chún. Unicode ji̍t jī sī Eng-gí uni kap code 2-jī cho͘-ha̍p--khí-lâi-ê. Uni ū "thong-iōng" ê ì-sù; code sī "hû-bé" ê ì-sù. Unicode ê 1-ê tiōng-iàu ê lí-liām sī beh siat-kè 1-thò ē-sài chhú-lí sè-kài kok-chióng bûn-jī ê pian-bé.

Kán-tan kóng, thong-iōng-bé sī 1-ê kok-chè piau-chún. I ê bo̍k-piau sī kā chhú-lí sè-kài kok-chióng gí-giân ê bûn-jī ê jī-tô͘ chòe-pian-bé. Kā múi 1-ê jī-tô͘ tùi-èng kàu 1-ê chéng-sò͘. Chit-ê chéng-sò͘ kiò-chòe chit-ê jī-tô͘ ê bé-ūi. Án-ne ē-sài kā bûn-jī choán-hoà choè sò͘-jī, chiah ū-hoat-tō iōng tiān-náu chhú-lí kah pó-chûn.

Thong-iōng-bé ū chi̍t-koá ki-su̍t siōng ê hān-chè kap būn-tê. Mā-ū chi̍t-kóa phoe-phêng. M̄-koh, thong-iōng-bé chiām-chiām piàn-chòe nńg-thé kok-chè-hòa kap nńg-thé to-gí-giân khoân-kéng chit 2-hāng sū-kang siōng chú-liû ê pian-bé. Microsoft Windows NT kap āu-lâi ê Microsoft Windows 2000, Microsoft Windows XP iōng UTF-16 lâi pó-chûn hē-thóng lāi-pō͘ iōng ê bûn-jī. UNIX-lūi ê hē-thóng, chhiūⁿ Linux, BSD (OpenBSD, FreeBSD) kap Mac OS X iōng UTF-8 lâi piáu-hiān to-gí-giân ê bûn-jī.

Chá-kî tiān-náu iōng ê pian-bé chú-iàu chiam-tùi Eng-gí lâi siat-kè. Ka-na sek-ha̍p chhú-lí Eng-bûn. Āu-lâi chiām-chiām cheng-ka Au-chiu kî-tha chú-iàu gí-giân iōng ê jī-bó. M̄-koh, bô-kâng kok-ka só͘ su-iàu kap chin-ka ê jī-bó lóng bô-kâng. Kiat-kó sī chhut-hiān chin-chē bô-hoat-tō͘ sio kau-thong ê pian-bé. Iōng Hoat-gí pian-bé hē-thóng pó-chûn ê chu-liāu, nā iōng Tek-gí pian-bé hē-thóng lâi tha̍k kap chhú-lí ē têng-tâⁿ--khì. Chiam-tùi 1-chióng gí-giân ê pian-bé hē-thóng siat-kè ê nńg-thé ka-na ē-sài chhú-lí hit chióng gí-giân. Beh kā chit-ê nńg-thé kái kah ē-sài chhú-lí pa̍t chióng gí-giân sī chin hùi-khì ê tāi-chì. Beh iōng tiān-náu chhú-lí 1-chóng í-siōng ê gí-giân ē-sài kóng chin khùn-lân. Nā sī khó-lū sè-kài kî-tha ê gí-giân kap bûn-jī, chit-ê būn-tê ka-na ē lú-lâi lú siong-tiōng.

Nā-sī ū 1-thò pian-bé ē-sài chhú-lí sè-kài kok-chióng bûn-jī. Bô kâng gí-giân ê chu-liāu kau-thong tio̍h piàn kán-tan. Tông-sî chhú-lí to-gí-giân mā piàn kán-tan. Nā-sī 1-thò nńg-thé lī-iōng chit chióng pian-bé lâi siat-kè, chit-ê nńg-thé, tiō sǹg-kóng khai-sí sī chiam-tùi bó͘ 1-chóng gí-giân lâi siat-kè, mā ē-sài khah kán-tan tio̍h kái lâi chi-oān pa̍t-chóng gí-giân kap bûn-jī. Chia-ê lī-ek ē-sài kóng sī chá-kî khai-sí thui-sak thong-iōng-bé ê tōng-ki.

Beh liáu-kái thui-sak thong-iōng-bé chit-chióng pian-bé piau-chún ê tōng-ki, su-iàu seng liáu-káu siáⁿ-mi̍h sī pian-bé. Iōng Eng-gí chòe lē. Eng-gí su-iàu 26 ê tōa-siá ê jī (ABC...XYZ), 26 ê sió-siá ê jī (abc...xyz), Arabic sò͘-jī (0123456789), kap 1-kóa piau-tiám (jī). Beh iōng tiān-náu chhú-lí Eng-gí, su-iàu 1-ê tùi-chiàu-pió, chit-ê pió ka múi 1-ê jī tùi-èng 1-ê to̍k-it ê 2-chìn-ūi sò͘-jī. M̄-koh, tiòng-iàu ê sī, ta̍k-ê lâng lóng ài iōng kāng-khóan ê tùi-chiàu-pió. Án-ne ta̍k-ke chia ū hoat-tō ko͘-thong, beh ka chia-ê 2-chìn-ūi sò͘-jī hoan-e̍k tńg lâi chòe Eng-gí chiah bôe têng-tâⁿ.

Siông-sè chhiáⁿ khoàⁿ: ASCII

1-ê pian-bé hē-thóng ē-sài tùi-èng kàu gōa-chē ê jī-tô͘ ài khòaⁿ chit-ê pian-bé iōng kui-ê bit lâi pó-chûn pian-bé-pió. 1-ê 7-bit ê 2-chìn-ūi sò͘-jī tùi-èng ê hoàn-ûi sī àn 0 kàu 2^7-1=127(thak chòe 2 ê 7 chhù-hong). So-í, 1-ê 7-bit ê pian-bé ē-sài siōng-chē tùi-èng kàu 128 ê jī-tô͘. Kāng-khoán ê tō-lí, 1-ê 8-bit ê pian-bé ē-sài tùi-èng kàu 256 ê jī-tô͘. 1-ê 16 bit ê pian-bé ē-sài tùi-èng kàu ??? ê jī-tô͘. Iōng lú-chē bit ê pian-bé ē-sài tùi-èng kàu lú-chē ê jī-tô͘, m̄-koh, beh pó-chûn 1-ê jī su-iàu ê RAM mā lú-chē.

Chá-kî ê tiān-náu, RAM sī chin tin-kùi ê chu-goân. In-ùi án-ne, ta̍k-ke ē iōng sè ê pian-bé. Chhú-lí Eng-gí ê sī, sǹg-sǹg 7-bit ê pian-bé tio̍h ū-kàu. Che chò-sêng 7-bit ê ASCII pian-bé piau-chú. M̄-koh, kî-thaⁿ iōng lô-má-jī bûn-jī hē-thóng ê Europe gí-gian, chia-chia sū-iàu 1-koa ū ka phiat-im hū-ho ê jī, chhiūⁿ 'å', he̍k-chiá-sī 1-koá liân-jī, chhiūⁿ 'œ'. Chia-ê jī(jī-tô͘) bô pau-koah tī ASCII pian-bé. Europe kok-ka, khai-sí chè-têng 8-bit ê pian-bé. Chia-ê 8-bit pian-bé, tùi 0 kàu 128 ê bé-ūi kap ASCII oân-choân sio-siâng.

Cho͘-hó-ê jī-bó kap Cho͘-ha̍p-ê jī-bó

[siu-kái | kái goân-sí-bé]

Ūi-tio̍h beh tī iú-hān ê pian-bé khong-kan lāi-tè chi-goân lú-chē lú-hó ê bûn-jī, thong-iōng-bé sú-iōng cho͘-hap-ê jī-bó ê chò-hoat. Iōng á chit-ê jī chò lē. Thong-iōng-bé ū hō chit-ê jī ka-kī 1-ê bé-ūi (U+00E1). M̄-koh, lán m̄a ē-sài siūⁿ-kóng chit-ê jī sī a (bé-ūi U+0061) kap ˊ lâi cho͘-hap--ê. Tī thong-iōng-bé ū tēng-gī 1-ê cho͘-hap-iōng (combining) ê ˊ (bé-ūi U+0301). N̄a-sī chhut-hiàn U+0061 U+0301 chit 2-ê sò͘-jī sio-liân, lán tio̍h ài liáu-kái che sī ài kà thâu-chêng U+0061 tāi-piáu ê a kap aū-piah U+0301 tāi-piáu ê ˊ, cho͘-ha̍p choè á. Iōng á (U+00E1) 1-ê sò͘-jī lâi piáu-sī Lô-má-jī jī-bó ê á, chit-chióng kiò cho͘-hó-ê jī-bó. (precomposed character). Iōng U+0061 U+0301 lâi piáu-sī, chit-chióng kiò cho͘-ha̍p-ê jī-bó (composed character). Chhiū U+0301 chit-chióng ê, chiò cho͘-ha̍p-iōng jī-bó (combining character).

1-ê ki-chhò jī-bó (base character) aū-piah ē-sài chiap 1-ê í-siōng ê cho͘-ha̍p-iōng jī-bó, hêng-sêng 1-ê cho͘-ha̍p-ê jī-bó. Nā sī chia-ê cho͘-ha̍p-ê jī-bó lóng beh kái chòe cho͘-hó-ê jī-bó, ū ka-kī ê bé-ūi. Ān-ne ē su-iàu iōng tiāu chin-chē bé-ūi, in-ùi cho͘-ha̍p ê khó-lêng-sèng ū chin-chē. M̄-koh, chá-chêng ê kî-thaⁿ pian-bé it-poaⁿ bô iōng cho͘-ha̍p. Ūi-tio̍h piāⁿ-lī chú-lí ka iōng kū pian-bé ê chu-liāu kau-thong, Europa ê chú-iàu gí-giân iōng ê bûn-jī ê jī-bó, i-poaⁿ tī thong-iōng-bé lāi lóng-ū tùi-èng ê cho͘-ha̍p-hó ê jī-bó. Ūi-tio̍h chiàu-kò 1-ê jī-bó khó-lêng ū 1-ê í-siōng ê piáu-sī-hoat (cho͘-hó-ê kap cho͘-ha̍p-ê). Thong-iōng-bé ū khu-tēng 2-ê piáu-sī án-chòaⁿ sèng kâng-ì (sio-siâng), chit-ê hoat-chek kiò canonical equivalence.

Iōng Hàn-jī ê '明' chit-ê jī chòe ké-sióng ê lē (si̍t-chè siōng Unicode tùi 明 ê chhù-lí sī chò tan-to̍k chi̍t jī, m̄-sī nn̄g jī ê cho͘-ha̍p). Chit-ê jī it-poaⁿ kan-na iōng 1-ê jī-tô͘ lâi ìn-soat, ū tok-lip ê bé-ūi. M̄-koh, chit-ê jī mā ē-sái thiah-chòe 2-ê jī-tô͘, hun-piat sī '日' kap '月'. Iōng chit 2-ê jī-tô͘ lâi ìn-soat, khó-lêng ē ìn chhut chhiūⁿ '日月' án-ne ê tô͘ , Chin pháiⁿ-khòaⁿ.

M̄-koh, kā '明' thiah-chòe 2-ê jī-tô͘ lâi ìn-soat ū 1-ê hó--chhù: kiám-chió su-iàu ê jī-tô͘. Tī 1-ê jī iōng 1-ê jī-tô͘ ê chêng-hêng, beh ìn '日','月','明', su-iàu 3-ê jī-tô͘. Nā-sī '明' thiah chòe 2-ê jī-tô͘ ìn, kan-na su-iàu 2-ê jī-tô͘. In-ùi jī-tô͘ ê sò͘-bo̍k it-tèng sī iú-hān. Beh iōng chia ê iú-hān ê jī-tô͘ lâi ìn-soat pí jī-tô͘ sò͘-bo̍k koh-khah chē ê jī ê sî, su-iàu kā jī thiah-choè 1-ê í-siōng ê jī-tô͘ lâi ìn. Iā-tio̍h-sī iōng jī-tô͘ khì cho͘-ha̍p (tàu) chhut sin-ê jī.

Beh iōng 2-ê í-siōng ê jī-tô͘ lâi cho͘-ha̍p chhut 1-ê jī ê sî, Su iàu iōng 1-kóa ìn-soat ki-su̍t, nā-bô ìn ê jī ê chhiūⁿ '日月' án-ne chin pháiⁿ-khòaⁿ.

Hián-sī ê būn-tê

[siu-kái | kái goân-sí-bé]

Beh chèng-khak hián-sī cho͘-ha̍p ê jī-bó sū-iàu khah ho̍k-cha̍p ê jī-hêng hián-sī ki-su̍t. Chia ê ki-su̍t m̄-sī thong-iōng-bé piau-chún ê 1-pō͘-hūn. Chóng-kóng in-ùi tōa-pō͘-hūn ê bûn-jī kan-na su-iàu iōng cho͘-hó-ê jī-bó, tiān-náu nńg-té tùi cho͘-ha̍p jī-bó ê chi-oān kaù-taⁿ iû-oân bô-kaù-hó. Ū-hoat-tō͘ chèng-khak hián-sī cho͘-ha̍p jī-bó ê ki-chân jī-hêng ki-su̍t ū OpenType (Adobe System kap Microsoft chè-tēng), AAT (Apple Computer chè-tēng), kap Graphite (SIL International chè-tēng). M̄-koh, tōa-hūn ê nńg-thé bô khì lī-iōng chia ê jī-hêng ki-su̍t, tōa-hūn ê jī-hêng mā bô chi-oān, só͘-í bô hoat-tō͘ chèng-khak hián-sī cho͘-ha̍p ê jī-bó. Pí-lūn chiū Pe̍h-oē-jī lâi kóng, chin chē Pe̍h-oē-jī jī-bó m̄-sī tāi-seng to̍h í-keng cho͘-ha̍p hó-sè ê thong-iōng-bé, iā-to̍h-sī kóng it-tēng ài-iōng cho͘-ha̍p ê hong-sek. Tōa-pō͘-hūn ê nńg-thé leh hián-sī chia ê jī-bó ê sî ē têng-tâⁿ--khì (pìⁿ chòe lōan-má).

Siang-hiòng bûn-jī

[siu-kái | kái goân-sí-bé]

Ū ê bûn-jī hē-thóng sī àn tó-pêng hiòng chiàⁿ-pêng siá, chhiūⁿ Latin bûn-jī. Ū ê sī àn chiàⁿ-pêng hiòng tó-pêng siá, chhiūⁿ Hi-pek-lâi-gí kap A-la-pek-gí.

Bé-ūi ê hoàn-ûi
16 chìn-ūi
UTF-16 UTF-8
binary
Notes
000000 - 00007F 00000000 0xxxxxxx 0xxxxxxx ASCII equivalence range; byte begins with zero
000080 - 0007FF 00000xxx xxxxxxxx 110xxxxx 10xxxxxx first byte begins with 110 or 1110, the following byte(s) begin with 10
000800 - 00FFFF xxxxxxxx xxxxxxxx 1110xxxx 10xxxxxx 10xxxxxx
010000 - 10FFFF 110110xx xxxxxxxx
110111xx xxxxxxxx
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx UTF-16 requires surrogates; an offset of 0x10000 is subtracted, so the bit pattern is not identical with UTF-8

Pian-chi̍p tiong Chit pha iáu-bōe ū lâng siá. Chhiáⁿ tàu pó͘-chhiong lōe-iông.

Pian-chi̍p tiong Chit pha iáu-bōe ū lâng siá. Chhiáⁿ tàu pó͘-chhiong lōe-iông.

Pian-chi̍p tiong Chit pha iáu-bōe ū lâng siá. Chhiáⁿ tàu pó͘-chhiong lōe-iông.

Pian-chi̍p tiong Chit pha iáu-bōe ū lâng siá. Chhiáⁿ tàu pó͘-chhiong lōe-iông.

Pian-chi̍p tiong Chit pha iáu-bōe ū lâng siá. Chhiáⁿ tàu pó͘-chhiong lōe-iông.

Hàn-jī thóng-it (Han unification)

[siu-kái | kái goân-sí-bé]

Pian-chi̍p tiong Chit pha iáu-bōe ū lâng siá. Chhiáⁿ tàu pó͘-chhiong lōe-iông.

Pian-chi̍p tiong Chit pha iáu-bōe ū lâng siá. Chhiáⁿ tàu pó͘-chhiong lōe-iông.

Pán-pún le̍k-sú

[siu-kái | kái goân-sí-bé]
  • 1991 nî Unicode 1.0
  • 1993 nî Unicode 1.1
  • 1996 nî Unicode 2.0
  • 1998 nî Unicode 2.1
  • 1999 nî Unicode 3.0
  • 2001 nî Unicode 3.1
  • 2002 nî Unicode 3.2
  • 2003 nî Unicode 4.0
  • 2005 nî Unicode 4.1
  • 2006 nî Unicode 5.0

Pe̍h-ōe-jī kap Thong-iōng-bé

[siu-kái | kái goân-sí-bé]

Chhiáⁿ chham-khó Taigi Unicode chit-phiⁿ bûn-chiuⁿ.

Goā-pō͘ liân-kiat

[siu-kái | kái goân-sí-bé]
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy