8 Mar 2001

Home | Hrvatski

Character Entities

  1. HTML 4.0 Character Entities
    1. HTML Latin1 Character Entity Set
    2. HTML Special Character Entity Set
    3. HTML Symbol Character Entity Set
  2. Some UNICODE symbols not found above
    1. HTML Special Character Entity Set
    2. HTML Symbol Character Entity Set
    3. Unclassified
  3. Notes on HR letters
    1. Charsets
    2. Letters Nj and Lj
  4. Reader comments

HTML 4.0 Character Entities

Source:
http://www.w3.org/TR/1998/REC-html40-19980424/sgml/entities.html

HTML Latin1 Character Entity Set

 nbsp#160no-break space = non-breaking space, U+00A0 ISOnum
¡iexcl#161inverted exclamation mark, U+00A1 ISOnum
¢cent#162cent sign, U+00A2 ISOnum
£pound#163pound sign, U+00A3 ISOnum
¤curren#164currency sign, U+00A4 ISOnum
¥yen#165yen sign = yuan sign, U+00A5 ISOnum
¦brvbar#166broken bar = broken vertical bar, U+00A6 ISOnum
§sect#167section sign, U+00A7 ISOnum
¨uml#168diaeresis = spacing diaeresis, U+00A8 ISOdia
©copy#169copyright sign, U+00A9 ISOnum
ªordf#170feminine ordinal indicator, U+00AA ISOnum
«laquo#171left-pointing double angle quotation mark = left pointing guillemet, U+00AB ISOnum
¬not#172not sign, U+00AC ISOnum
­shy#173soft hyphen = discretionary hyphen, U+00AD ISOnum
®reg#174registered sign = registered trade mark sign, U+00AE ISOnum
¯macr#175macron = spacing macron = overline = APL overbar, U+00AF ISOdia
°deg#176degree sign, U+00B0 ISOnum
±plusmn#177plus-minus sign = plus-or-minus sign, U+00B1 ISOnum
²sup2#178superscript two = superscript digit two = squared, U+00B2 ISOnum
³sup3#179superscript three = superscript digit three = cubed, U+00B3 ISOnum
´acute#180acute accent = spacing acute, U+00B4 ISOdia
µmicro#181micro sign, U+00B5 ISOnum
para#182pilcrow sign = paragraph sign, U+00B6 ISOnum
·middot#183middle dot = Georgian comma = Greek middle dot, U+00B7 ISOnum
¸cedil#184cedilla = spacing cedilla, U+00B8 ISOdia
¹sup1#185superscript one = superscript digit one, U+00B9 ISOnum
ºordm#186masculine ordinal indicator, U+00BA ISOnum
»raquo#187right-pointing double angle quotation mark = right pointing guillemet, U+00BB ISOnum
¼frac14#188vulgar fraction one quarter = fraction one quarter, U+00BC ISOnum
½frac12#189vulgar fraction one half = fraction one half, U+00BD ISOnum
¾frac34#190vulgar fraction three quarters = fraction three quarters, U+00BE ISOnum
¿iquest#191inverted question mark = turned question mark, U+00BF ISOnum
ÀAgrave#192latin capital letter A with grave = latin capital letter A grave, U+00C0 ISOlat1
ÁAacute#193latin capital letter A with acute, U+00C1 ISOlat1
ÂAcirc#194latin capital letter A with circumflex, U+00C2 ISOlat1
ÃAtilde#195latin capital letter A with tilde, U+00C3 ISOlat1
ÄAuml#196latin capital letter A with diaeresis, U+00C4 ISOlat1
ÅAring#197latin capital letter A with ring above = latin capital letter A ring, U+00C5 ISOlat1
ÆAElig#198latin capital letter AE = latin capital ligature AE, U+00C6 ISOlat1
ÇCcedil#199latin capital letter C with cedilla, U+00C7 ISOlat1
ÈEgrave#200latin capital letter E with grave, U+00C8 ISOlat1
ÉEacute#201latin capital letter E with acute, U+00C9 ISOlat1
ÊEcirc#202latin capital letter E with circumflex, U+00CA ISOlat1
ËEuml#203latin capital letter E with diaeresis, U+00CB ISOlat1
ÌIgrave#204latin capital letter I with grave, U+00CC ISOlat1
ÍIacute#205latin capital letter I with acute, U+00CD ISOlat1
ÎIcirc#206latin capital letter I with circumflex, U+00CE ISOlat1
ÏIuml#207latin capital letter I with diaeresis, U+00CF ISOlat1
ÐETH#208latin capital letter ETH, U+00D0 ISOlat1
ÑNtilde#209latin capital letter N with tilde, U+00D1 ISOlat1
ÒOgrave#210latin capital letter O with grave, U+00D2 ISOlat1
ÓOacute#211latin capital letter O with acute, U+00D3 ISOlat1
ÔOcirc#212latin capital letter O with circumflex, U+00D4 ISOlat1
ÕOtilde#213latin capital letter O with tilde, U+00D5 ISOlat1
ÖOuml#214latin capital letter O with diaeresis, U+00D6 ISOlat1
×times#215multiplication sign, U+00D7 ISOnum
ØOslash#216latin capital letter O with stroke = latin capital letter O slash, U+00D8 ISOlat1
ÙUgrave#217latin capital letter U with grave, U+00D9 ISOlat1
ÚUacute#218latin capital letter U with acute, U+00DA ISOlat1
ÛUcirc#219latin capital letter U with circumflex, U+00DB ISOlat1
ÜUuml#220latin capital letter U with diaeresis, U+00DC ISOlat1
ÝYacute#221latin capital letter Y with acute, U+00DD ISOlat1
ÞTHORN#222latin capital letter THORN, U+00DE ISOlat1
ßszlig#223latin small letter sharp s = ess-zed, U+00DF ISOlat1
àagrave#224latin small letter a with grave = latin small letter a grave, U+00E0 ISOlat1
áaacute#225latin small letter a with acute, U+00E1 ISOlat1
âacirc#226latin small letter a with circumflex, U+00E2 ISOlat1
ãatilde#227latin small letter a with tilde, U+00E3 ISOlat1
äauml#228latin small letter a with diaeresis, U+00E4 ISOlat1
åaring#229latin small letter a with ring above = latin small letter a ring, U+00E5 ISOlat1
æaelig#230latin small letter ae = latin small ligature ae, U+00E6 ISOlat1
çccedil#231latin small letter c with cedilla, U+00E7 ISOlat1
èegrave#232latin small letter e with grave, U+00E8 ISOlat1
éeacute#233latin small letter e with acute, U+00E9 ISOlat1
êecirc#234latin small letter e with circumflex, U+00EA ISOlat1
ëeuml#235latin small letter e with diaeresis, U+00EB ISOlat1
ìigrave#236latin small letter i with grave, U+00EC ISOlat1
íiacute#237latin small letter i with acute, U+00ED ISOlat1
îicirc#238latin small letter i with circumflex, U+00EE ISOlat1
ïiuml#239latin small letter i with diaeresis, U+00EF ISOlat1
ðeth#240latin small letter eth, U+00F0 ISOlat1
ñntilde#241latin small letter n with tilde, U+00F1 ISOlat1
òograve#242latin small letter o with grave, U+00F2 ISOlat1
óoacute#243latin small letter o with acute, U+00F3 ISOlat1
ôocirc#244latin small letter o with circumflex, U+00F4 ISOlat1
õotilde#245latin small letter o with tilde, U+00F5 ISOlat1
öouml#246latin small letter o with diaeresis, U+00F6 ISOlat1
÷divide#247division sign, U+00F7 ISOnum
øoslash#248latin small letter o with stroke, = latin small letter o slash, U+00F8 ISOlat1
ùugrave#249latin small letter u with grave, U+00F9 ISOlat1
úuacute#250latin small letter u with acute, U+00FA ISOlat1
ûucirc#251latin small letter u with circumflex, U+00FB ISOlat1
üuuml#252latin small letter u with diaeresis, U+00FC ISOlat1
ýyacute#253latin small letter y with acute, U+00FD ISOlat1
þthorn#254latin small letter thorn with, U+00FE ISOlat1
ÿyuml#255latin small letter y with diaeresis, U+00FF ISOlat1

HTML Special Character Entity Set

C0 Controls and Basic Latin

"quot#34quotation mark = APL quote, U+0022 ISOnum
&amp#38ampersand, U+0026 ISOnum
<lt#60less-than sign, U+003C ISOnum
>gt#62greater-than sign, U+003E ISOnum

Latin Extended-A

ŒOElig#338latin capital ligature OE, U+0152 ISOlat2
œoelig#339latin small ligature oe, U+0153 ISOlat2
ligature is a misnomer, this is a separate character in some languages
ŠScaron#352latin capital letter S with caron, U+0160 ISOlat2
šscaron#353latin small letter s with caron, U+0161 ISOlat2
ŸYuml#376latin capital letter Y with diaeresis, U+0178 ISOlat2

Spacing Modifier Letters

ˆcirc#710modifier letter circumflex accent, U+02C6 ISOpub
˜tilde#732small tilde, U+02DC ISOdia

General Punctuation

ensp#8194en space, U+2002 ISOpub
emsp#8195em space, U+2003 ISOpub
thinsp#8201thin space, U+2009 ISOpub
zwnj#8204zero width non-joiner, U+200C NEW RFC 2070
zwj#8205zero width joiner, U+200D NEW RFC 2070
lrm#8206left-to-right mark, U+200E NEW RFC 2070
rlm#8207right-to-left mark, U+200F NEW RFC 2070
ndash#8211en dash, U+2013 ISOpub
mdash#8212em dash, U+2014 ISOpub
lsquo#8216left single quotation mark, U+2018 ISOnum
rsquo#8217right single quotation mark, U+2019 ISOnum
sbquo#8218single low-9 quotation mark, U+201A NEW
ldquo#8220left double quotation mark, U+201C ISOnum
rdquo#8221right double quotation mark, U+201D ISOnum
bdquo#8222double low-9 quotation mark, U+201E NEW
dagger#8224dagger, U+2020 ISOpub
Dagger#8225double dagger, U+2021 ISOpub
permil#8240per mille sign, U+2030 ISOtech
lsaquo#8249single left-pointing angle quotation mark, U+2039 ISO proposed
lsaquo is proposed but not yet ISO standardized
rsaquo#8250single right-pointing angle quotation mark, U+203A ISO proposed
rsaquo is proposed but not yet ISO standardized
euro#8364euro sign, U+20AC NEW

HTML Symbol Character Entity Set

Latin Extended-B

ƒfnof#402latin small f with hook = function = florin, U+0192 ISOtech

Greek

ΑAlpha#913greek capital letter alpha, U+0391
ΒBeta#914greek capital letter beta, U+0392
ΓGamma#915greek capital letter gamma, U+0393 ISOgrk3
ΔDelta#916greek capital letter delta, U+0394 ISOgrk3
ΕEpsilon#917greek capital letter epsilon, U+0395
ΖZeta#918greek capital letter zeta, U+0396
ΗEta#919greek capital letter eta, U+0397
ΘTheta#920greek capital letter theta, U+0398 ISOgrk3
ΙIota#921greek capital letter iota, U+0399
ΚKappa#922greek capital letter kappa, U+039A
ΛLambda#923greek capital letter lambda, U+039B ISOgrk3
ΜMu#924greek capital letter mu, U+039C
ΝNu#925greek capital letter nu, U+039D
ΞXi#926greek capital letter xi, U+039E ISOgrk3
ΟOmicron#927greek capital letter omicron, U+039F
ΠPi#928greek capital letter pi, U+03A0 ISOgrk3
ΡRho#929greek capital letter rho, U+03A1
there is no Sigmaf, and no U+03A2 character either
ΣSigma#931greek capital letter sigma, U+03A3 ISOgrk3
ΤTau#932greek capital letter tau, U+03A4
ΥUpsilon#933greek capital letter upsilon, U+03A5 ISOgrk3
ΦPhi#934greek capital letter phi, U+03A6 ISOgrk3
ΧChi#935greek capital letter chi, U+03A7
ΨPsi#936greek capital letter psi, U+03A8 ISOgrk3
ΩOmega#937greek capital letter omega, U+03A9 ISOgrk3
αalpha#945greek small letter alpha, U+03B1 ISOgrk3
βbeta#946greek small letter beta, U+03B2 ISOgrk3
γgamma#947greek small letter gamma, U+03B3 ISOgrk3
δdelta#948greek small letter delta, U+03B4 ISOgrk3
εepsilon#949greek small letter epsilon, U+03B5 ISOgrk3
ζzeta#950greek small letter zeta, U+03B6 ISOgrk3
ηeta#951greek small letter eta, U+03B7 ISOgrk3
θtheta#952greek small letter theta, U+03B8 ISOgrk3
ιiota#953greek small letter iota, U+03B9 ISOgrk3
κkappa#954greek small letter kappa, U+03BA ISOgrk3
λlambda#955greek small letter lambda, U+03BB ISOgrk3
μmu#956greek small letter mu, U+03BC ISOgrk3
νnu#957greek small letter nu, U+03BD ISOgrk3
ξxi#958greek small letter xi, U+03BE ISOgrk3
οomicron#959greek small letter omicron, U+03BF NEW
πpi#960greek small letter pi, U+03C0 ISOgrk3
ρrho#961greek small letter rho, U+03C1 ISOgrk3
ςsigmaf#962greek small letter final sigma, U+03C2 ISOgrk3
σsigma#963greek small letter sigma, U+03C3 ISOgrk3
τtau#964greek small letter tau, U+03C4 ISOgrk3
υupsilon#965greek small letter upsilon, U+03C5 ISOgrk3
φphi#966greek small letter phi, U+03C6 ISOgrk3
χchi#967greek small letter chi, U+03C7 ISOgrk3
ψpsi#968greek small letter psi, U+03C8 ISOgrk3
ωomega#969greek small letter omega, U+03C9 ISOgrk3
ϑthetasym#977greek small letter theta symbol, U+03D1 NEW
ϒupsih#978greek upsilon with hook symbol, U+03D2 NEW
ϖpiv#982greek pi symbol, U+03D6 ISOgrk3

General Punctuation

bull#8226bullet = black small circle, U+2022 ISOpub
bullet is NOT the same as bullet operator, U+2219
hellip#8230horizontal ellipsis = three dot leader, U+2026 ISOpub
prime#8242prime = minutes = feet, U+2032 ISOtech
Prime#8243double prime = seconds = inches, U+2033 ISOtech
oline#8254overline = spacing overscore, U+203E NEW
frasl#8260fraction slash, U+2044 NEW

Letterlike Symbols

weierp#8472script capital P = power set = Weierstrass p, U+2118 ISOamso
image#8465blackletter capital I = imaginary part, U+2111 ISOamso
real#8476blackletter capital R = real part symbol, U+211C ISOamso
trade#8482trade mark sign, U+2122 ISOnum
alefsym#8501alef symbol = first transfinite cardinal, U+2135 NEW
alef symbol is NOT the same as hebrew letter alef, U+05D0 although the same glyph could be used to depict both characters

Arrows

larr#8592leftwards arrow, U+2190 ISOnum
uarr#8593upwards arrow, U+2191 ISOnu
rarr#8594rightwards arrow, U+2192 ISOnum
darr#8595downwards arrow, U+2193 ISOnum
harr#8596left right arrow, U+2194 ISOamsa
crarr#8629downwards arrow with corner leftwards = carriage return, U+21B5 NEW
lArr#8656leftwards double arrow, U+21D0 ISOtech
Unicode does not say that lArr is the same as the 'is implied by' arrow but also does not have any other character for that function. So ? lArr can be used for 'is implied by' as ISOtech suggests
uArr#8657upwards double arrow, U+21D1 ISOamsa
rArr#8658rightwards double arrow, U+21D2 ISOtech
Unicode does not say this is the 'implies' character but does not have another character with this function so ? rArr can be used for 'implies' as ISOtech suggests
dArr#8659downwards double arrow, U+21D3 ISOamsa
hArr#8660left right double arrow, U+21D4 ISOamsa

Mathematical Operators

forall#8704for all, U+2200 ISOtech
part#8706partial differential, U+2202 ISOtech
exist#8707there exists, U+2203 ISOtech
empty#8709empty set = null set = diameter, U+2205 ISOamso
nabla#8711nabla = backward difference, U+2207 ISOtech
isin#8712element of, U+2208 ISOtech
notin#8713not an element of, U+2209 ISOtech
ni#8715contains as member, U+220B ISOtech
should there be a more memorable name than 'ni'?
prod#8719n-ary product = product sign, U+220F ISOamsb
prod is NOT the same character as U+03A0 'greek capital letter pi' though the same glyph might be used for both
sum#8721n-ary sumation, U+2211 ISOamsb
sum is NOT the same character as U+03A3 'greek capital letter sigma' though the same glyph might be used for both
minus#8722minus sign, U+2212 ISOtech
lowast#8727asterisk operator, U+2217 ISOtech
radic#8730square root = radical sign, U+221A ISOtech
prop#8733proportional to, U+221D ISOtech
infin#8734infinity, U+221E ISOtech
ang#8736angle, U+2220 ISOamso
and#8743logical and = wedge, U+2227 ISOtech
or#8744logical or = vee, U+2228 ISOtech
cap#8745intersection = cap, U+2229 ISOtech
cup#8746union = cup, U+222A ISOtech
int#8747integral, U+222B ISOtech
there4#8756therefore, U+2234 ISOtech
sim#8764tilde operator = varies with = similar to, U+223C ISOtech
tilde operator is NOT the same character as the tilde, U+007E, although the same glyph might be used to represent both
cong#8773approximately equal to, U+2245 ISOtech
asymp#8776almost equal to = asymptotic to, U+2248 ISOamsr
ne#8800not equal to, U+2260 ISOtech
equiv#8801identical to, U+2261 ISOtech
le#8804less-than or equal to, U+2264 ISOtech
ge#8805greater-than or equal to, U+2265 ISOtech
sub#8834subset of, U+2282 ISOtech
sup#8835superset of, U+2283 ISOtech
note that nsup, 'not a superset of, U+2283' is not covered by the Symbol font encoding and is not included. Should it be, for symmetry? It is in ISOamsn
nsub#8836not a subset of, U+2284 ISOamsn
sube#8838subset of or equal to, U+2286 ISOtech
supe#8839superset of or equal to, U+2287 ISOtech
oplus#8853circled plus = direct sum, U+2295 ISOamsb
otimes#8855circled times = vector product, U+2297 ISOamsb
perp#8869up tack = orthogonal to = perpendicular, U+22A5 ISOtech
sdot#8901dot operator, U+22C5 ISOamsb
dot operator is NOT the same character as U+00B7 middle dot

Miscellaneous Technical

lceil#8968left ceiling = apl upstile, U+2308 ISOamsc
rceil#8969right ceiling, U+2309 ISOamsc
lfloor#8970left floor = apl downstile, U+230A ISOamsc
rfloor#8971right floor, U+230B ISOamsc
lang#9001left-pointing angle bracket = bra, U+2329 ISOtech
lang is NOT the same character as U+003C 'less than' or U+2039 'single left-pointing angle quotation mark'
rang#9002right-pointing angle bracket = ket, U+232A ISOtech
rang is NOT the same character as U+003E 'greater than' or U+203A 'single right-pointing angle quotation mark'

Geometric Shapes

loz#9674lozenge, U+25CA ISOpub

Miscellaneous Symbols

spades#9824black spade suit, U+2660 ISOpub
black here seems to mean filled as opposed to hollow
clubs#9827black club suit = shamrock, U+2663 ISOpub
hearts#9829black heart suit = valentine, U+2665 ISOpub
diams#9830black diamond suit, U+2666 ISOpub

Some UNICODE symbols not found above

Source:
I have found these using IE 5.0 on Windows 98, Verdana font.
I arranged them into groups as I found appropriate by looking at glyphs and encoding.

HTML Special Character Entity Set

Latin Extended-A

Ć#0262
ć#0263
Č#0268
č#0269
Đ#0272
đ#0273
Ž#0381
ž#0382

HTML Symbol Character Entity Set

Letterlike Symbols

8470

Arrows

#8597

Geometric Shapes

#9632
#9633
#9642
#9643
#9650
#9660
#9675
#9679

Miscellaneous Symbols

#9792
#9794
#9834

Mathematical Operators

#8710
#8725
#8729
#8735

Unclassified

Group 1

٭#1645

Group 2

#9472
#9474
#9484
#9488
#9492
#9496
#9500
#9508
#9516
#9524
#9532
#9552
#9553
#9554
#9555
#9556
#9557
#9558
#9559
#9560
#9561
#9562
#9563
#9564
#9565
#9566
#9567
#9568
#9569
#9570
#9571
#9572
#9573
#9574
#9575
#9576
#9577
#9578
#9579
#9580
#9604
#9608
#9612
#9618
#9619

Notes on HR letters

Charsets

I am aware of two charsets that may be used for hr dyacritical letters, ISO-8859-2 and Windows-1250. I would discourage encoding of HTML pages in Windows-1250, becaouse of user agents on non-Windows platforms.

In situations where encoding must be ISO-8859-1 (e.g. one cannot specify meta-tags), the solution is to use character entities from the following table:

Ć#0262
ć#0263
Č#0268
č#0269
Đ#0272
đ#0273
ŠScaron#352latin capital letter S with caron, U+0160 ISOlat2
šscaron#353latin small letter s with caron, U+0161 ISOlat2
Ž#0381
ž#0382

Letters Nj and Lj

There are two reasons I find those letters should allso be separate entities:

  1. Any algorithmic manipulation needs to consider 'nj' and 'lj' as single letters. Example: it is not self-evident that sorting algorithm should consider 'next letter' when repositioning the 'current letter'. In turn, such a requirement renders most of the existing algorithms unusable.
  2. There is a 'legal presedan', namely I have found a Unicode symbols IJ (#306) and ij (#307), which seemed to me as letters i and j together.

There is one reason why implentation of these letter would be difficult.

  1. Which glyph is a capital letter? 'NJ' or 'Nj'?

If only one is chosen, one would either have to begin sentences like "NJegovatelj (...).", or could only capitalize like in "RANjIV".

However, if both glyphs were implemented, correct sort ordering would be achieved by having one symbol immediately follow another in the alphabetic sequence. However, spelling programs would have to be aware of both glyphs.

I would expect to see 6 entitites, in ascending order: LJ, Lj, lj, NJ, Nj and nj.

Reader Comments

Do you have a comment about this page?

Quick rate this page:

Hayo Schmidt, 15. Dec 2000

There are no comments on this page.

Well, what I mean is, the jumpmark to the reader comments does not exist, since there are none. You could use this as the first comment on the page and put the jumpmark there, but then this comment would be out-of-date.

Isn't it paradox ;-).

Hayo


Danke, Hayo!

You have forced me to about an hour of hard thinking.

I must admit - it is a paradox. A paradox is not an impossible thing, but something that is unexpected, out of the ordinary, opposite to common belief, but may bee seen to be true after some thought.

Namely, comments may be out of date - their purpose is to change things (or, else, why bother?).

What gives me a headache is my original intent not to reply to comments on this page - it should be a place of "final judgement by the readers" and not a newsgroup. But there you have it - nothing is as one would wish for.

Tomislav (Thomas? in German) - mated in one move :-)).

This document: <http://www.inet.hr/~tsereg/en/charent.html>