CSS Question...charset

liunx

Guest
Hi-<br />
<br />
Character sets have never made sense to me. I used the following just to be able to validate my page, which worked; but what is utf-8 for? Is there a list? Isn't this supposed to tell the browser what language the page is in?<br />
<br />
<meta http-equiv="Content-Type" content="text/html; <br />
charset=utf-8"> <br />
<br />
Thanks, Gandalf<br />
:D<!--content-->This tells the browser what charactes to display the page in. As you can imagine there are different ways of displaying letters around the world and different letters as well...<br />
<br />
That help?<!--content-->Hum... interesting, I have always wondered what that ment.<!--content-->So what's utf-8? Latin? What is charset code for Western English?<br />
<br />
Ty, G<br />
:D<!--content-->There are many standards:<br />
<br />
ISO-8859-1 is for Western Europe.<br />
ISO-8859-7 is the Greek character set.<br />
<br />
Characters are generally numbered from 1 to 255, and represented by that number expressed in binary in data transfer and storage.<br />
<br />
Whilst some characters are rendered the same in many character sets (A to Z for example), in others some accented characters may be replaced with line drawing characters and so on.<br />
<br />
Some Asian character sets use TWO bytes for representing characters because they have more than 255 different characters that need to be displayed. <br />
<br />
UTF-8 is one of the various Unicode options, and allows for Thousands of different characters to be displayed simultaneously, just as long as the user has the correct fonts loaded.<br />
<br />
<br />
You can easily verify this by visiting a Japanese or Thai website.<br />
<br />
If the characters look like:<br />
B$,J]M-$7$F$$$^$9!#$^$?!"K]Lu$K$O8m$j$,$"$k$+$b$7$l$^$;$s!#8+$D$1$?J}$O<br />
then you don't have support for this loaded. When you add support for the correct character set, the above is properly rendered in Japanese characters instead (but only if the content is actually labelled with the correct Character Set Declaration to tell the browser to decode those characters using byte pairs, not singly).<br />
<br />
The Character Set decaration tells the browser, and spiders, what character set the page is encoded in: for example, that character 65 is the letter A, not something else (using ISO 646 ASCII, ISO 2022, ISO 8859, ISO 10646 (=UTF-8), etc).<br />
<br />
This is separate from the Content-Language declaration which tells the browser, spiders, and translation tools what human language the page content actually represents.<br />
<br />
The Language Declaration is made from 2 parts e.g. EN-US. The EN is the Language Code from ISO 639. The US is the Country Code from ISO 3166, and uses roughly the same letters as used for Internet country codes in web addresses and email.<br />
<br />
Two other "Code" standards you need to be aware of are ISO 4217 for currency USD, GBP, HKD, and so on, and ISO 8601 for YYYY-MM-DD dates and HH:MM:SS times.<br />
<br />
<br />
My 'minimum' header has all of this: <br />
<br />
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <!-- DOCTYPE --><br />
<HTML> <br />
<HEAD> <br />
<TITLE> Your Title Here </TITLE> <br />
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <!-- Character Set --><br />
<META HTTP-EQUIV="Content-Language" CONTENT="EN-GB"> <!-- Content Language --><br />
<META NAME="Keywords" CONTENT=" your keyword list here "> <br />
<META NAME="Description" CONTENT=" Your Description Here. "> <br />
</HEAD><br />
<br />
You may need EN-US rather then EN-GB for the Content Language. <br />
<br />
The Title, Keywords, and Description are useful to some search engines (and the Title is displayed by the Browser along the top of the window). The DocType is useful for Validation and to tell the browser what version of HTML is being used. The Character Set and Content Language tags are going to become more important in search engines in the future. At present, the browser determines which character set to use to display the page (important for people who also have DBCS and Shifted character set support installed, and non-US defaults on their browser) using this information. They will also help with online translation tools, and help visitors to your web site that use other settings than US as their default.<!--content-->
 
Back
Top