JP Character Set

liunx

Guest
W3C says I'm not using the correct character set and thus cannot validate or read my page for that matter.<br />
<br />
LEGEND:<br />
Orange = Correct if wrong (please)<br />
<br />
My code:<br />
<br />
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"><br />
<html><br />
<head><br />
<title>Our Site</title><br />
<meta http-equiv="content-type" content="text/html; charset="iso-2022-jp"><br />
<meta http-equiv="content-language" content="iso-2022-jp"><br />
<meta name="keywords" content="blah, blah, blah"><br />
<meta name="author" content="blah, blah, blah"><br />
<meta name="description" content="blah blah blah."><br />
<meta name="robots" content="all"><br />
<meta name="rating" content="general"><br />
<meta name="revisit" content="10"><br />
<link href=http://www.htmlforums.com/archive/index.php/"abc.css" rel="stylesheet" type="text/css"><br />
</head><br />
<br />
<br />
Copy from W3C validation results: <br />
<br />
<br />
Doctype: <br />
Encoding: <br />
<br />
I was not able to extract a character encoding labeling from any of the valid sources for such information. Without encoding information it is impossible to validate the document. The sources I tried are: <br />
<br />
The HTTP Content-Type field. <br />
The XML Declaration. <br />
The HTML "META" element. <br />
And I even tried to autodetect it using the algorithm defined in Appendix F of the XML 1.0 Recommendation. <br />
<br />
Since none of these sources yielded any usable information, I will not be able to validate this document. Sorry. Please make sure you specify the character encoding in use. <br />
<br />
IANA maintains the list of official names for character sets. <br />
<br />
<br />
Thanks,<br />
Gandalf<br />
:D<!--content-->I've come to the conclusion that there are several different sub-char's under a character set and I have no idea which one to use...<br />
<br />
Any helpful comment would be appreciated, since the research I have done was useful.<br />
<br />
Ty,<br />
Gandalf<br />
:D<!--content-->The version HTML 4.0 should be HTML 4.01<br />
<br />
<br />
...charset="iso-2022-jp"> -- well, what character set are you using? and the iso should be in capitals ISO instead (as well as the jp I expect, too).<br />
<br />
<br />
<meta http-equiv="content-language" content="iso-2022-jp"> Hmm, iso-2022-jp should be just the two letters ja as well.<!--content-->Still not working...<br />
<br />
Browsing some Japanese sites to see what they did.<br />
<br />
G<br />
:D<!--content-->I found this in two different sites:<br />
<br />
<META http-equiv="Content-Type" content="text/html; charset=Shift_JIS"><br />
<META http-equiv="Content-Language" content="ja"><br />
<br />
However, mimicing this in my code, W3C still didn't recognize the page.<br />
<br />
This wouldn't have anything to do with the page being uploaded as opposed to using a URI would it??<br />
<br />
Thanks,<br />
G<br />
:D<!--content-->Please don't let my replies to my own thread here give you the impression that I've figure out the answer to my problem.<br />
<br />
The following is simply for note, as from time to time I come back to my posting for reference. But the following is also for the benefit of the reader if nothing more than the mere novelty of it.<br />
<br />
Posted from: <br />
<!-- m --><a class="postlink" href="http://www.iana.org/assignments/character-sets">http://www.iana.org/assignments/character-sets</a><!-- m --> <br />
<br />
<br />
207 kubota 1.5 <strong>ISO-2022-JP-2</strong> (RFC 1554) is a subset of 7bit version<br />
208 of ISO 2022 and superset of ISO-2022-JP. Difference between ISO-2022-JP<br />
209 and ISO-2022-JP-2 is that ISO-2022-JP-2 has more coded character sets<br />
210 than ISO-2022-JP. Character sets included in ISO-2022-JP-2 are:<br />
<br />
<br />
211 kubota 1.2 <list><br />
212 <item>ASCII (ESC 0x28 0x42)<br />
213 <item>JIS X 0201-1976 Roman (ESC 0x28 0x4a),<br />
214 <item>JIS X 0208-1978 (old JIS) (ESC 0x24 0x40),<br />
215 <item>JIS X 0208-1983 (new JIS) (ESC 0x24 0x42),<br />
216 <item>GB2312-80 (simplified Chinese) (ESC 0x24 0x41),<br />
217 <item>KS C 5601 (Korean) (ESC 0x24 0x28 0x43),<br />
218 <item>JIS X 0212-1990 (ESC 0x24 0x28 0x44),<br />
219 <item>ISO 8859-1 (Latin-1) (ESC 0x2e 0x41), and<br />
220 <item>ISO 8859-7 (Greek) (ESC 0x2e 0x46).<br />
221 </list><br />
<br />
<br />
<br />
<br />
Gandalf<br />
:D<!--content-->Question:<br />
<br />
What in the heck does this mean in English:<br />
Line 8, column 70: an attribute value literal can occur in an attribute specification list only after a VI delimiter <br />
...nt-type" content="text/html; charset="Shift_JIS"><br />
<br />
<br />
By the way, I finally got the W3C Validator to read my code after I forced a DOCTYPE, since for some reason it wasn't reading mine which is: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">. Though I think this problem will fix itself once we get this CHARset stuff taken care of along with the other page errors.<br />
<br />
Second Question:<br />
In this line, [ ...nt-type" content="text/html; charset="Shift_JIS"> ] does the charset need to be as is or does case matter? Can I write it in all lower case letters and get the same result?<br />
<br />
I really hope we get this problem taken care of soon...<br />
<br />
Thanks,<br />
Gandalf<br />
:D<!--content-->I already said above to change the HTML 4.0 to be HTML 4.01 instead; right there in the DOCTYPE.<br />
<br />
The Content-Type MUST be in capital letters, or exactly as per the W3C example, so ISO-2022-JP is valid and all others iso-2022-jp, etc, are not.<br />
<br />
<br />
Post the URL of the page so I can take a look at what is going on. It really should not be this difficult.<br />
<br />
<br />
<br />
<br />
<br />
Umm, dude, you changed TWO things in your version. You fixed one thing and broke something else at the same time.<br />
<br />
You posted an example with:<br />
<br />
content="text/html; charset=Shift_JIS"<br />
<br />
Now you are saying you have:<br />
<br />
content="text/html; charset="Shift_JIS"<br />
<br />
<br />
Look carefully at how many quote marks you have in the last one. That one is wrong. The first one has the correct number of "quotes" in it.<br />
<br />
The error about attribute specification list and delimiters is warning you about the incorrect number of quote marks; the validator thinks you are trying to do a list of attribute values not just one value.<br />
<br />
About the only time you can have a list of attribute values is when you say something like <font face="Verdana, Courier, Arial"> and so on. There are very few other places where an attribute can have a list of allowed values, and the incorrect quoting makes the validator think that that is what you are trying to do.<br />
<br />
<br />
<br />
.<!--content-->
 
Back
Top