[Greasemonkey] 0.5.1 (a few small things), and a plan
Nikolas Coukouma
lists at atrus.org
Wed Aug 24 15:16:09 EDT 2005
Bill Donnelly wrote:
> Although charset auto-detect is cool, what about putting in an
> @charset metadata item where people can specifically specify
> (I love saying that) what charset their script is written for/under?
> Like the one used for HTML docs:
>
> <
> meta http-equiv="Content-Type" content="text/html; "
> >
>
> Then you don't have to worry about auto-detect. (default to UTF-8?)
The problem is that charsets embedded in documents are a pain to use[1],
as pointed out in the original discussion[2]. Using normalization form C
(NFC) makes sense because it's used in the IRI spec[3] and the W3C's
character model for the web[4]. As for what NFC is, I recommend a FAQ[5]
for a quick overview of what it is and the comments on Mark's blog[6]
for why it causes nightmares.
[1] http://www.w3.org/TR/2004/REC-xml-20040204/#sec-guessing
[2] http://www.mozdev.org/pipermail/greasemonkey/2005-May/002142.html
[3] http://ietf.org/rfc/rfc3987
[4] http://www.w3.org/TR/charmod/
[5] http://www.cl.cam.ac.uk/~mgk25/unicode.html#ucsutf
[6] http://diveintomark.org/archives/2004/07/06/nfc
More information about the Greasemonkey
mailing list