[Project_owners] Detecting file charset
zack.carter at gmail.com
Fri Feb 16 08:53:26 PST 2007
On 2/14/07, Karsten Düsterloh <mnenhy at tprac.de> wrote:
> Well, how should this work?
> If all characters are below 0x80, it's most probably(!) ASCII, and else?
compare each char to do so... but still this wouldn't be reliable.
> What makes 0xA4 be a euro sign instead of mere currency symbol?
I'm not sure if these are rhetorical questions or not. :(
I have found some interesting things though:
Universal Encoding Detector written in Python
which linked to this article,
Haven't read them yet, but seems promising. I know Mozilla has some
detection built in.. but no scriptable interfaces, it would seem...
More information about the Project_owners