



Hello,
Today I tried opening a 1.3 MB text file, actually a MySQL database dump with a .sql extension. The content is UTF-8, but it was recognized as Latin-1.
So I went to the preferences, and tried to ask for UTF-8 by default and to disable Auto-Detect. Didn't work. I tried maybe six or seven configurations, restarted Komodo Edit (4.2.1) each time, and I always got Latin-1 in the end. Duh.
I tried setting Language-specific Default Encoding as "UTF-8" for SQL, but it didn't work either.
So in the end I can't get Komodo Edit to recognize the file encoding as UTF-8, even though the file [i]is[/i] UTF-8, the system encoding is UTF-8 too, etc.
And I can't use Current File Settings to set the encoding as "UTF-8", because that would only change mess the data when I save the file, instead of reinterpreting the file with a different encoding. It would go like this:
1. UTF-8 file is opened. Let's say it contains a "é" character.
2. Komodo Edit thinks the file is Latin-1. It prints "é" in the editor.
3. I go to Current File Settings and change file charset from "Latin-1/ISO-8859-1" to "UTF-8".
4. Display is not affected. Komodo still prints "é".
5. I save the file. "é" gets saved as UTF-8 data. So all my data is corrupted.
Is there a way to tell Komodo Edit to reinterpret a file with a different encoding? If not, shouldn't that be an option in Current File Settings, or simply a «View» menu option like in Firefox (best tool for guessing what a file's real encoding is, save for hex editors)?
I could open the file in Firefox, tell Firefox it's UTF-8 (in case it choose something else), and then copy-paste the result into Komodo with file encoding set as UTF-8. That would work, but I'm not sure copypasting 1.3 MB of unicode text is such a good idea. And It's only a really small database. Imagine the pain with a 40 MB database dump...
So... did I miss something? Or is it that:
- Auto-Detect is broken and
- there is no "Reinterpret file as <charset>" feature in Komodo?
After a few tests, my reckoning is that:
1. Default Charset is only for creating new files. Opening files don't use this parameter, it's only managed through Auto-Detect (and possible exceptions to Auto-Detect as seen in the Internationalization options). Right?
2. Auto-Detect didn't work in my case because the file is not valid UTF-8. Some editors wouldn't open the file because of that. I opened it with Firefox, set the encoding as UTF-8, cut 'n pasted the result in Komodo and saved that as UTF-8. I get a file that has 35 more bytes than the original one. And now it opens ok in those editors, and Komodo recognizes it as UTF-8.
So that problem is solved (why I get an invalid dump is another problem ^_^;). I think my related remarks are still valid though. :)
First, I would suggest trying the 4.3 beta and see if you have the same issue.
Does changing the encoding in the file properties work? If not, can you create a bug on bugs.activestate.com, and attach a zipped file that we can test against?