Community technical support mailing list was retired 2010 and replaced with a professional technical support team. For assistance please contact: Pre-sales Technical support via email to sales@march-hare.com.
Olaf Groeger wrote: > http://www.unicode.org/faq/utf_bom.html#28). To have an example out of my > work: We save content of a database in CVS. In the DB the unicode string > has not BOM and when we save it to file and put it to CVS we don't add a > BOM. All four variations are fully legal. If you're saving the DB to a file the context is lost, so you should use a BOM. Otherwise it's just a binary file, not unicode text file. > What cvsnt seems to do during commit is to cut of the first to bytes where > it expects to have the BOM. And while checkout/update, it adds "0xFF 0xFE" Basically the output of a -ku file is always a correct UTF16-LE (Actually UCS-2 - UTF16 is just an abstraction and not used in practice) file as this is what Windows uses. Internally it's stored as UTF-8 anyway so there's absolutely no difference between the types once it's in the repository (in fact if you do update -kkv on the -ku file it'll give you a valid UTF-8 file instead). The ability to checkout into different types may be added someday, but it's not there at the moment. This just requires client-side changes and some more -k options. Support for UCS-4 is needed for a complete implementation. If you don't have a BOM you get ambiguities, which CVS does its best to resolve, but it's not really a supported configuration - an automated tool like CVS can't be expected to guess 100% of the time what the file actually is (if you have any japanese/arabic/etc. then it's got no chance). Tony