Community technical support mailing list was retired 2010 and replaced with a professional technical support team. For assistance please contact: Pre-sales Technical support via email to sales@march-hare.com.
Tony Hoyle wrote: > Binary files in particular can get truncated with the wrong options > (especially on Win32.. Unix doesn't really make the distinction). Thanks for clarifying... I did try looking through the source, but 'twas a bit confusing. > > A 16bit Unicode file would behave poorly if treated as an ordinary text > file, as every alternate byte is null. CR/LF expansion would completely > destroy them (turning "'A' NULL CR NULL 'B' NULL" into "'A' NULL CR LF > NULL 'B' NULL" which is a completely different string). When the client sends the files to to server, how are they encoded? text/unicode as utf8, binary as ? Meanwhile... I've written a small app that scans through a set of directories to check that any unicode/binary files are correctly wrapped... quite useful before doing an import or add/commit to check that you aren't going to go wrong. I will make it available later today for anybody who is interested... it uses a heuristic algorithm that I hacked together to differentiate binary from text/unicode files... it works for me quite well... but could do with a few other people testing it out (especially anybody with unicode files that contains CJK). BTW, does anybody use UTF-32? Perhaps the detection code code be put into cvsnt so that binary/unicode files get correctly wrapped or throw up a "danger will robinson" message -:) Greetings from (rainy!) Luxembourg -- David Somers VoIP: FWD 622885 PGP Key = 7E613D4E Fingerprint = 53A0 D84B 7F90 F227 2EAB 4FD7 6278 E2A8 7E61 3D4E