[cvsnt] Re: cvswrappers

Fri Jul 8 12:39:01 BST 2005

Tony Hoyle wrote:
> Binary files in particular can get truncated with the wrong options
> (especially on Win32.. Unix doesn't really make the distinction).

Thanks for clarifying... I did try looking through the source, but 'twas a
bit confusing. 
> 
> A 16bit Unicode file would behave poorly if treated as an ordinary text
> file, as every alternate byte is null.  CR/LF expansion would completely
> destroy them (turning "'A' NULL CR NULL 'B' NULL" into "'A' NULL CR LF
> NULL 'B' NULL" which is a completely different string).

When the client sends the files to to server, how are they encoded?
text/unicode as utf8, binary as ?

Meanwhile... I've written a small app that scans through a set of
directories to check that any unicode/binary files are correctly wrapped...
quite useful before doing an import or add/commit to check that you aren't
going to go wrong.

I will make it available later today for anybody who is interested... it
uses a heuristic algorithm that I hacked together to differentiate binary
from text/unicode files... it works for me quite well... but could do with
a few other people testing it out (especially anybody with unicode files
that contains CJK).

BTW, does anybody use UTF-32?

Perhaps the detection code code be put into cvsnt so that binary/unicode
files get correctly wrapped or throw up a "danger will robinson" message
-:)

Greetings from (rainy!) Luxembourg

-- 
David Somers
VoIP: FWD 622885
PGP Key = 7E613D4E
Fingerprint = 53A0 D84B 7F90 F227 2EAB  4FD7 6278 E2A8 7E61 3D4E