Community technical support mailing list was retired 2010 and replaced with a professional technical support team. For assistance please contact: Pre-sales Technical support via email to sales@march-hare.com.
> From: cvsnt-bounces at cvsnt.org > [mailto:cvsnt-bounces at cvsnt.org] On Behalf Of Glen Starrett > Sent: Friday, 19 May, 2006 11:49 > > If you were thinking of doing an overall checksum for the file it would > be impractical -- having to calculate the checksum on the entire RCS > file after every commit, tag, etc. would kill performance. I believe that's highly dubious. CVS has to rewrite the whole file for most operations, since the RCS file format is plain-text (and consequently full of, in effect, variable-length records). Computing a checksum over the contents before writing them out would be a matter of a handful of cycles for typical algorithms. I'm willing to be that for the vast majority of users even a cryptographic hash like MD5 or (better) SHA-256 would not have a noticeable effect on performance. It's negligible relative to disk I/O time - not to mention network I/O for remote repositories, which is what most people use anyway. I wrote an implementation of MD5 in COBOL (!) recently, and on an IBM laptop that's a couple of years old (so hardly top-of-the-line hardware) it gets about 5MB/s throughput - including the disk I/O. And COBOL is not an ideal language for this operation. Calculating checksums is not expensive. Frankly, I think Walter Tichy should have included at least a CRC (the output of the "sum" utility would have been fine) in the original RCS file format. Even for machines of the day the cost would have been negligible. I'd patch it in myself, but I'm not using a current build (we're still stuck on 2.0.51d at the moment), so there wouldn't be much point in my hacking the version I use. -- Michael Wojcik Principal Software Systems Developer, Micro Focus