View Single Post
  #7 (permalink)  
Old January 6th, 2002
Unregistered
Guest
 
Posts: n/a
Talking Follow up

About Huge, when i said Huge look complicate i mean from what you tell me it's more about verification of data integrity.

I prefer less verification and better speed (smaller protocol)
as long as verification is good enough.

About CRC it's really not a good idea to use CRC16 or CRC32 the last one give 4 Billions values, that not enough you could get wrong duplicates files. SHA1 use a 20 bytes (160bits) this give a lot more possibility. To give an idea it would be arround,
4Billions * 4Billions * 4Billions, ... You get the point, with this amount of possibility you reduce the change of making false duplicates.

SHA1 speed is fast enough +1MB/Sec but it's not that important, a client program could generate all the sha1 at startup and cache
this information in memory 1000 files would required only 20KB
of memory. Doing a hash for each query would not be a good idea...

Marc
Reply With Quote