View Single Post
  #1 (permalink)  
Old September 22nd, 2006
Trivolve Trivolve is offline
Join Date: September 22nd, 2006
Posts: 4
Trivolve is flying high
Lightbulb "Merging" Identical files

I believe everybody has had this experience before. You search for something, and you get a few thousand results. Furthermore, you notice that some of the different files in the search are actually identical!

If 2 people download the same file from the same website, and they both use share it on Limewire, would their files show up separately in searches? I believe that is so. But they are still essentially the same file, so they should be merged together in searches.

Limewire makes a record of all the file properties of all files that are shared on the Gnutella network, and therefore has access to the size of the file. I would suggest that the "actual size of file" be used instead of "file size on disk", due to different cluster sizes of different hard disks and partition types.

So files with EXACTLY THE SAME NUMBER OF BYTES should be "merged together", so that when searched, one would see 1 file with 2 hosts rather than 2 files with 1 host each. This would GREATLY increase the number of hosts for a particular file, especially for the starting (so one file can start with many hosts rather than 1 only), and would reduce the number of repeats.

Furthermore, as we are only "merging" files that have the exact same number of bytes, just what is the chance that two different files with the same extension, would have the same number of bytes? For a typical 10mb file, that chance is one in 10,000,000. So this WILL work.

Last of all, what if of the two people who downloaded the same file from some website, one changes the file name (but not the file extension, duh)? How can we merge these two? They STILL CAN BE MERGED.
Let's say the file name is 12345.mp3 but one of the people change it to 123456.mp3. So what I suggest is that while "merging" these two files, also "merge" the two file names and other properties like description, so whether i search 12345.mp3 or 123456.mp3, i would still get the same file with 2 hosts.

Last of all, what does everyone think of this new feature i suggest?

Last edited by Trivolve; September 23rd, 2006 at 02:20 AM. Reason: Make the idea clearer
Reply With Quote