View Single Post
  #6 (permalink)  
Old November 8th, 2002
arne_bab's Avatar
arne_bab arne_bab is offline
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

I just wrote, how it felt, when I read your messages.
I don't deem it as good style to insult someone you're discussing with in the second message you write.
Sadly I let me get drawn in, too. Please accept my excuse, especially for the following:

Quote:
... still none. Was that only useless rant, or did you want to tell me something?
As most people, I don't like being told, that I

Quote:
also obviously don't know much about the lawsuit... or it's chances of flying,
especially, if I get no answer to a plead to be able to learn more about it.

I never said, I knew everything, and I can't even say I know much about kazaa, but I try to learn how the programs work, whenever I can, and I think I know the principle of gnutella and the maths behind it.

My idea was, that you include a counter into the search-queries, which get forwarded through the network.
A client, which receives a query, adds to that counter the number of search-replies (results) it sends back.
If the number increases beyond a given number, it doesn't forward it further.

An extreme example: A User asks for .mp3 (that's what forced the programmers to include filters, iirc)

Normal behavior (I ignore filters for now):
The query reaches about 5 contact-hosts. Those send their answers and forward it to 5 other contact-hosts, each.
From everyone we get many replies, which eat bandwidth.
If every host has 100 mp3s, we have 500 replies at the first step, 2500 at the second, 12500 at the third and so on.

Behavior with QRC (query reply count):
The query reaches the first 5 contacts. Those send their answers back. Then they add the number of replies to the QRC (each). Now the queries have a QRC of 100 and get no longer forwarded.
The query results in 500 mp3s.


Now a more standard example (2 of 5 hosts have 10 files, which match the query, that means the search is much too unspecific)(Maximum QRC is 10 replies):

Normal behavior:
The query gets to the first five hosts. Two of them have multiple files, which match the query. They send the replies. Three others don't have any matching files.
All five hosts forward the query to five other hosts, each.
With the first step you contact 5 hosts and get 2x10=20 results.
After two step you contact 25 hosts and you get 10x10=100 results.
At the third step contact 125 hosts and got 50x10=500 results.
At the fourth steps its 625 hosts and 2500 results.

You will never read through all those 2,500 results, so most of them are just garbage sent through the network.


Behavior with QRC (query reply count):
The query gets to the first five hosts. Two of them have multiple files, which match the query.
They add the number of replies to the QRC and as the number is equal to (or higher than) 10, they stop forwarding.
The other three hosts forward the query to five other hosts, each.
From those 15 hosts contacted in the 2nd step 6 have the files asked for, stop forwarding and give you 60 replies.
The other 9 hosts forward to 5 other hosts, each.
At the first step you contact 5 hosts and gain 20 results.
At the second step you contact 15 hosts and get 60 results.
At the third step you contact 45 hosts and get 180 results.
At the fourth step you contact 135 hosts (If i calculate correctly) and get 540 results.


Now the last example: A more specific search for a file, which about every hundredth host can answer, and that with less than five results:

Here normal behavior and behavior with QRC match nearly exactly.
For the first two steps nothing can change.
To miss one file at the third step, you'd need to have one of those hosts, which have the files (one out of a hundred). Than one of the contacts of those hosts also need to have the files (again, one of a hundred) and one of the contacts of that host also needs to be one out of a hundred.
The chance is 5/100=1/20 or the first host, 1/400 for the second host and 1/8000 for the third host
When that happens you will already have at least 10 results.


Result:

About nothing changes for rare files, but searches for popular files no longer clutter the network.

Also the chances of the music-industry to attack gnutella by creating spamming hosts is greatly reduced. (I can make that clearer in another message, if necessary).

numbers again
For a spam-query you get 500 replies instead of over 50,000 with a HTL of 4
For a popular query you get 20+60+180+540=800 replies instead of 3,120 with a HTL of 4
For a specific search nearly nothing changes.

Last edited by arne_bab; November 8th, 2002 at 09:19 AM.
Reply With Quote