Gnutella Forums  

Go Back   Gnutella Forums > Gnutella News and Gnutelliums Forums > General Gnutella Development Discussion
Register FAQ The Twelve Commandments Members List Calendar Arcade Find the Best VPN Today's Posts

General Gnutella Development Discussion For general discussion about Gnutella development.


Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old September 26th, 2002
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default [0.7 proposal] Count search replies

I already posted this once, but a typo in the subject has clearly kept most people from ever looking inside the post.
This now under a new subject, I hope you forgive me posting the same twice.

Summary: Insert a Query Reply Count (QRC) into the queries.
Contacted hosts go down for popular files.
Not so many replies, which never get seen.
Stops *.mp3 query flood without ever forbidding it.

-----

You could include the number of search replies a query already got in the forwarded query.

That would sort out *.mp3 (and similar requests) without ever needing to forbid them and it would enable users to find rare files more easily, while still keeping the network traffic down, when searching for popular files.

I got that idea, when I calculated how many host you can reach, if every node just has 3 connections and you have a max-hop of 7. Less than 3000.
I also calculated, that using a max-hop of 7 you'd need over 5 Connections to be able to reach all gnutella-users (about 150.000).

With HTL=7 you'd need over 17 Hosts per node to reach every computer user of this world. With HTL=15 you'd need less than 5 (4.5)

The way I suggest may allow you to increase the HTL again without choking the network.

As kazaa might go down soon gnutella could get much more users => more connections to find most files, especially rare. Another approach would be to increase the HTL, but that would slow down the network, as popular files would get replies and more replies, which aren't really needed.

So I got the idea, that you can reduce the number of hosts which are contacted for popular files by setting the number of replies each query sent out gets.

The Query reply count would be included into each query. When a node sents the replies it counts them and adds them to the qrc. If that qrc is greater than 10 afterwards it doesn't forward the query.

Here are some sample calculations:

For the first one I assume, that every user has 3 connections and every third user has some of the requested files. To make it more easily I set the qrc to 1. The HTL is 7.
The number of hosts contacted without the QRC would be 3^7 = 2180 Hosts.
With the qrc you have 2^7 = 128 contacted hosts and 2059 not contacted hosts, that meanse:
If
HTL=7
Hosts=3
popularity is 1/3
contacted hosts: o=128
not contacted hosts: x=2059

The same for
HTL 7
Hosts 6
Pop=1/3:
contacted Hosts: o=16.384
not contacted hosts: x=263.552

If the popularity is smaller you have:
HTL=7
Hosts=3
Pop=1/6
o=610
x=1576

or:
HTL=7
Hosts=6
Pop=1/6
o=78.125
x=201.811

That means if every 3rd User has the file you contact only 6.2% of the nodes, which are in your reach.
If every 6th has it you contact only 39%.

That number increases a bit, as you want more than one reply for each request.
If every hundredth has files you'd contact 93% of the Hosts.

Requests like *.mp3 stop at the first nodes and get at most replies from 5 hosts (the direct neighbors).

That trick would enable you to scale gnutella up again without creating too much traffic. Naturally it depends like everything on users sharing.

Finishing summary: Query Reply Count (QRC) into the queries.
Contacted hosts go down for popular files.
Not so many replies, which never get seen.
Stops *.mp3 flood without ever forbidding it.

at last, I add a chart, which shows its working for popularity = 1/3 and hosts per node=3.
(.gif)(please ignore the "half-transparent"-parts. Only the black lines count)
Arne Bab.

Comments? Are there errors in the calculation? Something else?
Attached Images
File Type: gif qrc-chart.gif (13.2 KB, 321 views)
Reply With Quote
  #2 (permalink)  
Old October 2nd, 2002
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

No, I don't know enough about it. I get the information through others, so I should never fully trust them (even if I sometimes do, shame on me).

So, how does it go? How great are the chances, that kazaa will survive it?

What do you think about the idea to improve the efficiency of gnutella?

PS: Napster went down and with it most of the network (the userbase). I expect the same with kazaa.
Reply With Quote
  #3 (permalink)  
Old November 5th, 2002
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default no answer

... still none. Was that only useless rant, or did you want to tell me something?
Reply With Quote
  #4 (permalink)  
Old November 6th, 2002
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

Quote:
Originally posted by bpmax

No... no shame on us... as a community...for listening to such rubbish........ But have you ever heard the term "troll" before?

Sure, but I think that for me, it means something completely different, than for you.

homo sapiens ingentis:
height: about 2,8m
Weight: 225kg,
Vision-spektrum: infrared and human spektrum

I never really was a member of the internet-community, and I think, I don't want to be one, at least not, if the internet-community is where you are.

Quote:

That depends on the money they sink into it.... But does this have to do with gnutella anyway, troller?
Think a bit.
P2p-filesharing depends on a userbase. Few Users mean few files, mean fewer Users.
Kazaa has many Users.
Most Users of Kazaa don't use Gnutella.
If Kazaa goes down, many Users might switch to Gnutella.
=> Gnutella gets more Users.
=> Gnutella has more files.
=> Gnutella gets better.

Quote:

... "the network"??? What do you mean by that, pal??? What network are you referring to? What "userbase" are you referring to? Do you even have a clue to what you are talking about?
*sighs* Seems I must write it much simpler.

NapsterClient = Program
NapsterNetwork = Form of distribution
UserBase = number of Users of the Network
NapsterClient = original Napster
NapsterNetwork = All Napster Servers
UserBase = All who use a Client which connects to a NapsterServer

NapsterClient went down but the
NapsterNetwork was still up, but the
NapsterNetwork didn't get the Users back, so the
Userbase went down.

Without Users the NapsterNetwork (all Napster-Servers) wasn't useful anymore, as it no longer had enough files.

=> If the UserBase is too small the best Program is rubbish.

Clear enough?

- twinstar end -

Last edited by arne_bab; November 6th, 2002 at 06:36 AM.
Reply With Quote
  #5 (permalink)  
Old November 7th, 2002
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

Quote:
Originally posted by bpmax

Wonderful thought... but it's not gonna happen anytime soon. The case was just presented to Scandanavian courts... in which US interest has no weight. How do you figure the US based associations have any "clout" in that region?
Ah, that was the first interesting bit of information, I get of you.


Quote:

dead in the water since 5/01... and such a platform will NEVER again be possible. You are talking about a year and a half old DEAD issue.

P2P is a whole other ballgame from the server-
You know, that Kazaa needs a central server? How else could they be stopped?
FastTrack and Kazaa isn't the same, you said so yourself.

Besides: Learn from the past.

I wish I had a selfconfidence like you. Never having to prove anything, never needing to think about anything, but always sure, that you are right.

This was my last message in this part of the thread, except if anyone posts anything really interesting.

Good to hear of you, and to know, that you gave me at least exactly one bit of information I didn't have: Kazaa is sued in Kanada at the moment.
Sadly the rest you wrote wasn't at all important or even interesting and your dicussion style wasn't worth a penny.

I'm sorry, that I ever answered to your first question.

Bye.
Reply With Quote
  #6 (permalink)  
Old November 8th, 2002
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

I just wrote, how it felt, when I read your messages.
I don't deem it as good style to insult someone you're discussing with in the second message you write.
Sadly I let me get drawn in, too. Please accept my excuse, especially for the following:

Quote:
... still none. Was that only useless rant, or did you want to tell me something?
As most people, I don't like being told, that I

Quote:
also obviously don't know much about the lawsuit... or it's chances of flying,
especially, if I get no answer to a plead to be able to learn more about it.

I never said, I knew everything, and I can't even say I know much about kazaa, but I try to learn how the programs work, whenever I can, and I think I know the principle of gnutella and the maths behind it.

My idea was, that you include a counter into the search-queries, which get forwarded through the network.
A client, which receives a query, adds to that counter the number of search-replies (results) it sends back.
If the number increases beyond a given number, it doesn't forward it further.

An extreme example: A User asks for .mp3 (that's what forced the programmers to include filters, iirc)

Normal behavior (I ignore filters for now):
The query reaches about 5 contact-hosts. Those send their answers and forward it to 5 other contact-hosts, each.
From everyone we get many replies, which eat bandwidth.
If every host has 100 mp3s, we have 500 replies at the first step, 2500 at the second, 12500 at the third and so on.

Behavior with QRC (query reply count):
The query reaches the first 5 contacts. Those send their answers back. Then they add the number of replies to the QRC (each). Now the queries have a QRC of 100 and get no longer forwarded.
The query results in 500 mp3s.


Now a more standard example (2 of 5 hosts have 10 files, which match the query, that means the search is much too unspecific)(Maximum QRC is 10 replies):

Normal behavior:
The query gets to the first five hosts. Two of them have multiple files, which match the query. They send the replies. Three others don't have any matching files.
All five hosts forward the query to five other hosts, each.
With the first step you contact 5 hosts and get 2x10=20 results.
After two step you contact 25 hosts and you get 10x10=100 results.
At the third step contact 125 hosts and got 50x10=500 results.
At the fourth steps its 625 hosts and 2500 results.

You will never read through all those 2,500 results, so most of them are just garbage sent through the network.


Behavior with QRC (query reply count):
The query gets to the first five hosts. Two of them have multiple files, which match the query.
They add the number of replies to the QRC and as the number is equal to (or higher than) 10, they stop forwarding.
The other three hosts forward the query to five other hosts, each.
From those 15 hosts contacted in the 2nd step 6 have the files asked for, stop forwarding and give you 60 replies.
The other 9 hosts forward to 5 other hosts, each.
At the first step you contact 5 hosts and gain 20 results.
At the second step you contact 15 hosts and get 60 results.
At the third step you contact 45 hosts and get 180 results.
At the fourth step you contact 135 hosts (If i calculate correctly) and get 540 results.


Now the last example: A more specific search for a file, which about every hundredth host can answer, and that with less than five results:

Here normal behavior and behavior with QRC match nearly exactly.
For the first two steps nothing can change.
To miss one file at the third step, you'd need to have one of those hosts, which have the files (one out of a hundred). Than one of the contacts of those hosts also need to have the files (again, one of a hundred) and one of the contacts of that host also needs to be one out of a hundred.
The chance is 5/100=1/20 or the first host, 1/400 for the second host and 1/8000 for the third host
When that happens you will already have at least 10 results.


Result:

About nothing changes for rare files, but searches for popular files no longer clutter the network.

Also the chances of the music-industry to attack gnutella by creating spamming hosts is greatly reduced. (I can make that clearer in another message, if necessary).

numbers again
For a spam-query you get 500 replies instead of over 50,000 with a HTL of 4
For a popular query you get 20+60+180+540=800 replies instead of 3,120 with a HTL of 4
For a specific search nearly nothing changes.

Last edited by arne_bab; November 8th, 2002 at 09:19 AM.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Weird; No replies to Chat holla812 Open Discussion topics 7 October 14th, 2005 12:46 AM
Count of results found by search a greater value than the actual results displayed rrmetal Download/Upload Problems 1 January 17th, 2005 08:14 AM
Outlook Express Replies BWolf Tips & Tricks 4 September 29th, 2004 04:00 AM
My Proposal for XoloX!!! Unregistered User Experience 1 February 6th, 2002 08:11 AM
---a Radical Proposal--- Unregistered General Gnutella / Gnutella Network Discussion 0 September 21st, 2001 12:08 PM


All times are GMT -7. The time now is 03:12 PM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
SEO by vBSEO 3.6.0 ©2011, Crawlability, Inc.

Copyright © 2020 Gnutella Forums.
All Rights Reserved.