|
Register | FAQ | The Twelve Commandments | Members List | Calendar | Arcade | Find the Best VPN | Today's Posts | Search |
General Gnutella Development Discussion For general discussion about Gnutella development. |
| LinkBack | Thread Tools | Display Modes |
| |||
Re: Search based blocking and network clustering Quote:
Quote:
And what good would come from seperating the network exactly? Let's not start trying to add politics to a network that's supposed to be politically neutral, shall we? Thanks. |
| ||||
Re: Re: Search based blocking and network clustering Quote:
Quote:
Quote:
Personally I am more interested in reducing the traffic of 'illegal' porn through my PC - I wouldn't advocate the use of gnutella to many people I know because I would be embarassed. Yes, I believe in free speach, I think this solution would be a good one, it would leave people with the right to publish what they like, but also give people the right not to take part in that publishing if they wish not to. Nos |
| |||
Re: Re: Re: Search based blocking and network clustering Quote:
Quote:
It would be seriously destructive to the Gnutella network for some clients (or users of those clients) to arbitrarily decide that they don't want to forward queries just because they contain the word "porn" or something. Quote:
I think your idea would be better suited to a proprietary network where there are only 1 or 2 clients that exist to access it. Unless you can convince ALL other Gnutella client developers that your idea just rocks and to implement it (which I doubt would happen... you haven't yet convinced me it's worth implementing into my client), producing a client with features like that would be useless because there would be a dozen other clients out there that would be ignoring your efforts. Last edited by Smilin' Joe Fission; April 8th, 2002 at 12:43 AM. |
| |||||||||
Re: Re: Re: Re: Search based blocking and network clustering Quote:
Quote:
Quote:
For HIGHLY UNPOPULAR terms such as lolita preteen xxx I expect it would have a significant impact on the time taken for searches to come back and for the client to find other clients who do not block those terms, assuming everyone installs a client which supports this system and configures the blocking. But because of the pluralistic nature of the population of gnutella users, for most search terms it wouldn't be any kind of problem. Quote:
Quote:
Quote:
Quote:
Also, as I said, people would like to move clients closer together in some cases for some searches. Uses proposed have included language specific searches. Once the grep searching ability is facilitated in gnutella searches, you will be able to block searches which do not contain a particular term. Quote:
If noone else thinks this is a good idea, not many clients will support it - maybe none if I don't get my finger out. On the other hand, if it is a good idea, we will figure out how to do it and do it and the idea will spread - users will demand it or move clients. Your opinion as an individual that 'the idea just plain sux' is only the idea of one person, and you speak for no one except yourself. Thanks for your opinion. Anyone else with the same opinion, consider yourselves spoken for already by Smilin' Joe. Quote:
I believe the appropriate protocol response would be 'relay improper queries (402)'. So, many clients will understand what is being said to them (at least enough to display an error message to the user). BYE can even be retrofitted to 0.4 clients. If a client does not understand, that's fine, it just won't necessarily have the same degree of effect in moving clients closer/further away based on searches. But they will be disconnected by my client every time they try an unwanted search .. which is not a great penalty unless it is being done by many many clients. Nos <A HREF="http://www.sdf.se/~simon/marvin/songs/save_the_children.html">Who really cares? Who's willing to try?</A> Last edited by Nosferatu; April 8th, 2002 at 02:31 AM. |
| ||||||
Re: Re: Re: Re: Re: Search based blocking and network clustering Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
|
| ||||||||||
Re: Re: Re: Re: Re: Re: Search based blocking and network clustering Quote:
Can you at least tell me what type of abuse you think is inevitable? Quote:
Quote:
Here is <A HREF="http://www.limewire.com/index.jsp/size">the real figure - it's around 300k - 1/2 million at the moment.</A> I guess their guess is better than mine . It's around 30 times the size of the generally accepted average horizon size. Quote:
Quote:
You are assuming that the 10 randomly selected hosts you have chosen are ALL searching for something widely considered inappropriate. This is already unlikely, but no doubt will happen very very occaisionally. By definition, because this term is widely considered inappropriate, and as you say you are searching for only things widely considered legitimate, then the next ten hosts you pick up are pretty much guaranteed to accept your searches and connections. Quote:
Considered 'impossible'. Even if the impossible happened, all that would be experienced is everyone would for 10 seconds to a minute be searching for 10 new hosts. Since the other 'hundreds or even thousands of clients' who are in your impossible scenario all connected to these 10 'naughty' hosts are all receiving pongs through the 'naughty' hosts up until the time that the 'naughty' guys perform their 'naughty' search, they will already have knowledge of a great number of 'nice' hosts, so they should find a new one without even having to visit a host cache. Remember, the above is not going to happen. 'Naughty' searchers are going to appear rarely, one at a time. There is a scenario where what you describe is going to happen, which will be during start up, if a very wide number adopt the strategy of 'specialising' their searches. Let's look at it this way. I will try to describe a reasonable, but worst-case, scenario, where you the user are not searching for something considered inappropriate by most people. Say a very high number of people think specialist searches implemented using grep is a good idea, ie search me for iso files only, or search me for mp3s only. What do you think the upper limit would be? 40% of people might think this way? I think that is a very very conservative figure. OK, for a very back-of-the-envelope kind of figure, n * p = t where n: total number of trials p: probability that a connection will not reject you when you search t: target number of host connections gives n = t/p If we say p = 0.6 (ie 60% 'good' connections, 40% 'bad') and you want to keep up 10 connections, n = 10/0.6 n = 16.67 On average, a user searching for something which the specialists reject, has to talk to 17-odd hosts at startup in order to establish 10 good hosts. This does not consider any additional host-rejection scenarios. We can generalise the answer, by saying that t = 1 n = 1.7 You have to connect to, on average, about 1.7 times as many hosts, if there are 40% of people wanting to specialise and you search for something else. 40% is an astonishingly high proportion, and you have said yourself that you don't think this idea will take off at all. How about if we assume 20%, still a very high figure, but perhaps a realistic high-point. p = 0.8 n = 1/0.8 n = 1.25 Only a 25% increase in number of initial host connections required at startup. And as I said before, this ignores the effect of hosts caching hosts who are similar to themselves. (Perhaps this effect would be insignificant anyway until you have done a few searches). Anyway, I guess this means it might be a good idea if when rejected by a host, if you have plenty of hosts in your cache, that you delete the rejecting host from your cache, thus increasing the chances that clients cache hosts with similar search/drop criteria. Quote:
I am more likely to find a result, eventually. I re-search every ten minutes, and get a different group of results. I can still download from most of the machines I located ten minutes ago, if I don't find anything. The only ones I might not be able to download from are firewalled IPs. Quote:
<I>Added later: oops - confused between statistics and probability. No I do not have the statistics, but can model guessed probabilities - see later posting</I> Quote:
Quote:
Nos "We can't train that boy as a Jedi because he is too old and too full of fear" Last edited by Nosferatu; April 8th, 2002 at 11:53 PM. |
| |||
Quote:
Where do people get the idea that Gnutella isn't political? Read this thread and tell me it ain't! |
| |||
Re: Search based blocking and network clustering Quote:
First, and foremost, it'll give anyone "control" over the content seen by others - this is one of the things that's been bugging many developers: if you can control, you can also be asked to do control specific files. One can endlessly discuss that particular issue, but basically, a "hands-off" approach is the most appropriate in this case. Second, the network isn't meant to block certain traffic - it's an open and free protocol. Blocking certain traffic is vendor specific, and that can lead into the great debates you've seen elsewhere on this forum (commercial vs. non-commercial, et al.) It is better to let the end-user decide what he/she is willing to see, for example the "Family Filters" seen in some clients. Obviously, some developers should make that a password protected. Thing of GnutellaNet as an Internet atop of the Internet. The Internet is an ungoverned place - things you find on GnutellaNet can also be found on the Internet itself, how distasteful that content may be. But as with the Internet, it is at your sole discretion to block/avoid these things, not the maker of an Internet browser. -- Mike PS: Mods!! (Morgwen, Cyclo) - for some reason my account is "disabled", I was unable to post with my actual account and had to go as Unregistered. I couldn't even PM you two - wassup? |
| ||||
Some figures OK, the statistical math is very hardgoing and I'd probably get it wrong and you probably wouldn't understand it (even if you do understand statistical maths!) Using the Binomial calculator at http://www.anu.edu.au/nceph/surfstat...me/tables.html I can quickly plug in n=17 and p=0.6 as determined previously and find out the standard deviation: 2 So we can say, for the horror 40% of people disallow searches that aren't for some specific resource and you aren't searching for that specific resouce, that 5% of the time you get 10 hosts in under 13 tries 33% of the time you get 10 hosts in under 15 tries 67 % of the time you get 10 hosts in under 19 tries 95% of the time you get 10 hosts in under 21 tries For a still fairly bad situation where 20% of people disallow .. blah blah blah .. by plugging in n=12, n=13 (the mean is 12.5) and p=0.8 as determined earlier, and finding that the standard deviation is 1.4, so 5% of the time 10 hosts in under 10 tries 33% of the time 10 hosts in under 11 tries 67% of the time 10 hosts in under 14 tries 95% of the time 10 hosts in under 15-16 tries I couldn't find any online application which will graph these outcomes in a useful way. I wonder whether any of the big commercial vendors have their own gnutella network modellers. If so, they could figure out better what would happen. I guess they wouldn't tell us though I wonder if there is a project yet to write a gnutella network model? It would be useful for exploring proposed protocol modificiations, and I guess not much different from writing a client. The hard part would be writing analysis routines to make the data meaningful. Nos |
| |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Search based on Directory/Folder | pheare | Feature Request | 7 | August 12th, 2005 07:51 AM |
Does Search Result based on Uploads? | kaymatrix | General Windows Support | 2 | June 11th, 2005 10:15 AM |
Network based ID3 cleanup, playlists, song classification, discovery & more... | jim7 | General Gnutella / Gnutella Network Discussion | 2 | October 22nd, 2001 08:30 PM |
University Network Blocking Gnutella | Unregistered | Connection Problems | 2 | August 31st, 2001 12:30 AM |