Gnutella Forums - View Single Post

David91 · #3 (**permalink**) April 27th, 2003

I too have been trying to work out how Gnutella works and I think (albeit briefly) that the answers to your questions are:

If there were only a small number of peers available to be connected, time and the efficient use of bandwidth could achieve a stable linkage without there having to be a central server of any sort, but the moment the number of machines to be connected scales up and these machines have differing hardware and software systems, varying versions of different access packages and variable amounts of bandwidth available, the name of the game changes. However, even in a small system of hosts with relatively equal resources, individual members would be acting as potential servers for the others so the predicate of a "serverless" network does not apply in any P2P context small or large. P2P does not emulate the architecture of a LAN or WAN albeit that file sharing facilities are common to all.

Because too much of the bandwidth would be taken up in a flat daisy-chaining linkage between an infinite number of peers, the topography must be three dimensional (ie some nodes with greater resources will be designated ultrapeers and they will provide a gateway for leaf nodes to the greater whole). This system depends on the use of routing tables which are dynamic stores of current linkages both vertical and horizontal for queries and downloads.

But even this system cannot cope with the present numbers so, to avoid saturating the network with mere "connection/maintain the linkage" messages, the search/query mechanism is limited in the TTL system, where queries only last two or three hops between ultrapeers. Thus, your search horizon is always limited by the "accident of availability". When you run a search, you achieve two or three hops between the then available ultrapeers and their leaf nodes. Wait five minutes, run the same search and you might hit a completely different set of ultrapeers and leaf nodes, so dynamic is the system.

As trap_jaw says, you must always start somewhere by announcing your presence to those already on the net, hence the listing of the "regularly available" hosts in the cache (these listings are not comprehensive and do not present any greater security hazard than any other system for collecting the broadcast addresses of internet users). You are correct in that the metasystem is not yet using any rule-based protocol for selecting the nodes to which you are subsequently connected. For example, it would be convenient if you were only connected to ultrapeers with not less than 50 leaf nodes to ensure as wide a propagation of queries as possible. If you want to encourage the efficiency of the system, you can experiment by adding the efficient ultrapeers to the connection box (but, for several reasons, this is not guaranteed to improve results).