Gnutella Forums

Gnutella Forums (https://www.gnutellaforums.com/)
-   General Gnutella Development Discussion (https://www.gnutellaforums.com/general-gnutella-development-discussion/)
-   -   Packet routing (https://www.gnutellaforums.com/general-gnutella-development-discussion/13158-packet-routing.html)

Cakkie July 3rd, 2002 03:47 AM

Packet routing
 
I ahve a few questions about routing packets. If I send a PING or QUERY, I assign a GUID to my packet, I store the packet so that I can look it up when I receive a PONG or QUERYHIT. No problem here, since I created the GUID, and I started the packet.

However, what should I do if I receive a packet, and need to forward it. A PING or QUERY I forward to each connection (except the originating), but what about PONG and QUERYHIT, to which socket must I send this? I can imagine I have to keep track of this too, but I don't really have a clue on how to do this. I could just send it to each connection, but that would create a large "polution", wouldn't it?

I was also told that each packet must have a unique ID, and if you come across the same ID twice, the second packet should be dropped from the network, since you already procedded it. But what happens when someone does a QUERYHIT, using the GUID from the QUERY, that would result in the packet beeing removed from the network almost immediatly.

cultiv8r July 3rd, 2002 03:33 PM

You also need to keep track of the GUIDs for other packtes that are not generated by you. And for Query Hits, you must also store the ServentID for a little while, because Push requests will use that.

This way, you know how to send a return packet to the originator. For Pongs and Query Hits, you can see multiple come back with the same GUID - you should not drop those. But for multiple Pings or Queries with the same GUID - yes, those must be dropped.

IMO, this is one of the drawbacks of the coding part of Gnutella. But it ensures the traffic won't go haywire.

Cakkie July 3rd, 2002 11:06 PM

Thanx, I though it would be something like that. Another question, how long must I keep a GUID? I can imaging that a ping (and it's pongs) should be long out of the network after 1 minute. However, a queryhit not, since it can receive a push, which could be sent a few mintes later, or is it a good idea to always keep track of servent id's, and not remove them after some time?

cultiv8r July 4th, 2002 12:39 AM

Generally, one would say "the longer the better". But you'll notice you'll end up with a lot of GUIDs and Servent IDs to track that way. I would like to see some comments from other Gnutella developers on that as well.

Vinnie July 4th, 2002 08:29 PM

Solution
 
BearShare's solution is to create two tables.

One starts out empty (call this full) and the other starts out big enough for 1,000 GUIDs (call this recent).

Each time you see a GUID, check full, and then check recent, in that order, to see if it already exists.

Each table can use a binary tree for efficient searching.

If it already exists, do nothing.

If it doesn't exist, then the binary traversal that you took will show you where to insert it. Insert the new GUID into recent.

When recent is full, perform the following algorithm:

- Subtract the time that an entry was first stored in recent from the current time. This is the time in seconds it took to fill recent

- Calculate the number of entries required to store 10 minutes worth of GUIDs, based on the number of GUIDs stored in recent (1,000 initially). The formula is:

needed = ( count * 10 * 60 ) / seconds

where count is the number of entries in recent, seconds is the time to fill recent, and needed is the number of entries of storage needed to hold 10 minutes worth of data.

- Discard the contents of full and reallocate full to needed entries.

- Swap full and recent

At this point, table full is really full, and table recent is empty, but has enough room to hold needed items.

This algorithm requires only two variably sized allocations. Using a binary tree, you get the fastest possible find. As an added bonus, insertions execute in constant time because you can re-use the information gleaned during the find (which you had to perform anyway to check for duplicates) to determine the insertion point.

Using a slight variation, the same table structure can be used to store PUSH routes. The main difference for storing push routes, is that entries are always added into recent even if they already exist in full. It is also wise to keep these push routing tables on a per-host basis, and do a binary search in each pair of tables, for each host, when routing pushes.

This solution automatically adapts to changing network conditions. It guarantees at least 10 minutes worth of storage, and usually provides more (since there are two tables).

The memory demands are modest, and there is, on average, only a single allocation performed every 10 minutes.

I rather doubt a better algorithm exists.

Vinnie July 4th, 2002 08:33 PM

Quote:

Originally posted by Cakkie
I can imaging that a ping (and it's pongs) should be long out of the network after 1 minute.
Pings are rarely routed using the GUID anymore, LimeWire's "pong caching" algorithm should be used instead.

Quote:

However, a queryhit not, since it can receive a push, which could be sent a few mintes later, or is it a good idea to always keep track of servent id's, and not remove them after some time?
You are correct, that pushes can be sent MUCH later from when the associated query hit was received.

And duplicate queries can arrive some minutes later from the original.

BearShare tracks these latencies in the Statistics page, under the "Oldest" column.

1) Oldest Query: longest time span between duplicate queries

2) Oldest Query Hits: longest interval between a query and its associated query hits

3) Oldest Push: longest interval between a query hit and an assoicated push

For 1, average time frames are between 5 and 15 minutes

For 2, average time frame is from 2 to 8 minutes

For 3, I have seen values as high as 13 hours (running for a long time with very stable hosts/horizon). This means that someone requested a download 13 HOURS after receiving the query hit.

I strongly recommend the two-table approach for storing push routes - this single change resulted in a vast improvement in push routing for the entire network. LimeWire switched to this scheme after they discovered that a fixed 10,000 element table is only good enough for about 80 seconds worth of data.

Syfonic July 21st, 2002 12:09 PM

very good ideas/solutions :)


All times are GMT -7. The time now is 10:16 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
SEO by vBSEO 3.6.0 ©2011, Crawlability, Inc.

Copyright © 2020 Gnutella Forums.
All Rights Reserved.