View Single Post
  #1 (permalink)  
Old July 15th, 2002
notgermy notgermy is offline
Novicius
 
Join Date: July 15th, 2002
Posts: 1
notgermy is flying high
Default crawling: routing and bandwidth conservation

Hi, I'm working on implementing a crawler which will eventually provide xcache-like functionality with some additional bells and whistles. I'm currently using jtella for the gnutella stack implementation.

For the implementation I'm researching ways to perform crawling by repurposing PONG/QUERYHIT messages without the need to flood the network with PINGs, which I think is a good thing as most crawlers waste precious bandwidth.

In my initial investigation I'm noticing a lot of traffic which, if my reading of the protocol is correct, I shouldn't be. Specifically when connecting to a single host (e.g. connect1.gnutellanet.com:6346), and not sending a single PING message, I'm being forwarded PONGs and QUERYHITs.

My question: is this expected behavior (PONGs/QUERYHITs are broadcast) or are there just a lot of broken implementations?

It's my understanding the only messages which are broadcast (i.e. sent to all connected servents except for the sender) are PING and QUERY messages. Additionally all other messages (PONG and QUERYHIT messages) are "routed" back to the sender (by recording the message GUID and sender and using this information for replying.

Are these assumptions correct?
Reply With Quote