![]() |
crawling: routing and bandwidth conservation Hi, I'm working on implementing a crawler which will eventually provide xcache-like functionality with some additional bells and whistles. I'm currently using jtella for the gnutella stack implementation. For the implementation I'm researching ways to perform crawling by repurposing PONG/QUERYHIT messages without the need to flood the network with PINGs, which I think is a good thing as most crawlers waste precious bandwidth. In my initial investigation I'm noticing a lot of traffic which, if my reading of the protocol is correct, I shouldn't be. Specifically when connecting to a single host (e.g. connect1.gnutellanet.com:6346), and not sending a single PING message, I'm being forwarded PONGs and QUERYHITs. My question: is this expected behavior (PONGs/QUERYHITs are broadcast) or are there just a lot of broken implementations? It's my understanding the only messages which are broadcast (i.e. sent to all connected servents except for the sender) are PING and QUERY messages. Additionally all other messages (PONG and QUERYHIT messages) are "routed" back to the sender (by recording the message GUID and sender and using this information for replying. Are these assumptions correct? |
Yes, your assumptions are correct. Normally, they should not be forwarded to you (and probably every other lcient connected to the one forwarding these messages). Why is this? Who knows... my guess is that there are indeed a number of defective clients out there. I even see clients who don't even send a User-Agent clause in their connection string. This makes it really difficult to track down the defective client. Normally, your application shouldn't be able to work (at least it shouldn't gather any information). Unfortunatly, some clients perfere to forward an unroutable message to all clients rather than dropping it. |
I think it is correct. |
All times are GMT -7. The time now is 09:03 PM. |
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
SEO by vBSEO 3.6.0 ©2011, Crawlability, Inc.
Copyright © 2020 Gnutella Forums.
All Rights Reserved.