Gnutella Forums

Gnutella Forums (https://www.gnutellaforums.com/)
-   General Discussion (https://www.gnutellaforums.com/general-discussion/)
-   -   Playing with the rules (https://www.gnutellaforums.com/general-discussion/59482-playing-rules.html)

arne_bab August 3rd, 2006 05:30 AM

Playing with the rules
 
I want to share some of the funny things I do with the search rules.

To follow this playing guide, you'll have to compile Phex for yourself (there's a guide to that in our wiki).

First step: Creating funny rules.

Most of my rules are pretty straightforward, for example they hide all files with a size of "at least 197.7kB and at most 197.7kB", which means: All which are exactly 197.7kB in size (which is a standard-spam file, which I no longer wanted to see).

Another one hides all files smaller than 0byte, because I got spam with files having sizes below 0.

But those are the simple and useful ones. More fun (and certainly more risky) are ban and download rules.

Ban rules (only avaible in svn at the moment) have the big advantage, that scammers get banned automatically. But since Phex has to process a list of more that 8.000 ip adresses on each ban, auto-banning can very easily bring down your system, so use it with care.
Banning the hosts is more efficient than hiding or deleting results, because Phex doeesn't aknowledge any reply from a banned host, so it doesn't get counted in Dynamic Querying and your searches travel far further, means, you get more results, if you manage to do it in a manner, which keeps your Phex running (I told you, it's a game ;) ).

For example I sometimes activate a filter, which just bans hosts that serve known spam (text translated, as I use a german Phex):
"on getting search results
with 'buylegalmp3.com' oder 'efreeclub' oder '----------------' in the filename
ban source"

This does sometimes lead to Phex stopping to work (I assume it might come back after an hour or so, but I don't want to wit that long :) ), but if Phex survives that strain, I get far better results for some time (till the spammers changed their IP-Adresses).

And the second part of the game are download rules.

If I have a search, which will yield many results for songs I want, but I don't want to have to click all 300 of them by hand, I just activate a rule, which automatically downloads all music files, as long as they don't contain spam-names or "Preview-t" or "INCOMPLETE".

Since the spam-rule processes and weeds out bad files before the download-rule runs, I mostly get good files with this, and downloading new files is as smple as hitting "start search" again.

But if you do this, don't be mad if you get duplicate files, virus-ridden mp3s, files with strange names, things you didn't really search for, or similar. You have to weed out bad files by hand, before you put them into your playlist.

When you do this, you will really love the ability of Phex to manage a few tousand downloads :)
Many will not work, but as there might well be 5 versions of a song, you might not have to worry about that so much. Just remember to remove duplicates, before putting them into your player :).


But please remember: These are funny games. They won't empty your fridgerator and eat your cat, but they may well get virusses on your computer, eat your video-files or download scamware, because this is teh risk, if you don't check each download yourself.
But sometimes that's the price for careless playing. If you are willing to pay it, then good luck to you, if not, then take care and check automatic action-rules at least thrice, before using them :)

Have fun!

Hyper-kun August 9th, 2006 06:13 PM

"Another one hides all files smaller than 0byte, because I got spam with files having sizes below 0."

That must be a bug in Phex. Search results use a fixed 32-bit field to indicate a file's size. This should be considered an unsigned integer, not signed. Simply because the latter makes little sense. If it's exactly -1 (0xffffffff) there's probably a GGEP LF block which is used for files as larger or larger than 4 GiB. This is recommend for files as large or larger than 2 GiB because as Phex shows the interpretation of values beyond that (0x7fffffff) might differ. Anyway, I doubt that was spam but it could be of course. At least I've never seen such spam thus far.

"I sometimes activate a filter, which just bans hosts that serve known spam"

Sorry but this is definitely stupid. It's not only ineffective you also perform a DoS attack and a Joe job against yourself. I see quite a lot of requests for files I never had. When I check the SHA-1 checksums that's almost always efreeclub spam. This means you would have banned me for nothing. Trust me this far, I'm no spammer nor did I ever download or upload those files. In fact, from the host in question there was never downloaded anything and no the IP address is not dynamic. And no the configured port is not a standard port. These spammers do really pass addresses of random victims in search results, they are not really random though, the addresses point to running peers at least most of the time. The only other explanation would be a download mesh bug in some client, most-likely LimeWire considering the frequency but I doubt that. You should only ban hosts that really upload the file and even then you shouldn't really ban all of those. You know how partial file-sharing works, don't you? However, if you do check the IP addresses with whois and/or just keep looking at the uploaders you'll see - at least in some cases - who are really spammers.

You might want to check Gtk-Gnutella's list of hostiles:
https://svn.sourceforge.net/viewvc/*...s/hostiles.txt

Even if you don't plan to copy it, you should give it a try. For example, you can use it in Ultrapeer mode, look at the passive search results and mark those listed as hostile. I don't know whether Phex can do that out of the box but you should see that's pretty effective and causes few to no false-positives. This is certainly far from being perfect but much better than banning random hosts without knowing what you're doing.

Gtk-Gnutella has also a file blacklist (spam.txt). I'm not giving the URL here since the file is already huge (over 5 MB, listing 42464 items). This file should be obtained and updated through Subversion to keep the traffic to a minimum.

arne_bab August 9th, 2006 06:59 PM

I might lose some results by banning those who allegedly serve known spam-files, but my experience shows, that my search results got far better after banning efreeclub-sources a few times, if only because they can then no longer hinder the dynamic querying, and I only ban search results, where you won't/shouldn't see hosts from the download-mesh.

And heck, I told you, I am playing :)
This isn't a play-guide for those who don't want to risk losing results or outright losing contact to most sources :)

Still your information about fake-ips being included is interesting (I only knew about that as theoretical possibility).

And I think checking that hostiles file against the hostiles file in Phex could prove quite effective.

Integrating the spam-file will afaik take some coding, so it might take some more time.

arne_bab August 9th, 2006 07:05 PM

But I know a very simple explanation for adding random ips into the results:

People tend to download files which are served by many hosts, and since those who are listed are no known spammers, the files might get downloaded quite frequently.

It's quite a nice way to get your files to be seen (and your website to be hated... :) ).

Hyper-kun August 9th, 2006 07:39 PM

Yes, for example, you can discard everything with more than 16 sources in an GGEP ALT block. That's 100% spam. I don't know the limit for BearShare but for LimeWire it's 10 and Gtk-Gnutella it's 15.

A few months ago, there appeared some spammers using this for more efficient spamming. The old school spammers still use multiple query hits. Some GUIs don't show this difference.

It's best to dump the raw packets and analyze the raw data. That gives you some clues what can be promptly blocked. Most of the time you can either block packets with certain abnormal characteristics, exact files or otherwise an IP (range). In a few cases, you really have to wait for more information. Just try to get as much information from a suspicious host as possible, that will give you a good idea whether it's really a spammer and if yes, how they work.


All times are GMT -7. The time now is 05:39 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
SEO by vBSEO 3.6.0 ©2011, Crawlability, Inc.

Copyright © 2020 Gnutella Forums.
All Rights Reserved.