Gnutella Forums

Gnutella Forums (https://www.gnutellaforums.com/)
-   LimeWire Beta Archives (https://www.gnutellaforums.com/limewire-beta-archives/)
-   -   Junk Filter (https://www.gnutellaforums.com/limewire-beta-archives/49917-junk-filter.html)

Grandpa December 17th, 2005 11:34 PM

Yes I played with the slide bar but my findings was it didn't make much difference. What I did when training was I did software searches that had large amounts of the 851.7 virus then I sorted by file size I then chose the whole group of 851.7 files which was apparently a mistake.

I still haven't figured out a way to make it work without it filtering out a large amount of good files. I will keep trying but i have a feeling that this filter is no better than any I have seen in the past and eventually I will give up trying to make it work and totally disable it.

Lord of the Rings December 18th, 2005 04:11 AM

For me it worked well with 4.9.37 re: auto-spam. But I installed 4.9.39 before I did any more. I then concentrated upon the virii sizes as you were. I found it wasn't as successful. I don't know whether one can conclude that it works well with auto-spam such as mp3 & ipod spam, but not so well with virii.

* Some notes about how to use the spam filter to its greatest efficiency would be beneficial.

kmag December 19th, 2005 08:04 PM

Thanks for the feedback.

In 4.9.40, Roger fixed some display confusion to make the threshold for showing a trash can in the "quality" column equal to the threshold for hiding junk (if the hiding junk option is enabled).

These problems with too many files being marked junk over time are likely a problem with the filter being set up to learn more from bad hints than from good hints. (In the code, these hints are called "tokens".) Hopefully LW 4.9.40 is much better about this; the learning should be much less biased toward the bad unless you set the sensitivity above 50%.

The spam filter is actually a set of filters, where the file starts out being 100% good, and each filter multiplies the goodness by some value between 1.0 (inclusive) and 0.0 (exclusive). It's probably a host of different filters that are whittling the files down to a "junk" rating.

Basically, if you search for some terms and end up getting a result that you mark spam, LW will internally create a bunch of tokens for different things LW knows about the file. There's a token for the size of the file, a token for each word in the title, etc, etc. Tokens that keep showing up in the search results for "very bad" spam rated files gradually get marked more and more "bad". Tokens that keep showing up in the search results for "very good" spam rated files gradually get marked more and more "good". Part of the problem is probaby that the standard for "very good" was more tough than the standard for "very bad", (hard-coded to below 15% junk vs. above 70% junk) so with each search, the effects of the "bad" tokens relative to the "good" tokens was multiplied. Basically, lots of very "bad" tokens mean lots of search results get very bad spam ratings, which means lots of tokens slowly get marked more "bad"... it's a snowball effect, and we need an opposing "good" snowball effect to cancel it out. This is an over-simplification, but hopefully it helps you get a general idea of what goes on inside the spam filter.

Give 4.9.40 a try and let us know how it works for you.

Don't be shy about going into the options and changing the sensitivity of the junk filter. In 4.9.40 (unlike 4.9.39 and 38), to some extent the sensitivity of the junk filter affects the balance of influence between bad junk ratings and good junk ratings. Below a sensitivity of 50%, it's hard to say which way the learnig is biased. Above 50% sensitivity, the learnig becomes more and more biased toward increasing the "bad-ness" of tokens. Hopefully with feedback from real-world useage, we can tweak the filter to have very little bias in the sensitivity ranges that people actually use.

Grandpa December 19th, 2005 10:28 PM

Thanks for the explanation.

Now I am going to have to eat my words the filter in .40 seems to work extremely well so far. I have never liked filters none of them ever seemed to work but so far this one does. It is very easy to train and very effective. I do like the fact that you can view the junk results so if it makes a mistake you can correct it.

Time will tell if it is going to be good or not but so far it is very good you guys once again have proved to me that you are among the best.

Now that that is out of the way how about working on direct connect. You know in the future that may be all that we have.

Any way keep it up you are the King of the hill and everybody is going to try and push you off. But you keep making improvements like you have the last few years, then they are going to have a hell of a time doing it.;)

Lord of the Rings December 19th, 2005 11:49 PM

One thing I've noticed about the win 4.9.40 is the stop search button sometimes seems to become faded out so I can only stop a search by right-clicking the search tab. This has been happening in spurts/groups of searches. Actually I noted this seems to happen after I right-click the tab to repeat the search. Or is this normal. I don't always want to stop a search.

As far as the filter goes, I find that the result still seems to display despite being set not to (I'm concentrating on virus sizes for program files; ie: 765/851 KB.) One moment they'll appear as normal, a few secs later with a trash can as it recognises them as junk. At present I have settings at around 85-90% for the slider. I've increased it bit by bit over this session.

Sometimes a new one shows up in the results window, I click junk & the others with a trash icon disappear as well as that one. I'm meaning only a small % of those detected as spam showed up in the results window. Is this normal. And why do they suddenly disappear only sometimes when another one is designated as junk?

Lord of the Rings December 21st, 2005 07:49 PM

It's greyed out until you do a search. Then select any item in the search results that you feel needs to be filtered out. Then the trash button should be accessible. Or did you already try that?

Under LW prefs>Filters>Junk you'll find options for how you want to use it.

I've been using the windows version so I'm not sure about the mac version.

Morb December 27th, 2005 01:43 PM

I had set the filter to strict all the way and not to show junk files. While selecting 4 files and marking them as junk in a total list of 8 I suddenly have only 1 file remaining. Sometimes all the results in the search window dissapear. Seems odd to me. Mostly the smaller files I will mark as junk because they're only spam. Some huge files though are too so I mark them as well. Now I go on a completely different search for another file, totally different name and find a ton of files that aren't junk are marked as junk! Why? I set the filter to show the junk files so that I can now see what's marked as junk that I DIDN'T mark as junk to unmark it. Ugh. I've sinse just turned the whole damn thing off as it seems to be more of a hassle. I still say just put a file size filter on. I don't see how you're supposed to train something when you don't know how it works or is designed in the first place. :rolleyes: Then when you sort it, the junk files won't sort and there's valid files above and below. Just not worth it.

c_robertson December 27th, 2005 03:59 PM

Junk filter on file size
 
There have been quite a bit discussion about filtering out file size, and why the filter come out the way it did, I don't understand.
I (and most others) would be so very happy if we could mark a file with ??k file size as junk, and all of those file sizes are filtered out. It seems so simple. Apparently there is more to it then that.

I have to conclude that this filter was also made for filtering out similar file names. That seems risky, since there are good files with the same file name as bad ones.

If you want to go with that, make a third pane which may be expanded with the junk files. Of course, you will then find many good files are being filtered out and you will never use the junk filter again.

That is what will happen, and what I will be doing.

Grandpa December 27th, 2005 06:33 PM

Well if you train it, It actually works pretty well move the slide bar back to around 30% to 40% mark the files you know to be good as not junk. It will then start distinguishing between name and size. I myself have never liked filters of any kind but if you take the time this one works better than any I have used before.

If you don't want to take the time to try and figure it out then don't use it. Just keep crying about the filter and maybe they will change it. But I hope not.

mfenech December 29th, 2005 09:29 AM

Why do I still see junk files in my search results (items flagged with the trashcan icon) if I select 'Do not display junk' in the Filter Options? Am I missing something? :confused:


All times are GMT -7. The time now is 12:04 AM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
SEO by vBSEO 3.6.0 ©2011, Crawlability, Inc.

Copyright 2020 Gnutella Forums.
All Rights Reserved.