View Single Post
  #219 (permalink)  
Old July 13th, 2005
Sputnik
Guest
 
Posts: n/a
Exclamation

Pointless. Better Bitzi integration is what we need, not more and more manual filtering on filenames and hosts. The spammers keep moving hosts and they keep changing their filenames. They even keep changing the files, so they come in different sizes and such. The only reason they stick to 356x598 image dimensions is because nobody as yet has the capability to filter on image dimensions. Once they do, the spammers will start varying that, too.

In fact, they will keep dodging every filtering method and forcing you to run a treadmill of keeping your filtering up to date. That is why the filtering has to use file hashes, and why the process of flagging a file as bad, disseminating this information, and generally staying up to date has to be integrated as smoothly as possible and made as automatic as possible, while still resistant to spoofing attacks.

Any filtering method not based on file hashes will cause false positives. As time passes and the number of names, hosts, etc. they've moved to has grown the proportion of false positives would get worse and worse. But the file hash of a spam is still the file hash of a spam, even if the spams they're sending now have different hashes. It'll be a very long time before the first legitimate file turns up that has a hash identical to one previously seen on a spam. Hashes can therefore pretty much eliminate false positives, and they're already used throughout the network anyway, so they are the ideal filter criterion. There is even already a way to rate them -- Bitzi. The problem is that as of yet it is too hard to use effectively -- you have to surf to some web site, look up your search results manually, and probably register and sign up for a load of spam to actually vote on files and not just use the ratings. Spam, of course, being precisely what we are all trying to AVOID here.

Voting especially must be made easier. We need a preview feature that works for non-music media files and lets you delete files, delete-and-vote-bad files, and start-sharing-and-vote-good files you've downloaded. We also need to be able to vote files good without automatically sharing them, at least so long as Limewire continues to scale really poorly to sharing large numbers of files. (500 -- ok. 1000 -- sluggish and unresponsive. 2000 -- it starts crashing. 10000 -- it basically no longer works.) As for looking up files -- the search results should show ratings by each file in some graphical format. The current equvialent is to right click every file in turn, hit "Bitzi lookup", and wait for a browser to spend ages (during which your whole system is unresponsive) starting up and more ages loading a slow, gratuitously graphics-encrufted Web page. Once for EACH RESULT, mind you. That is simply unacceptable. Nobody will bother. It takes less time to just view the files and delete the bad ones in Explorer -- or rather it would, save for the niggling problem that if you delete anything from Limewire's download directory, the next time it gets a file Explorer locks up solid with 100% cpu use and makes you reboot. Microsoft's fault, of course, rather than Limewire's, but it makes it just as slow to weed out bad files by testing and deleting as it does to weed them out before downloading them using the Bitzi lookup. Which means the Bitzi lookup is just too damn slow and manual. The rating info needs to be fetched for each file automatically and displayed to the left, replacing the current worthless crop of "quality" indicators, whose uselessness was driven home by the large series of sequentially-numbered interesting four-star results I saw earlier this evening, not one of which downloaded. All went to "need more sources" nearly immediately. Requerying them didn't result in anything but a five minute wait, then "awaiting sources". Four stars means about as much as a politician's campaign promises, as near as I can tell; possibly slightly less. Every other icon there is equally useless. Green checks mean you have the file, or a different file with the same content, or no file like it at all. At best they are probabilistic indications that you already have the file. Yellow folders mean you have the file, or it's downloading. Torn paper means you have the file, don't have the file, the file is downloading, the file was corrupt, or the file started downloading and then got interrupted - - i.e. that particular one doesn't tell you a damn thing. And any number of stars means you have the file, the file's downloading, you don't have the file but should be able to get it quickly (even one star), you don't have the file and shouldn't hold your breath (even four stars), etc. etc. you get the picture. Every revision to the system has left it at least as broken as before, despite frequent claims that "THIS time for SURE!"; it increasingly looks like it is not actually technially possible to make the things reliable, and any further attempts are doomed to the same sorts of failures we've seen in the past. So let's replace them with bitzi lookup info. That should be technically feasible and actually very useful.

But the most critical thing is to make participating in rating files easier. Make it possible from within Limewire, along with deleting/organizing/deciding whether to share new arrivals. Only if it's sufficiently easy will votes by legitimate people be sure to outshout the spurious votes the spammers put into the ballot box. Although they don't seem to as yet, they will once filtering by bitzi lookup is easy enough it's routinely done by everyone. And then the lookup results will become worthless unless actually voting becomes correspondingly easier for everyone.

In the meantime, here's a proposed fix for the limewire scaling problem with sharing big numbers of files (present since at least 3.something). A file with a huge number of sources elsewhere on the network doesn't need your help. Limewire can detect this through the mesh, I believe, and if so, it can, each session, share only those "shared" files that are sufficiently rare -- say the 500 rarest, in terms of the number of other hosts online sharing each file. If there's under 500 files, it shares them all. If there's over 500 files that don't exist anywhere else on the network, it picks 500 of these and rotates each session, so as to eventually cover them all. Or even rotates every hour or something.
Reply With Quote