View Single Post
  #1 (permalink)  
Old May 11th, 2005
Johan Mattsson
Guest
 
Posts: n/a
Default Unhierarchical distribution of web-search-engine-index.

Open letter to the developer and supporters of free p2p technologies.

I am developing a search engine spidering www for ogg and mp3 files, at the time I am writing this is c. 70 000 files indexed.

This is a brief discussion of ways to distribute this index in an unhierarchical way built over existing p2p networks, it is submitted here as a request for opinions on how it could be implemented without violating and/or disturbing protocol specifications and usability of the networks.

The motivation for implementing this is that many unsigned artists releases their music on the internet but those files are rarely available at p2p networks. This design is also in contrast to central servers since it will take too much computer power to provide such service to the whole worlds p2p-networks.

The spidering and updating of index is centrally made by our servers, with an engine released under GNU/GPL an sourceforge. It could be done in an unhierarchical way but we insist that p2p-clients must remain free from such addons and only implementing things that directly benefits the user.

An conceptual approach on the problem:
The index is split in many small files (ex. 100) which will take c. 300kb each, the index is ordered by the artist name and the client searching for an artist downloads the meta-file containing an url to it and additional redundant information about other artists. This makes it impossible to search for a particular song without knowing the artist but I consider it appropriate. The client shares that meta file and makes in this way it available for more users. The wanted targetfile is downloaded via http.

Inconvenience
This might cause a dissonance with clients that not is designed with this use in mind, since these files containing the index will be viewed by people browsing the host expecting to find "real" files and not a set of meta data.

Worth to consider implementing or not?

Sincerely Johan Mattsson

You can find a part of the index here: http://openmusic.op.funpic.org/catalog/
Reply With Quote