![]() |
| | |||||||
| Register | FAQ | Members List | Calendar | Arcade | Search | Today's Posts | Mark Forums Read |
| General Gnutella Development Discussion For general discussion about Gnutella development. |
| Welcome To Gnutella Forums You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content, fun aspects such as the image caption contest and play in the arcade, and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! (click here) If you have any problems with the registration process or your account login, please contact us. Your email address must be legitimate and verified before becoming a full member of the forums. Please be sure to disable any spam filters you may have for our website, so that email messages can reach you. Note: Forum Feedback email option is not seen, so if you have an issue with registration, etc., send a Personal Message (PM) to one of the active Administrators: Lord of the Rings or Birdy. Once registered but before posting, members MUST READ the FORUM RULES (click here) and members should include System details - help us to help you (click on blue link) in their posts if their problem relates to using the program. Whilst forum helpers are happy to help where they can, without these system details your post might be ignored. And wise to read How to create a New Thread Thank you . Uw e-mailadres moet wettig zijn en verifiërde alvorens een volwaardig lid van de forums te worden. Gelieve te zijn zeker om om het even welke spamfilters onbruikbaar te maken u voor onze website kunt hebben, zodat de e-mailberichten u kunnen bereiken . Votre email address doit être légitime et vérifié avant d'aller bien à un membre à part entière des forum. Veuillez être sûr de désactiver tous les filtres de Spam que vous pouvez prendre pour notre site Web, de sorte que les messages électroniques puissent vous atteindre . Ihr email address muss gesetzmäßig und überprüft sein, bevor es ein vollwertiges Mitglied der Foren wird. Seien Sie bitte sicher, alle mögliche Spamfilter zu sperren, die Sie für unsere Web site haben können, damit E-Mail-Nachrichten Sie erreichen können . Su email address debe ser legítimo y verificado antes de sentir bien a un miembro de pleno derecho de los foros. Esté por favor seguro de inhabilitar cualquier filtro del Spam que usted pueda tener para nuestro Web site, de modo que los correos electrónicos puedan alcanzarle . Seu email address deve ser legítimo e verific antes de assentar bem em um membro integral dos fóruns. Seja por favor certo incapacitar todos os filtros que do Spam você puder ter para nosso Web site, de modo que os mensagens de correio electrónico possam o alcangar. . Din e-post tilltalar måste vara legitim och verifierat för passande en full medlem av forumen. Behaga är säkert att inaktivera någon spam filtrerar dig kan ha för vår website, så att e-postmeddelanden kan ne dig. . Il vostro email address deve essere legittimo e verificato prima di stare bene ad un membro titolare delle tribune. Sia prego sicuro rendere invalidi tutti i filtri che dallo Spam potete avere per il nostro Web site, di modo che i messaggi di posta elettronica possono raggiungerli. . Η διεύθυνση ηλεκτρονικού ταχυδρομείου σας πρέπει να είναι νόμιμη και ελεγγμένη πρίν γίνεται πλήρες μέλος των φόρουμ. Παρακαλώ να είστε βέβαιος να θέσει εκτός λειτουργίας οποιωνδήποτε φίλτρα spam που μπορείτε να έχετε για τον ιστοχώρο μας, έτσι ώστε τα μηνύματα ηλεκτρονικού ταχυδρομείου μπορούν να φθάσουν σε σας. . Ваш адрес электронной почты должен быть правомерен и подтвержен перед идти действительным членом форумов. Пожалуйста уверен вывести все фильтры из строя спам вы можете иметь для нашего вебсайта, так, что сообщения по электронной почте смогут достигнуть вас. . 您的电子邮件必须是合法和核实在适合论坛的一个正式成员之前。 请务必使您可以为我们的网站有的所有发送同样的消息到多个新闻组过滤器失去能力,因此电子邮件可能到达您 . あなたの電子メールアドレスはフォーラムのフールメンバーに似合う前に正当、確認されなければならない。 電子メールメッセージが達することができるようにあなたが私達のウェブサイトのために持つかもしれないスパムフィルターを不具にすること確実がありなさい。 Hilfe in Deutsch, Ayuda en español, Aide en français, Hulp in het Nederlands Forum Rules Support Forums Before you post to one of the specific Client Help and Support Conferences in Gnutella Client Forums please look through other threads and Stickies that may answer your questions. Most problems are not new. The Search function is most useful. Also the red Stickies have answers to the most commonly asked questions. (over 90 percent). If your problem is not resolved by a search of the forums, please take the next step and post in the appropriate forum. There are many members who will be glad to help. If you are new to the world of file sharing please do not be shy! Everyone was ‘new’ when they first started. When posting, please include details for: Your Operating System ....... Your version of your Gnutella Client (* this is important for helping solve problems) ....... Your Internet connection (56K, Cable, DSL) ....... The exact error message, if one pops up Any other relevant information that you think may help ....... Try to make your post descriptive, specific, and clear so members can quickly and efficiently help you. To aid helpers in solving download/upload problems, LimeWire and Frostwire users must specify whether they are downloading a torrent file or a file from the Gnutella network. Members need to supply these details >>> System details - help us to help you (click on blue link) Moderators There are senior members on the forums who serve as Moderators. These volunteers keep the board organized and moving. Moderators are authorized to: (in order of increasing severity) Move posts to the correct forums. Many times, members post in the wrong forum. These off-topic posts may impede the normal operation of the forum. Edit posts. Moderators will edit posts that are offensive or break any of the House Rules. Delete posts. Posts that cannot be edited to comply with the House Rules will be deleted. Restrict members. This is one of the last punishments before a member is banned. Restrictions may include placing all new posts in a moderation queue or temporarily banning the offender. Ban members. The most severe punishment. Three or more moderators or administrators must agree to the ban for this action to occur. Banning is reserved for very severe offenses and members who, after many warnings, fail to comply with the House Rules. Banning is permanent. Bans cannot be removed by the moderators and probably won't be removed by the administration. The Rules 1. Warez, copyright violation, or any other illegal activity may NOT be linked or expressed in any form. Topics discussing techniques for violating these laws and messages containing locations of web sites or other servers hosting illegal content will be silently removed. Multiple offenses will result in consequences. 2. Spamming and excessive advertising will not be tolerated. Commercial advertising is not allowed in any form, including using in signatures. 3. There will be no excessive use of profanity in any forum. 4. There will be no racial, ethnic, or gender based insults, or any other personal attacks. 5. Pictures may be attached to posts and signatures if they are not sexually explicit or offensive. 6. Remember to post in the correct forum. Take your time to look at other threads and see where your post will go. If your post is placed in the wrong forum it will be moved by a moderator. There are specific Gnutella Client sections for LimeWire, Phex, FrostWire, BearShare, Gnucleus, Morpheus, and many more. Please choose the correct section for your problem. 7. If you see a post in the wrong forum or in violation of the House Rules, please contact a moderator via Private Message or the "Report this post to a moderator" link at the bottom of every post. Please do not respond directly to the member - a moderator will do what is required. 8. Any impersonation of a forum member in any mode of communication is strictly prohibited and will result in banning. 9. Multiple copies of the same post will not be tolerated. Post your question, comment, or complaint only once. There is no need to express yourself more than once. Duplicate posts will be deleted with little or no warning. 10. Posts should have descriptive subjects. Vague titles such as "Help!", "Why?", and the like may not get enough attention to the contents. 11. Do not divulge anyone's personal information in the forum, not even your own. This includes e-mail addresses, IP addresses, age, house address, and any other distinguishing information. Don´t use eMail addresses in your nick. Reiterating, do not post your email address in posts. This is for your own protection. 12. Signatures may be used as long as they are not offensive or sexually explicit or used for commercial advertising. 13. Failure to show that you have read the forum rules may result in forum rules breach infraction points or warnings awarded against you which may later total up to an automatic temporary or permanent ban. Supplying system details is a prerequisite in most cases, particularly with connection or installation issues. Violation of any of these rules will bring consequences, determined on a case-by-case basis. Thank You! Thanks for taking the time to read these forum guidelines. We hope your visit is helpful and mutually beneficial to the entire community. |
| | LinkBack | Thread Tools | Display Modes |
| |||
| Quote:
|
| |||
| Quote:
As for the number of atoms in the universe.... I don't think that number is even close. Whatever scientist came up with that number is on drugs.
__________________ - SJF |
| |||
| The number of atoms in the universe is about 10^78. You can estimate this number by counting galaxies, measuring how bright they are and estimating how big their mass is. You don't need any drugs for that. |
| ||||
| I just had a conversation on irc .. someone had a good idea, maybe some of you have heard it before. Anyway, the idea is this: hash the first meg of the file as well as the whole file. So that way you can tell that 'buffy vampire.divx' 20M is the same file as 'buffy vampyre.divx' 80M, and get at least the first 20M. Then you repeat search later for files with first meg hash = x. To implement this most reliably and sensibly would require instead of the HUGE proposal's technicque of always and only hashing the whole file, the best implementation would be to have a query 'please hash the file from range x-y'. This shouldn't be totally automated .. because someone might have a text file which includes a smaller text file that should be considered complete .. eg they may have tacked some personall notes onto the end of some classic document. You probably don't want the extended version, so a user control button is needed 'Find bigger files whaich start off the same' or not. In fact a really good implementation (but not necessary for each client to implement for it to work, as long as clients suppor the 'hash this part of the file please' extension, would be the one suggested below: <Justin_> bigger or smaller <Justin_> or have a slider hehe <Justin_> the way md5sum works having 100 sums is not that intensive to make right? <Justin_> cause its incremental no? <Justin_> so if you had a slider in the program, that starts at 100%, that you can lower by 10% incremnts to find more files <Justin_> as in, the default is files that match 100%, or files that match at the 90% mark, well not % it would have to be 10M intervals Having the ability to reuest hashes for arbitrary portions of files would additionally make their use for verifying contents reliable - if someone could generate two files with the same hashes (or when this happens randomly) simply checking the hash for a given subportion would detect the difference. Nos ---------- Quote:
Certainly we are not going to generate each version of a 1G file that is possible .. ever (well, unless some pr!ck sits down in the far future and does it on purpose as a programming exercise using some newfangled superdupercomputer we can't even imagine yet .. but I stray from the topic). We do need a hash that has enough values that <i>most probably</I> each individual file we generate will have a unique value .. but it can't be known for sure unless you actually generate the hash for each file (ie generate each file). Hashes are funny things. (I'm still searching for a good reference to back that statement up .. but don't have time to find right now .. see later posting.) I think if you look at the file size and the hash, you have enough certainty to call it a definite match in searching for alternate download sources. Better techinuqe described above in first portion of post. Quote:
<A HREF="http://groups.google.com/groups?q=number+atoms+universe&hl=en&scoring=r&sel m=4kc1fu%24gej%40agate.berkeley.edu&rnum=1">This</A> will do as a reference - at least the guy has the word 'physics' in his email, as well as the word 'berkely'. I couldn't be bothered checking any more thoroughly than that. Nos <I>[Editted 14-04-2000 to add URL reference for atom count left out of initial post]</I>
__________________ <P><I>"It has served us well, this myth of Christ"</i> -- Pope Leo X <HR WIDTH=60% ALIGN=LEFT> Last edited by Nosferatu; April 14th, 2002 at 12:01 AM. |
| |||||
| Quote:
Quote:
Quote:
Quote:
Quote:
__________________ - SJF |
| ||||
| Quote:
http://rfc-gnutella.sourceforge.net/...-huge-0_92.txt or http://rfc-gnutella.sourceforge.net/...-huge-0_93.txt . Perhaps it was discussed and then dropped .. ? Got a reference?I found http://bitzi.com/ propose/use <A HREF="http://bitzi.com/developer/bitprint">tiger-tree </A>as an attempt to index as many files as they can .. looks like a good project to incorporate into gnutella clients - have a bitzi index lookup. Also found the <A HREF="http://www.cs.technion.ac.il/~biham/Reports/Tiger/">Tiger Hash algorithm homepage</A> and the <A HREF="http://sourceforge.net/projects/tigertree/">tiger-tree homepage</A>. Unfortunately between these three sources I can't find a description of the tiger-tree process in words I can understand. <A HREF="http://bitzi.com/developer/bitprint">"TigerTree is based on the Tiger hash algorithm, applied to each 1024-byte block of a file, then combined up through a binary hash tree to a final summary value"</A> really doesn't cut it for me. Anyone know what it means? They imply that it can be used for incremental portions of the file .. but I don't understand the process. These bitzi guys are JUST doing hashing of files, and are covering any files you care to name .. so they probably have thrashed this issue out pretty well. Also, if there aren't competing schemes to index all of filespace, then it really makes a lot of sense to use their hashing scheme so that you can link in and allow users to search bitzis index to see what it has to say about what the user receives in their search results. I think this is a <B><I>really exciting idea</I></B>. Could save a lot of bandwidth downloading broken mp3s etc, for example. Quote:
Nos
__________________ <P><I>"It has served us well, this myth of Christ"</i> -- Pope Leo X <HR WIDTH=60% ALIGN=LEFT> |
| |||
| Quote:
Quote:
Code: A B C D
\ / \ /
E F
\ /
\ /
G
I hope this makes a shred of sense, it's in the early morning as I'm writing this and my brain is falling asleep. Besides that, I can't seem to find the reference material I got this from. Quote:
__________________ - SJF |
| |||
| First, tiger is a hash algorythm just like md5 or sha1. "tree" describes a way of using that algorythm where segments of the file are hashed individually. The tigertree implementation used by Bitzi uses 1024b blocks (though they could use any size). I have no evidence but I think that around 1mb would be the best. The tree hash is the best way to share partial files. A tree hash can use any hash algorithm (ie, md5, sha1, tiger, ect). Small chunks of the file are individually hashed and all of these hashes make up the tree hash. Because of this you could set it so that there is a hash for every 1mb of the file. Then you could securely and confidently download partial files of 1mb size from multiple hosts with partial files. An added bonus of the tree hash method is the ability to resume from nearly identical files. For example: I want to download songx, so I search for it and find it. There are quite a few versions with the same size, bitrate, ect but they have different metadata so the hash is different. Well, with the tree hash you could use all of those sources to swarm from for the parts of the file that are the same!!! This would greatly increase swarming speeds while providing the same security and confidence we currently have with hashed files! |
| ||||
| Hmm .. hash trees ... What I have understood from searching the web: Tree-hashes are also known as Merkle Trees. The idea was <A HREF="http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=/netahtml/srchnum.htm&r=1&f=G&l=50&s1=%274,309,569%27.WKU.&O S=PN/4,309,569&RS=PN/4,309,569">patented in 1979</A>, but I read somewhere the patent ran out in 1999. The tree works like this: Hash tree <PRE> (Whole File) | /\ / \ / \ (m) (n) / \ / \ / | | \ (i) (j) (k) (l) / | / \ / \ / \ (a)(b)(c)(d)(e)(f)(g)(h) </PRE> You get parts a and b of a file. You get the hash of the entire file, and the hash values for only j and n, and you can verify that a and b are part of the file by generating x, then n, then with p, the whole file hash. But in gnutella you wouldn't do this - it's absolutely pointless. For it to work for all parts of the file, all values in the tree hash need to be stored centrally where you can get them to check. If you have available an index (<A HREF="http://bitzi.com/">bitzi</A> only stores the whole file hashes) you would just download the correct hash for section x and check it. I can't see a feasible way to make that aspect work without a central server storing all the intermediate hash values, otherwise you might just as well do this: If you didn't use a tree, you might store all the values a-h and download and check each one individually. For a big file this is a lot of downloading and checking. So you might make a hash of all the hashes together and store that value - but that is useless unless you have downloaded <I>all</I> the sub parts. So the tree covers these in-between cases. BUT you need to have all these values available somewhere for verification for it to be useful, ie if you find only parts a, c and n, you still need someone to tell you the correct hashes for b and d and the whole hash to verify the whole hash. The idea is that you can download part of a file from someone who doesn't know what the whole file is, so the person you're downloading the file portion from might not know what the whole file looks like, so you have to find the info from someone else. Now, to set up a service like Bitzi storing all the subtree values, I guess the storage would blow out. That's obviously why they don't do it. And I can't see a sane way to track all these sub-portions within the gnutella protocol. I guess you could automatically make lists and store them in config files .. then automatically share them out as virtual files and automatically query for the virtual files .. but it sounds like a big headache to me. Another option is calculating them on request, but this seems .. bad too. So the hash tree idea doesn't seem helpful to me (except in maybe an all-out swarming effort .. which seems like too much work for not enough benefit at this stage). Can anyone point out something I've missed? Is anyone implementing this as a way to do swarmed downloads? I'm back to thinking that the easy and reliable solution which works is just query for a hash of a given byte-range. This has an additional benefit I didn't mention before, you could ask for hash of mp3 files being offset by 64 bytes or 64k or whatever size the id3 tag is. Then files which are the same except for meta data could be easily found: Search for files matching words blah foo bananas. Query hash of file "foo bananas go blah.mp3" 2.35M offset by 64 bytes or whatever it is. Same with file "foo bananass goes blah.mp3" 2.37M. They match! Queue both as alternative downloads! Unfortunately "bananas go blah foo.mp3" 2.34M turned out to be a different file (must be 160bit or something ;P ) Nos <I>[Editted 15 Apr 2002 to clean up drawing - sorry pic looks OK in (gak!) IE but maybe not in <B>your</B> browser]</I>
__________________ <P><I>"It has served us well, this myth of Christ"</i> -- Pope Leo X <HR WIDTH=60% ALIGN=LEFT> Last edited by Nosferatu; April 14th, 2002 at 10:16 PM. |
| |||
| No need for a central server. Who ever is hosting (sharing) the file keeps the whole file hash as well as as 1mb incremental hashes. These are stored just like the sha1 for HUGE. Then if I start a download fromyou I get that hash info. Now I can use it to search the gnet for the other parts, even if they are partial files, to swarm from. |
| |||
| What would be better is if a method were used where one could determine the component hashes by disassembling the full file hash. Then, only the full file hash would need to be sent when requesting a file. I suppose that may be asking a bit much though.
__________________ - SJF |
| |||
| just a bit ;-) Yes, this would be great. But a downfall might be that AN actual set of data might be able to be calculated that would match such a hash and then it would be possible to create fake data with the same hash and screw up the dls on gnet. |
| ||||
| It's another way of doing it, but I didn't mention it because basically it's not. You just make the hash for the whole file the concatenation of the hashes for the parts. It means that either you select parts as being pretty big compared with the size of the whole file, or you end up with a long hash. Nos
__________________ <P><I>"It has served us well, this myth of Christ"</i> -- Pope Leo X <HR WIDTH=60% ALIGN=LEFT> |
| |||
| After talking with gordon from Bitzi I think tree hashes are overkill. Instead you could simply hash ranges of the file with sha1. This could be done in 1mb chunks. So basically all files would be hashed 2x. Once for a full file hash, and once where a hash is generated for each 1mb portion of the file starting from the begining. Since the the file will not be an exact multiple of 1mb the last hash may be of a segment shorter than 1mb. I dont have any basis for choosing 1 mb of course. A bit of trial and error would be needed to optimize the system. Anything larger than 1mb, say 5mb or 10mb would be good for large files but would not provide the benefit, esp meta data benefits, for small files such as mp3s. Does anyone know more about meta data, is it always stored at the end of files, even for videos ect? |
| Thread Tools | |
| Display Modes | |
| |
| | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Gnutella Protocoll v0.7 Proposal | Moak | General Gnutella Development Discussion | 41 | August 17th, 2002 11:55 AM |
| gnutella development plans | Iamnacho | General Gnutella Development Discussion | 11 | March 9th, 2002 07:21 PM |
| My Proposal for XoloX!!! | Unregistered | User Experience | 1 | February 6th, 2002 09:11 AM |
| Xolox and Gnutella development | Moak | Rants | 6 | November 25th, 2001 07:05 AM |
| ---a Radical Proposal--- | Unregistered | General Gnutella / Gnutella Network Discussion | 0 | September 21st, 2001 01:08 PM |