![]() |
Gnutella Protocoll v0.7 Proposal Moak's protocoll v0.7 proposal (REVISED, see rev 1 posting below, click here) * Take v0.6 handshaking and leave the 3rd step away ([1]). * Replace GIV with a real HTTP style PUT/POST header * Document GUID tagging * Rename Ultrapeers back to 'Superpeers' or 'Superservants' * Use ISO Latin 1 character set (ISO 8859-1) in search queries/queryhits/HTTP filenames. * Think about an Unicode addon, for world wide file trade (including Asia)! Hope you like it. Comments, more ideas? Moak :) PS: Okay some explanations are needed. Take a look into current 0.6 handshaking in [1] [2]. The client initates a connection together with HTTP-style headers, the other side answers with an OK and more HTTP-style headers. So far so good (great idea and very flexible)! Then there is this 3rd step (before sending binary data stream), unnecessary I think: the client sends another OK. I see no purpose for the 3rd step, kick it? Here is a sample interaction between a client and a server with a new v0.7 handshaking: Code: Client Server In the very rare case the servants need to shake hands once more and exchange more information (does it ever happen?), they can just continue with another "GNUTELLA/0.7 200 OK" round, but they don't need to. Suprised? This works because the binary message stream will never start with "GNUTELLA" (the first 16 bytes will be the GUID in a descriptor header, which byte 9 is allways 0xFF following the modern GUID-tagging rules, pseudo code: guid[8]=0xFF). (Revised, see rev 1 posting below) Next point, replacing GIV with PUT or POST, see [3] [4]. Should be self explanary, the HTTP protocoll [5] allready provides a "upload" functionality... we can use this instead of cooking our own HTTP. For sure Gnutella's Push descriptor is still needed, no change here. Next point, document GUID tagging as mentioned in [6]. New developers need to have a full documented protocoll to contribute. Next point, umm, Giga-, Hyper- and Ultrapeers [7]. Can we just forget about marketing and name them simply 'Superpeers' or 'Superservants' please. :-) Only the name 'Supernode' is trademarked (Kazaa/Morpheus/Grokster use only this term on their webpages, eDonkey uses the term 'server'), everything else is free. AFAIK good old Clip2 was first mentioning superpeers or refelectors. Last point, character set. I think we still have none and Gnutella is world wide, eeks! In a first step I suggest to orientate at the web and use ISO Latin 1 (ISO 8859-1) at least, US_ASCII or "wild wild west" isn't enough [8]. Okay, ISO Latin 1 is very selfish, I live in Europe and this will fit my needs. We also should think about a Unicode alternative, perhaps together with file hashs this will perfectly fit into a new HUGE proposal [9]. PPS: Thx to Tamama and Mike Green (Emixode), #gnutelladev [1] Gnutella v0.6 Handshake Summary - http://www.gnucleus.com/research/connect.html [2] Gnutella v0.6 Handshake - http://groups.yahoo.com/group/the_gd...ing%20Protocol (Yahoo account required) [3] Gnutella V0.4/0.6 File Transfer Summary - http://www.gnucleus.com/research/transfer.html [4] Gnutella protocoll specification v0.4 revision 1.2 - http://www.clip2.com/GnutellaProtocol04.pdf [5] HTTP/1.0, RFC 1945 - http://www1.ics.uci.edu/pub/ietf/http/rfc1945.html [6] Gnutella GUID tagging - http://groups.yahoo.com/group/the_gdf/message/1397 [7] "Ultra"peers - http://groups.yahoo.com/group/the_gd...ltrapeers.html (Yahoo account required) [8] ISO 8859-1 character set - http://www.htmlhelp.com/reference/charset/ [9] "HUGE" - http://groups.yahoo.com/group/the_gd...roposals/HUGE/ (Yahoo account required) |
some more thought ? Without the 3rd step in connecting to a server additional requests are very hard(read, non-conform.. like the HTTP/1.0 and GNUTELLA/0.4 on the same port.) to do. Thus one could see this as a 'future guaranty' that extensions could be implemented. However what would a server do if it would get a bad(read, unknown) header? Right!, usually they disconnect. This needs to be more documented. However you most likely do not need the additional requests because in the two headers that are send to and from the server is all the information one would probably ever need. If one could think of a request that needs to be done outside the initial 2 handshake headers, please post them here :) The Uni code character sets could be negotiated in the header. However if not all clients support unicode then searches might come out wrong (or worse, block) at a given server in the network. Also unicode is 2 bytes, so it will add extra overhead on the network. Example: A doesnt do unicode B does C does A - B - C B needs to 'downgrade' search requests and 'upgrade' searchhits from A to B or C. This would be very hard for example chinese to standard ASCII. Some more thought might be usefull, but I like the idea :) I didnt even know about the 0xff at byte 9. It serves no real long time purpose either. New clients will simply use the new standard, and they can be just as bad at routing (read, none) and ping/pongs as the old clients. Some non-yahoo group (emailed) based paper on these GUIDs would be nice indeed. Tam |
Re: some more thought ? Hi Tamama! > Without the 3rd step in connecting to a server additional > requests are very hard [...] to do. Hmm, really? You can just repeat the first steps again and again (if you really need it, 4 steps, 6 steps, 8 steps...). So we are flexible here. In most cases two steps will be fine, I guess. The transition from connect-sequence (HTTP-headers) to binary messages (Gnutella descriptors) could be easily detected: If there is no further string "GNUTELLA/", the binary data stream starts and will run until disconnection of that peer. (Revised, see rev 1 posting below) > However what would a server do if it would get a bad [...] header? > Right!, usually they disconnect. They ignore it as far as I understood the v0.6 handshaking? Each side tells what it supports or wants (2 sides = two steps), then they start to talk Gnutella. If one side doesn't know a feature of the other side, it ignores it. When both sides supports a feature (e.g. 'Query Routing'), then they can use this feature. It's similar to what happens when a v0.4 client connects a v0.6 client. The old client will connect with "GNUTELLA CONNECT/0.4<lf><lf>", the modern client will recognize this and not continue with a 0.6 connect sequence, instead use the "common language". > If one could think of a request that needs to be done outside > the initial 2 handshake headers, please post them here :) Dito :) > B needs to 'downgrade' search requests and 'upgrade' searchhits from A to B or C. > This would be very hard for example chinese to standard ASCII. Yep, veeeery hard! *g* Therfore I suggested to use only ISO LATIN 1 as a default, no negotiation in the connection header! But within Query/Queryhits a asian client might want to use Unicode (e.g. chinese characters). All other asian clients will understand and could reply... non asian clients don't understand (they don't have asian files), but can route without problems. This type of communication does perfectly fit into the HUGE proposal. With a unicode standard all world wide servants can communicate and speak the same language. Actually also non asian clients could be part of a world wide swarming, when they understand UNICODE. I also prefer ISO_LATIN 1 as default, because it is a one byte code per character, while UNICODE is a multibyte code! UNICODE as default would unnecessarily increase the allready high broadcast traffic. > Some non-yahoo group (emailed) based paper on these GUIDs would be nice indeed. Hehe, yep. At least the required Yahoo account is very anoying IMHO and doesn't help to attract new developers. Greets, Moak |
ok after some talking etc etc [] = optional --- = seperator, in between lines are \r\n seperated ... = repeat client: --- CONNECT GNUTELLA/0.7 User-Agent: FastFinder 3.4 ultrapeer: 0.1, 0.2 Gnutella: 0.7 [more header-information] --- server: --- GNUTELLA/0.7 OK All dandy and fine nice mister client User-Agent: Peerhana 0.5 ultrapeer: 0.2 ultrapeer: 0.3 Gnutella: 0.7, 0.8, 0.9 [more header-information] --- <binary data starts> As you can see header fields are used to tag _all_ allowed extensions. The extensions that are common are automatically activated. header fields are build up as the following: <name> : <content> <name> = <string> <content> = <string [,<string>...] <string> = array of characters not containing special characters like \r\n : , etc. This allows for shorter headers as version numbers can be put on 1 line Tam |
Moak's Gnutella Protocol v0.7 Proposal rev 1 - March 2002 Hi, talking with Tamama, Emixode and Bmk I found it necesarry to overwork the proposal a little bit. :-) Main goals for a new 0.7 protocol are: - full HTTP alike connections, to achieve an easy parser - straight and simple connection sheme, easy and fast - very flexible desgin for later needs, avoiding a 0.8 in near future *g* - don't hurt old v0.4 servants, let them still be operable Appendix A: Extended HTTP style connecting First of all we make the connection sequence full HTTP like (to use the same parser on all incoming connections). I suggest to use 'CONNECT REQUEST GNUTELLA/0.7', the response 'GNUTELLA/0.7 200 OK' is allready correct. Then I suggest a slightly different connection handshake to what I wrote in the original post. A v0.7 servant will implement too alternatives: * 1st handshake alternative: Fast-Connect Overview: Default handshake, fixed 2 steps (may be repeated in case of error codes) Here is a sample interaction: Code: Client Server Comments Two exceptions: a) If the server does not respond with '200 OK' the same procedure is started again (perhaps the server disconnects, but it doesn't need to). b) If the client or the server sends an Header 'Full-Handshake: Yes', they _have_ to continue with the full handshake (described next). * 2nd handshake alternative: Full-Handshake Overview: Optional, minimum 4 steps (+ 2*x steps) Here is a sample interaction: Code: Client Server Comments Why doing all this? The whole connection sheme was designed for easy parsing + effectivity + flexibility. The technical background: While we mostly will use the first intuitive two-step handshake, we introduced the optional flexible handshake for future/extended use (currently we do not need it, especially if someone wants to transfar proprietary data then let those clients deal with it, proprietary is nothing that a handshake should care about, better improve the protocol for the sake of everyone). Especially the second alternative may look oversized, but after long discussions I think this is the way to fit all needs and it's highly HTTP orientated (HTTP we allready use and it promises to be a robust and successfull design). Developers, plz play arround with all posibilities you could imagine, this design should hopefully make your code easy & flexible. Summary: A V0.7 client will understand the following HTTP and GNUTELLA methods: GET, PUT, CONNECT, START (and GIV, GNUTELLA). GET for downloading files - e.g. 'GET /get/1283/gnutti.deb HTTP/1.0' PUT for uploading files (was GIV in v0.4) - e.g. 'PUT 1283:72814A49E69D0F43AAB400/gnutti.deb HTTP/1.0' GIV for backwards compatibility only, see PUT - e.g. 'GIV 1283:72814A49E69D0F43AAB400/gnutti.deb HTTP/1.0' CONNECT to initiate a connection handshake - e.g. 'CONNECT REQUEST GNUTELLA/0.7' CONNECT to continue a started connection handshake - e.g. 'CONNECT HANDSHAKE GNUTELLA/0.7' GNUTELLA for backwards compatibility only, see CONNECT - e.g. 'GNUTELLA CONNECT/0.4' START to end the handshake and start the binary data stream - e.g. 'START BINARY GNUTELLA/0.7' Notes: Do not continue an established handshake with CONNECT REQUEST (this will reset headers and start over), use CONNECT HANDSHAKE. You should not add proprietary data to a Fast-Connect, it would be ignored by all other clients and wastes bandwith, instead take a look at the Full-Handshake example. Some geek notes: Our Gnutella specification consists of HTTP 1.0 methods (GET, PUT) and GNUTELLA specific methods (e.g. CONNECT, START), all are full HTTP like. Our Gnutella specific methods allow multiple headers, so they actually look more like a HTTP 1.1 request. We define 'Content-Length=0' if the Conten-Length header is not given in Gnutella specific methods (Tamama *g*). This avoids adding 'Content-Length=0' to every CONNECT|START request, since we have no content/payload here. Response codes are typcial HTTP like responses including status code - e.g. 'GNUTELLA/0.7 200 OK'. [1] Appendix B: New GUID-Tagging We introduce a new GUID-Tagging style, first the documentation then additional explanations: * Byte 16 is used to show the highest supported protocol version, pseudo code: guid[15]=0x07 in this protocol version (up to v25.5 is possible with 8 bits, every value below 4 is treated as v0.4). * Byte 9 is used to sign if the peer supports special features, bitcoded, low active, 0xFF means no features (backwards compatibility with older clients). The coded features in those bits are important for routing/broadcasting messages: Bit 0: Peer acts as superpeer (important for network structure) Bit 1: Peer acts as proxy/tunneling peer (important for firewalled or NAT-routed hosts) Bit 2: Peer does understand and request metadata Bit 3: Peer does understand and request file hash Bit 4: Peer does understand and use UNICODE (UTF-8 encoded) Bit 5: Peer does have chat support (important for community idea) Bit 6-7: reserved The 2 higest bits are reserved/locked and should be allways 1! Again backwards compatibilty, very old clients had a 10XXXXXX setting, which we have to avoid. The default for current v0.4/v0.6 clients is 11111111 which is also our default = no features. Why using (oldfashioned) GUID tagging over Handshaking? No, it's a new idea behind and both work together. GUID-features are routed along the network (tell status features a far away host), while handshaking features can only used with the direct connected hosts. You can shake hands with direct connected hosts, but not with a hop=5 away peer. Mainly the GUID feature bits are used to avoid unnecessarily routed/broadcasted messages, e.g. only when a peer understands metadata you need to send him metadata. When a peer just searches for alternative download locations (searching with a hash), the other peers should not answer with metadatas to save bandwith. Secondarily GUID protocol version provides a faster connect. A host cache will receive a PING from any host connecting, it will answer with collected PONGs. [2] Those PONGs will now carry the Gnutella protocol version of the servant inside the GUID. E.g. you can now directly connect an old 0.4 servant with the old protocol or only connect to superpeers. Connecting is faster. Notes: Feature bits should be discussed, before finishing this protocol version. To detect client version and features use the following pseudo code (this is NO 100% guarantee!): int protocol = 4; // fallback to v0.4 struct { unsigned reserved : 2; unsigned bChat : 1; unsigned bUnicode : 1; unsigned bFileHash : 1; unsigned bMetadata : 1; unsigned bProxyTunnel : 1; unsigned bSuperpeer : 1; } features = 0; //no features yet if((guid[8] & 0xC0)=0x80) protocol=4; else if(guid[15]>=0x04) protocol=guid[15]; if(protocol >= 7) features = guid[8] ^ 0xFF; Appendix C: Character set and UNICODE expansion While talking with some other developers, I found out that my idea was too short/confusing, here is a more detailed suggestion: * First use ISO_LATIN 1 as basic charset in all HTTP header and/or Gnutella strings (propably you allready did this). This won't hurt anyone and will fit most European/American/Australien needs to transport national special characters (e.g äåæ). But... and this is a big "but", this character set will not fit asian, russian needs and more languages. While Gnutella messages are routed through various servants it is impossible to translate one chracter set into another, the solution is UNICODE. * Second use optional UNICODE in Gnutella strings (Query/Queryhits), if you really need it. Unfourtunately Unicode is a two byte code per character + Unicode contains zero bytes (A = 'a', '\0') which is a problem too. So Unicode must be encoded, we use UTF-8 (see later post for details). Why not using Unicode allways? Oops, the binary datastream would not be backwards compatible with v0.4 and it would also blow up Query/Queryhit unnecessarily compared to ISO_LATIN 1 (1 byte vs multibyte characters, bigger size = bigger query traffic, eeks). Why not using another character set than ISO_LATIN 1? Whatever you choose it wouldn't fit a world wide need. Do we need to negotiate UNICODE in the connect handshake? No, since UNICODE is routed in Query/Queryhits it will travel the whole network mesh and is not affected by direct connected hosts. Instead we use the GUI-tagging (see above). Notes: The two step solution above is my suggestion, perhaps is is better to skip Latin1 completly and use UTF-8 only (topic of further investgations). With the two step solutions all clients speak a common character set (ISO_LATIN 1), very easy to implement, the UNICODE (UTF-8) is optional for clients which want to offer an extended language support or do not use ISO_LATIN1 codepage on their system. While UNICODE (UTF-8) is encoded inside the Query/Queryhit every "basic" client will route it without any problem. Okay, an old client will not understand it, but that is no problem, e.g. a non asian client will usually have no asian files. I might post a new proposal for a "specialized horizons" header. This allows to group similar interests and traffic together + keep in touch world wide. In this special context here it will help to route e.g. asian unicode primary between ppl offering asian content. Appendix D: Backwards/Upwards compatibilty Backwards compatibility with incoming protocol v0.4 is full guranteed. All incoming v0.4 clients should be answered with the simple v0.4 handshaking. Outgoing connections should be started depending on the GUID tagging, however if v0.7 doesn't suceed try v0.4. About v0.6 backwards compatibilty, while the handshake is incompatible with v0.7, I suggest to make this decision client vendor dependent, if you wanna support v0.6 just do it, it's optional. Here is a sample handshake, old client knocking at the door: Code: Client Server Comments Here is a sample handshake, newer client tries to connect: Code: Client Server Comments More stuff for brainstorming and documentation.... Let's collect and write down everything that was introduced since v0.4 and is fact in modern clients! I think about blocking webbrowsers (with Referrer header), what more? Let's think about a chat specification. Perhaps IRC based or IRC DCC for inbetween client communication, perhaps HTTP based. Integrate the Bye-Descriptor [3] into the v0.7 protocol. Make this message also full HTTP (pure text string, no byte codes) [1]. Perhaps we should also notify downloader and send them a new HTTP BYE message? Define clients research and retry behaviour. When a peer drops the connection, you want to resume/reconnect. Such a retry should be well defined to avoid "hammering" foreign IPs. Most important points at the end, a big TODO, workout and define protocol features for: superpeers, dynamic traffic routing, hashs, metadata and (friendly) anti-freeloading behaviour (we had various posts in this forum discussing those topics). Ah, don't forget to vote for 'Superpeer'... not Ultra-Hyper-Giga-Peer. :-) Keep Marketing and prorprietary ideas out of Gnutella! Comments, more ideas? Greets, Moak :) PS: I might summarize all ideas into one paper later, we still need to work out this proposal. Feedback is highly welcome! Send me a PM (private message) or meet me on IRC. :-) [1] HTTP/1.0 Status Codes, RFC 1945 - http://www1.ics.uci.edu/pub/ietf/htt...l#Status-Codes [2] Gnutella Host Caches - http://www.gnutellaforums.com/showth...&threadid=5807 [3] Bye Descriptor - http://groups.yahoo.com/group/the_gd...ls/BYE/Bye.txt (Yahoo account required) |
I've got a couple of questions: 1) Why won't you use HTTP 1.1 (it has the one or the other advantage)? 2) Does the gnutella 0.4 spec say anything about the charset to be used for gnutella messages? I already searched for files with äöü... and also found them with LimeWire. 3) Wie heißt die Transformers Episode mit dem Hasstaub auf englisch? |
1) We do not need it? Actually the new v0.7 Gnutella specific methods look more HTTP/1.1. 2) Protocoll v0.4 mentions no character set, they speak about "ASCII encoded" strings (= wild wild west, every chat above '\127' is undefined yet). Protocoll v0.7 defines ISO_LATIN 1, which includes US_ASCII as subset (backwards compatible). |
3) Transformers Volume 7: The Return Of Optimus Prime |
Appendix F: LAN Auto-find & Proxy configuration How about an "auto-find" feature for servents running in a LAN? The idea: Multiple servents running in a LAN could easily find each other and connect automatically. Perhaps on a LAN party or in a home network, servents will find each other without running an additional hostcache or manual configuration (plug'n'play). Similar to multiplayer games where you click on "LAN games" and then you'll see all running servers, no need to configure IP/port manually. Implementation: Each servent sends a UDP broadcast [1] on startup to the LAN (only internal LAN devices, not to the internet) and every servent will answer with an UDP "pong". The binary data within those UDP datagrams should be identical with Gnutella's ping/pong descriptor (for reusing the existing methods/code!). There's more... supplementary idea 2: It may usefull to define a servent which is the link to another network segment or to the internet, a superpeer with proxy/tunneling ability. All clients in the LAN will auto find this servent (LAN UDP broadcast again) and then use this superpeer all together to connect to the internet, instead of each and everyone connect on their own. I think this last feature is very important for two reasons: a) Internet bandwith is used more efficient and b) a superpeer proxy could increase download succes for people behind a NAT router. A NAT router usually allows no incoming connections, peers behind two different NAT routed networks can not share files (also not via Gnutella's push, see here). Such hosts are the majority following Limewire's host statistics [2]. A superpeer proxy in the LAN could be the missing link and reroute/tunnel those requests to the correct servent = more shared files, higher download rate. Implementation: Similar to UDP broadcasts above, a proxy servent will set the equivalent GUID features bits (bSuperpeer, bProxyTunnel) before answering with an UDP "pong". Additional handshaking headers are needed to negotiate proxying and tunelling [3] between clients and superpeer, topic for further investigations. Notes: Do also cross check with the 'Your-IP' handshake header in Appendix A, this header is used to tell the local peer (or tunneling proxy peer) the external IP of your NAT router (IPs could be dynamic since most ppl do not have a static IP). Furthermore see Apendix B, the GUID tagging Byte 9 Bits 0-1 define the mentioned superpeer and proxy/tunelling ability. Result: We achieve a full automatic LAN client find + automatic proxy configuration (elimination of today existing bottlenecks with NAT routers, traffic minimization). [1] UDP Broadcasts - http://tangentsoft.net/wskfaq/interm...html#broadcast [2] Limewires's Gnutella Network Size - http://www.limewire.com/index.jsp/size [3] Gnutella Tunneling - http://groups.yahoo.com/group/the_gd..._discusion.txt (Yahoo account required) |
will never be accepted "* Take v0.6 handshaking and leave the 3rd step away (see [1]). (Revised, see rev 1 posting below) * Replace GIV with a real HTTP style PUT/POST header * Document GUID tagging * Rename Ultrapeers back to 'Superpeers' or 'Superservants' * Use ISO Latin 1 character set (ISO 8859-1) in search queries/queryhits/HTTP filenames. * Think about an Unicode addon, for world wide file trade (including Asia)! " The problem with this is that it's not very backward compatable. Trust me there have been hundreds of these type of proposals, none have made it. The reason 0.6 made it was the it was backward combatible. It only involved the hadn shaking mechanism, so a servent could have both 0.6 and 0.4 connections. |
Re: will never be accepted It is as backwards compatible to v0.4, as v0.6 is to v0.4. Protocol v0.7 proposal rev1 is full binary transparent with other/older clients and interacts with them without any problem. E.g. features like GUID tagging, ISO charset and optional UNICODE are full compatible even with v0.4, 0.5. 0.6 and 0.7 clients. To step further I think also features like hashs, metadata, tunneling and superpeers should be fixed part of a new v0.7 protocol (I prepared as much as possible for those proposals, see GUID tagging). See the v0.7 protocol as a possible replacement for v0.6. The intention/goal in my eyes is a more standardized and better documented protocol -> a step further to a Gnutella RFC. I did also describe a upwards compatibility to a theoretical coming v0.8 protocol (see Appendix D). |
Re. Appendix F: LAN Auto-find PS: Gnucleus uses already UDP similar to suggested above. It is called 'Gnucleus LAN edition' and works without using an external Gnutella host cache. A qoute from their homepage: "Gnucleus LAN edition is really working great at colleges around the world, if your college is blocking gnutella, I suggest setting up a private Gnucleus LAN, I can find just about anything here on my 200 person LAN. In Germany there's a college with over 600 people using Gnucleus LAN, and in Ontario I got an email from someone on a gnucleus LAN of over 1,200 users!" Swabby, a documentation or comment would be nice. :) |
Quote:
It took over a year to get .6 on most clients, leave it alone, please. Do you want to program low level stuff or more features on the client side? Take existing code from gnuc and go from there, quit re-inventing the wheel and please use that brain of yours to make the user experience better (that's a compliment!) .6 works, you can send more info like you want to now in the headers and all this was talked about a year ago. Besides, keep it simple and don't send lots of useless junk in the headers. |
Hi anonymous, fearing improvements? Yeah, the best idea is I'll quit and go over to ...perhaps Freenet or eDonkey. Gnutella development is snail slow and some developer are astonishing closed minded for an open potocol, the commercial influence is increasing. If you don't see the necessity to envolve Gnutella and push it to a new quality... I see it and think it's so desperatly needed! Since months I pray for superpeers, hash, metadata, dynamic structure and improved routing of messages, to learn from other P2P systems... and especially for a common and well documented standart. I still have ideas, but I'm just a rebell rouser? The best that can happen to Gnutella in my eyes is a finished Gnutella RFC and stop of homebrew additions and v0.5/v0.6 chaos. Mentioned v0.6 protocol is just a few months old and a small step forward - right now just a new handshake and a construction site in heavy improvement (feel free to look behind the scenes and read the GDF message database). Sorry, that I try to help, analyze weak spots and suggest solutions. :-) Greets, Moak PS: Nobody forces you to use UDP. And Gnucleus is not my style of coding, then I prefer PEERanha (which I allready support). |
Quote:
--Max |
Skip UNICODE Skip Latin1 and use UTF-8 instead. |
UTF-8 = compatible By using UTF-8 the protocoll will stay compatible with current clients. UTF also is UNICODE (Unicode Transformation Format), it uses 1, 2 or 3 bytes to express a character. Null bytes do not occur. English is rendered using one byte, Russian or the special characters of German or French with 2 byte, Chinese with 3 byte. UTF-8 does entail higher traffic for Asian languages or other characters which need 3 bytes, but no pain no gain. And it will be compatible. If you swith over to Latin-1, then you'd get compatibility problems when later moving to UTF-8. So please, please do implement it now!!! You can catch a really worldwide user base with this! |
Re: UTF-8 = compatible Quote:
|
Broadcast and die! Quote:
<p>No no no no! Fine point: DO NOT USE BROADCAST! Use multicast with a TTL of 1. It should work over any normal ethernet LAN with no worse effect then a broadcast, and on a smart lan, it will avoid bugging unintrested hosts. Furthermore, on more multicast enabled networks, the TTL could be increased to span discovery outside of the local subnet. There is no reason to use plain broadcast anymore except lazyness. |
Doesn't work. How many LANs have an working multicast tunnel for Gnutella... UDP is the most simple and best working alternative, other protocols use it too. If an network admin needs to block broadcasts he can do anytime, no need or advantages from multicast. |
Quote:
<p> Broadcast NEVER cross subnets. Multicasts don't cross subnets if you don't have a multicast router (which is common, I agree) or you set your TTL to 1. The most common configuration today is a switched network that sniffs the IGMP membership messages and will only flood multicasts to interested hosts, thus reducing any potential negitave impact on the network. Sure, the multicasts can't span subnets on most networks today, but if you would bother reading what I frigging wrote you'd see that I was saying that they were simply a smarter then broadcasts, even in the TTL=1 case.. Even in the most old & crappy network, the multicast traffic will still reach all the hosts it has to (in such a network it will be functionally the same as broadcasts). <p>The IETF generally considers the use of broadcast to be depreciated for any new protocol (look at OSPF, broadcast would work fine, but why send the packets to unintrested parties).<p>Finally, most affordable switches don't give you the tools you need to block those broadcasts, you simply can't filter out all broadcasts as you need ARP on most networks. ;) <p> If you want to be ignorant, thats fine, but please don't influence protocol design in areas you obviously don't understand. |
A newcomer with attitude, plus unfriendly language. UDP is still the most simple and best working for an auto-LAN-find. |
Yup... |
Quote:
|
UTF, UNICODE technical (long) As to why the new protocol should extend to UNICODE, and why implement this using UTF-8: UNICODE aspires to define all characters of all languages. Right now, an address space of 2byte (about 64,000 characters) has been defined to cover most languages. This is being extended to 4bytes, but let's keep it at 2 bytes for now. UTF (more correctly UTF-8) as well as UCS are ways to express the 2byte-number (I skip the 4byte UNICODE) for a character. UCS simply is the number in 2bytes, thus it may contain null-bytes. Normally when talking about UNICODE, the UCS-2 (= 2 bytes) method of expressing UNICODE is being refered to. UTF or more correctly UTF-8 uses 1, 2 or 3 bytes to express the 2byte number for a UNICODE character. Null bytes do not occur. This works as follows: <table border=1 cols=4><tr><td>UNICODE character number range (in hex)</td><td>UTF byte 1 (in binary)</td><td>UTF byte 2</td><td>UTF byte 3</td><td></tr><tr><td>0000 - 007f</td><td>0xxxxxxx</td><td>(none)</td><td>(none)</td><td></tr> <tr><td>0080 - 07ff</td><td>110xxxxx</td><td>10xxxxxx</td><td>(none)</td></tr><tr><td>07ff - ffff</td><td>1110xxxx</td><td>10xxxxxx</td><td>10xxxxxx</td><td></tr></table> <i><font color=red>UTF can also have 4 bytes, and using the same scheme express a character number up to U+10ffff. That won't be relevant right now, but may be in future. Provisions should be taken for upward compatibility with possible 4-byte UTF code sequences.</font></i> The first byte of a UTF sequence gives its length in the highest value bits up to the first 0-bit, the following 1 or 2 bytes are easily recognizable as belonging to an UTF sequence by their 2 highest value bits, having a value between 80 and BF. The bits here marked as 'x' give the number of the character in the UNICODE table. Thus, a UTF character of 1 byte length is exactly the same number as the corresponding ASCII character. However, a Latin-1 character will have a number beyond 7f. So its not possible to say if a single byte is a Latin-1 character or the start of an UTF sequence. In conclusion, extending the encoding sheme of the protocoll from ASCII to UTF would leave current clients still working, as nullbytes do not occur. Old clients of course would treat each byte of an UTF sequence as a separate character, leading to funny names in the search results. But you get that even now, and searches containing e.g. German special characters do not really work right now: These characters will normally just be ignored. Moving to UCS might make some old clients fail, as one character might contain a nullbyte. Compared to UCS, UTF for a single character either takes less space (for ASCII text), exactly the same space (for the special European characters and any characters up to 07ff UNICODE, for example Russian), or 1 byte more (most notable for Asian languages) As the bulk of the traffic very probably will remain ASCII for a long time from now, the increase in load by using UTF should be tolerable. You gain a worldwide audience, and you stay compatible with the current standard. Keep in mind that Latin-1 right now neither is standard nor does it work well. Lastly, if at some point in future you desire an extension to cover UNICODE characters up to U+10ffff then UTF-8 can still be used. Please have a look at the <a href=http://www.unicode.org/>UNICODE Consortium</a>. Demonstration pages for UNICODE (always UTF encoded) can be found anywhere on the web. One such is <a href=http://www.geocities.com/Tokyo/Pagoda/1675/unicode-page.html>here</a>. If you go for Latin-1, then you need a mechanism to identify the message as Latin-1 or as UNCIODE. If you use UCS, then you probably cannot maintain downward compatibility. You will also get new problems when at some point in the future characters up to U+10ffff should be supported. |
Re: UTF, UNICODE technical (long) Quote:
|
Re: Re: UTF, UNICODE technical (long) Quote:
|
see GUID tagging. Stuff could eigther be Latin-1 (default) or UTF-8 (unicode variation), would that fit to the international needs in your eyes? |
Yes :D. To my knowledge, this would make a comprehensive basis for internationalization. With a view to the future, it's best to provide for UTF-8 up to 4 bytes, even though currently only up to 3 will be used. Once the protocoll has been defined that way, it's just up to the clients to fill it with life. Hopefully the cutting edge of clients will support entry and display of characters not in the system default codepage... from past experiences, I'm worried about this :( . |
Oki, I added this to the v0.7 proposal. Thx for you help and Greets, Moak |
is there a link to the GNUXP protocol (v0.7 Peeranha)? |
I see no future for thi 0.7 proposal To me, I don't think this 0.7 proposal adds anything, apart from breaking the existing 0.6 protocol and building a new one. Let me take a few examples to illustrate my point: 1. The 3-way handshake works. It is necessary for Ultrapeer negotiation with gentle redirection of an ultrapeer to leaf status. It is necessary for Gnet traffic compression negotiation. I understand it can be done with a 2x2-way handshake, but you criticize the 3-way as being complex, so a 4-way is even more complex. As to simply moving to a 2-way because it is simpler to implement, this is a valid point. However, given the need for 4-way exchanges somtimes, you have to handle exceptions anyway. So let's leave the handshaking as a 3-way process. 2. GUID tagging. This mixes a few concepts. You should have a look at my GGEP "Q" extension proposal, which I have posted on the GDF: It clearly separates between atributes that make sense during a query, and those that make sense during a reply. Moreover, the "Q" extension is far more extensible that the bits in the GUID. Finally, don't forget that the GUID is not sent in a query. 3. Renaming of Ultrapeers to something else. Well, I call them Ultranodes. I don't need a protocol 0.7 to call them the way I want. Everyone understand that Ultranodes and Ultrapeers are the same thing. However, not everything you propose is to throw away. It's just that the premisses of your proposal are wrong, and you target your efforts on things that are superfical inconveniences (but would be a pain to backout) instead of moving forward and constructing. Live and let learn! Raphael |
Re: I see no future for thi 0.7 proposal Quote:
- full HTTP alike connections, to achieve an easy parser - straight and simple connection sheme, easy and fast - very flexible desgin for later needs, avoiding a 0.8 in near future *g* - don't hurt old v0.4 servants, let them still be operable I won't argue anymore which handshake is more complex and why I don't like the name ultrapeers or the concept behind it. AFAIK the GDF still has no proposal for LAN auto configuration, UDP autofind or internationalisation/UNICODE, is the chat protocol documented meanwhile? v0.7 ideas are about half year old, meanwhile I prefer GNUXP's concept (ask GodXblue for details). In the past I made suggestions and spend time to improve things and bringing new ideas from different file sharing systems, just take what you like. I found out other developers are not really interested in my experience or support (the friendly way of saying : I know I'm not welcome here). I would have appreciated if Gnutella would be more honest and more cooperative. Happy developing. :) |
Re: Gnutella Protocoll v0.7 Proposal Quote:
|
Re: Re: Gnutella Protocoll v0.7 Proposal Quote:
Good riddance. |
Quote:
http://mrgone4662.dns2go.com/forums/...s=&threadid=16 Morgwen |
no, it's a very early one |
Quote:
Morgwen |
Why did you call it "General Gnutella Development Discussion" when there is actually no development?? You complained that it took over 1 year for v0.6 to be spread out on the net? Why do you think the client-developers inmplemented the ability of online-updates in their programs? Even if a user isn't interested in the new version, he would probably click on "yes, update" just to get rid of the annoying message that a new version is available :) . And if you make banners telling about a new protocol version "new, faster protocol v0.7 supported" it would work fine. (faster sounds good for all the average AOL-users :D , and its a little bit faster anyway :eek: ) just my opinion.... nils.lw |
The real development of Gnutella is being discussed in the GDF, not here. This place here is for people who have problems with developing their client and need to ask questions. |
Hm its jst because you said that you dont want to change/develop the protocol. |
Quote:
http://groups.yahoo.com/group/the_gdf/ |
All times are GMT -7. The time now is 05:13 AM. |
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
SEO by vBSEO 3.6.0 ©2011, Crawlability, Inc.
Copyright © 2020 Gnutella Forums.
All Rights Reserved.