![]() |
| | |||||||
| Register | FAQ | Members List | Calendar | Arcade | Search | Today's Posts | Mark Forums Read |
| Open Discussion topics Discuss the time of day, whatever you want to. This is the hangout area. If you have LimeWire problems, post them here too. |
| Welcome To Gnutella Forums You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content, fun aspects such as the image caption contest and play in the arcade, and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! (click here) If you have any problems with the registration process or your account login, please contact us. Your email address must be legitimate and verified before becoming a full member of the forums. Please be sure to disable any spam filters you may have for our website, so that email messages can reach you. Once registered but before posting, members MUST READ the FORUM RULES (click here) and LimeWire/FrostWire users should include System details - help us to help you (click on blue link) in their posts if their problem relates to using the program. Whilst forum helpers are happy to help where they can, without these system details your post might be ignored. And wise to read How to create a New Thread Thank you . Uw e-mailadres moet wettig zijn en verifiërde alvorens een volwaardig lid van de forums te worden. Gelieve te zijn zeker om om het even welke spamfilters onbruikbaar te maken u voor onze website kunt hebben, zodat de e-mailberichten u kunnen bereiken . Votre email address doit être légitime et vérifié avant d'aller bien à un membre à part entière des forum. Veuillez être sûr de désactiver tous les filtres de Spam que vous pouvez prendre pour notre site Web, de sorte que les messages électroniques puissent vous atteindre . Ihr email address muss gesetzmäßig und überprüft sein, bevor es ein vollwertiges Mitglied der Foren wird. Seien Sie bitte sicher, alle mögliche Spamfilter zu sperren, die Sie für unsere Web site haben können, damit E-Mail-Nachrichten Sie erreichen können . Su email address debe ser legítimo y verificado antes de sentir bien a un miembro de pleno derecho de los foros. Esté por favor seguro de inhabilitar cualquier filtro del Spam que usted pueda tener para nuestro Web site, de modo que los correos electrónicos puedan alcanzarle . Seu email address deve ser legítimo e verific antes de assentar bem em um membro integral dos fóruns. Seja por favor certo incapacitar todos os filtros que do Spam você puder ter para nosso Web site, de modo que os mensagens de correio electrónico possam o alcangar. . Din e-post tilltalar måste vara legitim och verifierat för passande en full medlem av forumen. Behaga är säkert att inaktivera någon spam filtrerar dig kan ha för vår website, så att e-postmeddelanden kan ne dig. . Il vostro email address deve essere legittimo e verificato prima di stare bene ad un membro titolare delle tribune. Sia prego sicuro rendere invalidi tutti i filtri che dallo Spam potete avere per il nostro Web site, di modo che i messaggi di posta elettronica possono raggiungerli. . Η διεύθυνση ηλεκτρονικού ταχυδρομείου σας πρέπει να είναι νόμιμη και ελεγγμένη πρίν γίνεται πλήρες μέλος των φόρουμ. Παρακαλώ να είστε βέβαιος να θέσει εκτός λειτουργίας οποιωνδήποτε φίλτρα spam που μπορείτε να έχετε για τον ιστοχώρο μας, έτσι ώστε τα μηνύματα ηλεκτρονικού ταχυδρομείου μπορούν να φθάσουν σε σας. . Ваш адрес электронной почты должен быть правомерен и подтвержен перед идти действительным членом форумов. Пожалуйста уверен вывести все фильтры из строя спам вы можете иметь для нашего вебсайта, так, что сообщения по электронной почте смогут достигнуть вас. . 您的电子邮件必须是合法和核实在适合论坛的一个正式成员之前。 请务必使您可以为我们的网站有的所有发送同样的消息到多个新闻组过滤器失去能力,因此电子邮件可能到达您 . あなたの電子メールアドレスはフォーラムのフールメンバーに似合う前に正当、確認されなければならない。 電子メールメッセージが達することができるようにあなたが私達のウェブサイトのために持つかもしれないスパムフィルターを不具にすること確実がありなさい。 Hilfe in Deutsch, Ayuda en español, Aide en français, Hulp in het Nederlands Forum Rules Support Forums Before you post to one of the specific Client Help and Support Conferences in Gnutella Client Forums please look through other threads and Stickies that may answer your questions. Most problems are not new. The Search function is most useful. Also the red Stickies have answers to the most commonly asked questions. (over 90 percent). If your problem is not resolved by a search of the forums, please take the next step and post in the appropriate forum. There are many members who will be glad to help. If you are new to the world of file sharing please do not be shy! Everyone was ‘new’ when they first started. When posting, please include details for: Your Operating System ....... Your version of your Gnutella Client ....... Your Internet connection (56K, Cable, DSL) ....... The exact error message, if one pops up Any other relevant information that you think may help ....... Try to make your post descriptive, specific, and clear so members can quickly and efficiently help you LimeWire and FrostWire users need to supply these details >>> System details - help us to help you (click on blue link) Moderators There are senior members on the forums who serve as Moderators. These volunteers keep the board organized and moving. Moderators are authorized to: (in order of increasing severity) Move posts to the correct forums. Many times, members post in the wrong forum. These off-topic posts may impede the normal operation of the forum. Edit posts. Moderators will edit posts that are offensive or break any of the House Rules. Delete posts. Posts that cannot be edited to comply with the House Rules will be deleted. Restrict members. This is one of the last punishments before a member is banned. Restrictions may include placing all new posts in a moderation queue or temporarily banning the offender. Ban members. The most severe punishment. Three or more moderators or administrators must agree to the ban for this action to occur. Banning is reserved for very severe offenses and members who, after many warnings, fail to comply with the House Rules. Banning is permanent. Bans cannot be removed by the moderators and probably won't be removed by the administration. The Rules 1. Warez, copyright violation, or any other illegal activity may NOT be linked or expressed in any form. Topics discussing techniques for violating these laws and messages containing locations of web sites or other servers hosting illegal content will be silently removed. Multiple offenses will result in consequences. 2. Spamming and excessive advertising will not be tolerated. 3. There will be no excessive use of profanity in any forum. 4. There will be no racial, ethnic, or gender based insults, or any other personal attacks. 5. Pictures may be attached to posts and signatures if they are not sexually explicit or offensive. 6. Remember to post in the correct forum. Take your time to look at other threads and see where your post will go. If your post is placed in the wrong forum it will be moved by a moderator. 7. If you see a post in the wrong forum or in violation of the House Rules, please contact a moderator via Private Message or the "Report this post to a moderator" link at the bottom of every post. Please do not respond directly to the member - a moderator will do what is required. 8. Any impersonation of a forum member in any mode of communication is strictly prohibited and will result in banning. 9. Multiple copies of the same post will not be tolerated. Post your question, comment, or complaint only once. There is no need to express yourself more than once. Duplicate posts will be deleted with little or no warning. 10. Posts should have descriptive subjects. Vague titles such as "Help!", "Why?", and the like may not get enough attention to the contents. 11. Do not divulge anyone's personal information in the forum, not even your own. This includes e-mail addresses, IP addresses, age, house address, and any other distinguishing information. Don´t use eMail addresses in your nick. 12. Signatures may be used as long as they are not offensive or sexually explicit. 13. Failure to show that you have read the forum rules may result in forum rules breach infraction points or warnings awarded against you which may later total up to an automatic temporary or permanent ban. Supplying system details is a prerequisite in most cases, particularly with connection or installation issues. Violation of any of these rules will bring consequences, determined on a case-by-case basis. Thank You! Thanks for taking the time to read these forum guidelines. We hope your visit is helpful and mutually beneficial to the entire community. |
| | LinkBack | Thread Tools | Display Modes |
| ||||
| Should I be typing in Japanese? Hey I could be wrong. The error message that comes up is that 2 characters will congest the network, as it will seek out/search thru more items than it could possibly need & without necessarily finding all possible desired sources before the search is exhausted. In other words, a lot of search energy would simply be wasted searching thru unnecessary & unwanted files. One thing the p2p & particularly LW designers try to do is limit unnecessary traffic along the Gnutella network so it doesn't work at snail's pace. Which would you prefer, LW to work as the tortoise or the hare? lol If you're refering to Asian characters such as used in Chinese or Japanese then perhaps you should put a request in for a new feature in LW. Post here: New Feature Requests And explain your reasons in detail. I can see how it would be a nuisance. Particularly for some names or even simple/short titles. Last edited by Lord of the Rings : November 22nd, 2004 at 10:56 AM. |
| |||
| To the question before answering, thank you for! By any means, unless search can be written in 2 characters, there are also some which do not come out it is, but in that case it probably is how it should have done? |
| ||||
| The limit on 3 characters was designed at a time when only ASCII searches were reliable. But since we now support Unicode for handling any language, this rule should be rewritten so that it will require a minimum 3 UTF-8 encoded bytes for a search. This won't change anything for ASCII searches: it will still be 3 characters. But for geenral European Latin/Greek searches it will mean that 2 characters will be enough if at least one is not ASCII (note however that searches ignores and drop accents, even if combining accents are still returned in the results) For Asian languages, 3 UTF-8 bytes will code 1 ideograph or 1 Hiragana or Katakana. May be this limit of 3 bytes is too little. So as a prudent alternative, I would say that 3 ASCII-only characters or 4 bytes of UTF-8 encoding will be needed to perform a search (For European languages, this is 3 ASCII, or 2 ASCII and 1 extended character, or 2 extended characters; for Asian texts, this means a minimum of 2 ideographs or 2 hiragana/katakana, ignoring the combining voice or tone marks)
__________________ LimeWire is international. Help translate LimeWire to your own language. Visit: http://www.limewire.org/translate.shtml |
| ||||
| Nihon-go desu ka. Warui kedo ore-tachi no nihon-go wa mada mada desu (I still can't read, not to mention write, kanji) It is really funny to see how those translator applications work (or don't work).
__________________ iMac G4 OSX 10.3.9 RAM 256MB LW 4.10.5 Basic ADSL anything from 3 to 8Mbps/around 1024kbps "Raise your can of Beer on high And seal your fate forever Our best years have passed us by The Golden Age Of Leather" -Blue Öyster Cult- |
| ||||
| Although I can't read Japanese other than Hiragana and Katakana characters for which an approximative phonetic translitteration to the Latin script is easy to perform (like you did), I can still recognize that "nihon-go" means "Japanese" (the language name). So I won't be helpful unless there's a translator for support questions in Japanese (even more difficult when Japanese users send us a question in Japanese using in their email some unknown variant of EUC/ISO-2022-JP, instead of the more widely portable Shift-JIS, or even Unicode UTF-8)... So I have a small support question in Japanese for which I can't reply. Here it is (sorry this forum does not support Unicode characters other than in a UTF-8 form , so characters are shown incorrectly; you need to explicitly select UTF-8 in your browser...): æ–‡å_—ã?®éƒ¨åˆ†ã?Œâ–¡â–¡â–¡ã?«ã?ªã‚Šã?¾ã?™ ã?ªã?œã?§ã?—ょã?†ã?‹? The question comes with a screen snapshot of Limewire in Japanese, where the title shown at the top of the search box is shown only as a string of square boxes. Apparently that user seems to have problems in his configuration of fonts to display Japanese, but I'm not sure how I can help, given that his display is not the one I get when testing LimeWire in Japanese, where I don't see these square boxes (which mean missing glyphs in the selected font). So if there are inaccuracies in the encoding of the Japanese translation of LimeWire, there's little I can do. (Some months ago, a Japanese student was working in LimeWire offices in New York, and helped improving this translation, and creating the complete translation of the LimeWire web site in Japanese; he also worked with me to define the rules allowing better handling of Japanese in keyword searches). Can someone come to the rescue? Aren't there any experimented Japanese user out there?
__________________ LimeWire is international. Help translate LimeWire to your own language. Visit: http://www.limewire.org/translate.shtml Last edited by verdyp : December 11th, 2004 at 12:01 PM. |
| ||||
| Chinese characters, more exactly Han ideographs, are used in Chinese most often to write one syllable, not to write words or concepts as the term "ideograph" would imply. Linguists prefer the term "sinograph" to designate these characters. What makes the Han script complex is the number of syllables that the script allow to encode, and the fact that the set of syllables in Chinese is extremely rich, with distinctive diphtongs, stress, tones, and multiple consonnants... When you compare it to other syllabaries (like Hiragana or Katakana used also in Japanese), the individual "letters" of that script becomes as much expressive as 1 or 2 syllables in a Latin-based language. That's why most Chinese words are written with no more than 2 sinographs. This is why Han is not considered as a syllabary, although it should (with the exception of some historic and rarely used sinographs used to represent concepts, or some tradtional sinographs which are widely used and frequent in Chinese texts, and represent a complete word or concept). The size of the extended Han syllabary is not a problem for LimeWire, which inherits simply from the encoding efforts for Han performed in Unicode. In Unicode the most frenquent sinographs are encoded in the "BMP" after the U+03FF code point limit, meaning that they are represented with 3 bytes in a UTF-8 encoding scheme. A search for these characters will be selective if there is a bit more than 1 common sinograph in the search string. The rule for allowing searches with: - at least 3 ASCII-only chars, - or at least 2 chars if at least one is not ASCII, - or at least 4 bytes in the UTF-8 representation would work for Chinese, as well as other languages. Many more rare Han sinographs are encoded out of the BMP in a supplementary "ideographic" plane (SIP). Within Java and in LimeWire, all Unicode characters in strings are encoded internally with UTF-16 as a pair of "surrogates". But in UTF-8 they will become 4 bytes. If those characters were present in a search string, each of them would highly selective for searches. So a single character would be enough. So the proposed rule to allow searches would also work well for these extended sinographs in the SIP...
__________________ LimeWire is international. Help translate LimeWire to your own language. Visit: http://www.limewire.org/translate.shtml |
| ||||
| That's very impressive about 中文 use in LW! I guess it depends upon which Japanese text you use. But their very roughly equivalent of Han might achieve in finding some items in searches (depending on labelling & source, etc.) I don't know japanese as is obvious 石灰ワイヤー So it's a difficult task. |
| ||||
| You seem to have interesting and valuable knowledge about Asian scripts. If you have some programming experience, could you join to our team of open-source contributors to help improve further the internationalization of LimeWire? Unfortunately, my knowledges of these scripts is only theorical, based on the works done in the Unicode standard and related works such as ICU and UniHan properties, but not based on linguistic and semantics. One thing for which we have no clew is the support of Thai (which unfortunately has a visual ordering in Unicode because of the support of the legacy national TIS-620 standard, instead of a logical one used in other scripts, and also because Thai, like many other Asian languages, do not use any space to separate words). In the past, I proposed to index Asian filenames by splitting them arbitrarily in units of 2 or 3 character positions, but the number of generated keywords would have been a bit too high: Suppose that the title "ABCDEFG" is present, where each letter is a ideograph, or a Hiragana or Katakana letter or a Thai letter, the generated searchable and indexed keywords would have been: "AB", "BC", "CD", "DE", "EF", "FG" (if these two-letter "keywords" respect the minimum UTF-8 length restrictions given in my previous message) "ABC", "BCD", "CDE", "DEF", "EFG" Note that there may exist situations where a SINGLE character is a significant keyword. In LimeWire, we currently detect keyword separations either with: - spaces and controls - the general category of characters, so that punctuations or symbols become equivalent to spaces. - the script type of the character: a transition in Japanese between Hiragana or Katakana or Kanji or Latin implies a keyword separation. What we really need is a lexer. There are several open-source projects related to such lexical analysis of Asian texts (notably for implementing automatic translators, or input method editors or word processors). The problem is that they often depend on a local database that will store the lexical entities, or long lists of lexical rules. Some projects perform something else: lexical analysis is performed automatically, from an initially empty dictionnary, by statistical analysis of frequent lexical radicals, so that frequently used prefixes and suffixes can be identified (this is also useful for non Asian languages, like German, or for other Latin-written European or African languages like Hungarian or Berber). This is a research domain which is highly protected by many patents, notably those owned by famous dictionnary editors, or web search engines like Google... Documents on this subject, which would be freely available and that would allow royaltee-free redistribution in open-source software are difficult to find... But I suppose that this has been studied since centuries within some old books whose text is now available in the public domain. My searches within public libraries like the BDF in France have not found something significant (and getting copies of these documents is often complicate or even expensive, unless these books have been converted to numeric formats avaliable online). Also most of these books imply at least a good knowledge of the referenced languages, something I don't have... It's probably easier to do by natives of countries speaking and writing those languages, that's why we call for contributions by open-sourcers...
__________________ LimeWire is international. Help translate LimeWire to your own language. Visit: http://www.limewire.org/translate.shtml |
| ||||
| Well I never studied Thai but I did study Kmer but I wouldn't suppose you'd have too many people of that language who'd even use a computer let alone LW. There are some similarities to Thai. But then there's a person here who is Thai. But none of the keyboards are setup here to use Thai. Simply using OSX fontbook to help with translations of text (or make such fonts available.) Likewise for Chinese. The other language I studied is Vietnamese but since that's almost latin based it is not quite as complex. Yes in Thai you have all the xtra accents & what-have-you. In principle the same as Kmer. Lao also has some similarities to Thai but in a different way to kmer. As far as programming goes, although I loved it at the time, I am now too far removed from it. I am no master of java & only have a very basic & extremely limited knowledge of it. In my work I became too distracted with a locally designed language for work as a sys prog & ended up nowhere except in frustration. It was a rare language at the time so to speak. The co. was disolved as have many over the years. I guess that's why I studied business & marketing so I would know what I was walking into. lol But that's not to say I am not keen to help! If I can I will. I just need some instructions preferably by pm. I wanted to say the above things 1st! Technical terms can be a stumbling block. Often they're based directly on english; spoken &/ written . |
| ||||
| In principle, Lao and Khmer will be less complex than Thai, because they were encoded in Unicode using the logical model, which makes full-text searches easier to implement. For LimeWire, it means that Lao and Khmer can be handled like other Indian/Brahmic scripts (we don't care if the visual order differs from the logical order, or if there exists input methods that use a visual order, given that the conversion of these texts to Unicode will use a logical order.) But it's not true for Thai, because Thai is supported in Thailand by an old and widely used TIS-620 standard, made since long in the 70's by IBM for the Thai government which made it mandatory for the representation of Thai texts. Unicode has then borrowed this situation, because it wanted to keep a roundtrip compatibility with the very large existing corpus of Thai texts in computers, files, databases, and input methods, encoded since long with TIS-620 or one of its predecessors. Thai has always been encoded with the visual order where some letters that are logically after another one (including in phonetic, structure, or collation) must be entered before it, because it will be written graphically in a single right-to-left direction (this visual ordering comes from the legacy limitations of font and display technologies in the 70s). India has chosen to preserve the logical ordering, which looks more like the way Indian users think about their language, and how they spell it orally (there has existed some typewriters in India using the visual orer, but these were considered difficult or illogical to use; for computers, the ISCII standard was created with the correct assumption that computers would make themselves the visual ordering.) Thai is complex because there does not exist a reliable algorithm to convert from the visual order (encoded) to the logical order (that would be useful for searches and collation). In practice, a Thai collation system comes with a large database containing most Thai words or radicals encoded in logical order. This database could be used to create useful lexical entities in LimeWire, but it is large, and may be subject to some copyright restrictions (I don't know exactly the status of the Thai database that comes with IBM's ICU, i.e. if it can be redistributed freely, like ICU itself). If LimeWire incorporated this database for Thai users, may be this should be an optional download, because of its size. As far as I know, no other Asian language needs such a database for collation, but such a database may be needed by a lexer to split sentences into keywords, due to the absence of mandatory spaces. But it's possible that these Asian users have learned to insert spaces or punctuation within their filenames to help the indexation of their files. We have no feedback about this from Chinese or Japanese users, so I don't know if their attempts to search files in their language is successful or not. If not, may be we should consider implementing at least some basic lexer like the one I exposed above (every 2 or 3 or 4 characters within a sequence of letters of the same Asian script).
__________________ LimeWire is international. Help translate LimeWire to your own language. Visit: http://www.limewire.org/translate.shtml |
| ||||
| I may try to research around to see what I can find out (of those that search in their native language) & see what their responses are. But I won't reply in a hurry it could take quite some time. Particularly this time of year (people on breaks & on holidays overseas, etc.) Thai font licenced! Well I guess that would explain the limitation of characters. Some of mine come freely from a university apparently. So how does Java handle these fonts? Seem to be fine from here. Everytime I visit hotm'l it's in thai text. Not all correctly displayed but. How much do you think LW would pay me if I translated a version totally to Thai & had OSX type iTunes support. lol (ummm just a joke!] |
| ||||
| I won't reply you about the payment. A LimeWire crew would better replyto you privately. I contribute to LimeWire as a free open-source contributer, managing most of the work with candidate translators, validating their input, testing them in LimeWire, helping LimeWire for beta-tests, or helping with proposed optimizations or corrections. I'm not paid to do that, but I have no contractual obligation with LimeWire, but just need to respect its working etiquette. For these reasons, Limewire has granted to me a limited access to their development platform, and I received a couple of checks and some goodies as a way thank me for my past contributions. |