Write the first paragraph of your page here.
As noted on the Phoneme List, there are differences in what phonemes are set for each language and even those that are used by both languages may not have the same pronunciation as each other. In fact, the most major factor in the debate is the fact that language is the only thing that is different about the Vocaloids, with additional adjustments made for either engine only being a small factor on the interface and phonemes for use.
While it is true some pronunciation problems do exist within the English Vocaloid (a few struggle to pronounce the letter 'r' for instance, and smooth blending of diphthongs is rare), this is also reflected inside the Japanese Vocaloids with some of the most well known being the Kagamine Vocaloids. Also, it is easy to pick out the faults of the language you speak and understand rather than the ones you do not, and as often noted by the fandom, one of the reasons why many songs carry subtitles in the first place, as subtitles often even help Japanese to understand what Vocaloids are singing. In fact, both English and Japanese Vocaloids actually sing well in the language they are intended for, but due to the nature of any language there will always be limits on how closely a synthesizer can mimic it. Overall effects will also depend on the user working with the software.
There is also a myth amongst the western Vocaloid fans that Japanese Vocaloids are better at singing in English than the actual English Vocaloids. However, this is not the case as notably to begin with, none of the Japanese are singing in clear English at all in any of the songs where they are featured singing English songs. They are just using the phonemes that are similar to English ones in their Japanese library and even then there are gaps where the language does not cover all of the needed phonemes for English. Japanese Vocaloids also do not have all the vital sound samples included for perfect English recreation and therefore can only partly recreate the English language. Vice versa, though English Vocaloids can recreate Japanese, they also have the same problems. In both cases, some phonetics needed for either language are completely missing that are needed to fully recreate the other and both will always carry heavy native accent and close editing will always have to occur.
English has more phonetic combinations and it is much more difficult to synthesize than Japanese in that respect. Though English consonants often get in cluster, Japanese consonant sounds are always followed by inseparable vowels except ん. In addition almost all consonants can be used to end syllables in English, while Japanese generally does not. Consonant blending (such as consonant and blending) is purposefully absent in Japanese vocaloids due to the characteristic features of the Japanese language and commonly accepted of English vocaloids despite how, to do it truly well, distinct phenomes would be needed for every possible consonant combination.
There are approximately 31 consonantal sounds used in English Vocaloids' phonetic system when aspirated consonants are counted, and about 36 are used in the Japanese systems, including palatized consonants. Japanese Vocaloids only possess the five pure vowels and diphthongs do not need to be blended smoothly for pronunciation to be considered acceptable. Their English counterparts have a total of 20 vowel sounds if they include diphthongs. In order to create a perfect language recreation, the Vocaloid engine must include a extensive library of these sounds as samples. According to one estimate, to create Japanese requires 500 diphones per pitch, English requires 2,500. The Japanese Vocaloid library will end up consisting of 1,500+ samples on average per Vocaloid while English Vocaloids will contain a greater array resulting approx 4,300+ samples on average per Vocaloid. This is why English Vocaloids have a greater language capability than their Japanese cousins, however, they still have limits on their vocal capabilities just the same.
Thus a good English voicebank requires a great deal of effort (this is why good English UTAUs are so hard to find as it is intended for Japanese language, and why so many overseas UTAUs restrict themselves to Japanese) and the sheer quantity of necessary samples makes it difficult to maintain quality. Also note that Japanese use pitch distinctions, sokuon, vowel length and yōon that commonly change the meaning of a word entirely, and in some dialects final consonants and blending are present in certain sounds. These aspects are difficult for Vocaloids to synthesize, but an English speaker would not notice such problems when listening to a song in Japanese.
- See English language and History of the English Language for details on the English language
- See also Japanese for a history on Japan's language.
- By Vocaloid Wiki
Write the second section of your page here. Do not forget to add a category to help people find the page.