Re: Saul's Phrase List Rob Zook Thu, 06 Nov 1997 15:04:03 -0600 At 02:36 PM 11/6/97 -0600, you wrote: >At 01:24 PM 11/6/97 -0600, Rob wrote: > >>At 12:23 PM 11/6/97 -0600, Saul wrote: >> >>>I find it kind of spooky that the phoneme distribution between those >>>phrases and the lexicon should be at all similar, because the phrases >>>make use of forms of probably less than 20 different words. This says a >>>lot, as I had suspected, about the skew of the current lexicon. >> >>Yes, the fact that the small sample and the larger samples distribution >>came out the same indicates a very skewed sample. However, any such >>distribution of the phonemic inventory of a language should have some >>skewes in favor of the sounds which make it distinctive. > >Of course. But when one vowel occurs almost twice as often the next most >numerous phoneme, while 2/5 of the vowel inventory doesn't occur at >all... Well, you're right some of the sounds may stay pretty skewed with a larger sample, but we need more phrases in a file. Just a list of words alone like in the dictionary will not show much because who knows from just that how much any individual word gets used. So that's why we need more phrases built from the words in the dictionary. Naturally, some words will find more use than others and hopefully that will even out some of the skewing. If not, we can always fudge the values anyway we like of course. >I think we can safely say that ST writers, when working with Vulcan, >imagine a lot of "ar"s and "or"s -- and apostraphes! Very few of the writers seem to have done their linguistics homework. >>>We may need a temporary ban on /a/ and /r/ until things even out a bit. >> >>Well, lets wait abit make some more phrases and see how what the overall >>sound seems like. > >I meant a ban on their inclusion in new words. Without new words, our >phrases will tend to reflect the existing lexicon unless we are highly >selective. If we do the computer generated vocabulary thing we can have the end result follow whatever sound distribution we like. I have just the program to use. It's called Language Maker, and it can do exactly what we need. All we need is the frequency distribution of phonemes and the phonemic constraints and poof! - we got vocabulary. Rob Z. -------------------------------------------------------- Men are born ignorant, not stupid; they are made stupid by education. -- Bertrand Russell