Ch.5  WORDS

Typically enough, before I can make sense talking about words, I first have to talk about numbers.  Dipping into the fair use statutes in search of some authoritative estimate of the size of my subject matter, I discovered that The Compact Oxford English Dictionary, 2nd Edition; Oxford University Press; 1991, presently lists about 616,500 words. That really doesn't help much because the esteemed Oxford editors leave out a lot of words.  For example, there are around 1.4 million named species of insects. Some authorities estimate that the total vocabulary of the English language is in excess of three million words, however that only about 200,000 are in common use.  They further estimate that the average vocabulary of an educated person is about 20,000 words, of which about 2,000 are used in a normal weeks conversation.

It would seem, it seems to me, that we pretty much have enough words to go around.  With such a stockpile of resources we should be able to communicate precisely with each other.  Making oneself perfectly understood should simply be  a matter of using just the right number of words in just the right order.  Using Oxford as a guideline,  even an educated person makes use of only 3.25% of the words readily available, so at first brush failures in communication would seem to be the result of not knowing enough words.  Not to obdurately animadversion a reliquiae  caballus,
but we all know that is not the case.  In fact, beyond a rather low threshold, the use of less common words, particularly words with more than two syllables, is socially taboo.  Rather than be judged a skilled conversationalist, one is more likely to be judged a pretentious snob. [ For a number of years, the accounts of which have not yet darkened the doors of this website, I worked for a government agency preparing reports that were issued directly to members of congress.  Our strict guidelines were to write to the tenth grade level.]  Even in our most prestigious ivy league universities writing students are cautioned to use simple words and simple sentence constructions. Pray tell then what are we doing with all those words we are not allowed to use, and why the hell did we make them in the first place?

To get a handle on that, I need to use some language developed for numbers; specifically, statistical numbers.  What I am going to try to do is to take a well known concept and then modify it so that I have a comfortable metaphorical framework to pontificate within.  I suspect that is a bit more  circuitous than competent communication would have it, but please humor me.  I am a veteran.  A suitable expression for the primary thesis of my thinking would be the evolution of organizational structure.  The nearest thing to science guiding my exploration would be chaos theory.  And the nearest thing to a catch phrase/brochure/metaphorical gimmick would be the normal distribution. The normal distribution has an enormous, albeit cabalistic, influence on our human affairs.  Guiding essentially [no I didn't count them] all our decisions is the concept of boundaries and a family of comparable elements within those boundaries- but to backtrack a bit...

In 1733 a fellow by the name of Abraham de Moivre developed a process to approximate binomial distributions.  Word has it that his original paper was lost and not rediscovered until 1924, but then the paper appears to have been reprinted in 1738 in the second edition of The Doctrine of Chances entitled In the Context of Approximating Certain Binomial Distributions for Large n. Abrahams computations were extended by a  Mr. Laplace and published in the Analylitical Theory of Probabilities (1812).  Thereafter it was called the Theorem of de Moivre-Laplace, although that term is not universally accepted.  The ubiquitous value of the theorem in practical applications first came to light in the early 19th. century.  The famous Belgian Astronomer Adoplh Quetelet collected random data on the chest measurements of Scottish soldiers and the heights of French soldiers.  Remarkably, when Quetelet plotted both sets of measurements they tended to cluster in similar symmetrical shapes around a mean. This information proved to be invaluable, as one might imagine, to planning battlefield strategies. [The casual reader might note that the author has a tendency to describe certain events in terms that border on irreverence;  perhaps even sarcasm.  The more astute reader may reason that this character trait stems from an inferiority complex brought on by a hypercritical mother, and tolerate such digressions out of sympathy.]

As more and more data were collected about natural phenomena, researchers  began to notice that the tendency of data fields to form more or less symmetrical distributions around a mean value was widespread.   In fact it seemed to be true of almost everything that could be measured as discrete elements.  Eventually this phenomenon came to be called the normal distribution.   Once a set of random data was quantified and depicted in a histogram, then the frequency of the elements recorded tended to fall most prominently in a narrow range of values, with greater or lesser values distributed on either side of the central value.  A line extending from the top of each representation of measurements tended to form a curve with similar slopes on either side of the mean.  This mathematical Wooden Trojan Horse [get it?] came to be called the bell shaped curve.  Suppose one were to measure the dingbat population on a 1000 acre chunk of plastic floating in the ocean by dividing the island into one acre sections and then recording the number of dingbats in each section.  Dividing the total number of dingbats by the number of sections you discover that the average number of dingbats is 12 per section. The number of sections containing exactly 12 dingbats would be graphically  represented as a column proceeding vertically from a base line. Then the number of sections containing 11 dingbats would be represented by another column as well as the number of sections containing 13 dingbats, and so on until all the dingbats are represented on the chart, the sections containing less than 12 on one side of the baseline, the number of sections containing more than 12 dingbats on the other side of the mean  or central column.  The normal distribution research found that the curve of the chart would tend to look like this.
The normal distribution expectation is that there would be approximately the same number of sections containing 11 dingbats as there would be number of sections containing 13 dingbats.  In other words the rate of deviation from the average would be the same for greater or lesser values. As the concept developed the model baseline was divided into lengths called standard deviations.  For convenience, the baseline charts typically show 6 standard deviations, three negative and three positive.  Since the distribution curve does not actually intersect the base line, there is no place on the baseline beyond which a measured element is forbidden to reside, but elements very far away from the average have  little hope of being published.  To quote one of the most used phrases in all of TV advertizing,  "but wait" researchers were hardly though, they further found that for a given population 68% of the cases fall within one standard deviation of the mean, and 99.7% of the cases fall within 3 standard deviations of the mean.  The exact width of a standard deviation in a given chart requires substantial computation but in general, the smaller the standard deviation the higher the apogee of the curve and the more precipitous the slopes, and the larger the standard deviation the more flattened out the bell appears.

Somewhere along the line the evolution of  the  normal distribution reached a fork in the road.  Rather than choose one road or the other the chutzpadik  bell followed both roads.  Only one fork is of value to us here, but lets follow the other path for a while because it leads to a very interesting observation from  its most gifted explorers.  As counting and chart depiction became more precise and widespread it became obvious that the symmetrical bell shaped curve frequently looked a little ragged.  Rather than discard the normal distribution concept a new craft was created---- statistics. One initial function of this new craft was to minimize the sample size needed in order to make valid statements about an entire population.

Statisticians began to devise esoteric and complex ways to redescribe the raw data so that it looked a little more "normal" when charted.  To their credit, when one studies statistics the instructors invariably caution the class that statistical probabilities are only an approximation of the "real" population.  As these esoteric formulas continued to evolve the primary focus shifted from the development of tools to assist in making decisions applicable to populations to extending the boundaries of the equations as an end in itself.  The most gifted of these explorers eventually enters a rare dimension called pure mathematics.  I envy them, and traveled down this road for the sole purpose of saluting them.  I would not demean their quest by attempting to describe their journey with my own words, but will instead cite the words of one of their own.

"The [mathematician] does not study [mathematics] because it is useful; he studies it because he delights in it, and he delights in it because it is beautiful.... Of course I do not here speak of that beauty that strikes the senses, the beauty of qualities and appearances; not that I undervalue such beauty, far from it, but it has nothing to do with science; I mean that profounder beauty which comes from the harmonious order of the parts, and which a pure intelligence can grasp."
Jules Henri Poincaré  (1854-1912)

Poincaré  is considered by many to be the greatest mathematician of his age, and his perception of beauty in mathematics is an aspect still recognized.  e.g.  Edward Witten, a mathematical physicist at the Institute for Advanced Study in Princeton, New Jersey, and often considered to be Einstein's true successor has said

 " [T]he equations that really work in describing nature with the most generality and the greatest simplicity are very elegant and subtle.  It's the kind of beauty that might be hard to explain to a person from a different walk of life who doesn't deal with science or math professionally. But the beauty of Einstein's equations, for example, is just as real to anyone who's experienced it as the beauty of music."

I don't doubt that great mathematicians perceive a profound beauty in equations that would be completely indecipherable to most, but Monsieur Poincaré  is dead wrong on two counts.  One, when he states that everyday beauty has nothing to do with science, and two, that such vision is available  only to those with "a pure intelligence."  The distinguished and departed gentleman will receive his proper comeuppance a bit later, but for now, having paid our respects to our betters, we shall return to the normal distribution road.



Earlier, I used the analogy that the normal distribution was like a Trojan Horse. What I mean by that is that the concept has the appearance of being harmless; in fact quite helpful, and that is true in many respects.  Few venture capitalists would invest money before taking a sample of their prospective customers and using statistics based on the normal distribution to determine their likelihood of success. Pollsters are constantly checking our pulse with survey questions.  For example they might say "our survey shows that 67% of Americans approve the policy changes.  The margin of error is plus or minus 3 percentage points."  What that means on the mathematician's road is very different from what it means on the plebeian pathways.  The assumption is that if every American were asked, that no more than 70% of us would approve of the policy change and no less than 64% of Americans would not approve of the policy change.  That assumption is treated like a true, irrefutable fact, and any action taken within the 6% range of approval would be would not receive extensive scrutiny.  The problem is that the existence of a  real level of opinion is a myth.  There is no such thing as A real population. [Readers might want to skip the rest of this paragraph because I am going to get pretty picky and/or petty, even for an  INTP. (Myers-Briggs Character Types in case you are interested)]  Suppose one wanted to know the average age of a person within a fifty mile radius of a given point.   Here's how come there ain't no such thing.  First off, you could never find the "given point".  What given point does one start drawing the circle from? Well, says the generic  liegeman, "might I suggest sire that you simply drop a dime on the ground and start your measurement from there." And from which part of the dime does one start measuring, the cross line in the letter A in America ?-- which part of the line ?-- -which molecule,  and of what?--- which atom?--- which electron?--- as a particle or a wave?--- what if there is not a particle at the bottom at all, but a string?--- which dimension?-(and of course the same issue would apply to the outer edge of the circle.)  Even if you could decide on the center and the edge what about the people partially on the line?  What about the person that might die just at the moment of the counting-- exactly when does death occur?--- or does life begin?  Within which nanosecond does one perform the calculation?  How could one report the average age before it would change?
The frequently subliminal but implicit assumption that there is a real distribution clouds our ability to fully understand anomalies, and it is there that strange attractors come into being, and why we are so adept at not seeing them until they effect unexpected outcomes.  When we see something that is different from our expectations based on our knowledge and beliefs (about the those) we almost always see it as a temporary, even accidental deviation from the norm.  In other words it is a glitch in the natural order and will soon right itself.   We are severely programmed to not see it as a harbinger of changing order.




When and how words came into being is the subject of much speculation and research. Perhaps sounds became words became languages over millions of years as we started to walk upright and gravity pulled our larynx down allowing the vocal chords to develop.  Maybe the language abilities resulted from mutations of innate protolinguistic genes, or were installed  by some deity or other.  Perhaps Webster discovered how to do time travel and went back in time to invent words so he would be able to find a publisher for his book.

Arbitrarily picking a point in time to write about the organization of words in our American English language, I decided on 1963.  That period was not a unique milestone in the history of words but I chose it simply for editorial convenience.  That was the first time I was actually exposed to the organizational aspects of words at the university level. [My personal background re: syntax is elucidated elsewhere in this cybernest.]  At that point in time the organizational structure of words was well established. We had just passed out of the '50's; a time of widely shared values and opinions, and things seemed to be working just fine.  In terms of my distribution metaphor,  words, not just the frequency but the connotations of words and their perceived value had been well established.  A graphical representation of the then words would show a very distinct and dense area of accepted usage, the "mean" or area of the central standard deviation in the hypothetical graph.  The rules and proper usage of words in this area were widely, almost universally, accepted.  Language was so fully developed that textbooks were printed defining, in excruciating detail, proper use of words. Students in English class were subjected to one of the more horrific practices of the day-- diagraming sentences. See how this practice mutilated a cherished sentence.

                                                                                                                              




















Life in the center of words (by now promoted to language) was sternly controlled; innovation, or the even more dreaded colloquial use, was dealt with severely. The reason, so said the authorities in charge, was because all the rules had been established.  Proper usage of words had reached its ultimate destiny. The final arbiters of the American English language had no sense of history or the future.  It just didn't register that the rules of language should be fluid--they assumed once a dangling participle, always a dangling participle.  Had the arbiters been transported back in time to the first schoolroom in the New World, they would have continued to meet out punishment for those pilgrim children who strayed from accepted use.  They would not have missed a beat.  How dare one state that the universal truths of language they were then enforcing were different from the ones in use in 1963!   A new word wandering in from the outer regions of words had little chance of making it into the dictionary. We had god, apple pie, and the conjugation of verbs.  Period.
                                                                                                                                      Continued> 
                                                                                                                                                                                                     
Jim's Home Page
Dingbat Data
Dingbats Per Cube
12
11
10
9
8
13
14
15
16
C-
U-
B-
E-

F-
R-
E-
Q-
U-
E-
N-
C-
Y-
THE OTHER
wordscont