3794
Lawyerdude’s Theory of Word Counting
“By their words ye shall know them” - J. Christ
Word Counting: How to quantitatively analyze writing for a signature
How to Analyze speech patterns - for deceit for example
by Attorney Douglas Palaschak. 25 April 2000.
Skip this prefatory material and these links. Click here to jump directly to my theory.
This page is www.lawyerdude.netfirms.com/wordcounting.html
Related pages:
My Word Counter discussion group: http://groups.yahoo.com/group/Wordcounters/
Professor Catherine Ball’s Word Counter Program: http://www.georgetown.edu/faculty/ballc/webtools/web_freqs.html
Morgan’s Word Counter program: http://www.wordcounter.com/
Location of original file on my computer: File: backup/projects/essays/actually . . .3794
This page is mentioned on: http://www.lawyerdude.8m.com/mystory.html
General navigational links:
Telephone Lawyerdude: 805 652 0334
Please join my newest Yahoo group for discussion or legal self help litigation. Here is the link to the link: http://www.lawyerdude.8k.com/6346.html
Email lawyerdude: dlawyerdude@hotmail.com
Instant message me: I am lawyerdude1989 on Yahoo instant messenger. I am dlawyedude on msn messenger.
My most useful web pages and my most popular web pages are these following pages:
Self help Litigation forms, instructions, cases, and samples.
1. Links to all 70 sample motions for all my protege litigators ! http://www.circuitlawyer.8m.com/traffic.html
2. List of my 200 most popular web pages according to Google. http://www.lawyerdude.8k.com/5733.html
3. Samples of 8 actual Section 1983 federal complaints: http://www.lawyerdude.netfirms.com/6008.html
4. List of the 30 most important criminal court motions. They are listed in Lawyerdude’s Bill of Rights for Criminal Defendants in jail. This is my New Standard by which to measure effectiveness of counsel. Make your appointed lawyer toe the line: http://www.circuitlawyer.8m.com/5635.html
5. Motions 101. How to write and file and serve a motion: http://www.lawyerdude.netfirms.com/6025.html
6. Courtroom assertiveness 101: How to be assertive in court. Scripts for the Pro Se litigant:
http://www.circuitlawyer.8m.com/5537.html
7. Your litigation rights page. Learn your litigation rights! www.circuitlawyer.8m.com/5687.html
8. Were you strip searched? Sue em! http://www.circuitlawyer.8m.com/5728.html Do they do a strip search anus check every time you go to the law library? Did your jail not have a law library?
9. List of the most quotable cases and the most useful web pages for the pro se Litigator: www.lawyerdude.8k.com/medley.html
10. Your case summary and trial notebook form: http://www.lawyerdude.8k.com/summary.html
11. New! Links to the 143 cases that define criminal procedure: http://www.circuitlawyer.8m.com/weinreb.html
12. My Demurrer page: perfect record so far: http://lawyerdude.8k.com/5736.html
13.
All about Lawyerdude
14. My ongoing battle with the mistaken, oppressive, and political state bar: http://www.lawyerdude.8k.com
15. Lawyerdudes’s biographical page: http://www.lawyerdude.8m.com/mystory.html
16. My LSD story and brief: http://www.lawyerdude.8m.com/5431.html
17. How to work well with Lawyerdude: http://www.lawyerdude.8k.com/contract.html
18. My most important page. My top 10 lists: http://www.lawyerdude.8m.com/5459.html
19. My ideas. My 10 proposed amendments to the bill of rights: http://www.lawyerdude.8m.com/5123.html
20. My home page: www.lawyerdude.8m.com Or my mirror site: www.lawyerdude.netfirms.com
My biggest fattest briefs:
21. My “state bar acts are unconstitutional!” brief: www.lawyerdude.8k.com/3789.html
22. My 100 page LSD brief: http://www.circuitlawyer.8m.com/1170.html Use this for your drug case!
23. My collection of “right to drive” briefs: www.lawyerdude.8k.com/right2drive.html
24. Lawyerdude's briefs: www.circuitlawyer.8m.com
25.
More Lawyerdude links and Recommended Reading list
26. Lawyerdude’s traffic page: http://www.lawyerdude.8m.com/5259.html
27. Lawyerdude's library. A prioritized reading list. A list of books that farm folk and an enlightened populace should read. Some of these books justify weekly or monthly review - like your Bible - for your own defense. www.lawyerdude.netfirms.com/library.html
28. List of links to the Latest uploads from Lawyerdude: http://www.circuitlawyer.8m.com/5673.html
29. Lawyerdude's Contemporary Constitutional Issues: http://www.circuitlawyer.8m.com/5693.html
30. Lawyerdude's links page: www.lawyerdude.8m.com/links.html
31. Lawyer’s Manifesto: www.lawyerdude.8k.com/5753.html
People who link to me:
32. I thank Bill Munro www.landrights.com I remember Dan Meador http://www.lawresearch-registry.org/ , http://www.geocities.com/CapitolHill/Rotunda/4027/
Theory of Word Counting
“By their words ye shall know them” - J. Christ
Word Counting: How to quantitatively analyze writing for a signature
How to Analyze speech patterns - for deceit for example
by Attorney Douglas Palaschak. 25 April 2000.
Table of Contents:
Try it yourself - on the Magna Charta excerpt , for example, as shown below
What I discovered - half the words in any article are the same 17 words
The rank of words according to their frequency of use
What I theorized: The Signature theory
Measuring the natural frequency of word occurrence
How to analyze speech patterns - for deceit for example
Palaschak's theory of jumbled words
Palaschak's Theory of Control by Words
Most clear example: Application to the Unibomber case:
We can use this theory to find out who really wrote supreme court opinions
Lawyerdude’s Theory of Wordcounting
When I was in jail once I decided to test my theories of word frequency. We all know that some words appear more frequently than others. The words “the” and “and” for example occur more often than the word “encyclopedia” for example.
Glossary of words used herein
Frequency of occurrence: This is a ratio, always less than one, of the number of occurrences of the word divided by the total number of words in the passages analyzed. In the example below, the total number of words is 38. The frequency of occurrence of the word “the” is .132.
Divergence ratio: The ratio of the frequency of the word in the subject's writing to the frequency of the word in a sufficiently larger universe of writing - sufficient enough to demonstrate a stable frequency of that word. In the example below, the subject used the word “amercements” with a frequency of 1 in 38 words. The frequency of occurrence is 1/38 = .026. If we were to analyze Reader's Digest, we would likely find some entire issues without the word “amercements” so that the frequency of occurrence in our universe of Reader's Digest land is actually something like 1/100,000 = .00001. The divergence ration here = the ratio of the 2 frequencies = .026/.00001 = 2600. This passage uses the word 2600 times more frequently than the normal writer in our Reader's Digest universe.
My Methodology
I counted the words in a Reader's Digest article. Then I listed all the words in alphabetical order - and, of course, many words occurred more than once. For each word I divided the number of times it appeared by the total number of words in the article and that ratio was, of course, the frequency of occurrence of that particular word.
Try it yourself - on the Magna Charta excerpt , for example, as shown below
Let's use something from the Magna charta an example:
“A freeman shall not be amerced for a slight offense, except in accordance with the degree of the offense . . . and none of the aforesaid amercements shall be imposed except by the oath of honest men of the neighborhood.”
First, count the words. There are 38 words in this passage. Now list the words in alphabetical order. (WordPerfect can do this for you - and count the words too - and place them in alphabetical order.) Maybe you want to cross off each word as you count it - because there may be duplicates - and you need to count duplicates also.
Here they are listed alphabetically with the numbers appearing before letters:
20
a
a
accordance
aforesaid
amerced
amercements
and
be
be
by
degree
except
except
for
freeman
honest
imposed
in
men
neighborhood
none
not
oath
of
of
of
of
offense
offense
shall
shall
slight
the
the
the
the
the
with
Now compute for yourself the frequency of occurrence of the word “the” for example.
The word “the” occurred 5 times our of a total of 38 words. Thus the frequency of occurrence is 5/38 = .132 = about 1/7. One is seven words is the word “the” if all writing is like this. The problem with this small sampling is that no word will have a ratio less than 1/38 - because we only had 38 words in the sample.
What I discovered - half the words in any article are the same 17 words
I discovered that if you remove the 17 to 35 (I forget the number now) most frequently occurring words in an article, then you have taken out half the words in a Reader's Digest caliber article. Of course you might expect some variation in the number of words that you would need to remove to take out half the words of the article - and that factor would be a factor that would be a weak signature - but the better signature is described further below.
The rank of words according to their frequency of use
The word “the” would be at the top of the list as the most frequently used word.
Theory: How many words remain in the same rank across our Reader's digest Universe? That in itself is a measure something less than a signature. Try it yourself:
Example: The ranking of words in the example is as follows:
“the” occurred 5 times in the 38 word passage. Frequency = 1/38 = .132
“of” occurred 4 times. 4/38 = .105
“shall” occurred twice. 2/38 = .053
be
a
except
“offense” and the words above it on the list all occurred twice
The remaining words on the list only occurred only and therefore are all tied for last place with no particular order. We need a bigger sample of words to clarify their ranking in terms of frequence.
Creating a data bank - or several - will stabilize the numbers and establish a societal frequency pattern.
As you put more and more sentences, and paragraphs, and articles, and magazines, and encyclopedias, and newspapers through the process you will see that the numbers stop changing. The frequency of the word “the” will likely be accurately determined - for the particular universe from which you select your documents.
What I theorized: The Signature theory
Each of us has certain words or phrases that we use more that other people use these words and phrases. The essence of computing the signature data lies in comparing the frequency of these words in our own speech with the frequency of occurrence in normal speech. My theory then is based on statistics.
Measuring the natural frequency of word occurrence
My analysis can easily be done with WordPerfect or any computer program that can make a word list. With a little preliminary work, one can establish baseline date for each word. Each word will have a naturally occurring frequency in various types of writing. Readers Digest or any other edited publication would seem to present a good baseline. One can establish the frequency of the 17 to 35 most frequent words by simply counting words in one Reader's Digest article. I did that. It works. And the variation from article to article can be analyzed statistically also.
The divergence ratio is explained in the glossary. The divergence ratio is an indicator of the degree of deviation from the norm in the use of a particular word or phrase. In Larry's case discussed herein, he uses the phrase “turn around and” excessively. The divergence ratio for this word will be high for Larry - and that becomes part of his signature.
I published the results of my analysis when I got out of jail in 1993. They may or may not be on this computer. If not this computer, then my other computer in storage has it.
How to analyze speech patterns - for deceit for example
If we were to analyze Larry Federico's speech we would see that he uses the phrase “then turn around and” or “I'm gonna turn around and”. The essence is “turn around”. These are words manifesting his intent to deceive.
Basketball is a game based on deceit - as are most games - and most human behavior. That is why perjury is never prosecuted - Furman in the O.J. trial perjured himself and escaped punishment.
Psychologists talk abut the “switch” which is characteristic of devious people - and characteristic of our games - which manifest our social skills - the most prominent of which is deception. Example: Basketball player fakes to the left and dribbles right. That is deception!!) Deception is a natural defense of animals. Possums pretend to be dead. Fish inflate themselves to appear dangerous. Palaschak's theory of deceit: It is more prominent in densely populated areas of crude ignorant people; in fact it becomes ingrained in society. Corollary: Saturation advertising - a hallmark of 20th century television and radio - is based on deceit; and advertising only works in densely populated pockets of crude ignorant people.
Larry is unintentionally telling us that his plan involves buying something from somebody and then selling it for more - by deceiving both the buyer and the seller. Larry is a used car salesman branching out into real estate.
Palaschak's theory of jumbled words
If you have enough verbiage about a subject, you can tell fundamentally what is being said even if the words are rearranged; the frequency of occurrence of words signals the message albeit not nearly as well as with ordered words.
Related theory of jumbled letters: If you have the first and last letter correct, people can unjumble the letters.
You can porve tihs throey to yruoself. You jsut now proevd it.
Palaschak's Theory of Control by Words
If we don't use the word “amercements” in Reader's Digest, then the readers of Reader's digest will never ask about amercements. That's why some parents spell the word “pregnant”. It avoids discomforting explanations - and dumbs down the population - at an early age. “By their words ye shall know them”
There is a universe of applications of word frequency theory!
Most clear example: Application to the Unibomber case: To prove that the unibomber wrote the treatise one could compare his previous writings. Compute a divergence ratio for each of the most diverse words in the vocabulary of the treatise. Compute the divergence ratio for the most diverse words in the previous vocabulary of the unibomber. Rank the words inversely according to frequency - and compare the ranked list. The diverse words should be at the top of the list in both lists.
We can use this theory to find out who really wrote supreme court opinions- or lower court opinions - except that editors will remove the really telling words.