Thursday, January 18, 2007

Why similarity is way beyond the similar, II: the poverty line

What defines a poor person? Is a poor person someone below the poverty line, someone whose income is less than a precise number? If you think so, as the US government does, you're putting in the same category homeless people from New Orleans and some Chicago Economics grad student who might work as a TA and still get a shiny golden Rolex from proud wealthy daddy for Christmas. Oh, and perhaps you're throwing out those above the poverty line, but whose expenses dwarf their capabilities, such as garbage collectors of expensive New York City.

We must say Bravo, unfogged! Bravo!

Many English words (well, words in any natural language), particularly the kinds of words that are important in political arguments, have imprecise referents that shift with context. "Poor" and "poverty" refer generally to a state in which one's lack of economic resources is a problem; whether that specifically means Pat Moynihan's 'underclass', or a grad student living on ramen noodles, or even a house-poor family who overspent on a McMansion and are now, despite a decent income, fearing foreclosure. In some contexts, poverty refers to grinding hardship; in others, it can describe a state that doesn't involve a great deal of hardship at all. But these are all correct usages of the word 'poor '-- while it's not of infinite extension, it doesn't have a sharp edge. Given any situation, you can argue about whether 'poverty' is a valid description of it, but there are always going to be borderline cases where there isn't a solid yes or no answer.

In order to facilitate public policy analysis, on the other hand, the US Government has created a defined term -- 'the poverty line' -- which does have a sharp edge. If your income is three times the cost of an economy food budget in 1963 (adjusted for inflation) or below, you're below the poverty line; if not, you aren't. The poverty line is a precise measure, and it's necessary for some purposes, but that doesn't make it more accurate than the vague natural-language word 'poverty'. A grad student from a wealthy family with a lot of possessions and family assistance who's earning a below poverty level stipend for a year isn't poor, despite being under the poverty line; a family living in an area with high housing costs and making an income slightly over the poverty line is poor, despite not meeting the definition of the defined term. That's not a reason not to use the defined term, but it's important to remember that the defined term is a tool, rather than a reality; a public policy intended to address 'poverty' and directing its aid toward the temporarily low-income grad student (no implication that there aren't genuinely needy grad-students intended, of course) in preference to the genuinely needy family would be misdirected, even though the first is below the poverty line and the second isn't. For accuracy's sake, it's important to focus on the vague natural language word, which refers to some state of hardship due to lack of economic resources, and remember that the defined term is simply a tool for analysis.

Similarly, in natural language 'the economy' describes the whole system of exchanges of goods and services that go on in our society -- it's incredibly complex, and certainly can't be reduced to one or two statistics. We're interested in the economy because it has all sorts of measurable characteristics that affect the welfare of people in our society. There's a defined term, 'Gross Domestic Product', whose size and growth are equated with the size and growth of the economy. This is precise, and it's not wrong if the reason you're discussing the economy is something that's going to be strongly affected by GDP, but its precision can make it terribly misleading when you're talking about the economy in terms of the economic welfare of individuals. If you find yourself thinking "Well, the economy is strong; even though wages are flat and income volatility is high, it's surprising that people aren't reacting positively to the economic good times" it's because your equation of 'the economy' with something that can be precisely defined, GDP, has left you with an inaccurate picture of what economic good times mean. The vague word is less likely to lead you astray than the defined term.

A lot of people have a tendency to privilege defined terms over naturally used words; if there's a tightly defined sense a word can be used in, they want to call usages that don't fit the tight definition as wrong, or improper. The problem is that you can't shake all of the connotations and baggage away from the natural word; when you say that 'poverty' means only 'the state of having an income below the federal poverty line', you implicitly state that someone who isn't poor by that definition isn't suffering from economic hardship -- while you can explicitly disavow the implication, it's still hard not to be affected by it. Better, and more accurate, to use words naturally, and save defined terms for the contexts where their precision is necessary.

[note: Free Exchange also has a say on the subject].

A small addendum: Regarding the cases where there is no sharp edge between "similar" categories, or English words for that matter, we should perhaps substitute "many English words", for "all English words". Afterwards, why not go to "all human concepts"? Perhaps all human concepts are fluid and change according to context. The connotation explosion will allow us to see someone above the poverty line as a very needy, poor person. It will also allow us to see that many people below that line are not to be considered in any sense needy. Even precisely defined concepts which make the glory of economists; even mathematically defined terms, such as number, or a square, will dance the context dance.

A number is a number is a number, right? A square is a square is a square, right? Though these are precisely defined categories, our cognition tricks us for responding 'yes', while we should perhaps take another look.

In number theory a number is a number. But it also can be a well-formed formula, or not. And those that actually are well-formed formulas can also refer, or talk about, in a Gödelianesque way, the entire nature of mathematics. Numbers have more than a single connotation; and the chosen connotation will change according to the context it is considered.

The Russian Scientist Michail Bongard brought forth a great challenge to computer scientists. Find out what is the difference between six visual figures on the left side, and six visual figures on the right side. Here are some Bongard problems. You can solve them in a minute or two.

The answers are in the first comment. Now, look at problem 91, and ask yourself: have I been consistent?

The answer is that you have not been consistent. A square is a square is a square? For that to happen, you would have to classify "three squares on the left side", versus "one square on the right side". Or perhaps you might want to think in terms of line segments: "twelve on the left", versus "four on the right". Those views lead to no correct answer.

If you solved it, you have been inconsistent; categorizing the "same" thing differently according to context.

Not to worry; wasn't, after all, Ralph Waldo Emerson, who said that "consistency is the hobgoblin of small minds"?

We have been pointing this out for a while now:
Linhares, A, (2000). A glimpse at the metaphysics of Bongard Problems. Artificial Inteligence 121, 251-270. Available HERE.

Why similarity is way beyond the "similar": the million-dollar comma

What is similarity? How do we categorize "similar" stuff? Seems pretty obvious, right? We see similar things, and they are properly categorized because, well, they're just similar! That seems correct until you look down at the fine print and try to see the intricate cognitive details involved. Our minds have concepts, and these concepts sing and dance according to the tune of the moment. Perhaps the brain's motto should be: our criteria for classification and categorization is simple: there is no criteria.

Take, for instance, the question "What is similarity?", and consider a text document. Let's think like a computer: If the Google search engine found two documents with thousands of words and a mere comma of difference between them, should Google classify the pages as similar? If you think so, look at it from a human's (or perhaps one might better put it as a lawyer's) eyes.

Here's a story from the New York Times:

The Comma That Costs 1 Million Dollars (Canadian)

OTTAWA, Oct. 24 — If there is a moral to the story about a contract dispute between Canadian companies, this is it: Pay attention in grammar class.

The dispute between Rogers Communications of Toronto, Canada’s largest cable television provider, and a telephone company in Atlantic Canada, Bell Aliant, is over the phone company’s attempt to cancel a contract governing Rogers’ use of telephone poles. But the argument turns on a single comma in the 14-page contract. The answer is worth 1 million Canadian dollars ($888,000).

Citing the “rules of punctuation,” Canada’s telecommunications regulator recently ruled that the comma allowed Bell Aliant to end its five-year agreement with Rogers at any time with notice.

Rogers argues that pole contracts run for five years and automatically renew for another five years, unless a telephone company cancels the agreement before the start of the final 12 months.

The dispute is over this sentence: “This agreement shall be effective from the date it is made and shall continue in force for a period of five (5) years from the date it is made, and thereafter for successive five (5) year terms, unless and until terminated by one year prior notice in writing by either party.”

Look at that last comma. How long should the contract last? Without the comma, it's pretty clear, right? It must last at least a full 5 years. It is beyond the point whther the bastards lawyers actually intended this, but the comma, however, distorts meaning in a profound way. This distortion of meaning brought by the slightest of cues is one of the phenomena that gets our bloodstream turbocharged, for that is the basis of expert intuition. [I'm running a Ph.D. seminar entitled "Mathematical distortions of semantic space: theory and philosophy", and our topic of discussion turns out eventually to be this comma? Boy, am I going to succeed in Academia...]

Let's now get back to Google's way of looking at things. There are two 14-page documents, one has a single comma that the other lacks. Should Google classify them as "similar"? It seems clearly obvious that it must be the case: to Google's eyes, these are 99,9999% similar. After all, under what circumstances should the algorithms in a search engine perceive the semantic dangers that lie within a single comma, given thousands and thousands and thousands of exactly-matching-words-and-paragraphs documents?

Neuroeconomics and connotation explosions

Oh, wouldn't it be great to see a neuroeconomics study of "tax cut" versus "tax relief"? Get a bunch of subjects to discuss "a tax cut program"; another, similar bunch of subjects to dicuss "a tax relief" program, everything else remaining constant, and watching how those brains respond to each scenario?

I'm more than willing to put money that there's a large disparity in brain processing between these cases. I'd also bet that, as the topic becomes rather dull people converse more and more about either a "tax cut" or a "tax relief", this disparity will diminish.

My final bet is that a pool of economics Ph.D.s should exhibit much less disparity between the framing of "cut" and "relief" than subjects with a Ph.D. in Greek Mythology.

Any takers?

Tuesday, January 16, 2007


Students of decision-making are well known to the combinatorial explosion problem. It pervades a large number of disciplines, and NP-complete problems most probably will never have efficient algorithms; that's just a fact of life, it seems.

What should be perhaps pointed out as a problem of at least similar significance, but barely touched upon the surface, is what may be called the connotation-explosion problem. Let's see three examples of this; in hope they will suffice to make the understanding clearer.

Example 1. Think about a tax cut plan. Does that sound exciting? Does it move you? Do you immediately think that 'this is something that just must be fair'? Not necessarily, right? Now, following Lakoff, consider a tax relief plan.

re·lief - rɪˈlif, noun
1. alleviation, ease, or deliverance through the removal of pain, distress, oppression, etc.
2. a means or thing that relieves pain, distress, anxiety, etc.
3. money, food, or other help given to those in poverty or need.
4. something affording a pleasing change, as from monotony.
5. release from a post of duty, as by the arrival of a substitute or replacement.
6. the person or persons acting as replacement.
7. the rescue of a besieged town, fort, etc., from an attacking force.
8. the freeing of a closed space, as a tank or boiler, from more than a desirable amount of pressure or vacuum.

Ahh, a tax relief sounds different than a mere tax cut, doesn't it? The magnificent connotations attached to relief are too good not to alter our perception of a government policy. It brings a refreshing feeling to what could be a rather dull topic. It brings connotations associated to our bodily feelings to a political program. I am ready to bet that less people would have panic attacks if they were called an "unneeded adrenaline spike".

Connotations explode, and a new meaning arises. But didn't we all (specially the politicians amongst us) knew that already? Of course, but now consider this:

Example 2. The Eliza effect. Joseph Weizenbaum, an MIT computer scientist, devised ages ago a curious program called ELIZA, which parodied a psychotherapist.

"ELIZA - A Computer Program for the Study of Natural Language Communication between Man and Machine," Communications of the Association for Computing Machinery 9 (1966): 36-45.

What was astonishing about ELIZA is that it mostly worked by throwing back to people what they had asked in slightly modified form, yet people truly felt the thing resonated with their feelings. So there you have a (kind of) simple program, which holds no associations between child abuse and cruelty, spitting out statements like "I understand your feeling, and am sympathetic about it; please go on". These statements have loaded meaning to people, but to ELIZA they were just empty as an automated teller machine printing out "thank you for your business". The "ELIZA effect" is the tendency humans have of projecting meaning to the machine.

Quite naturally, we have a bunch of modern versions, some of those not by Weizenbaum, but by Microsoft. With hindsight and pride not to be taken by a mere machine, here's a post worth quoting in full, from "De Bode Francisco Orelana" ('diz' obviously means 'says'):

ana carolina diz:
i want to leave
Encarta® Instant Answers diz:
Bye ana carolina, it was good chatting with you.
ana carolina diz:
i didn't leave yet!
Encarta® Instant Answers diz:
Well I'll be. So, is that a good thing?
ana carolina diz:
baaaaaaaad thing
Encarta® Instant Answers diz:
Oh? I'm sorry you're not feeling well.
ana carolina diz:
Encarta® Instant Answers diz:
Yes. That's right.
ana carolina diz:
may i call you dave?
Encarta® Instant Answers diz:
If you find a way to call me Dave, feel free to.
ana carolina diz:
hi dave
Encarta® Instant Answers diz:
Hi ana carolina. What's up?
ana carolina diz:
wanna marry me?
Encarta® Instant Answers diz:
Thanks, but I'll wait until I hear from people who already got to marry you.
ana carolina diz:
i'm single
Encarta® Instant Answers diz:
You are? Wow. Is it nice to be single?
ana carolina diz:
yes, but i wanna a rich jewish husband
Encarta® Instant Answers diz:
I leave it to humans to consider religious questions, I am only a robot.
ana carolina diz:
are you a sex machine?
Encarta® Instant Answers diz:

We should stop here with a brief announcement, as the ELIZA effect has quietly crept into economics.

Friday, January 5, 2007

Which mechanisms enable intuition?

In a previous post, we have asked which mechanisms enable intuition in humans (and other animals).

When compared to humans, the Sphex does not seem to feel any discomfort and seems happy to be stuck in a rut. A human would most probably feel increasing tension when something annoying like that bugs us. The Sphex, by contrast, does not seem to have those funny feelings. How do these funny feelings (that something is wrong) work?

Perhaps it's time to point out experiments that shed some light at how our very subtle intuition is processed. Both concern unconscious learning.

Take, for instance, what I call the "Hefferline" experiment, for its first author:

Hefferline, RF, Keenan, B, & Harford, RA, (1959) "Escape and avoidance conditioning in human subjects without their observation of the response", SCIENCE 130: 1338-1339.

These guys were behaviorists studying conditioning; please leave your prejudices aside and consider this: they devised the following experimental setting:

(i) The desired conditioned response was a 'tiny thumb-twitch so small as to be subliminal, that is, the subjects could control it but only unconsciously;
(ii) This response was recorded by electrodes placed on the palmar base of the left thumb and on the medial edge of the left hand; Other electrodes were also placed as dummies;
(iii) Subjects heard music through headphones, and there was an aversive noise that would be mute for 15 seconds whenever the response was obtained (or, in case it already were mute at the time of response, being then postponed for more 15 seconds).

There were four groups, each of which, of course, was told a different story. The group that's really interesting is group 1. These guys had recording and dummy electrodes. They were not informed that they had any control of the disturbing sound. What they thought was that the study was about "body tension of noise superimposed on music". Thus, they were simply told to "listen through and [oh I love this part] otherwise do nothing".

After some time under the experiment, Group 1 subjects started to develop the desired response. After the experiment is over, they found it amusing that the noise in the music seemed to disappear. When told that they were, in fact, controlling the sound, and that their brains had made it go away, they simply could not believe in any of that scientist crap.

At all times, our brains are capturing tiny cues from the environment, and trying to do the best it can with that info. Whether we like it or not. We don't even know or feel that that's going on. But it is; and this mechanism is the basis of human intuition.

The other experiment is generally called Damasio's Iowa gambling task:

A group of subjects is given 4 sets of cards, and $2000 in fake money. The subjects are told to "maximize profits, show me the money!", or something to the effect, and... with each card they take from a given set, they get a reward (some money) and a penalty (a fine to pay).

Two sets of cards are "good", and two are "bad"; in the sense that the good ones provide something like $50, and low penalties, while the bad ones provide much more ($200) but sometimes have steep penalties, such as $750 to debate from your amount. The point is, when do people learn to separate the good decks from the bad ones?

Well, after 80 cards have been taken, people have an explanation, a conscious report of which cards they prefer (though it is more of a process, let's call this "event C")

After some 50 cards, people report having learned nothing yet, but they are strongly biased towards the good decks, with no awareness of the bias ("event B")

And here is the kicker: after a mere 10 cards, people cannot report anything concerning the good and the bad decks, but... they already generate a stress response to the bad cards ("event A").

Intuition concerns 'immediate knowledge'. The Iowa gambling task gives us clues as to how long, in terms of iterations, one goes from a skin stress response to an unconscious bias to a conscious report of what they're doing.

An amazing opportunity would be to provide a FARG architecture for this one. If anyone has the courage to do a thesis on this one, I'm totally in for it.

If you'd like, you can play with the IOWA gambling task here: SHOW ME THE MONEY!