Thursday, January 31, 2008

For portuguese speakers

This blog is the funniest thing ever. If you go there, leave a comment to her.

Don't know about you damn Darwinists...

I don't know about you monkeyboy Darwinists, but I'm sure as hell submitting papers to The Journal of Creation!


Misclassifications of Adenine and Guanine: Serious fraud of scientific evidence?

Dr. A. Linhares
Reverend, Universal Life Chuch

Abstract. In this article we present conclusive data showing that DNA base pairings of nucleotides--most specially some subtle effects involving pyrimidines--explains why both (i) the overall length of a DNA double helix determine the strength of the association between the two strands of DNA and (ii) how Eve was encouraged by a snake to led Adam eat of The Forbidden Fruit. Moreover, by showing that Adenine and Guanine have been intentionally misnamed with an explicit agenda against a clearer comprehension of the events surrounding His death and resurrection (a mischievous fraud for which its proponents shall repent), we demonstrate that such entities should have been characterized with the letters E and V. With our more modern terminology, we have been able to uncover 100 Billion instances of the naturally occurring "EVE" sequence in the genome of the species in which no one is free from sin. Needless to mention, it is no coincidence that 100 Billion is exactly the estimated number of galaxies in the observable universe, and even if supermassive black holes are found at the center of galaxies--a speculative, yet potentially possible finding--that would not in any statistically significant sense bear any effect on the data concerning the fact that He called Abraham and his progeny to be the means for saving all of humanity, or related phenomena.


So long, you fools! --Alex

Wednesday, January 30, 2008

Open markets are oh-so-beautiful!

That's the reason communism failed; nothing beats open markets and their incentive systems. Full article from Slate.

Semantic web dreams and a strategy for Yahoo!



From a cognitive science perspective, the semantic web is still years and years and years away--at least a full decade. What I mean is the set of complex mechanisms that involve creating meaning, not the usual ridicule hyperbole out there.

Consider, for example the fact that when the TAM flight crashed in SP last July, the news were full of contextual ads suggesting readers to "Fly TAM".

Or maybe take a look of these contextual ads (hat tip to Digg):



The best ideas over this issue are by Bob French--though he doesn't particularly address 'contextual ads'--but the whole problem of meaning extraction from text databases in which semantic web engineers are falling upon. This paper is one of the funniest, and most intelligent, things I've ever read.

The "long tail" will always be algorithmic. The "fat head" will always be mainstream. The "middle ground" will be social. This naturally suggest a strategy for Yahoo! (which TechCrunch says is failing--and it just might be).

Yahoo! isn't mainstream media, nor algorithmic (like Google). From this point of view, I think what they should do becomes clear: They should strive to dominate the middle space.

Yahoo! should go beyond del.icio.us and acquire digg. It should subordinate all of its strategy to having all content, including ads, brought up by social voting. If an ad is buried, let it go; just like every piece of content. In the short-run, most likely, only ads from Apple or Ron Paul will appear; in the long-run, only good, socially targeted content should arise.

Meanwhile, algorithmic contextual ads will keep suggesting to stone people to death, to find debt, and to burn babies.

Monday, January 28, 2008

Some thoughts on Temperature

Intrinsic to FARG architectures is the notion of temperature. From John Rehling's thesis:

"All of the three modules and the Letter Spirit loop incorporate, each in its own way, temperature – a means of modulating the randomness of behavior – or something very much like it. The Letter Spirit loop, for instance, decreases the amount that style influences the Drafter's attempts at rendering a particular letter category when it has already undergone several attempts at drafting without the Examiner having recognized them as members of the intended letter category. It also turns up the amount of randomness in drafting as a run goes on, under the premise that novel letterforms are more useful at that point than good ones that had been drafted previously. In all of its guises, the use of temperature employs the principle encapsulated by the phrase "desperate times call for desperate measures"—a principle seen at work in many kinds of human behavior." (Rehling p.346-347)

While I do love the notion, I am not satisfied with current implementations.

We know the mind is massively distributed. Imagine, in the future, Fluid Concept architectures running in thousands of processors. It is with this in mind that your blogger asks you to follow his wild speculations.

Today, temperature is implemented as a central variable that the whole system depends upon. As an appetizer, ask yourself: Is this sustainable for a massively parallel future? Do our brains have a central floating point number to access the state of entropy in any given situation? That's highly unlikely, I think.

Now for the main course: how can we change this? How can we turn a single, central, variable, into massively distributed stuff? I'm going to speculate here, but I think that this can be implemented fairly easily, and perhaps by the end of 2008 we can have some running examples.

How does temperature work? It both measures the state of coherence (or, to flip it, the state of disorder), and feeds back that information, so that the system may "behave" itself or go "balooney".

By behaving itself, I mean, of course, that the system should remain concentrated in the most promising ideas. It should ignore most possibilities. If temperature is low, that means that a coherent global view is coming into shape, and we don't want to mess it up by destroying work done--which inevitably happens if a "new" idea comes up.

By going balooney, I mean that the system has not yet figured out what to do. It has not yet built in STM structures that represent, in any cogent way, what the situation is like. Any node in the slipnet is welcome to become as agitated as it wants.

Now, how can that be massively distributed?

Here's a hunch.

What if, during a run, the system is able to track codelets that are destroying previously built structure? If the nodes that are creating and destroying stuff can be tracked by STM chunks, then what about creating, temporarily, "inhibitor" links in the slipnet nodes responsible for the creation and destruction of the structure in question?

With these mechanisms in place, I think they can be managed to gradually increase inhibition between competing worldviews.

After some time, with high enough inhibition, we would have "clusters" of self-reinforcing nodes, and these clusters should inhibit other competing clusters. The emergent behavior should in principle be just like the impact of a lowered temperature. "New", unwanted ideas should rarely creep into the party.

If the system gets stuck on a snag, then it could try moving to another cluster. But how would it find out that it is stuck, without having temperature (a centralized variable), to tell it that "this is probably not the best idea concerning our current situation"?

The structures built, since they would be tracking (some) activity in the slipnet, might become more and more dissatisfied with the top-down "orders" they are receiving. At one point, they might send a message to "shut-up", and bring activation of such node to zero. Maybe also this zero activation might spread to (the cluster's) associated nodes, or maybe even it could flow directly to the inhibited nodes. Perhaps the inhibited nodes (who were dying out) could be instantly activated by such triggering. This could make the system rapidly jump between competing worldviews, rather like our perception of Necker cubes or the Vases/Faces illusion.

Of course, as Christian says, language compiles everything. It may be very tricky to get this working. I do, however, think that it would (coupled with other new ideas) enable massive parallelism (on the tech side), and, more than that, reflect the distributed nature of our own information-processing (on the scientific side). Perhaps this may be one of the right moves in turning that Rubik's cube.

(more than ever, comments very welcome)

The Mystical "AS-A" relationship

It's fair to say that object-oriented languages presuppose metaphysical realism; that is, a world composed with objects, properties, and relations, independent of any understanding. OO-goodness instantly provides the IS-A relationship (by inheritance), and the HAS-A relationship (an object with other objects as properties). It does not, however, provide the much needed "AS-A" relationship. This is to me, the single greatest reason for the failure of AI and Cognitive Science.

There are so many different examples of seeing one thing in terms of others; just use language and they will come up by the thousands. To take one example I've mentioned earlier:

Take the iPod mp3 player--it is a hit product that "saved Apple Computer". On the day the first iPod was announced, people did not have the category "iPod" in their minds, so they had to resort to previous categories... making all kinds of analogies you can imagine. In fact, I decided to collect some analogies on the iPod, to use someday in a paper in Marketing. Here are some [taken from real people debating on the web]:

THE iPod can be seen...
"as a mobile phone" [future prospect]
"as a canvas for expression"
"as a bootable drive"
"as a security threat"
[video iPod] "as a marketing tool"
"as a data repository"
"as a unit of measure" [believe it or not: in terms of money, height, and so on; e.g., how many iPods to circle the Earth?]
"as key for security"
"as a platform" [such as the Solaris platform]
"as a legislative force"
"as an Ebook"
"as a presentation device"
"as a business tool"
"as a phone phreaking device"
"as a portable DVD player"
"as a learning tool"

Crazy, right? But that's perception at it's core... understanding something in terms of other, better understood things.
I think that we have nailed this one--though I can't disclose the solution until we have a full paper. I think we really nailed it. But a blog is not an outlet for this discussion--and it may have implications, not only for cognitive models, but for programming languages also. We are building a first implementation, and I hope to post a link to the papers as soon as they are born.

And the best part? It's not complex as it seems. It's simple. Sometimes you have to go through an immense amount of complexity in order to find something that's crystal-clear and dead simple. It's simple, and it is beautiful.

The Starcat project: a thought

The Starcat project is an intriguing project being developed at San Diego State University. It's hard to distill exactly what they are doing, or how (most papers are short, and they have numerous different application areas, but there are two which are not), but one idea strikes my as fantastic.

They have decoupled the slipnet (long-term memory), the workspace (short-term memory), and the coderack. They are using patterns of observers, and there is a mediator between these systems that takes care of their interactions.

One thing that I have some objections to is the idea of having "codelet events carry[ing] the necessary instructions to drive the system" in each of its now-decoupled subsystems. I cannot argue vigorously against this yet, but I envision "general" codelets which will be applicable to any domain. Until we can get these general codelets running in a system, it stays as a hunch. And even then, their idea may be complementary to ours (we just don't know yet). As strange as it may sound, I think that a fixed set of codelets should exist, in any domain. Of course, the accidental characteristics of each domain will still need to be programmed, but I feel that responsibility won't be placed on the codelets.

And here's what I find it fascinating: first, these subsystems can be implemented in parallel. Moreover, a change in the code of one of the system should not introduce bugs (the cognitive model may go haywire, of course, but it should still work as a program). The systems are encapsulated, and you can study and manipulate them in isolation from each other. I find this idea fascinating, and Kudos to their team!

Essence and accident

Quoting from "the success of open source", by Steven Weber, Harvard University Press, 2004, p. 57:

[Software engineer Fred Brooks once] wrote a paper entitled "No silver bullet: essence and accidents of software engineering".

Brooks use Aristotelian language to separate two kinds of problems in software engineering. Essence is the difficulty inherent in the structure of the problem. Accident includes difficulties that in any particular setting go along with the production of software, or mistakes that happen but are not inherent to the nature of the task.
One of the goals of The Human Intuition Project is to build a robust framework on which to develop FARG architectures on a much larger and faster scale--or, to use the economist's favorite word, to increase productivity. Brook's distinction is extremely important here. Each project has had its accidental characteristics: NUMBO deals with numbers and operations, Tabletop deals with objects in a dinner table, Phaeaco deals with Bongard problems, Copycat and Metacat deal with letter strings, Letter Spirit deals with font styles, Capyblanca deals with chess, newcoming projects handle music sequences, number sequences, and so forth.

All of these projects have a shared essence; a shared set of ideas employed to create the system. And all of them have accidental characteristics to deal with. If we are successful, then two things will happen: designers of new computational models will be able to concentrate attention to the accidental features of the problem; and, more importantly, we will be able to concentrate on the essential features of human cognition in separation from the accidental features of any particular domain. If we all agree on the same set of ideas, it should be a non-brainer to encapsulate them and have them outside any particular domain and its accidental characteristics.

Monday, January 21, 2008

On the carving of waves

Legend tells that, during the first days of AI, Marvin Minsky was developing a machine to play ping-pong.  The greatest effort at that time, of course, was devoted to the "thinking" part--and if the story is true, then he assigned an undergraduate student to handle the "perception thing".


But of course, it didn't work.  Perception was found to be more difficult than it seems at first.  For us, perception is trivial; it is only when you start to deal with its mechanisms that the full extent of the problem arises.

Whenever we hear speech, we are carving waves of sound into discrete units (phonemes, then words, then phrases, etc).  Whenever we read a webpage, or look outside the window, we are carving a massive bidimensional wave of colors into discrete objects.  It is hard to model the information-processing that takes place on our minds.  Roughly 60 years later, Microsoft, with all the cash in the world, still struggles with the problem.

But the carving of continuous entities into discrete ones is not the only scientific problem here.  Even discrete entities are chunked into ever higher levels--in myriads of different ways.  Consider, for instance, the copycat project, the letter spirit project, the tabletop project, or any other project from FARG. 

For example, in NUMBO, numbers are chunked by applying operations with bricks.  95=(9*11)-4.  In chess, related pieces are chunked into groups.  And even a letter is a chunk of different traces.  The objects we see are not "out there", outside of any understanding.  Objects are mental creations.  And these mental objects, these chunks, have an enormously recursive structure.  Some traces chunk into a letter, some letters chunk into a word, some words chunk into a sentence, and on and on it goes.  Meaning is created, somehow, during the process.

We are developing a theory of mental objects, a theory of chunks, which, we believe, will push FARG forward.  In this model, one of the crucial ingredients of chunks are relations (which create new chunks based on other chunks), or rules (which help in carving waves into the most basic chunks possible--probably corresponding to what happens in V1, an area of the brain related to vision).  We are still quite far from a working model, but here's a strategy to approach it: forget about the waves at start, and deal with pre-built, low-level, chunks.  Only after that is satisfactorily solved, rules should be tackled.

If you think that that's an oversimplification, that dealing, for example, with discrete entities such as letters, is a problem way beyond the carving of waves, look no further than the unintentionally worst company domain names ever.  Here are some:

1. A site called ‘Who Represents‘ where you can find the name of the agent that represents a celebrity. Their domain name… wait for it… is
www.whorepresents.com

2. Experts Exchange, a knowledge base where programmers can exchange advice and views at
www.expertsexchange.com

3. Looking for a pen? Look no further than Pen Island at
www.penisland.net

4. Need a therapist? Try Therapist Finder at
www.therapistfinder.com

5. Then of course, there’s the Italian Power Generator company…
www.powergenitalia.com

6. And now, we have the Mole Station Native Nursery, based in New South Wales:

www.molestationnursery.com

Carving and chunking are deeply interrelated, both involved in the creation of mental objects.  Traditional computer science can't do the trick here.  Many traditional ideas must be reconsidered.  If Microsoft ever hires someone like Harry Foundalis, we'll know they're now serious about this thing.

OOPS. We've been buzzfeeded!

Here's what happens when Buzzfeed links to you.  Traffic chart from google analytics:


Sunday, January 13, 2008

Linhares, A., & Brum, P., (2007) Understanding our understanding of strategic scenarios. Cognitive Science 31, 989--1007

Here's our new paper on Cognitive Science:

Alexandre Linhares & Paulo Brum‌, EBAPE/FGV, Rio de Janeiro, Brazil

There is a crucial debate concerning the nature of chess chunks: One current possibility states that chunks are built by encoding particular combinations of pieces-on-squares (POSs), and that chunks are formed mostly by "close" pieces (in a "Euclidean" sense). A complementary hypothesis is that chunks are encoded by abstract, semantic information. This article extends recent experiments and shows that chess players are able to perceive strong similarity between very different positions if the pieces retain the same abstract roles in both of them. This casts doubt on the idea that POS information is the key information encoded in chess chunks, and this article proposes, instead, that the key encoding involves the abstract roles that pieces (and sets of pieces) play–a theoretical standpoint in line with the research program in semantics that places analogy at the core of cognition.


The basic idea is this: We are showing, for the first time, that analogy-making pervades chess thinking, especially in the most abstract, middle-to-endgame play. If you'd like to take a look, feel free to drop us an email.

Tuesday, January 8, 2008

In search of productivity gains

It's January 2008, we wish you all a great new year, but first and foremost we wish ourselves a great new year. We do have big plans for the year. One such goal is to improve productivity in developing FARG models by a large factor, perhaps 10 times easier. If Capyblanca took 5 years to develop, and it's still with its rough edges, a major new rewrite based on the framework codebase should take 5x12 months div 10, which equals 6 months, which seems about right.

Note that there is only one problem here.

Like honest man, calm women, and fire-breathing dragons, that codebase does not exist.

But we have a vision for it, and the vision is beautiful. It's as elegant as Giselle on Armani. I think some words of explanation are needed here: Why is it beautiful? And why is that important?

It's closed for modification and open for extension. Which means two things; first, we'll have basic FARG functionality provided, for free, with no need to deal with it's internals (slipnets, codelets, temperature, coderack, maybe even hedonic feedback regulation will make it to v. 1.0). Our evolving system is based on the notion of connotations. And the idea of connotation explosion is explored in full. All you should need to do to create a new FARG system is to develop the right connotations. For Copycat, that might be, "Letter B", "First", "three", "Sameness group", "Sucessorship Group", "opposite", and so on. Write some code for these, and the system should run, beautifully.

If that ever happens, and the codebase is robust (i.e., it accepts other, new, domains, without having to rewrite its internals), then that codebase should be getting closer and closer to the truth. That being something the likes of figure 49 of Harry's thesis--note, as an aside, that in the paper I've linked here I think that Harry underestimates Jeff Hawkins.

But it is beautiful, also, because it employs recursion, symmetry, polymorphism, in such new, very new, ways. I think even people interested in programming languages might be interested in seeing what we are coming up with. It has some repercussions in that arena. But that's only for when we have a good, solid working model. So our first task for this year is to delve on hedonic feedback and autoregulatory learning (i.e., autoprogramming), and a connotation transfer-based form of programming. After we have some reports on these, we should be able to enjoy large gains in productivity.

And as any economist will tell you, a productivity gain is one of the greatest things you can ever achieve in your endeavors. Large productivity gains change everything. There were cars before Henry Ford, and there was such as thing as a "world-wide web" before Netscape. But those brought productivity gains, changing "the curve of the curve". This is what FARG needs. So here's some reasoning for optimism, in spite of the ugly economic downturns ahead:

  1. FARG changes everything; with it we have a scientific model, and an agenda, for understanding what understanding itself is all about.
  2. BUT FARG is hard to implement. Very hard to build. Nasty problems abound.
  3. A large productivity gain in FARG development could help bring massive change in this second point; and
  4. If that gain comes through a solid codebase, the though constrains imposed on the codebase to reflect human cognition should, in the long run, provide a better perspective towards a full theory of human high-level cognition.
So there it is. Have a great 2008! As a general policy, it's always better to start a year with ambitious, optimistic dreams, and leave the screwing up for later, on the coming months.