Sunday, May 23, 2010

Probability 2 - Boy or Girl paradox

I continue to be amazed by the fact that apparently simple probability problems are often so difficult to resolve. It is particularly surprizing in view of the fact that so much science seems to depend on probability these days: the particle physicists' results are small probabilistic anomalies in incredibly expensive experiments, quantum physicists say that nature is inherently probabilistic, the medicine we take is determined by 'random' trials, evolution apparently occurs by 'random' mutation, etc, etc. (I don't necessarily concur with all this, by the way.)

I have been thinking more about this fact since the blog post of Peter Cameron, which derived from another post by Alex Bellos. The particular problem they consider is this: if someone tells you that they have two children at least one of whom is a boy born on Tuesday, what is the probability that the second child is a boy?

Actually, the simpler version of this problem in which there is no mention of the day of birth (possibly introduced by Martin Gardner who died 22nd May), has been discussed and disputed in short and in long, in circles high and low, under the name "The Boy or Girl Paradox".

In short, my view is that the answer to this question is 1/2 (under the simplifying assumptions clearly intended; that is, that the birth rates of boys and girls are equal, etc).
However a closely related problem which I will explain would yield the answer 1/3, the answer preferred by mathematicians.

I think this simple problem benefits from the distinction between states and actions, the states being the quality of being a girl or a boy, and the actions being the declarations made. In the real world problem you need to consider the probability of the action as well as the probability of being in a state.

Let's now consider the problem in detail: the simplifying assumptions mentioned above mean that given a family with two children the following four cases are equally likely: BoyBoy, BoyGirl, GirlBoy, GirlGirl. In the case BoyBoy the likelihood of a declaration that there is a boy is 1, in the case of GirlGirl the likelihood of a declaration that there is a girl is 1, but in the other two cases there is a half chance that the declaration is 'boy', and a half chance that the declaration is 'girl'. Now let's look at the total weight of a declaration 'boy': it is 1+1/2+1/2. The weight which corresponds to the case BoyBoy is 1. Hence the proportion of 'boy' answers which correspond to the state two boys is

Now a slightly different problem: suppose I interview people with two children, and ask if they have a boy in the family; if they answer yes, what is the probability that the second child is a boy?
Again the simplifying assumptions mentioned above mean that given a family with two children the following four cases are equally likely: BoyBoy, BoyGirl, GirlBoy, GirlGirl. In the case BoyBoy the likelihood that they respond 'yes' is 1, in the case of GirlGirl the likelihood that they respond 'yes' is 0; in both the other two cases the likelihood of 'yes' is 1, not 1/2. Now let's look at the total weight of a response 'yes': it is 1+1+1. The weight which corresponds to the case BoyBoy is 1. Hence the proportion of 'yes' answers which correspond to the state two boys is

If there is also mention of the day of birth there are at least three interpretions:
1) a person declares that they have two children at least one a boy born on Tuesday;
2) when asked if they have a boy, the response is 'yes, and born on Tuesday';
3) when asked if they have a boy born on Tuesday, the response is 'yes'.

The probability of the second child being a boy in the first case is 1/2, in the second
1/3 and in the third 13/27.

In this post I have tried (exaggeratedly) to argue in as simple language as possible, because a simple problem should only require the appropriate distinctions, not a big machinery. However I have earlier proposed (with Sabadini and de Francesco Albasini) a mathematical context, in which many of the perplexities are seen to arise from considering (normalized) probabilities, rather than weights of actions. The precise mathematical point is that normalization does not behave well with respect to sequential operations (which include abstraction). Another simple perplexing example where abstraction does not work well with normalization is Simpson's paradox, the mathematical origin of which is the fact that not always a/b+c/d=(a+c)/(b+d).

An extreme example of Simpson's paradox is the following:
Consider treatments A,B and diseases X,Y.
Treatment A with disease X on one person cures the person (100% cure rate - best possible)
Treatment B with disease X on 99 people cures 98 (worse than 100%).
So A has a better success rate than B with disease X.

Treatment A with disease Y on 99 people cures 1 person (this is better than 0% cure rate)
Treatment B with disease Y on one person fails to cure (0% cure rate - worst possible).
So A is better than B with disease Y.

In both diseases X and Y the treatment A is better than B.

However treatment A saves 2 people in 100, B saves 98 in a hundred.
B is worse than A???


Friday, May 21, 2010

Como and Varese

For those who followed the disputes I reported about the computer science course in Como, I have some news. Next year there will be no first year course in Informatica at the University of Insubria in Como.
Since the 7th of May 2010 the computer science staff previously part of the Faculty of Science in Como, have been transferred to the Science Faculty in Varese. We are also now members of the Department of Informatica and Communication in Varese.

The established "hard" sciences Mathematics, Physics and Chemistry have won. All that remains is for them to find some students.


Synthetic DNA

I must admit to being alarmed by the reported creation by J Craig Venter's research institute of a reproducing cell with synthesized DNA.


Wednesday, May 19, 2010


There are severe disadvantages living in a country where the language is not your mother tongue. However there are also advantages. I am constantly learning things about English from my exposure to Italian.

One example today: the newspapers are talking about a coprifuoco (cover fire) in the battle in Bangkok. At first I couldn't imagine what that meant. Then I guessed the meaning without being able to recall the English word (this happens too frequently these days). Finally I realized that coprifuoco was curfew, with an instant insight into the origin of English word.

From the Online Etymological Dictionary:
curfew early 14c., from Anglo-Fr. coeverfu (late 13c.), from O.Fr. covrefeu, lit. "cover fire," from couvre, imper. of couvrir "to cover" + feu "fire." The medieval practice of ringing a bell at fixed time in the evening as an order to bank the hearths and prepare for sleep. The original purpose was to prevent conflagrations from untended fires. The modern extended sense of "periodic restriction of movement" had evolved by 1800s.


Monday, May 17, 2010


I notice that John Baez has realized a danger of the new scientific order of blog posts, blog comments, labs, mailing lists, expository-style papers, preprints in arxiv, and traditional scientific journal articles.

From a comment on the May 15, 2010 post of n-category cafe (my italics)
"This Week’s Finds and my seminar notes are packed with hunches of varying caliber. Sometimes you need to read between the lines a bit to see them — for example, if I say something ‘should’ be true, it means I believe it’s true but haven’t proved it. And sometimes, I’ve said something is true even though I haven’t proved it. By now I realize this is a bad habit… thanks to the following story.
Once I went to a talk where somebody said that for any ring R there’s a one-object tricategory Alg(R) consisting of R-algebras, bimodules and bimodule morphisms. I said “Really? Do you know if anyone has ever written that up?” And the speaker said “Sure! It’s in This Week’s Finds!” Which galled me, because while I knew it was true, I’d never seen a proof written up — and I realized then that by claiming it was true in This Week’s Finds, I’d reduced the chances of ever seeing a proof."
Another remark:
"The last page of my slides was supposed to summarize only the ideas that will surprise and thrill the audience, so I get a standing ovation."

I find the current state of n-category theory a confusing mixture of conjecture, unfinished definitions, unfinished proofs, buts lots of exposition.

I am old-fashioned and believe that to maintain integrity we must insist that the science exists in the scientific journal articles. Credit should be assigned on the basis of published scientific articles.
I don't know what it means to have expository articles of an vaguely developed field.

Perhaps we will end with Bogdanovs.

Labels: ,

Thursday, May 06, 2010


I made the suggestion recently that the Kolmogorov presentation of probability was too abstract, and that this was the cause of elementary confusions like those common in the Monty Hall problem.

I must admit to being quite unsure of the foundations of probability theory. The fact that simply stated problems are easy to make errors about hints to me that the gap between theory and practice is too wide. I make some tentative remarks below, which are made more formal in a small paper we have just written.

1. First, I think probability should be about explicitly described systems. (Peter Cameron in his blog post describes two possible systems ("protocols") which might be behind a particular problem, yielding different results.) This would imply that one needs a notion of system. In this context probabilities are weightings of actions in a particular state of a system. Such a weighting represents information about the cause of actions. It is a kind of primitive dynamical information.

2. The Kolmogorov view is that independence is defined in terms of the probabilities of events. This is the attitude that we are observing behaviours, rather than considering systems. For systems there is a clear operation of composition in parallel. Actions may be appear to be independent behaviourally without being actually parallel.

3. The fact that the sum of all probabilities must be 1 is another evidence of a behavioural view. A system may have big reasons for making a choice, or small reasons, in both cases, say, with a half and half probability. This doesn't affect observed behaviours. However in composing systems large reasons in one system may overwhelm small reasons in another.

To make this more concrete: suppose I am in a situation where I am deciding to eat an apple or a pear (with equal interest), and suddenly I perceive that a car is driving straight towards me, and that with equal choice I must decide whether to run to the left or right to avoid an accident. This is a state in a composition of two systems. It is clear that the weight behind choosing an apple or a pear is overwhelmed by the choice between jumping left or right.

At a system level weight is more fundamental than probability, which arises as a normalization of weight.

Update: I see now that Peter Cameron has come to a conclusion (which I quote in full, but with my italics; it refers to a particular calculation which you can see at his blog):
"Mathematically the conditional probability that someone has two boys, given that one of their two children is a boy born on Tuesday, is 13/27. If we started defining the probability of some event in terms of the algorithm that led to the statement being made, all our textbooks would need to be rewritten from the ground up. I think the best way to proceed is to say, we do know how to calculate probability (and how to interpret it), but this requires careful thought, and sometimes our intuition lets us down."

Update: I made a too brief comment on Peter Cameron's blog, namely that
"I agree with John Faben [another commenter] that the calculation of probability requires information about the algorithm involved."

It is somewhat difficult in blogs to follow which comments are later referred to, but it seems that Peter replied to my comment with the following:
"Sorry to have to disagree…

Any calculation in probability can be done unambiguously by the rules (based on Kolmogorov’s axioms) provided we specify carefully what is the “sample space” and what is the probability measure on the sample space. If you start bringing in other factors like the algorithm used to generate a statement then all you are doing is changing the measure. That is why I said in my example that I am a covert Bayesian. I happen to think that the probability measure that applies in a given situation depends on everything I know about the situation (which may include information about the algorithm used to generate some statement)."

He wrote more which I do not reproduce here.

I have been trying to understand why is it difficult to apply the existing theory (Kolmogorov) to apparently simple real world problems, a fact admitted by Peter. (There is a rumour that even Erdos had difficulty with Monty Hall.)

My suggested answer to this is that there is too great a gap between real world problems and the theory, and perhaps there could be a more detailed theory in between reducing the gap.

The more detailed theory would be prior to the construction of the probability space.
It would consist in making more precise the real world problem, by describing a mathematically formal system (algorithm, or protocol). This is not beyond the capacity of mathematics to do.

Peter seems to agree when he says "everything I know about the situation", which is another way of saying the precise system under consideration.

There is a sleight of hand in the statement "If you start bringing in other factors like the algorithm used to generate a statement then all you are doing is changing the measure". The algorithm is needed to determine the measure; different algorithms determine different measures.


Tuesday, May 04, 2010

The fragility of computer programs

For some time I have used mainly the Opera browser, which has about 2 percent of users on the web. I used it with a Gmail as a POP server. About a month ago the mail server stopped working. I couldn't do anything with Gmail on Opera, so I started using Chrome more frequently, and had almost decided to dump completely Opera.
I did try to fix Opera in this period, searching for similar experiences on the web. Just today I found that a single flag in preferences:advanced:security:security_protocols, namely the lack of enabling of TLS1.1, was causing the problem.

I have returned to using Opera as my main browser.


Monday, May 03, 2010

Bits and pieces

I have been quiet for a while as we have just had a month long visit from one of my sons and a granddaughter. Very enjoyable, interesting and distracting from normal activities. I want to make some brief comments which maybe I will enlarge upon later.

1. Why is probability so unintuitive? Peter Cameron has an interesting post about probability. I am particularly amused by his admission that after five years of teaching probability (something I have never done) he had a recurring nightmare that a student might ask him "What is probability?" and that he would be unable to answer.
He implicitly raises the question I began with, by describing a very simple problem the usual solution to which he finds unconvincing. Recently a whole book has been written about a famous simple question, the Monty Hall Problem.
I have always had similar difficulties about the meaning of probability theory, but recently in our work on algebras of processes I have come to the conclusion that one of the main reasons for the difficulty is that probability theory, and in particular the set theoretical formulation of Kolmogorov, is too abstract. Probabilities should not be associated with events, but instead to transitions or processes. Conditional probability then arises from communication between processes. (Peter starts to describe protocols. In fact, he shows that different protocols for his problem yield different results.) I think the mathematical model needs to be richer. The lack of intuition is related to the fact that problems are presented in too abstract a context: ask an Italian about Monty Hall and he will imagine any kind of imbroglio. We wrote a little paper in this direction, and are currently writing another.

2. How can people write programs without having a precise semantics for the language? I have been working for some years on the theory of programming languages. I have the view that the design of a programming language should involve first having an idea of an algebra of systems, and that the language should arise from the notion of free algebra. This doesn't seem to be an idea shared by many. (As a crude example of the idea, the integers form a ring, and elements of a free ring Z[x,y,..] are programs for making calculations with numbers.)
It seems however that programmers don't need to know mathematics and can write programs that are more or less correct without knowing really what the language means. (Also I write programs like that.) How is it possible?
To make a comparison, students have the greatest difficulty making proofs. Very few succeed.

3. Absurdities. I think there are some mad ideas being presented in science these days. Just two examples: multiverses, and Max Tegmark's classification of these; the aggresive insistence of David Deutsch on the Everett interpretation of quantum theory. More worrying is that both of these are being supported at the highest level of the scientific establishment. The Edge also seems to me to be full of dubious ideas.
I read in an article by Deutsch that people like me who express doubts about many worlds are like those who doubted the motion of the earth, on common sense grounds, at the time of Galileo. Looks like I am in bad company.

Labels: , ,