Sleeping beauty problem IV
Previous post in this series
I have been waiting to see some conclusions coming from discussions of the problem at the Stubborn Mule blog, however the discussion seems to have petered out without a final statement. I admit I did not read the details of Giulio Katis's mathematical model, even though it was very closely related to the (sequential part of ) the compositional model of Markov chains that de Francesco Albasini, Sabadini and I developed in "The Compositional Construction of Markov Processes, Applied Categorical Structures 2011", "The Compositional Construction of Markov Processes II, RAIRO - Theor. Inf. and Applic., 2011", and in Luisa's PhD thesis at the University of Insubria, Como, 2011.
However I remain convinced that the matter is simple and I would like to make a couple of mathematical comments.
A Markov chain is usually described as a particular type of stochastic process. However a simpler starting point is to consider a finite set of states, \{ 0,1,2,\cdots k \} say,and for each pair of states (i,j) a probability a_{i,j} of\; the\; transition from i to j, which probabilities form a k\times k matrix A whose row sums are all 1, a stochastic matrix.
It is then easy to see that A^n also has row sum 1, and we may identify the the (i,j) element as the probability\; of\; an\; n\; step\; transition from i to j. This is a definition, not a theorem in this context. (It may be made a theorem about a stochastic process derived from A, but the (rather difficult) construction of a stochastic process from a matrix involves essentially using the matrix product.)
Notice however that there are other row-sum-1 matrices related to A, for example 1/2(A + A^2). What I suggested in the first post in this series is that the (i,j)th element of this matrix might be identified with the probability\; of\; a\; one-{\bf or}-two\; step\; transition from i to j. Again this is a possible definition, not a theorem. Another example; the matrix (I+A+A^2+\cdots +A^n)/(n+1) is a row-sum-1 matrix, which may be identified as the probability of a transition from i to j in from 0 to n steps.
Suppose now we accept the sense of these definitions, let us look at the transition system in the first post in this series, and the corresponding 5\times 5 matrix A. To apply the mathematics, I claim that the sleeping beauty needs to make two calculations: (i) what is the probability that she has passed from state 0 to 1 in one step? and (ii) what is the probability that she has passed from state 0 to state 3 or state 4 in one-or-two steps? The first step uses the matrix A and results in 1/2. For the second the matrix 1/2(A+A^2) yields a probability of 1/4 for the probability of passing from 0 to 3 in one or two steps, and 1/4 for passing from 0 to 4 in one or two steps, and hence a probability of 1/2 for passing from 0 to 3 or 4 in one or two steps.
Her answer to the question of whether the coin was heads should be 1/2.
Notice that the matrix (I+A+A^2+\cdots +A^n)/(n+1) tends to a limit as n tends to \infty (see [2] for a proof), the limit being a row-sum-1 matrix whose (i,j)th entry might be defined to be the probability of passing from i to j in any number of steps.
[1] Charles M. Grinstead and J. Laurie Snell. Introduction to Probability: Second Revised Edition. American Mathematical Society, 1997.
[2] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM: Society for Industrial and Applied Mathematics, 2001.
[3] E. Seneta, Non-negative Matrices and Markov Chains, Springer Series in Statistics, 1973,
I have been waiting to see some conclusions coming from discussions of the problem at the Stubborn Mule blog, however the discussion seems to have petered out without a final statement. I admit I did not read the details of Giulio Katis's mathematical model, even though it was very closely related to the (sequential part of ) the compositional model of Markov chains that de Francesco Albasini, Sabadini and I developed in "The Compositional Construction of Markov Processes, Applied Categorical Structures 2011", "The Compositional Construction of Markov Processes II, RAIRO - Theor. Inf. and Applic., 2011", and in Luisa's PhD thesis at the University of Insubria, Como, 2011.
However I remain convinced that the matter is simple and I would like to make a couple of mathematical comments.
A Markov chain is usually described as a particular type of stochastic process. However a simpler starting point is to consider a finite set of states, \{ 0,1,2,\cdots k \} say,and for each pair of states (i,j) a probability a_{i,j} of\; the\; transition from i to j, which probabilities form a k\times k matrix A whose row sums are all 1, a stochastic matrix.
It is then easy to see that A^n also has row sum 1, and we may identify the the (i,j) element as the probability\; of\; an\; n\; step\; transition from i to j. This is a definition, not a theorem in this context. (It may be made a theorem about a stochastic process derived from A, but the (rather difficult) construction of a stochastic process from a matrix involves essentially using the matrix product.)
Notice however that there are other row-sum-1 matrices related to A, for example 1/2(A + A^2). What I suggested in the first post in this series is that the (i,j)th element of this matrix might be identified with the probability\; of\; a\; one-{\bf or}-two\; step\; transition from i to j. Again this is a possible definition, not a theorem. Another example; the matrix (I+A+A^2+\cdots +A^n)/(n+1) is a row-sum-1 matrix, which may be identified as the probability of a transition from i to j in from 0 to n steps.
Suppose now we accept the sense of these definitions, let us look at the transition system in the first post in this series, and the corresponding 5\times 5 matrix A. To apply the mathematics, I claim that the sleeping beauty needs to make two calculations: (i) what is the probability that she has passed from state 0 to 1 in one step? and (ii) what is the probability that she has passed from state 0 to state 3 or state 4 in one-or-two steps? The first step uses the matrix A and results in 1/2. For the second the matrix 1/2(A+A^2) yields a probability of 1/4 for the probability of passing from 0 to 3 in one or two steps, and 1/4 for passing from 0 to 4 in one or two steps, and hence a probability of 1/2 for passing from 0 to 3 or 4 in one or two steps.
Her answer to the question of whether the coin was heads should be 1/2.
Notice that the matrix (I+A+A^2+\cdots +A^n)/(n+1) tends to a limit as n tends to \infty (see [2] for a proof), the limit being a row-sum-1 matrix whose (i,j)th entry might be defined to be the probability of passing from i to j in any number of steps.
[1] Charles M. Grinstead and J. Laurie Snell. Introduction to Probability: Second Revised Edition. American Mathematical Society, 1997.
[2] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM: Society for Industrial and Applied Mathematics, 2001.
[3] E. Seneta, Non-negative Matrices and Markov Chains, Springer Series in Statistics, 1973,
Labels: probability
1 Comments:
Bob,
Why do you think this problem (and all the variants it has spawned) has attracted so much attention? Do you think a specific calculation will make sense of the confusion that surrounds it?
It would seem to me that instead certain explicit general concepts need to be identified. One of these I believe is the notion of restricting an experiment to only being in a subset of its states. You can find a definition of this (along with other relevant concepts) in my note http://www.stubbornmule.net/extras/Restrictingexperimentstobeingonlyincertainstates.pdf - which is a complete rewrite of the note of mine that was originally published on Stubborn Mule (parts of my thinking have been clarified since then). I would be interested in your comments.
One last question: if you really think this problem is so simple, why have you written (to date) four blogs on it?
Regards from Down Under,
Giulio
Post a Comment
<< Home