Bob Walters: August 2009

Wednesday, August 26, 2009

The problem with process "algebras", 11 May 2009

(from my departmental pages)

Process algebras like CCS and CSP do not begin with a precise semantics. I have written elsewhere in detail about CCS but let's look at an example from CSP. In Chapter 2 of his classic book Communicating sequential processes (Prentice Hall), C.A.R. Hoare describes the dining philosopher problem i) as a circular arrangement of philosophers and forks, and ii) as (Philosophers)||(Forks). These hint at geometrically quite different systems.

The truth is that the first picture is not represented in CCS, and that the real idea behind CCS is that there is a broadcast of signals between the philosophers and the forks.

To make the point more crudely, instead of the philosophers communicating with adjacent forks, they have to scramble in the middle of the table to find their forks. It is clear that these two different systems are quite different.

The situation becomes worse when the process algebras are used to investigate timing, and probability. The natural way of assigning duration or probabities to processes depends very much on the precise system we have in mind, as is clear from the two different pictures I described above.

The matter is serious. A theory of processes has much to add, for example, to probability theory. We must abandon these "algebras" based on broadcast!

The idea of producing a calculus before having a precise model goes back to the lambda calculus of Alonzo Church, which was invented about 1928, but which did not have a model till Dana Scott produced one on 14th November 1969.

Labels: computing, opinions

Graphical languages, 27 April 2009

These are some remarks I put on department page on 27th April 2009, after talking (and writing) to Peter Selinger. I see he has just put a version of his survey on graphical languages on arxiv.

In recent times the use of graphical languages (or string diagrams) for proving algebraic facts in monoidal categories has become common.

In the following I may seem to be belabouring a simple point but I don't know how to be clear without repeating the matter several times. I have a mathematical suggestion at the end.

I think it is important to distinguish heuristic arguments from proofs. I have always used geometric heuristics in algebra since my thesis in 1970 where I used pasting diagrams in 2-categories.

However proofs traditionally consist of discrete arguments using axioms. I don't think one can be said to have a proof if it is not clear how to write down a discrete argument.

To be more precise: if the proof consists in saying the two finite graphs are isomorphic this is clearly checkable by a discrete steps. If however the proof consists in saying there is a continuous map - the continuous map must be explicitly described before you have a proof.

For this reason I don't believe that people actually use the theorems of Street and Joyal - they never check the conditions of the theorems.

This is not a criticism of their theorems - but I believe that people are really using other theorems, which may be easily deducible from Joyal-Street, but use no topology but only finite graphs (or variations).

My point is: I think it could be valuable to understand what people really are using.

I have a partial suggestion.

Consider a graph G. Adding freely a symmetric monoidal category structure plus a symmetric separable Frobenius algebra structure to each object of G yields the full subcategory of Cospan(FiniteGraphs/G) restricting to discrete graphs as objects. (Rosebrugh, Sabadini, Walters CT04). Then the free category on G is a certain subcategory: this seems an absurdly complicated thing to do but explains why the arrows in the free category are graphs labelled in G.

The point is that we could have considered instead of a graph G a "tensor graph" G whose domains and codomains are words in the objects, and do the same free construction. This leads to Cospan(FiniteTensorGraphs/G) which I guess contains many free monoidal constructions (not all!). For those free constructions it does contain it yields proofs that two expressions are equal which amount to checking isomorphisms of finite tensor graphs, instead of sequences of algebraic deductions.

Labels: category theory, computing, mathematics

Sunday, August 09, 2009

Mathematical formulae in blogs

I just discovered how to put latex into my blog:

\[\int_{-\infty}^{\infty} e^{-x^2}dx.\]

To see how go here.

Update: I see from the comment below that I can use XY-pic.
\[\usepackage[all]{xy}\color{Red}\xymatrix@C+2em@R+2em{
Q \ar@/_10pt/[ddr]_{q_1} \ar@/^10pt/[drr]^{q_2} \ar@{-->}[dr]|*+<3pt,3pt>{\scriptstyle u} & & \\
& P \ar[d]_(0.4){p_1} \ar[r]^(0.4){p_2} & Y \ar[d]^{g}\\
& X \ar[r]_{f} & Z
}
\]

Labels: category theory, computing, mathematics

Sunday, August 02, 2009

Chaitin's Constant

(with Nicoletta Sabadini).

I have been reading about Chaitin's work, beginning with this document.

Various comments as to the significance of the work seem to me rather exaggerated.
Consider, for example, the following regarding Chaitin's constant:
"$\Omega$ = concentrated creativity? Each bit of $\Omega$ = one bit of creativity? Can human intellectual progress be measured by the number of bits of $\Omega$ that we know, or are currently capable of knowing, as a function of time?"

My main objection is that the various concepts discussed depend on the computing language involved, but that reference to this dependence is quickly omitted.

For example, the definition of random finite sequence (also Kolmogorov, or perhaps Solomonoff) does not allow one to decide whether any particular sequence is random. Dropping the language dependence from the definition is imprecise.
("Definition of Randomness R1: A random finite binary string is one that cannot be compressed into a program smaller than itself, that is, that is not the unique output of a program without any input, a program whose size in bits is smaller than the size in bits of its output.")
Given a language in which the shortest legal program has one billion bits, any sequence of less than one billion bits is incompressible, and hence random.

Now let's look at the Chaitin's constant $\Omega$.
The omission of the language dependence from $\Omega$ is imprecise.

What is Chaitin's constant? The definition is a little complicated which I think adds to its fascination. A very clear description is given in his paper of 1975.

First Chaitin's constant is not a constant but depends on a particular interpreter M of a particular Turing complete computing language L, in which programs and data are bit strings: the interpreter acts on bit strings which consist of a program and data concatenated, and if it terminates outputs a data bitstring.

There is now an important assumption: Chaitin considers a class of programs P which compute just the partial recursive functions on bitstrings whose domains are prefix-free sets of bitstrings. He shows that there is a universal such machine M: it is defined on a set PD of program-data bitstrings pd which is prefix free, and calculates the value of p on data d. Then Chaitin's "constant" for this language, and interpreter, is number
\[\Omega}_{L,M}=\sum_{pd\in PD}2^{-length(pd)}.\] The prefix freeness tells you that this series converges to a real number $\Omega_{L,M} < 1$, Chaitin's constant.

Chaitin shows that there exist such interpreters M of Turing complete languages L which have the prefix free property, in which the program bitstrings are of the form $1^k0$.

For the above kind of interpreter and a language for which all programs with at most one million and one bits diverge on any input Chaitin's constant has first million bits $0$. If the first million programs all halt on empty input then the first million bits of Chaitin's constant are $1$.

Let's now look at a simpler analogue $K$ of Chaitin's constant.
The languages L I have in mind are those Turing complete languages with programs encoded (via Godel numbering, for example) as natural number written as unary terminated by $0$; data is also written as unary terminated by $0$. An interpreter M first checks that the string is of the form described above, and cycles infinitely if not. Then the interpreter decodes the program number as a program, using the zero to see the end of the program, and applies it to the data number. If the program number does not correspond to a legitimate program the interpreter cycles infinitely. The set of all program-data strings is prefix free, and hence also the set of those for which the interpret halts.

The strings on which M halts are all of the form
\[(1^s)0(1^t)0=11111...101111...10;\] that is s 1's followed by a zero, and then t ones followed by a zero. We can immediately see in this case that the constant (defined in the same way as $\Omega$ but for this interpreter M) $K <1 1.="" because="" br="" form="" is="" k="" length="" number="" of="" s="" sequences="" so="" sum_="" t="" that="" the="" this="">
Let's look a bit more closely at the numbers $K$ which occur in our class of examples. The contribution to $K_{L,M}$ of program-data strings of $length>n$ is easy to estimate since
\[\sum_{k=n+1}^{\infty} k/2^{(k+1)} = (n+1)/2^n.\] So for example the program-data strings of length > 1000 contribute at most $1001/2^{1000} < 1/2^{990}$ to the constant $T$. This means that the first 990 digits of the constant are determined by the halting or not of the interpreter on program-data strings of length at most 1000. But the coding of programs may be chosen so that all programs of length at most 1000 halt on any input; or alternatively all diverge on any input. This means that with appropriate choice of L,M the number $K_{L,M}$ can have first 990 digits 1; for a different choice it will have the first 990 digits 0; in fact any sequence of 990 bits can occur. There is nothing special about 990: there are constants $K_{L,M}$ in our class which have first n digits 0 (or, if you wish, first n digits 1) for any fixed n.

A much simpler constant. Consider now a language L and interpreter M, not prefix free, for which the program and data are encoded as a single number using Godel numbering plus the Cantor coding of pairs numbers as single numbers. So the strings we are considering now are just strings of 1's. The interpreter first decodes the string into two numbers using Cantor, then decodes the first string using Godel into a program (checking if the program is syntactically correct, and cycling otherwise). Finally the interpreter applies the program to the second string, sometimes halting and producing a string, sometimes diverging.

Define the Turing constant $T_{L,M}$ to be the sum over all natural number n for which the interpreter halts, of $2^{-n}$ which clearly is a real number <1 .="" 0.="" a="" allow="" also="" avoided="" binary.="" binary="" br="" but="" by="" could="" cycle="" found="" have="" having="" if="" infinitely="" instead="" interpreter="" it="" length="" many="" of="" problem="" program-data="" same="" strings="" the="" unary="" using="" we="" would="">
Knowing the Turing constant would solve many problems of mathematics - it contains the information of which theorems are true in a formal system (program the search for deductions); it resolves Goldbach's conjecture (progam a search for counter examples - if the program terminates Goldbach is false, if not Goldbach is true). Etc, etc. These are some of the advertised propeties of Chaitin's constant.

Labels: computing, mathematics, opinions