## Best guess BG

The "Best guess" method relies on working out a probability of each candidate occurring in each cell. We start by working out the possible permutations for the cell's intersecting row and column. From the permutations we calculate how many times each candidate can occur in the cell. The ratio of these occurrences and the total candidate occurrences for the cell is the probability for the candidate. Please see Figure 5.

Figure 5. A Best guess example: the program has suggested that candidate 7 is the best guess for a cell and hence that candidates 3,4,5,6 can be removed [click for larger image].
```
In Figure 1 permutations for the column are:

[8, 5, 7],
[9, 4, 7],
[9, 5, 6]

Only these permutations sum to a total of 20.

For the row they are:

[5, 1, 6, 3],
[5, 6, 1, 3],
[7, 1, 3, 4],
[7, 1, 4, 3],
[7, 3, 1, 4],
[7, 4, 1, 3]

(Take the program's word for it!)
```

For the column counts we are here interested in the third elements - i.e. 7,7,6, and for the row the first elements - i.e. 5,5,7,7,7,7.

It is important to realise that for a candidate to be the solution for a cell it must occur in that cell in the row permutations and the column permutations: if it does not appear in the row permutations it cannot be the solution for the cell, likewise the column permutations. This fact arises naturally from the way in which we use the candidate counts found from the permutations: we multiply the row and column counts for each candidate.

```
candidates      1  2  3  4  5  6  7  8  9
counts across  [0, 0, 0, 0, 2, 0, 4, 0, 0]
counts down    [0, 0, 0, 0, 0, 1, 2, 0, 0]
count products [0, 0, 0, 0, 0, 0, 8, 0, 0]
total           8
highest probability = 8/8 (1.0) for candidate 7.

```

The Best guess hint method is only applied in two cases: A) if the user clicks on the "BG" button in the Toolbar; B) if the user clicks on the "?" button in the Toolbar and no other hint can be found - it is a last resort! And rightly too, because even though it finds the most probable result for the whole of the remaining grid it can still be wrong. In a brief analysis, probabilities as high as 75% were found to be wrong. A little reflection on the assumptions behind the probability calculation will reveal why this is so. The probabilities do not only depend on the intersecting row and column - they also depend on all the other intersections, or, more obviously, on all the other cells - the intersections used in the calculation being a crude approximation to their influence! Given this, it could be argued that the method should not be included. If you feel this way, don't use it. None of the built-in puzzles require it, but for some of the ones from the Guardian a single "Best guess" has got me through an impasse and allowed me to complete the puzzle. And, as seen in the above example, we can get probabilities of 100% and, given the nature of the calculation, all such cases are correct.