If you do the constraint propagation right, the number of trials and errors will be very small (and it will be only one trial and no errors for the easy ones).
I'd be very interested to see what an algorithm that solves any valid sudoku puzzle looks like. (apart from pathologic solutions like querying a database of all valid 9x9 sudoku combinations).
Edit: using the symmetries described in the same Wikipedia article, the size of the database could be reduced significantly to roughly 5*10^9, which is quite manageable.
The code in the post is based on depth-first search, which basically is a trial and error approach. A well implemented and effective one thanks to constraint propagation, but it is still trial and error. And there are cases, in which backtracking (trial and error) is necessary - at least one of them is documented in the post (the puzzle "hard1" that took 188 seconds).
I'd be very interested to see what an algorithm that solves any valid sudoku puzzle looks like. (apart from pathologic solutions like querying a database of all valid 9x9 sudoku combinations).