What skills do we need to learn to get into such a position? That is apart from statistics and programming? How much effort will go into learning that stuff?
If you had most of these things I think you would have a good shot at a compbio position:
- statistics, probability, and especially probabilistic inference
- nix/gnutools
- multiple scripting languages (Ruby, Python, Perl, BASH)
- at least one data-oriented language (R, Octave)
- understanding of molecular biology (read Molecular Biology of the Cell)
- applying machine learning tools to new problems
- understanding the major high throughput biological technologies and the kinds of data they produce, along with the current tools used for processing the data
You could pick up all of that in a year of intense self-study, and less assuming you already have some of those skills.
This is something I would really love to do. Where do you work (in academia, I presume)?
I have the programming background, and a bit of the bio background... but I am weak on statistics. How much of statistics and probability theory would I need (beyond a basic 1st-year college level)?
It's hard to say exactly, but if you can work through all the problems in Barber's Bayesian Reasoning and Machine Learning, and some other standard 'frequentist' stats text, you'd be well placed to get started.
It's unfortunate that biological statisticians have hijacked the term 'computational biology'. There's still a lot of computer science to be done in the area, particularly in genome assembly what with new sequencing technologies appearing every few years.
Certainly new algorithms and data structures are going to be crucial (e.g. Bowtie), but statistical analysis/scoring, even in a heuristic way, is always going to be an essential component just from the nature of the work being mostly about evaluating hypotheses from evidence.
I should have added data structures and algorithms to my list. De-novo assembly, alignment, phylogeny, and pretty much all sequence work rely heavily on advances in maths and comp sci.