You are underestimating the necessary design tension. If you expose hints, then ...

sixbrx · on May 16, 2013

This accords with our experience with large Oracle databases as well.

We've had a few panics caused by a DBA having updated table statistics when trying to optimize some query. This would occasionally cause sudden, massive changes in the way that other unrelated queries were performed, which queries would not finish and sometimes bring down the database.

These experiences caused us to have to change our procedures and keep the test database data closely in sync with the production data, so we'd know how newly gathered statistics would affect queries. The database is large enough that having testing be a full and recent copy of production is pretty painful. Oracle has since introduced features in 11g that allow pinning query plans, we've yet to try these though.

btilly · on May 16, 2013

I wish that databases did "speculative" queries.

In the simplest form, you could say, "I have 2 reasonable plans, let's try A, and if it takes above time X, then start B in parallel and go with whatever finishes first."

You could ramp up the idea to handle changing query plans based on updated statistics by sending some fraction of queries to one plan, and some to another. Then keep stats on how that worked out for you.

Basically never simply flip the switch in production to try a new, unproven query.

Incidentally 12g advertises that they will actually collect statistics during query execution and based on those will validate the query plan. If that works, then this problem should get better. But of course that comes with overhead, and is likely to be a complex piece of code, so you tell me whether you trust them to get it right right away.

hendzen · on May 16, 2013

I'm sure you'll appreciate the irony of this, but what you described is pretty much the exact mechanism of the MongoDB query optimizer [0].

[0] - http://docs.mongodb.org/manual/core/read-operations/#query-o...

jeffdavis · on May 17, 2013

A cost-based optimizer also needs a way to handle the combinatorial explosion of possible plans. For MongoDB, maybe they can be exhaustive or use a simple strategy. But for a SQL DBMS, the search space is way too huge, and you need a way to navigate it intelligently.

This "try the plan out" idea has come up many times, and it's a reasonable idea, but it's nothing new and not a complete solution. You need a way to evaluate whether a plan has a reasonable chance of being the best one before the whole plan is even constructed, much less executed.

Combinatorial explosion is also one of the reasons that humans generally aren't great at coming up with non-trivial plans, either.

A good optimizer is a combination of many good ideas; so I don't mean to be discouraging, just that one idea is not the full story.

If you are worried about the risk of a changing plan, you are between a rock and a hard place. If you lock the plan down (or hard-code it in the application), and the data characteristics change, that could be disastrous. If you don't lock it down, that's a leap of faith that the optimizer will adapt gracefully.

hendzen · on May 17, 2013

I understand the issues you've brought up in your post as well as the traditional methods of pruning the search space of query plans through using left-deep join trees and histograms for calculating selectivity.

My top-level point was that there is no way humans will come up with the optimal query plan by hand, and like you said even if they do - the data will change.

As an aside, the reason Mongo can use such a simple but effective strategy is because it doesn't support joins, which makes calculating the optimal plan much, much easier. RethinkDB does, however, and as such, a CBO is that much more important.

btilly · on May 17, 2013

I am not disputing the value of a cost based optimizer.

I'm just pointing out that if you have an existing plan that seems to be currently working the default should be to not just switch it up without testing that your switch didn't hurt anything. Instead you need to use care.

btilly · on May 16, 2013

The ironies are indeed rich. Thanks for that!

AlisdairO · on May 16, 2013

You can accomplish something pretty similar to this with oracle plan management: http://docs.oracle.com/cd/B28359_01/server.111/b28274/optpla... . You can set this up to pin plans, but let you know when it thinks it has a better plan to try out.

btilly · on May 16, 2013

Yup. This kind of complexity they have been forced into because the simple general solution that they think should work and which does the vast majority of the time also routinely causes pants on fire emergencies.

pkolaczk · on May 17, 2013

DB/2 does that for ages.

sergei · on May 16, 2013

I frequently see apps with thousands of distinct query signatures. Having a developer manually chose indexes, join ordering, aggregation method, (just to name a few) for every single query, and then select multiple plans because input parameters absolutely do result in scenario where plans can be 10,000x off in performance -- well, that's just untenable.

Not trying to claim that CBOs are the panacea here, but let's be realistic. Having developers manually plan every single query is not the right choice.

btilly · on May 16, 2013

There is a logical fallacy in what you've said. It absolutely is true that there are times when input parameters can be a factor of 10k in performance. It is also absolutely true that applications can have thousands of distinct query signatures.

What DOES NOT follow is that most of those distinct query signatures are very important to the application. A few are. But most are not. However volunteering to add the wrong one can take the application down.

Having developers manually plan every single query is not the right choice.

I agree. However the many applications successfully built on MySQL demonstrate that for a lot of people, crappy but consistent planning is good enough. At least then people can know where the problems are, and fix them.