Alternative take: there isn't that much low hanging fruit there. Hear me out. "T...

heurist · on Dec 4, 2018

I've been selling good data in a particular industry for three years. In this industry at least, the so-called "low-hanging fruit" only seems low-hanging until you realize that the people who could benefit most from the data are the ones who are mentally lazy and least likely to adopt it. Data has the same problems as any other product and may even be harder because you need to 1) acquire the data and 2) build tools that solve reliably difficult problems using huge amounts of noisy information...

rademacher · on Dec 4, 2018

Isn't there utility in accepting the null hypothesis? It's almost as valuable to know that there is no signal in the data as there is in the opposite, i.e., knowing where not to look for information.

I think your example is really justifying a "machine learner" that has some domain expertise and doesn't blindly apply algorithms to some array of numbers.

whatshisface · on Dec 4, 2018

I think his argument is that some null hypotheses can be rejected out of hand, but that people are wasting time and effort obtaining evidence that, if they had better priors, would be multiplied by 0.0000000000001 to end up with an insignificant posterior. That's what the astrology example indicates.

cpb · on Dec 4, 2018

The effort to evaluate the null hypothesis can be costly. In the competitive environment found in most hedge funds, how would you allocate to accepting the null hypothesis?

As in, if you worked at a data acquisition desk, and spent a quarter churning through terabytes of null hypothesis data, what's your attribution to the fund's performance?

losteric · on Dec 4, 2018

I think they're describing the "look-elsewhere effect": https://en.wikipedia.org/wiki/Look-elsewhere_effect (aka https://en.wikipedia.org/wiki/Multiple_comparisons_problem)

inputcoffee · on Dec 4, 2018

Accepting the null hypothesis has utility only if you have some reason to believe it would not be accepted.

Accepting it per se has no particular value. You could generate several random datasets, and accept/reject the null hypothesis between them ad infinitum.

To put it another way, its only interesting if its surprising.

rafiki6 · on Dec 4, 2018

Bingo. You nailed it. I work in finance. Developed markets have efficient stock markets. They are highly liquid. The reality is that there's lot of people competing for the same profits. In reality when there's that many players, if there's a profit to be had from a dataset you will be buy from a vendor, chances are one of your many competitors already bought it and found it. This is why we now say don't try to beat the market, you likely can't and mostly just need to get lucky having the right holding when an unforeseen event occurs. Too many variables at play that we just don't understand. Most firms are buying these datasets to stay relevant but they really make no difference in their actual investing strategies.

rdlecler1 · on Dec 4, 2018

This is where you might use a genetic algorithm or to learn which data to use on a particular prediction. Good AI won’t use all data just trim down to signal.

pplonski86 · on Dec 4, 2018

I would like to see use case when AI selects data source to use that humans will never consider.

rdlecler1 · on Dec 16, 2018

It's about weight relative importance, especially in conjunction with multivariate information that may be correlated.

jfoutz · on Dec 4, 2018

I read a neat criticism of ai techniques. The author pointed out humans can pick out a strong signal as well or better than ai. Humans could pick out signal from an array of weak sources. Ai would identify that case with fewer weak signals required, but it was hard to trust because it was sometimes wrong.

I wish I could remember the source. I’m sure it was an article here a few years ago. I want to say it was medical diagnosis based on charts.

Anyway, the point was there is a very narrow valley where ai is useful beyond an expert. And that valley is expensive to explore. And, there might not be anything there.