I'm not the greatest Musk fan but IMHO his approach to charge those who benefits...

anigbrowl · on March 10, 2023

Academics don't resell the data to others. In fact, their existing agreements with Twitter requires their published datasets (for reproducibility) to be anonymized precisely to ensure they don't become a commercial goldmine.

Given that most of the article is about the pricing tiers for academic use, based on marketing communications to universities, your comment seems strangely indifferent to the context of this news. These proposed costs are unaffordable precisely because academics are not running a business around the data. If the article were about enterprise data sales, your point would make sense.

mrtksn · on March 10, 2023

Academic work has all kinds of costs, why should data be free of charge?

optionalsquid · on March 10, 2023

As you say, Academics are pretty used to paying for access to data, services, material, etc., but $42k-per-month for limited access to Twitter sounds more like a "fuck off" price than anything else.

viscanti · on March 10, 2023

How many researchers can and will pay $42k per month for access? What's the market size here? Is this anything more than a drop in the bucket for Twitter?

mrtksn · on March 11, 2023

I don't know but it's theirs to sell and find the right price.

> Is this anything more than a drop in the bucket for Twitter?

I don't know, Twitter were selling office furniture and cutting on the food on campus. So maybe it's not just a drop in the bucket after all?

goosedragons · on March 10, 2023

There's pretty huge gap between $0 a month and $42,000. Even a year of data would require pretty huge grant.

isubasinghe · on March 11, 2023

Well I guarantee you the research project I worked on is now gonna get shut down because of this.

Academic work has costs but nothing like 42k per month, if anything that was the entire budget allocation for one RA's salary for 80% of the year.

The current project I am on now, has only been allocated 250k for multiple people (3 RAs) working on it for the year.

anigbrowl · on March 10, 2023

Sorry, no time for goalpost-chasing today.

andrejguran · on March 10, 2023

doesn't have to be free but with every increase there will be less research that can afford to pay for the data and with the proposed pricing of $500.000 for 0.3% of tweets it seems that no-one will be willing to pay the price

briandear · on March 10, 2023

Except if I want to buy some piece of academic research, they sell articles for as high at $60. Why aren’t academics complaining about the absurd costs for the public accessing their information?

anigbrowl · on March 11, 2023

Academics don't make money from academic publishing. In fact, they often have to pay exorbitant review fees to journals. There have been many, many HN threads about this part of the the publishing industry.

jwestbury · on March 13, 2023

> Why aren’t academics complaining about the absurd costs for the public accessing their information?

They are. Loudly and consistently. Academics largely abhor the academic publishing industry, but feel trapped by it.

gammalost · on March 10, 2023

Most academics (at least in Sweden). Get a lot of their articles from their University Library. Scihub is also an option if needed. If those options aren't possible then either request the library to buy it or to do it themselves. Besides it is way less than $43k.

Even then a lot of people are against the high cost

isubasinghe · on March 11, 2023

Just email the academic and ask for the paper, it's the journals that are ripping everyone off.

mftrhu · on March 11, 2023

... they are.

notafraudster · on March 10, 2023

They do? Every single academic in the country supports open-access, lobbies their institutions to pay for the costs of open access. Every researcher will send you a copy of their article if you are paywalled and want to read it. And you, like all academics, know about Sci-Hub, so you should do what most academics do and use Sci-Hub to pirate the article to begin with.

favaq · on March 10, 2023

I don't see anything positive coming out of academia having access to the Twitter firehose.

makestuff · on March 10, 2023

When I was in college we used it to try and try a sentiment analysis model since they are notoriously bad at detecting sarcasm and Twitter was full of sarcasm.

We also used the API to try and determine the most impacted areas after a natural disaster. Basically it would use the model we trained to try and read tweets of people that needed help or people tweeting about severe damage and group them by their coordinates.

The first one I agree isn’t really positive since it is just using other people’s data to train a model, but the second one could’ve been a useful tool to help EMS during a natural disaster.

isubasinghe · on March 11, 2023

sigh this is just straight up wrong, I was an RA that worked on a real time social media analytics software. We were able to pick up on things like likely covid infections sites etc.

mike_hearn · on March 11, 2023

Try searching Google Scholar for "social bot", or to save time, just read this paper:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3814191

Academia has flooded the literature with >10,000 research papers based on the Twitter API feed. Virtually none of it is reproducible, it's frequently based on circular logic, the methodologies are unscientific and the conclusions are usually deeply partisan, but it nonetheless gets amplified by the media as "proof" of various false claims.

Count me in the camp of people who is happy Musk is doing this. I've been writing for years about the plague of "social bot" research coming out of academia that's based on the Twitter API:

https://blog.plan99.net/fake-science-part-ii-bots-that-are-n...

https://blog.plan99.net/did-russian-bots-impact-brexit-ad66f...

Maybe your specific work on COVID was good, but it was certainly drowned out by the work that was sharply net negative for both society and science. Academic institutions were clearly never going to get the problem under control, so booting them out whilst allowing search engines and the like to continue accessing the feed seems like a good solution.

anigbrowl · on March 11, 2023

This is absurd; you're throwing the baby out with the bathwater. Certainly, it is easy to find social science papers with terrible methodology that use the Twitter API, or that build on the sand of papers with terrible methodology.

But you conclude from that that all academic use of the Twitter API is garbage, which is nonsensical, and that preventing academics from studying Twitter at scale is the ideal solution. Your hyperbolic language (here and in your two medium articles, which I read thoroughly, along with the SSRN paper you cited*) does nothing for your own credibility.

The main 'methodology' of the SSRN paper is combing through other papers' datasets, contacting some of the identified 'bot' accounts, and establishing that they're operated by real people; the accounts as misidentified as bots when in reality the account operators were just aggressively quote-tweeting by using copy & paste to spread (eg) political or Qanon messages 200 times an hour. The authors point out that by really making an effort, Twitter users can tweet spam up to 25 times a minute, with no bots in sight! While the authors are quite correct to point out that people can be misidentified as bots, this completely ignores the fact of the unwanted spamming behavior. Pointing out the scientific flaws of 'tools' like Botometer is wholly valid, but the effort to research and develop tools for bot identification are a response to the fact of systematic information pollution, and most papers that try to address this issue are careful to offer caveats and qualifications about the limitations of their methods. It is not the fault of academics if media pundits over-simplify the fruits of their research.

Here are some examples of high quality research using data from Twitter:

https://www.researchgate.net/publication/336638958_Ephemeral...

https://www.researchgate.net/publication/334816353_Political...

https://www.researchgate.net/publication/361949311_QAnon_Pro...

mike_hearn · on March 12, 2023

OK, here's a review of your papers.

1. Ephemeral Astroturfing Attacks: The Case of Fake Twitter Trends

A good start! It makes relatively limited claims (they aren't trying to assert whole elections are being distorted by Twitter bots) and is indeed higher quality than the ones I've been citing. It actually makes its data available, which is a step forward. But it's had limited impact (28 citations), and it's also not particularly useful. All they're doing is revealing that there is ordinary spam, hijacking and SEO on Turkish Twitter, which was never in doubt. All social media sites have these problems and the authors were tipped off by some amateur third party that highlights these campaigns. Most of what they find is plain commercial spam, there's also some politics in there related to local Turkish issues like cab drivers protesting against Uber but there's no evidence presented that this is actually having a real impact on politics.

The main question here is why are universities spending grant money on subsidizing Twitter? The only people who can do anything with this paper are Twitter's spam team, there isn't generalizable new scientific knowledge coming out of it.

2. Political Astroturfing on Twitter: How to Coordinate a Disinformation Campaign

This one starts with a big claim, so it can at least say it's doing important research. But I really wonder why you suggested it because it actually agrees with us and even destroys the underlying premise of the entire field! A pretty useful paper that might be worth citing in future articles on the topic, in fact.

Firstly, their conclusion is that "if even a powerful and well-financed organization like the South Korean secret service cannot instigate a successful disinformation campaign, then this may be more difficult than often assumed in public debates". In other words, the supposed problem motivating this entire field of >10,000 papers doesn't actually exist: even government agencies fail to have impact when they try to sway opinions with Twitter.

Secondly, they accept that our criticisms of the field are correct. "We argue that past research’s predominant focus on automated accounts, famously known as “social bots” ... misses its target since reports on recent astroturfing campaigns suggest that they are often at least partially run by actual humans" and "Because a ground truth is rarely available, systematic research into astroturfing campaigns is lacking".

They also dunk on ML models on page three, and admit that "these studies still largely focus on anecdotes and lack a theory-driven framework" i.e. are more like blog posts than scientific research. These were all points being made by Gallwitz, Kreil and myself years ago.

The paper does have issues! Still, they should get some cred for being honest about their findings, albeit on the penultimate page of a 25 page study. The first sentence of the paper is phrased in a misleading way: they assert that astroturfing on Twitter has the potential to influence politics, but their conclusion is that it actually doesn't. That's a problem that you see a lot when reading papers in some fields.

Paper 3. QAnon Propaganda on Twitter as Information Warfare.

Note that this paper also isn't about bots. It's a complaint about the behavior of real American people. Where is the actual science? Why are you picking this as an example of high quality research? It's not only blatantly partisan, reading more like a Guardian op-ed than a research paper, it starts by citing paper (2), the one that wrecks the whole premise of the field! They are happy to cite it as evidence that they should look for astroturfing instead of bots, but forget to mention that it shows that even an intelligence agency was unable to have any impact on politics by running Twitter campaigns. Yet that doesn't stop them asserting that their line of research is important due to the "innovative misuse of social media towards undermining democratic processes by promotion of magical thinking".

This sort of problem is rampant in published research. I've seen it so often that a paper cites another paper which directly undermines the conclusions of the first, yet the authors don't address or even mention it. This sort of thing is just deceptive. If they want to cite paper (2) then they need to tackle its conclusion.

The rest of it is just US Democratic Russiagate talking points. Getting into the accuracy of that is a book-sized job and and not about science, so I won't do that here, there are many such debates on the internet.

So that's your three papers. One is OK but not very valuable, one ends up (unintentionally?) wrecking the premise of the other ~10,000+ papers and one isn't even scientific research. It's unclear how they were picked but if these are really the best examples of high quality research from the field then, indeed, who really cares if Musk cuts it all off.

mike_hearn · on March 12, 2023

I'll try and find time to look at the papers you cite as high quality later today.

> you conclude from that that all academic use of the Twitter API is garbage, which is nonsensical

"All" no, a vast amount of it, yes. Is it nonsensical? Twitter themselves concluded this exact same thing even before Musk, both in public blog posts and internal emails (see the Twitter Files for examples).

But we don't really need to cite Twitter as an authority here. Just try to answer this question: what mechanisms exist that are stopping bad science outside the field of social bot research, and why have those mechanisms failed within it? It can't be peer review, university hiring committees and so on because those are all existing within social studies as well.

> Your hyperbolic language ...

What language do you think is hyperbolic, exactly, and why?

> Pointing out the scientific flaws of 'tools' like Botometer is wholly valid, but the effort to research and develop tools for bot identification are a response to the fact of systematic information pollution

This is exactly the sort of problem I'm talking about: this justification is circular. We do bad bot research because we know there are bots, we know there are bots because we do bad bot research. If there were actually big problems with social bots then it would be easy to find them and research them; we wouldn't see this situation where basically all papers are seeing patterns in noise.

Botometer is a good example of that. You admit that it's "scientifically flawed" but with respect, that language is not "hyperbolic" enough. It's not merely flawed, it's outright useless. It had an FP rate of 50% when tested against a known human dataset. Yet the Botometer paper has been cited over 900 times now (up from ~700 when I previously wrote about it). When exactly does the rest of the world get to call time on this bad behavior by the academy? These people are changing the opinions of world leaders on the back of misinformation, the exact problem they claim to be fighting.

> It is not the fault of academics if media pundits over-simplify the fruits of their research.

It wasn't media pundits that made academics cite the Botometer paper over 900 times, or write outright deceptive papers like the one I reviewed. The problem here is academia and the institutions need to start taking responsibility for it. Otherwise you're going to get situations like this one: academia will just get cut off from data. People don't have time to try and figure out which little subsections of the academy are following the rules to separate them from the rest.

2h · on March 10, 2023

Why cant universities fund it with their billion dollar endowments?

asutekku · on March 10, 2023

There is literally only couple of US universities having that, for smaller universities 42k a month for a research or two doesn't make any financial sense at all. This price is just basically a huge gatekeeper to prevent most people using it.

kenjackson · on March 10, 2023

Not many universities have billion dollar endowments.

SteveGerencser · on March 11, 2023

Indiana University a Midwest US state school has a 3+ Billion Dollar endowment behind it and ranks 16th right now. University of Texas is #1 at $42B and Berkeley is 20th at $2.6B. These are just state schools. Stanford is at $38B and Yale is at $42B. There is plenty of money out there in the university endowments. They just need to spend it on things like research and professors rather than Golf Courses and sports stadiums.

kenjackson · on March 11, 2023

There are ~6500 universities/colleges in the US. Probably around 1,000 academic schools. How many have $1B plus endowments? 150? So maybe around 15%.

https://www.univstats.com/corestats/

harvey9 · on March 10, 2023

I think Twitter will sell your attention in addition to other revenue streams, given the chance.

addisonl · on March 10, 2023

Exactly, your attention is always going to be sold to the highest bidder—that won't change.

Now you just get worse 3rd party apps and integrations. Interesting to see the attempt to spin this as a positive.

lowercased · on March 10, 2023

if the integrations and overall experience is worse... won't that mean there's less attention (and possibly less quality attention) to be sold?

code_runner · on March 10, 2023

They’ll chase anything that makes money but if the ONLY source of money is our attention etc… that’s a worse spot right?

harvey9 · on March 10, 2023

It is for Twitter. Our attention is a synonym for showing ads. If advertisers step away because they got nervous about the behavior of the new owner then it's better for Twitter to have other sources of income than not.

croes · on March 10, 2023

And if collect data, pay for that too.

When does Twitter start paying it's users who produce the data?

MuffinFlavored · on March 10, 2023

> When does Twitter start paying it's users who produce the data?

Why do those users choose to produce data for Twitter/on Twitter for free?

wvenable · on March 10, 2023

Because it's free to do so.

MuffinFlavored · on March 10, 2023

Then you can't really complain about not getting paid for it, can you?

wvenable · on March 10, 2023

Nope. But if Twitter starts making a side of this situation more financial then people are going feel differently about this relationship.

mrtksn · on March 10, 2023

The free users remain being the product though, don't we? We are the reach and the mined so the company can sell that but at least maybe there's a chance of not being interrupted.

Ideally, everyone would pay to use the service and nothing would be mined for manipulation but that world is hard to imagine in 2023.

code_runner · on March 10, 2023

Like it or not the service being available is the payment. People clearly already want to use it

mesozoic · on March 10, 2023

When they have a viable alternative option that competes with twitter where they can get similar influence which his what most of them want

codetrotter · on March 10, 2023

> When does Twitter start paying it's users who produce the data?

Never.

Having a business in the capitalist system is about maximising profits.

Musk spent ~$44bn USD or so to buy Twitter (and tried to back out of the deal too). Do you really think Twitter is gonna fairly compensate any of the users any time soon?

You’d be better off migrating to Mastodon. Maybe some instance in that ecosystem will figure out how to use crypto for good, and to compensate its content creators.

aliswe · on March 10, 2023

> Do you really think Twitter is gonna fairly compensate any of the users any time soon?

Yes, if it makes business sense to do so. Like it does for content creatos.

_boffin_ · on March 10, 2023

Lol what? Can you please explain your thinking on this unless you were just trying to be funny.

cmh89 · on March 10, 2023

>I'm not the greatest Musk fan but IMHO his approach to charge those who benefits from Twitter is spot on and I'm actually rooting for him to be able to find a viable business model which does not rely on selling my attention to highest bidder.

If there is money to be made, he's not going to pass it up. Why not charge and sell your attention to the highest bidder at the same time? It's the literal cable model and it's proven to work for 50 years.

piqi · on March 10, 2023

> ...find a viable business model which does not rely on selling my attention to highest bidder.

They'll do both.

duxup · on March 10, 2023

Who makes money off twitter?

Maybe a few apps, but Twitter seems to rather they not exist.

Anyone else really make any money?

Twitter is a weird place, I’m not sure how to make money off of it, or who does that’s would care to pay enough for the service?

polishdude20 · on March 10, 2023

Except twitter won't pay the people who have created the data in the first place. So it's stopping short of actually paying your dues.

RobRivera · on March 10, 2023

I dont expect to be paid for my content creation on facebook or instagram.

Yt, for sure.

I think it is fair to say this is a moving topic

nitwit005 · on March 10, 2023

The difficulty is, people can still scrape the data. That data scraping is likely to cost Twitter more than the API did, as they have to serve up the full page.

Yes, you can try to block people doing that, but historically people haven't succeeded.

WXLCKNO · on March 10, 2023

I, for one, will scrape Twitter relentlessly.

poolopolopolo · on March 10, 2023

scrapping data at scale is much harder that you are making it sounds like. Especially is the company is trying hard to prevent that. Much cheaper just to pay for the api.

nitwit005 · on March 10, 2023

LinkedIn tries hard to prevent scraping, but there are third parties doing it, and then re-selling it. Each user is presumably paying a fraction of what the scraping cost.

giancarlostoro · on March 10, 2023

Fully agree, I just wish I knew who these people are, because clearly he has looked at some data that suggests they'll pay up, or his hosting costs will be significantly lowered.

code_runner · on March 10, 2023

I’m not convinced Elon uses a lot of data in decisions like these. This might be just an arbitrary number to start negotiations from.

MBCook · on March 10, 2023

If drive everyone important off for good with a ridiculous price before “the market adjusts” (Musk changes his mind) you’ve done permanent damage to your operation.

8note · on March 10, 2023

The reach and data aren't twitter the business' to exchange though, it's the Twitter community of users

precompute · on March 10, 2023

It's expensive because now the real customers are now out in the open: governments. Endless coffers.

1270018080 · on March 10, 2023

They could make even more money if they charged $84k per month! Genius business model.

randlet · on March 10, 2023

Not sure if it's true in this case but in many cases charging double and halving your customer base is a win I think.

jeffbee · on March 10, 2023

[flagged]

Xeoncross · on March 10, 2023

Yeah, but that had a different top-level comment

dymk · on March 10, 2023

Where?

vkou · on March 11, 2023

https://news.ycombinator.com/item?id=17570029

dymk · on March 11, 2023

ctrl+f "fascist" "fascism" 0 results

bigbillheck · on March 10, 2023

That wouldn't be the weed number tho.

kube-system · on March 10, 2023

that depends on the demand elasticity.

Avshalom · on March 10, 2023

you're opposed to selling your attention to the highest bidder but you think it's a good business model to sell all of "your data" to any "market rate" bidder?

you know those are just the same thing right?