Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not the greatest Musk fan but IMHO his approach to charge those who benefits from Twitter is spot on and I'm actually rooting for him to be able to find a viable business model which does not rely on selling my attention to highest bidder.

If you are going to influence people, pay for the reach and If you are going to mine data, pay for the data. I guess the exact pricing can be adjusted according to the market needs but I agree with the paid access approach.



Academics don't resell the data to others. In fact, their existing agreements with Twitter requires their published datasets (for reproducibility) to be anonymized precisely to ensure they don't become a commercial goldmine.

Given that most of the article is about the pricing tiers for academic use, based on marketing communications to universities, your comment seems strangely indifferent to the context of this news. These proposed costs are unaffordable precisely because academics are not running a business around the data. If the article were about enterprise data sales, your point would make sense.


Academic work has all kinds of costs, why should data be free of charge?


As you say, Academics are pretty used to paying for access to data, services, material, etc., but $42k-per-month for limited access to Twitter sounds more like a "fuck off" price than anything else.


How many researchers can and will pay $42k per month for access? What's the market size here? Is this anything more than a drop in the bucket for Twitter?


I don't know but it's theirs to sell and find the right price.

> Is this anything more than a drop in the bucket for Twitter?

I don't know, Twitter were selling office furniture and cutting on the food on campus. So maybe it's not just a drop in the bucket after all?


There's pretty huge gap between $0 a month and $42,000. Even a year of data would require pretty huge grant.


Well I guarantee you the research project I worked on is now gonna get shut down because of this.

Academic work has costs but nothing like 42k per month, if anything that was the entire budget allocation for one RA's salary for 80% of the year.

The current project I am on now, has only been allocated 250k for multiple people (3 RAs) working on it for the year.


Sorry, no time for goalpost-chasing today.


doesn't have to be free but with every increase there will be less research that can afford to pay for the data and with the proposed pricing of $500.000 for 0.3% of tweets it seems that no-one will be willing to pay the price


Except if I want to buy some piece of academic research, they sell articles for as high at $60. Why aren’t academics complaining about the absurd costs for the public accessing their information?


Academics don't make money from academic publishing. In fact, they often have to pay exorbitant review fees to journals. There have been many, many HN threads about this part of the the publishing industry.


> Why aren’t academics complaining about the absurd costs for the public accessing their information?

They are. Loudly and consistently. Academics largely abhor the academic publishing industry, but feel trapped by it.


Most academics (at least in Sweden). Get a lot of their articles from their University Library. Scihub is also an option if needed. If those options aren't possible then either request the library to buy it or to do it themselves. Besides it is way less than $43k.

Even then a lot of people are against the high cost


Just email the academic and ask for the paper, it's the journals that are ripping everyone off.


... they are.


They do? Every single academic in the country supports open-access, lobbies their institutions to pay for the costs of open access. Every researcher will send you a copy of their article if you are paywalled and want to read it. And you, like all academics, know about Sci-Hub, so you should do what most academics do and use Sci-Hub to pirate the article to begin with.


I don't see anything positive coming out of academia having access to the Twitter firehose.


When I was in college we used it to try and try a sentiment analysis model since they are notoriously bad at detecting sarcasm and Twitter was full of sarcasm.

We also used the API to try and determine the most impacted areas after a natural disaster. Basically it would use the model we trained to try and read tweets of people that needed help or people tweeting about severe damage and group them by their coordinates.

The first one I agree isn’t really positive since it is just using other people’s data to train a model, but the second one could’ve been a useful tool to help EMS during a natural disaster.


sigh this is just straight up wrong, I was an RA that worked on a real time social media analytics software. We were able to pick up on things like likely covid infections sites etc.


Try searching Google Scholar for "social bot", or to save time, just read this paper:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3814191

Academia has flooded the literature with >10,000 research papers based on the Twitter API feed. Virtually none of it is reproducible, it's frequently based on circular logic, the methodologies are unscientific and the conclusions are usually deeply partisan, but it nonetheless gets amplified by the media as "proof" of various false claims.

Count me in the camp of people who is happy Musk is doing this. I've been writing for years about the plague of "social bot" research coming out of academia that's based on the Twitter API:

https://blog.plan99.net/fake-science-part-ii-bots-that-are-n...

https://blog.plan99.net/did-russian-bots-impact-brexit-ad66f...

Maybe your specific work on COVID was good, but it was certainly drowned out by the work that was sharply net negative for both society and science. Academic institutions were clearly never going to get the problem under control, so booting them out whilst allowing search engines and the like to continue accessing the feed seems like a good solution.


This is absurd; you're throwing the baby out with the bathwater. Certainly, it is easy to find social science papers with terrible methodology that use the Twitter API, or that build on the sand of papers with terrible methodology.

But you conclude from that that all academic use of the Twitter API is garbage, which is nonsensical, and that preventing academics from studying Twitter at scale is the ideal solution. Your hyperbolic language (here and in your two medium articles, which I read thoroughly, along with the SSRN paper you cited*) does nothing for your own credibility.

The main 'methodology' of the SSRN paper is combing through other papers' datasets, contacting some of the identified 'bot' accounts, and establishing that they're operated by real people; the accounts as misidentified as bots when in reality the account operators were just aggressively quote-tweeting by using copy & paste to spread (eg) political or Qanon messages 200 times an hour. The authors point out that by really making an effort, Twitter users can tweet spam up to 25 times a minute, with no bots in sight! While the authors are quite correct to point out that people can be misidentified as bots, this completely ignores the fact of the unwanted spamming behavior. Pointing out the scientific flaws of 'tools' like Botometer is wholly valid, but the effort to research and develop tools for bot identification are a response to the fact of systematic information pollution, and most papers that try to address this issue are careful to offer caveats and qualifications about the limitations of their methods. It is not the fault of academics if media pundits over-simplify the fruits of their research.

Here are some examples of high quality research using data from Twitter:

https://www.researchgate.net/publication/336638958_Ephemeral...

https://www.researchgate.net/publication/334816353_Political...

https://www.researchgate.net/publication/361949311_QAnon_Pro...


OK, here's a review of your papers.

1. Ephemeral Astroturfing Attacks: The Case of Fake Twitter Trends

A good start! It makes relatively limited claims (they aren't trying to assert whole elections are being distorted by Twitter bots) and is indeed higher quality than the ones I've been citing. It actually makes its data available, which is a step forward. But it's had limited impact (28 citations), and it's also not particularly useful. All they're doing is revealing that there is ordinary spam, hijacking and SEO on Turkish Twitter, which was never in doubt. All social media sites have these problems and the authors were tipped off by some amateur third party that highlights these campaigns. Most of what they find is plain commercial spam, there's also some politics in there related to local Turkish issues like cab drivers protesting against Uber but there's no evidence presented that this is actually having a real impact on politics.

The main question here is why are universities spending grant money on subsidizing Twitter? The only people who can do anything with this paper are Twitter's spam team, there isn't generalizable new scientific knowledge coming out of it.

2. Political Astroturfing on Twitter: How to Coordinate a Disinformation Campaign

This one starts with a big claim, so it can at least say it's doing important research. But I really wonder why you suggested it because it actually agrees with us and even destroys the underlying premise of the entire field! A pretty useful paper that might be worth citing in future articles on the topic, in fact.

Firstly, their conclusion is that "if even a powerful and well-financed organization like the South Korean secret service cannot instigate a successful disinformation campaign, then this may be more difficult than often assumed in public debates". In other words, the supposed problem motivating this entire field of >10,000 papers doesn't actually exist: even government agencies fail to have impact when they try to sway opinions with Twitter.

Secondly, they accept that our criticisms of the field are correct. "We argue that past research’s predominant focus on automated accounts, famously known as “social bots” ... misses its target since reports on recent astroturfing campaigns suggest that they are often at least partially run by actual humans" and "Because a ground truth is rarely available, systematic research into astroturfing campaigns is lacking".

They also dunk on ML models on page three, and admit that "these studies still largely focus on anecdotes and lack a theory-driven framework" i.e. are more like blog posts than scientific research. These were all points being made by Gallwitz, Kreil and myself years ago.

The paper does have issues! Still, they should get some cred for being honest about their findings, albeit on the penultimate page of a 25 page study. The first sentence of the paper is phrased in a misleading way: they assert that astroturfing on Twitter has the potential to influence politics, but their conclusion is that it actually doesn't. That's a problem that you see a lot when reading papers in some fields.

Paper 3. QAnon Propaganda on Twitter as Information Warfare.

Note that this paper also isn't about bots. It's a complaint about the behavior of real American people. Where is the actual science? Why are you picking this as an example of high quality research? It's not only blatantly partisan, reading more like a Guardian op-ed than a research paper, it starts by citing paper (2), the one that wrecks the whole premise of the field! They are happy to cite it as evidence that they should look for astroturfing instead of bots, but forget to mention that it shows that even an intelligence agency was unable to have any impact on politics by running Twitter campaigns. Yet that doesn't stop them asserting that their line of research is important due to the "innovative misuse of social media towards undermining democratic processes by promotion of magical thinking".

This sort of problem is rampant in published research. I've seen it so often that a paper cites another paper which directly undermines the conclusions of the first, yet the authors don't address or even mention it. This sort of thing is just deceptive. If they want to cite paper (2) then they need to tackle its conclusion.

The rest of it is just US Democratic Russiagate talking points. Getting into the accuracy of that is a book-sized job and and not about science, so I won't do that here, there are many such debates on the internet.

So that's your three papers. One is OK but not very valuable, one ends up (unintentionally?) wrecking the premise of the other ~10,000+ papers and one isn't even scientific research. It's unclear how they were picked but if these are really the best examples of high quality research from the field then, indeed, who really cares if Musk cuts it all off.


I'll try and find time to look at the papers you cite as high quality later today.

> you conclude from that that all academic use of the Twitter API is garbage, which is nonsensical

"All" no, a vast amount of it, yes. Is it nonsensical? Twitter themselves concluded this exact same thing even before Musk, both in public blog posts and internal emails (see the Twitter Files for examples).

But we don't really need to cite Twitter as an authority here. Just try to answer this question: what mechanisms exist that are stopping bad science outside the field of social bot research, and why have those mechanisms failed within it? It can't be peer review, university hiring committees and so on because those are all existing within social studies as well.

> Your hyperbolic language ...

What language do you think is hyperbolic, exactly, and why?

> Pointing out the scientific flaws of 'tools' like Botometer is wholly valid, but the effort to research and develop tools for bot identification are a response to the fact of systematic information pollution

This is exactly the sort of problem I'm talking about: this justification is circular. We do bad bot research because we know there are bots, we know there are bots because we do bad bot research. If there were actually big problems with social bots then it would be easy to find them and research them; we wouldn't see this situation where basically all papers are seeing patterns in noise.

Botometer is a good example of that. You admit that it's "scientifically flawed" but with respect, that language is not "hyperbolic" enough. It's not merely flawed, it's outright useless. It had an FP rate of 50% when tested against a known human dataset. Yet the Botometer paper has been cited over 900 times now (up from ~700 when I previously wrote about it). When exactly does the rest of the world get to call time on this bad behavior by the academy? These people are changing the opinions of world leaders on the back of misinformation, the exact problem they claim to be fighting.

> It is not the fault of academics if media pundits over-simplify the fruits of their research.

It wasn't media pundits that made academics cite the Botometer paper over 900 times, or write outright deceptive papers like the one I reviewed. The problem here is academia and the institutions need to start taking responsibility for it. Otherwise you're going to get situations like this one: academia will just get cut off from data. People don't have time to try and figure out which little subsections of the academy are following the rules to separate them from the rest.


Why cant universities fund it with their billion dollar endowments?


There is literally only couple of US universities having that, for smaller universities 42k a month for a research or two doesn't make any financial sense at all. This price is just basically a huge gatekeeper to prevent most people using it.


Not many universities have billion dollar endowments.


Indiana University a Midwest US state school has a 3+ Billion Dollar endowment behind it and ranks 16th right now. University of Texas is #1 at $42B and Berkeley is 20th at $2.6B. These are just state schools. Stanford is at $38B and Yale is at $42B. There is plenty of money out there in the university endowments. They just need to spend it on things like research and professors rather than Golf Courses and sports stadiums.


There are ~6500 universities/colleges in the US. Probably around 1,000 academic schools. How many have $1B plus endowments? 150? So maybe around 15%.

https://www.univstats.com/corestats/


I think Twitter will sell your attention in addition to other revenue streams, given the chance.


Exactly, your attention is always going to be sold to the highest bidder—that won't change.

Now you just get worse 3rd party apps and integrations. Interesting to see the attempt to spin this as a positive.


if the integrations and overall experience is worse... won't that mean there's less attention (and possibly less quality attention) to be sold?


They’ll chase anything that makes money but if the ONLY source of money is our attention etc… that’s a worse spot right?


It is for Twitter. Our attention is a synonym for showing ads. If advertisers step away because they got nervous about the behavior of the new owner then it's better for Twitter to have other sources of income than not.


And if collect data, pay for that too.

When does Twitter start paying it's users who produce the data?


> When does Twitter start paying it's users who produce the data?

Why do those users choose to produce data for Twitter/on Twitter for free?


Because it's free to do so.


Then you can't really complain about not getting paid for it, can you?


Nope. But if Twitter starts making a side of this situation more financial then people are going feel differently about this relationship.


The free users remain being the product though, don't we? We are the reach and the mined so the company can sell that but at least maybe there's a chance of not being interrupted.

Ideally, everyone would pay to use the service and nothing would be mined for manipulation but that world is hard to imagine in 2023.


Like it or not the service being available is the payment. People clearly already want to use it


When they have a viable alternative option that competes with twitter where they can get similar influence which his what most of them want


> When does Twitter start paying it's users who produce the data?

Never.

Having a business in the capitalist system is about maximising profits.

Musk spent ~$44bn USD or so to buy Twitter (and tried to back out of the deal too). Do you really think Twitter is gonna fairly compensate any of the users any time soon?

You’d be better off migrating to Mastodon. Maybe some instance in that ecosystem will figure out how to use crypto for good, and to compensate its content creators.


> Do you really think Twitter is gonna fairly compensate any of the users any time soon?

Yes, if it makes business sense to do so. Like it does for content creatos.


Lol what? Can you please explain your thinking on this unless you were just trying to be funny.


>I'm not the greatest Musk fan but IMHO his approach to charge those who benefits from Twitter is spot on and I'm actually rooting for him to be able to find a viable business model which does not rely on selling my attention to highest bidder.

If there is money to be made, he's not going to pass it up. Why not charge and sell your attention to the highest bidder at the same time? It's the literal cable model and it's proven to work for 50 years.


> ...find a viable business model which does not rely on selling my attention to highest bidder.

They'll do both.


Who makes money off twitter?

Maybe a few apps, but Twitter seems to rather they not exist.

Anyone else really make any money?

Twitter is a weird place, I’m not sure how to make money off of it, or who does that’s would care to pay enough for the service?


Except twitter won't pay the people who have created the data in the first place. So it's stopping short of actually paying your dues.


I dont expect to be paid for my content creation on facebook or instagram.

Yt, for sure.

I think it is fair to say this is a moving topic


The difficulty is, people can still scrape the data. That data scraping is likely to cost Twitter more than the API did, as they have to serve up the full page.

Yes, you can try to block people doing that, but historically people haven't succeeded.


I, for one, will scrape Twitter relentlessly.


scrapping data at scale is much harder that you are making it sounds like. Especially is the company is trying hard to prevent that. Much cheaper just to pay for the api.


LinkedIn tries hard to prevent scraping, but there are third parties doing it, and then re-selling it. Each user is presumably paying a fraction of what the scraping cost.


Fully agree, I just wish I knew who these people are, because clearly he has looked at some data that suggests they'll pay up, or his hosting costs will be significantly lowered.


I’m not convinced Elon uses a lot of data in decisions like these. This might be just an arbitrary number to start negotiations from.


If drive everyone important off for good with a ridiculous price before “the market adjusts” (Musk changes his mind) you’ve done permanent damage to your operation.


The reach and data aren't twitter the business' to exchange though, it's the Twitter community of users


It's expensive because now the real customers are now out in the open: governments. Endless coffers.


They could make even more money if they charged $84k per month! Genius business model.


Not sure if it's true in this case but in many cases charging double and halving your customer base is a win I think.


[flagged]


Yeah, but that had a different top-level comment


Where?



ctrl+f "fascist" "fascism" 0 results


That wouldn't be the weed number tho.


that depends on the demand elasticity.


you're opposed to selling your attention to the highest bidder but you think it's a good business model to sell all of "your data" to any "market rate" bidder?

you know those are just the same thing right?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: