I agree. I never understood LeCun's statement that we need to pivot toward the visual aspects of things because the bitrate of text is low while visual input through the eye is high.
Text and languages contain structured information and encode a lot of real-world complexity (or it's "modelling" that).
Not saying we won't pivot to visual data or world simulations, but he was clearly not the type of person to compete with other LLM research labs, nor did he propose any alternative that could be used to create something interesting for end-users.
Text and language contain only approximate information filtered through humans eyes and brains. Also animals don't have language and can show quite advanced capabilities compared to what we can currently do in robotics. And if you do enough mindfulness you can dissociate cognition/consciousness from language. I think we are lured because how important language is for us humans, but intuitively it's obvious to me language (and LLMs) are only a subcomponent, or even irrelevant for say self driving or robotics.
Seems like that "approximation" is perfectly sufficient for just about any task.
That whole take about the language being basically useless without a human mind to back it lost its legs in 2022.
In the meanwhile, what do those "world model" AIs do? Video generation? Meta didn't release anything like that. Robotics, self-driving? Also basically nothing from Meta there.
In the meanwhile, other companies are perfectly content with bolting multimodal transformers together for robotics tasks. Gemini Robotics being a research example - while modern Tesla FSD stack being a production grade one. Gemini even uses a language transformer as a key part of its stack.
The issue is context. trying to make an AI assistant with just text only inputs is doeable but limiting. You need to know the _context_ of all the data, and without visual input most of it is useful.
For example "Where is the other half of this" is almost impossible to solve unless you have an idea of what "this" is.
but to do that you need to have cameras, to use cameras you need to have position, object, and people tracking. And that is a hard problem thats not solved.
the hypothesis is that "world models" solve that with an implicit understanding of the worl and the objects in context
If LeCun's research has made Meta a powerhouse of video generation or general purpose robotics - the two promising directions that benefit from working with visual I/O and world modeling as LeCun sees it - it could have been a justified detour.
Ehh, pretty sad there's almost no information on FACEIT anti-cheat. One of the most impactful out there. Wonder if it's just the invasiveness that separates it.
Valve can't replicate even part of it, while CS2 game modes are flooded with cheaters. Most people who chase competitiveness (which CS used to be all about – now it's also skins) just install FACEIT directly and ignore 90% of built-in game content.
Maybe Valve just doesn't want to make the game more difficult to install and sacrifice several % of their user base.
There's a number of good reasons not to make everyone run a kernel level anti-cheat. Linux (and therefore SteamOS) compatibility is a big one.
I think the status quo where anyone on any platform can access the vanilla game -- where cheaters may not even be a huge problem depending on one's skill rating -- and the most competitively-minded players have the choice to play on FACEIT, works pretty fine.
I do wonder what the 90% of built-in game content you're referring to actually is.
Valve's approach was to avoid the cat and mouse game knowing it doesn't lead anywhere. You can always cheat using DMA or reading the monitor with another computer that simulates a hardware mouse to get aimbot abilities.
They wanted a machine learning to detect, flag and ban suspicious behaviour.
This didn't work out and I'm not sure they are still trying but there's a few conferences talking about it.
They did try some stuff but got pushback from Reddit community for being too invasive. Not that it really matters for something already running on your pc.
To be fair in the specific case of CS2, the normal modes without FACEIT are really barely playable. Most games are just a massive loss or win, depending on who has the suspiciously good player with 100 hours in their team.
Most fps games when you get high enough rating are this :/
It also doesn't help most streamers have soft aim lock so that's everyone thinks is normal.
I swear fps games have been in the steroid baseball era for years and it'll be interesting if it ever comes out.
There's also a financial incentive to not reveal 25% of the player base is cheating both in the immediate loss of player base and the inability to simultaneously prove it's happening in all the competitors.
"You can actually read up on how these things work."
While you can definitely read about how some parts of a very complex neural network function, it's very challenging to understand the underlying patterns.
That's why even the people who invented components of these networks still invest in areas like mechanistic interpretability, trying to develop a model of how these systems actually operate. See https://www.transformer-circuits.pub/2022/mech-interp-essay (Chris Olah)
That is a strength, not a weakness. It's valuable to see why people, even those with whom we disagree, think the way they do. There's already far too much of a tendency to expel heretics in today's society, so the fact that Lex just patiently listens to people is a breath of fresh air.
How? It's fine to have on people with all different viewpoints, including awful ones, but I think pushing back when they're on some bullshit is good and necessary. Otherwise you're just uncritically spreading fake junk to a huge audience, which leads to more people believing in fake junk.
The trouble is self-styled "both sides" types believe that since they take the both sides approach, they have insulated themselves from the kinds of politicization that compromises the extremes. But the manner in which you position yourself relative to those extremes is every bit as politicized and every bit as liable to the same cognitive biases and rationalizations.
Misinformed climate skeptics often regard themselves in this way, as not taking one side or the other on global warming. They mistakenly believe that this orientation has elevated them above equivalently offensive extremes, but in truth they have compromised their own media literacy by orienting themselves in that manner.
There are numerous instances of this all over the political spectrum, Cornell West talking to left-wing academics in left-wing academic language about how "nobody" thinks Obama is truly left-wing. Journalists during the Iraq war had a both sides approach that cashed out as extremely hawkish and apologetic in defense of the Iraq war.
The Lex Friedman version is a "centrist" in a specific kind of media environment that lends disproportionate visibility towards its own set of boutique topics. The combination of optimism about technology and trends especially around AI and crypto and some libertarian leaning politics surrounding it, which at its periphery finds itself disproportionately saturated by right-wing memeing and politics. And so it's a form of centerism that's in the center of a world as described by those things. But for him and his viewers it's something they consider a perfectly neutral state of nature that's free of any adornment of ideology.
I felt that way until he had Carlson on. Carlson is a grade A TV talking head grifter who just spins up sensationalist narratives to drive views. No background, no expertise, just a guy who mastered which buttons to push to get average joe's raging.
Lex says he wants open honest conversation, but Carlson was just doing the same stunningly dishonest grift he does every time he has a mic in front of him. So dumb.
I do have a few gripes though, which might just be from personal preference. A lot of the time the language used by both the host and the guests is unnecessarily obtuse. Also the host is biased towards being optimistic about LLMs leading to AGI, and so he doesn't probe guests deep enough about that, more than just asking something along the lines of "Do you think next token prediction is enough for AGI?". Most of his guests are biased economically or academically to answer yes. This is then taken as the premise of the discussion following.
Having said that, I do agree that it is much better and deeper than other podcasts about AI.
There's a difference to being a good chatshow/podcast host and a journalist holding someone's feet to the fire!
Dwarkesh is excellent at what he does - lots of research beforehand (which is how he lands these great guests), but then lets the guest do most of the talking, and encourages them to expand on what they are saying.
It you are critisizing the guest or giving them too much push back, then they are going to clam up and you won't get the best out of them.
I decided to listen to a Dwarkesh episode as a result of this thread. I chose the Eliezer Yudkowsky episode. After 90 minutes, Dwarkesh is raising one of the same 3 objections for the n-teenth time, instead of leading the conversation in an interesting direction. If his other AI episodes are in the vein as other comments describe, then this does seem to be plain old positive AGI optimism bias rather than some special interview technique. In addition, he's very ill-prepared in that he doesn't seem to have attempted to understand the reasons some people have for believing AGI to be a threat.
On the other hand, Yudkowsky was a terrible guest, in terms of his public speaking skills. He came across as combative. His answers were terse and he spent little time on background information or otherwise making an effort to explain his reasoning in a way more digestible for a general audience.
I think with any talk show it mostly comes down to how interesting the guests are. I kind of agree with you that Dwarkesh's steering of the conversation isn't the best, but he seems to put his guests at ease and maybe they are more forthcoming as a result. He is also obviously smart, and it seems that encourages his guests to feel compelled to give deeper/more insightful/technical answers than if they had been, say, talking to some clueless journalist. This was notable in his interview with Ilya Sutskever, who otherwise seems to talk down to his interviewers.
The main strength of Dwarkesh is the caliber of guests he is able to attract, especially for being so new to the game. Apparently he'll research a potential guest for a couple of weeks before cold e-mailing them with some of his researched questions and asking if they'll come on his podcast, and gets a very high acceptance rate since the guests appreciate the questions and effort he has put into it (e.g. maybe Zuck enjoying being asked about Augustus, and not just about some typical FaceBook fare).
If you were inclined to give him another try, then I'd recommend the Richard Rhodes or Dario Amodei episodes, not because of any great Dwarkesh interviewing skills, but because of what the guests have to say. If you are a techie then the Sholto + Bricken one is also good - for same reason.
As far as AI optimism, I gather Dwarkesh has moved to SF, so that maybe goes with the territory (and some of his friends - like Sholto + Bricken - being in the AGI field). While arguably being a bit too deferential, he did at least give some pushback to Zuck on AI safety issues such as Meta's apparent lack of any "safe scaling" tests, and questioning how Zucks "increased AI safety via democratization" applied to bio threats (how is putting capability to build bio weapons in hands of a bad actor mitigated by others having AI too).
I haven't listened to Dwarkesh, but I take the complaint to mean that he doesn't probe his guests in interesting ways, not so much that he doesn't criticize his guests. If you aren't guiding the conversation into interesting corners then that seems like a problem.
He does a lot of research before his interviews, so comes with a lot of good questions, but then mostly let's the guests talk. He does have some impromptu follow-ups, but mostly tries to come back to his prepared questions.
I struggle to blame people for speaking in whatever way is most natural to them, when they're answering hard questions off the cuff. "I apologize for such a long letter - I didn't have time to write a short one."
I think AGI is less a "generation" problem and more a "context retrieval" problem. I am an outsider looking in to the field, though, so I might be completely wrong.
I don't know Dwarkesh but I despise Lex Fridman. I don't know how a man that lacks the barest modicum of charisma has propelled himself to helming a high-profile, successful podcast. It's not like he tends to express interesting or original thoughts to make up for his paucity of presence. It's bizarre.
Maybe I'll check out Dwarkesh, but even seeing him mentioned him in the same breath as Fridman gives me pause ...
I mostly agree with you. I listened to Fridman primarily because of the high profile AI/tech people he got to interview. Even though Lex was a terrible interviewer, his guests were amazing.
Dwarkesh has recently reached the level where he's also interviewing these high profile AI/tech people, but it's so much more enjoyable to listen to, because he is such a better interviewer and skips all the nonsense questions about "what is love?" or getting into politics.
The question you should ask is: why are high-profile guests willing to talk to Lex Fridman but not others?
The short answer, imho, is trust. No one gets turned into an embarrassing soundbite talking to Lex. He doesn't try to ask gotcha questions for clickbait articles. Generally speaking "the press" are not your friend and they will twist your words. You have to walk on egg shells.
Lex doesn't need to express original ideas. He needs to get his guests to open up and share their unique perspectives and thoughts. He's been extremely successful in this.
An alternative question is why hasn't someone more charismatic taken off in this space? I'm not sure! Who knows, there might be some lizard brain secret sauce behind the "flat" podcast host.
Yes, of course. His guests love being able to come on and present their view with very little critical analysis of what they are saying. It is fantastic PR for them.
Interviewers shouldn't be aggressive, antagonistic or clickbaity but they should put opposing views to their guests so that the guest can respond. Testing ideas like this is a fundamental way of learning and establishing an understanding of a topic.
My earlier comparison was basically saying now that high-profile guests are talking to a much better interviewer (Dwarkesh), we no longer have to rely on Lex as the only podcast with long-form interviews of these guests.
I would have thought folks wouldn’t care less about superfluous stuff like “charisma” on HN and would like a monotone, calm robot-like man that 95% of podcast just lets their gust speak and every now and then just asks a follow-up/probing question. Thought Lex was pretty good at just going with the flow of the conversation and not sticking too much with the script.
I have never listened to Dwarkesh but I will give him a go. One thing I was a little put off by just skimming through this episode with Zuck is that he’s doing ad-reads in the middle which Lex doesn’t.
I'll agree that "interesting thoughts" may be up to interpretation, but imma fight you on the charisma thing. I looked up "flat affect" in the dictionary and there were no words, only a full-page headshot of Lex Fridman.
I'm simply pointing out the answer to your "I don't understand why people like him" question. If you can't understand why people don't share your hatred for something, then odds are that the disconnect is because they don't share your reasons for hating it.
Yeah, I'm a big fan of Lex because I think he is really good at building connections, staying intellectually curious, and helping peopl open up, but he is absolutely not big with charisma! I don't know if he normally talks so flat or not, but in the podcast I don't think he could be more flat if he tried. He's also not great at asking questions, at least not spontaneously. Seems really good at preparation though.
I listen to Lex relatively often. I think he often has enough specialized knowledge to keep up at least somewhat with guests. His most recent interview of the Egyptian comedian (not a funny interview) on Palestine was really profound, as in one of the best podcasts I’ve ever listened to.
Early on I got really fed up with him when I discovered him. Like his first interview with mark zuckerberg where he asks him multiple times to basically say his life is worthless, his huge simping to Elon musks, asking empty questions repeatedly, and being jealous of Mr Beast.
But yeah for whatever reason lately I’ve dug his podcast a lot. Those less good interviews were from a couple years ago. Though I wish he didn’t obsess so much about twitter
I've been using pyenv for several years now, but for some reason, the basic commands and overall integration don't feel as smooth as Node's nvm package. I wonder if that's because Python setup is technically harder than Node.
I think the idea is that after certain tech piece is getting regulated, giants will use this as opportunity to push for some replacement which they will get to influence.
Regulation "failures" are mostly about the wording, the how; less about what and why.
I guess rather than regulating cookies, they meant to regulate tracking. Or maybe even regulate targeting, rather than tracking.
The cookie banners are mostly a tragedy, everyone agrees that modal, blocking cookie banners were never the intention. But the giants definitely had something to gain by suggesting they "were forced" to harass visitors.
And the forced harassment is very likely illegal, the law is pretty clear that refusing/removing consent should be as easy as giving consent. Very few sites have both "refuse all" and "accept all" on the same pop-up, even worse are the "legitimate interest" ones which hide a second layer of refusal under that tab.
I hope the EU cracks down on it at some point, the harassment is very much a strategy to manufacture discontent in the public about the regulation, and it works (as you can see in multiple replies anytime this topic shows up in HN), and the strategy is illegal.
> I guess rather than regulating cookies, they meant to regulate tracking. Or maybe even regulate targeting, rather than tracking.
When I see statements like this, I wonder: have people ever read anything besides what the industry feeds them? Or the echo chambers of HN and twitter?
Love the idea, however, it's not clear whether I will get access to a large collection of components for building such workflows or what is currently possible? Would nice to get this info before proceeding with auth.
Theoretically, most OpenCV-type image pre/post-processing stuff is available in blocks and then all the major multi-modal + diffusion AI blocks are also available. As a sampling of what we've recently added:
AI Blocks:
- Multimodal LLM (GPT4v)
- Remove objects in Images
- AI Upscale 4x
- Prompted Segmentation (SAM w/ text prompting)
Editing Blocks:
- Change format
- Rotate
- Invert Color
- Blur
- Resize
- Mask to Alpha
If we've missed something please let us know, we just went through a big exercise in making sure we can quickly add new blocks.
It's a combination of things. The idea is that you can build workflows that chain functionality from ai models, as well as lower level image processing tasks. For lower level tasks we use the usual suspects - PIL, ImageMagik, OpenCV etc.
Text and languages contain structured information and encode a lot of real-world complexity (or it's "modelling" that).
Not saying we won't pivot to visual data or world simulations, but he was clearly not the type of person to compete with other LLM research labs, nor did he propose any alternative that could be used to create something interesting for end-users.