I fought with Tesseract for quite a while. Its good if high accuracy doesn't matter. Transcribing a book from clean, consistent non-skewed data its fine and an LLM might even be able to clean it up. But for legal or accounting data from hand scanned documents, the error rate made it untenable. Even clean, scanned documents of the same category have all sorts of density and skew anomalies that get misinterpreted. You'll pull your hair out trying to account for edge cases and never get the results you need even with numerous adjustments and model retraining on errors.
Flash 2.5 or 3 with thinking gave the best results.
Thanks. I was surprised that Tesseract had recognized poorly scanned magazines and with some Python library I was able to transcribe two-columns layout with almost no errors.
Tesseract is a cheap solution as it doesn’t touch any LLM.
For invoices, Gemini flash is really good, for sure, and you receive “sorted” data as well. So definitely thumbs up. I use it for transcription of difficult magazine layout.
I think that for such legally problematic usage as companies don’t like to share financial data with Google, it is be better to use a local model.
Russia firmly in that second tier along with better behaved peers that have brighter demographic futures and an actual economy, like India, Indonesia and Brazil.
The fundamental question that needs answering is: should we actually prevent minors below the age of X from accessing social media site Y? Is the harm done significant enough to warrant providing parents with a technical solution for giving them control over which sites their X-aged child signs up, and a solution that like actually works? Obviously pinky-swear "over 13?" checkboxes don't work, so this currently does not exist.
You can work through robustness issues like the one you bring up (photo uploading may not be a good method), we can discuss privacy trade-offs like adults without pretending this is the first time we legitimately need to make a privacy-functionality or privacy-societal need trade-off, etc. Heck, you can come up with various methods where not much privacy needs trading off, something pseudonymous and/or cryptographic and/or legislated OS-level device flags checked on signup and login.
But it makes no sense to jump to the minutiae without addressing the fundamental question.
> The fundamental question that needs answering is: should we actually prevent minors below the age of X from accessing social media site Y?
I suspect if you ask Hacker News commenters if we should put up any obstacles to accessing social media sites for anyone, a lot of people will tell you yes. The details don't matter. Bashing "social media" is popular here and anything that makes it harder for other people to use is viewed as a good thing.
What I've found to be more enlightening is to ask people if they'd be willing to accept the same limitations on Hacker News: Would they submit to ID review to prove they aren't a minor just to comment here? Or upvote? Or even access the algorithmic feed of user-generated content and comments? There's a lot of insistence that Hacker News would get an exception or doesn't count as social media under their ideal law, but in practice a site this large with user-generated content would likely need to adhere to the same laws.
So a better question might be: Would you be willing to submit to ID verification for the sites you participate in, as a fundamentally good thing for protecting minors from bad content on the internet?
You can look at all manner of posts here on HN that explain exactly how you should do age verification without uploading IDs or giving central authority to some untrustworthy entity.
The fact that neither the governments proposing these laws nor the social media sites want to implement them those ways tells you that what these entities want isn't "verification" but "control".
> You can look at all manner of posts here on HN that explain exactly how you should do age verification without uploading IDs or giving central authority to some untrustworthy entity.
That's not how ID verification works. The ID verification requirements are about associating the person logging in with the specific ID.
So kids borrow their parents' ID while they're not looking, complete the registration process that reveals nothing, then they're good forever.
Or in the scenario where nothing at all is revealed about the ID and there is no central authority managing rate limiting, all it takes is for a single ID to be compromised and then everyone can use it to authenticate everywhere forever.
That's why all of the age verification proposals are basically ID verification proposals. All of these anonymous crypto suggestions wouldn't satisfy those requirements.
> Would you be willing to submit to ID verification for the sites you participate in, as a fundamentally good thing for protecting minors from bad content on the internet?
The friction would be sufficient to give up. Arguably no loss to me and certainly none to the internet.
This is what has happened already, I am not giving my id to some shitty online provider. If I lose more sites so be it.
I would rather parenting be the responsibility of parents and I resent the selfish individuals who wilfully burden others with the various costs associated with their demands for safety from their own choices over taking responsibility for themselves. No impact to others is too great for those who insist anything they don’t wish to be exposed to is dealt with at the societal level.
If an at risk child’s parent is unwilling to do what they believe is the right thing by their child then they have failed the child and need to get a grip - confiscate the device or change the wifi password or sleep with the router under your pillow if you have to it’s really not that hard.
> Would you be willing to submit to ID verification for the sites you participate
I would not. Because there are better options out there if the objective is purely age verification that's as rigorous as the status quo for buying alcohol or cigarettes.
> The fundamental question that needs answering is: should we actually prevent minors below the age of X from accessing social media site Y?
This is only an interesting question if we can prevent it. We couldn't prevent minors from smoking, and that was in a world where you had to physically walk into a store to buy cigarettes. The internet is even more anonymous, remote-controlled, and wild-west. What makes us think we can actually effectively age gate the Internet, where even Nobody Knows You're A Dog (1993)[1].
I'd argue that the reduction of underage smoking has much more to do with things like social acceptability and education about the dangers of smoking, and not about physical controls on the distribution of and access to cigarettes. There also appears to be a recent trend of younger people not drinking alcohol to the extent that my generation and Boomers did, which is wonderful, but probably has nothing to do with physical access to beer.
This is the right way to reduce childhood social media use: Make it socially disgusting, and make it widely known to be dangerous.
The real solution, IMO, is a second internet. Domain names will be whitelisted, not blacklisted, and you must submit an application to some body or something.
I agree. There were attempts to do something like this with porn sites via the .xxx TLD I believe, but that inverts the problem. Don't force the public to go to a dark alley for their guilty pleasures. Instead, the sites that want to target kids need to be allowlisted. That is much more practical and palatable.
Yeah.. the opposition was just a bad take IMO... "but it will create a virtual red light district" which is EXACTLY what you want online, unlike a physical city, you aren't going to accidentally take a wrong turn, and if you're blocking *.xxx then it's even easier to avoid.
Then require all nudity to be on a .edu, .art or .xxx, problem mostly solved.
Who decides where the art erotica boarder is? There is plenty of content that would straddle that border, I have seen art that could legitimately called pornographic and pornography i would describe as art. Who decides? And then you have prudes Florida Texas red states trying to prevent remove any thing from an .edu and would happily ban the .xxx entirely and would find any .art suspect and probably ban it.
I dont see why phones can't come with a browser that does this. Parents could curate a whitelist like people curate playlists, and share it, and the browser would honor that.
Combined with some blacklisted apps (e.g., all other browsers), this would be a passable opt-in solution. I'm sure there's either a subscription or a small incentive for someone to build this that hopefully isn't "Scam children".
It's not like kids are using PCs, and if they use someone else's phone, that's at least a severely limiting factor.
They do, don’t they? Apple devices have had a robust whitelisting/blacklisting feature for at least a couple of years. I use it to block websites and apps to lessen my phone addiction. I’m sure Android offers similar features
It's never been about porn. By marking certain part of the internet "adult-only" you imply that the rest is "family-friendly" and parents can feel less bad about themselves leaving their children with iPads rather than actually parenting them, which is exactly what Big Tech wants for obvious reasons. If I had a child I'd rather have it watch porn than Cocomelon, which has been scientifically developed so that it turns your child's brain into seedless raspberry jam. Yet nobody's talking about the dangers of that, because everyone's occupied with <gasp> titties.
Don't worry, most likely your children will come across the normal sorts of bad people - cheating partners, bullying peers, abusive bosses, rude customers, lying beggars, maybe robbers and thieves. It's fortunately unlikely they'll meet a guy who is outspoken about his opinion that scientifically capturing people's attention to get them addicted to screens is morally much worse than showing them "penis into vagina episode 74786". We don't want their innocent minds to be poisoned with ideas that question the status quo.
Honestly if internet porn were "penis into vagina episode 74786" I'd have no problem with my kids who are old enough to desire it, to watch it. The problem is that all internet porn I've seen demonstrates undesirable behaviours and attitudes towards sex and towards their partners. Hitting, degradation, homosexuality, sex between family members, harmful body modifications, verbal abuse, etc, are on the front page of every porn site I've looked at. I honestly do not understand how this is supposed to be stimulating.
I have no problem with my kids watching a couple progress from kissing to foreplay to passion, if those kids already have the hormonal desires to experience these acts. But contemporary websites teach that violence is an integral part of sex - and I do not want my children learning this.
Nice job of sidestepping the "fundamental question" of whether that can be done and what damage it would do. You do not get to answer the question as you posed it in a vacuum.
It's not a "robustness issue". Nobody has proposed anything that works at all.
But to answer your "fundamental question", no. Age gating is dumb. Giving parents total control is also dumb.
If they are persistent enough, no. But then everyone knows it's not going to stop every child in every situation. It sets a president for what society thinks is a sensible limit though, and society raises children not just individual families or parents.
Do we want kids becoming alcoholics?
Do we want them turning up drunk to school and disrupting classes?
Do we want to give parents trying to do the right thing some backup? So they know that when their kid is alone they can expect that other adults set a similar example.
Sure, you can't stop a kid determined to consume alcohol. But I think the societal norm is an overall good thing.
The same should be applied to the online space, kids spend more and more time there. Porn, social media, gambling etc. should be just a much of a concern as alcohol.
We can't prevent all children from getting beer, but we can prevent most of them without compromising any adult's privacy. And everyone is ok with that state of affairs and the trade-offs. No one's calling for internet-connected beer cans that make you take a selfie before you can open them.
> we can prevent most of them without compromising any adult's privacy
But we don't. Even with in person age/ID checks the clerk will often enter some of that data into the store's system and then who knows what happens with it.
> the clerk will often enter some of that data into the store's system
I've only seen them enter the date of birth. No identifying information. If they record the ID itself I'd recommend going to a different store. Or ideally, writing your legislators to have the practice banned.
Depending on the size of the town, date of birth could be used to severely narrow down and target a specific person.
If one suspects a partner of buying alcohol and could convince or coerse the clerk, or even just peek in the book, and see the partner's date of birth written there, then that is good enough proof for many people and many purposes.
Is there actually a difference between transactions between humans in meatspace (getting a government ID, then using it at a store) and age estimation algorithms?
EFF explains a few differences between showing your ID in person and verifying your age online [1]. With respect to transmission, storage, and sharing of user data by the verifier/website, the risks of age estimation overlap with those of age verification.
Along with the diary, tax records, cellphone and family photos were stolen from someone's home, then sold for $40,000 to a far-right activist / centrist paragon of journalism James O'Keefe (whichever you prefer). Said paragon was alleged to have paid these (eventually convicted so I'm allowed to say) criminals more money to steal more stuff from this home.
While the warrant's probable cause section was redacted (maybe inappropriately), the facts of the case are still that the person being raided was alleged to have actually participated in an ongoing conspiracy to commit theft and transporting stolen property across state lines.
It's not about "hating the western way of life" or any such silliness. They can hate whatever they want within their internationally recognized borders.
War is best prevented by robust deterrents. When it comes to belligerent fascist regimes who want to see how far you can be pushed, not responding to provocations and aggression forcefully makes larger-scale war more likely in the future.
Depends on what you mean by cognition, but as you yourself said, BOLD may be correlated with certain kinds of long(er)-term activity, and that in itself is very useful if interpreted carefully. No one claims to detect single "thoughts" or anything of the sort, at least I haven't seen anything so shameless.
Well, a lot of task fMRI designs are pretty shameless and clearly haven't taken the temporal resolution issues seriously, at least when it comes to interpreting their findings in discussions (i.e. claiming that certain regions being involved must mean certain kind of cognition, e.g. "thoughts" must be involved too). And there have definitely been a few papers trying to show they can e.g. reconstruct the image ("thought") in a person's mind from the fMRI signal.
But I don't think we are really disagreeing on anything major here. I do think there is likely some useful potential locked away in carefully designed resting-state fMRI studies, probably especially for certain chronic and/or persistent systemic cognitive things like e.g. ADHD, autism, or, perhaps more fruitfully, it might just help with more basic understanding of things like sleep. But, I also won't be holding my breath for anything major coming out of fMRI anytime soon.
It is especially unforgiveable that the title of on the news release itself is about "40 percent of MRI signals". What, as in all MRI, not just fMRI? Hopefully an honest typo and not just resulting from ignorance.