Yeah, semantic web really hacked the brains of academic-facing bureaucrats. It fell into this giant gap between what administrators don't know about business and what they don't know about technology... a gap big enough to shove every utopian idea about "an effortlessly integrated, data driven society" into.
There's no such thing as "right" way to represent any given data stream, just ways that are more or less suitable to specific tasks and interests. That's why HTML failed as descriptive language (and has become fine-grained-formatting language), and it's why symantic web was doa.
I think HTML and the web failed in general. Modern HTML is really nothing more than div tags everywhere, with a handful of span tags. We went from abusing tables to abusing the entire document. We, in effect, eliminated all semantic meaning from a document by making everything generic tag soup.
The DOM + JS has largely supplanted HTML as the source of a web page. Especially when using tools such as React or Angular.
In terms of vision, the rise of native phone apps and the fact that every major site has a mobile version and a separate desktop version really highlights how the web failed.
I do node/React dev for a living. I'll be the first to admit this pile of hacks is total garbage. Mobile web is almost unusable. I hate it. I hate the sites I work on. Their UX is horrid. Native apps are so far superior that they make the web look like an embarrassing relic. But web development pays the bills and keeps the lights on.
> Modern HTML is really nothing more than div tags everywhere, with a handful of span tags. We went from abusing tables to abusing the entire document.
True but there is some reaction to this. On one hand there are people like yourself that just make the document from data and javascript using lots of build tools, frameworks and ever more specialist technologies.
On the other hand there are people wanting to get back to a pure document that has next to no javascript needed, using HTML5 built-ins for forms, CSS Grid for layout, no polyfills for legacy browsers and no frameworks. This is not widely given exciting buzzwords but 'intrinsic web design' is happening.
When I watch conference presentations it seems to me that there are two groups of people, those to whom the ever more complicated appeals and those to whom the make it simple appeals.
The 'make it simple' currently does not fit with existing workflows, maybe for startups but not for most web agencies where a good decade has now been spent nesting divs in more divs and making it a big mess of un-maintainable 'write only' code, with 97% unused stylesheets that have got to the stage where nobody knows what anything does, they just add new hacks to it.
With where we are with HTML5 it should be easy to markup your document semantically, however, Google have kind of given up with that and if you want to put some semantic goodness in there then you add some JSON-LD on the end rather than put property tags in everything throughout the document. It is as if Google would prefer the document to be doubled up, once with trillions of divs for some bad CSS and then done again to be machine readable.
Regarding mobile, 'progressive web apps' is widely supported and has removed the need for custom mobile applications. This is progress.
I'm working on a project with a React frontend. I think I never tried it on my phone. It probably works but the site is really meant for the desktop and I don't know if it makes sense on a small display.
However... I'm writing this on my phone so HN is part of the mobile web and it works well. I read the post on twobithistory.org on my phone and that also works well. I doubt that they have an app and even if they did, why should I install it and what would happen when I follow a link from HN to them? I'll get the mobile site or would the app catch the link and open itself?
I don't even have the apps of the news sites I read most. Reading them on the phone with Firefox and uBlock is good enough. Their apps probably contain more spyware then the adblocked sites.
So the mobile web did not conpletely fail. It's still what's on the screen of my phone for about 50% of the time: since last charge 3h 55m of screen time, 57m used by phone calls, 1h 54m Firefox, 15m WhatsApp, etc.
I've been recently wondering if there's another, better way. The big usability win of the web is that you can run applications without installing anything. Is there a way we could build a new platform that would get us the advantages of the web without all the awful cruft?
I'm imagining starting with webassembly for sandboxing. We can then expose through to webassembly a useful set of API primitives from the underlying OS for text boxes, widgets and stuff.
Apps would live in a heavily sandboxed container and because of that they could be launched by going to the right URL in a special browser. We could use the same security model as phone apps - apps have a special place to store their own files. They have some network access, and can access the user's data through explicit per-capability security requests.
That would allow a good, secure webapp style experience for users. But the apps themselves would feel native (since they could use native UX primitives, and they would have native performance).
Developers could write code in any language that can compile to webassembly. We could make a bundler that produced normal applications (by compiling the app out of the sandbox). Or we could run the application in a normal web browser if we wanted, by backing the UX primitives with DOM calls, passed through to WASM.
The original usability win of the web was that anyone could put up some text with pictures and links and anyone else could see it. It was proto-Facebook. I think of the evolution of the web as a sequence of technologically trivial but socially innovative changes in format: from static pages to timestamped blog posts, aggregating multiple blogs in a feed, allowing comments and so on. That's the vision laid out in Clay Shirky's writings about social software, as true today as ever before. And unlike the "semantic web", that vision naturally leads to a wealth of semantic information (like friend graphs) which is tremendously useful.
Hopefully that explains why "API primitives exposed to webassembly" feels to me like thinking about the web from the wrong end. The social end is what makes the web tick. It could be built with tinkertoys for all I care.
What is the essential difference between the success of browsers and the failure of X-Windows and Java applets?
For my money its that Java applets and X-Windows didn't have a distribution mechanism and security model. They simply didn't do anything I couldn't already do with desktop apps and HTML.
Also, frankly, they were kind of slow and not very good. I think thats the biggest problem with this sort of idea - the breadth of surface area for GUI toolkits is crazy huge. Building something that works well, and works cross-platform is a seriously huge amount of work.
Discovery. This is the essential difference. And this is mostly based on semantic features of html.
www has 3 main ways of discovery that alternative technologies didn’t offer: 1) search (leading you to correct info in the site, instead of just to a landing page. 2) overview pages, short summary with links to the actual info (google news, etc), 3) deep hyperlinks that everyone can easily discover (browser url) and provide elsewhere (email, Facebook posts, twitter posts, etc).
The first one is very much based on the semantic qualities of html, where google can crawl a page and make some educated guess about what the page is about.
Biggest problem with mobile apps is that discovery is completely channeled through commercial app stores.
I would like to see an alternative web tech stack that doesn’t skip the discovery part. Web assembly with canvas for example is completely useless for a search crawler.
> What is the essential difference between the success of browsers and the failure of X-Windows and Java applets?
Timeline is one of the key ones. According to chrome task manager (because browsers need task managers now) the page I'm typing this reply on the contains a text area and your comment is consuming 30MB of RAM. Back when Java applets were getting their reputation for being slow I would have been lucky to have a computer with 32MB of RAM, 8 and 16MB were still common at that time. Now there were some other things that made applets awful, but if they were introduced today they wouldn't seem nearly as bad as we remember, on the same computer this page would be clunky.
For x-windows, it was never really a contender because there was no MS compatibility, but the potential was there.
How are you crossing the bridge from webassembly to having access to the native UX primitives? Are you directly making C calls to native libraries like win32?
When people say this, it feels like we're in different universes, or at least looking for different things.
I main Linux on all of my machines. Most of the native apps on it have terrible UX. Even big apps - On touch screens, Firefox doesn't support two-fingered scroll. Chrome won't snap to the side of a desktop. Neither will bring up a virtual keyboard if I click on the URL bar.
The majority of my native apps that I use don't support fractional scaling - apps like Unity3d are unusable on 13 inch screens and there's no way for me to zoom in or out on them. Even system dialogs suffer from this problem sometimes, it's like nobody on native ever learned what an 'em' unit is and they're still stuck in 1990 calculating pixel positions.
To contrast, most of the websites that I'm using, even when they're badly designed will work on smallish screens or can be individually zoomed in and out. My keyboard shortcuts work pretty much the same across every site (aside from the rare exception that tries to be all fancy and implement its own). If they break, it's not rare for me to be able to open up an inspector and add one or two CSS rules that fix the problem.
Reading Hacker News, I sometimes wonder if I'm just browsing/using entirely different sites/apps than everybody else is. I don't understand how my experience is so different.
Regarding semantic HTML, I generally don't have too much of a problem there either. I don't think semantic HTML is hard to write - I use it on every single one of my sites. If you're using React and it can't be used without spitting divs all over the place, maybe the solution is just to stop using React? Modern HTML is only going to look like div soup if you fill it with divs.
I mean, I can build you a horrible SQL database that requires 30 joins on every data call, but that doesn't necessarily mean that SQL is bad. It means that auto-generating SQL tables based on a bunch of cobbled-together frameworks and user-scripts is bad. Treat your application's DOM like you would a schema, and put some thought into it. That will also solve a great deal of the responsive design problems on mobile that people are talking about, because light DOM structures are more flexible than heavy ones.
It's important to distinguish between the semantic web and semantic HTML. They are different things.
The criticisms this article levies about the semantic web are pretty much straight on as far as I can tell.
Semantic HTML is pretty straightforward though - it's using HTML to describe content, rather than purely for layout. Some sites do it better than others, but it's certainly not dead or abnormal -- and many static HTML generators are... decent.. ish. Semantic HTML is using stuff like article tags and sections, using actual links instead of just putting a click handler on a div, stuff like that. The stuff that makes it easy to parse and understand a web page.
It's very useful - semantic HTML is the reason that sites like Pocket work, it's the reason why reader mode in Firefox/Safari works. It's the reason why screenreaders on the web work so much better than on native apps (at least as far as my experience on Linux has gone, maybe other people have better apps than me :)) It also (opinion me) makes styling easier, because light descriptive DOM structures tend to be easier to manipulate in CSS than large ones.
The semantic web, to the extent that it's well-defined at all is more about the metadata associated with a webpage. Very different concepts.
It's a double-edged sword. I get why some apps say, "I want to handle everything myself", because then you don't have to debug which versions of a framework you're compatible with, and you don't have deal with these massive layers of abstractions. I hate working with frameworks, if I was building a Linux app I would be very tempted to just directly call into X or Wayland.
On the other hand, the last time I launched Braid on Linux, I had to manually change my resolution back afterwards and it removed my desktop background.
And I just felt like, "I'm sure there was a really good, sensible reason for whatever hack this game relied on when it originally launched on Linux. But... come on, if you had used some common framework, for all of the terrible problems that might have brought, when I launched it years later it would have at least full-screened properly."
So I dunno. The number of really big Linux apps that end up using their own custom display code is surprising to me. Even Emacs isn't fully using GTK. I assume developers of those apps are smart, so I assume there must be a good reason for it.
Nobody is stopping people from using the relatively new article or header tags. There is not really an inherent advantage in using divs except that they are barebones maybe. Apart from that, there are data attributes which are actually used on real-world websites for annotation of texts. Indeed they go particularly well with span tags.
People forget that all these fancy frameworks produce actual HTML5 DOMs, who cares if those are static or dynamic. I someone wants to write a semantic web parser/crawler then it's a great idea, but probably it shouldn't be done using wget. :-)
'There's no such thing as "right" way to represent any given data stream, just ways that are more or less suitable to specific tasks and interests.'
My core objection to "the Semantic Web" is the non-existence of "the Semantic". There is no way you can get everyone to agree upon a universal "semantic", and if you can, which you can't, you can't get people to accurately use it, and if you can, which you can't, you can't prevent people from gaming it into uselessness. But it all starts from the fact that the universal semantic doesn't even exist.
Somewhere there's a great in-depth series of blog posts from someone who describes just trying to get libraries to agree upon and correctly use an author field or set of author fields for libraries. This is a near-ideal use case, because you have trained individuals, with no motivations to insert bad data into the system for aggrandizement or ad revenue. And even just for that one field, it's a staggeringly hard problem. Expecting "the web" to do any better was always a pipe dream. Can't dredge it up, though.
(To the couple of other replies about how "it's happening, because it's happening in [medical] or [science]", well, no, that's not it. That's a smaller problem. The Semantic Web (TM) would at most use those as components, but nobody would consider that The Semantic Web (TM), at least in its original incarnation. Yes, smaller problems are easier to solve; that does not make the largest version of the problem feasible.)
I don't think a "universal semantic" was ever a design goal of the semantic web. What's needed is not one semantic, but the ability to map between competing/complementary semantics. Which is still a hard problem, to be sure, but which admits varying degrees of partial progress.
"I don't think a "universal semantic" was ever a design goal of the semantic web."
And I'm pretty sure it was the whole point. Nobody would ever have written as many reams of marketing material if the pitch was "Hey, someday, you'll be able to reach out to the web, and with specialized software for your particular domain you can access specialized web sites with specialized tags that give you access to specialized data sets that can be fed to your specialized artificial intelligence engines!"
Because that pitch is basically a "yawn, yeah, duuuuuh", and dozens of examples could have been produced even ten years ago. The whole point was to have this interconnected web of everything linking to everything, and that's what's not possible.
These two visions you present lie on extreme ends of a continuum. They're both complete strawmen. The folks behind semantic web were aware of both of them, and were careful not to let their work be pigeonholed into either one.
Consider these passages, from "The Semantic Web," Tim Berners-Lee et al, Scientific American, May 2001:
"Like the Internet, the Semantic Web will be as decentralized as possible. Such Web-like systems generate a lot of excitement at every level, from major corporation to individual user, and provide benefits that are hard or impossible to predict in advance. Decentralization requires compromises: the Web had to throw away the ideal of total consistency of all of its interconnections, ushering in the infamous message 'Error 404: Not Found' but allowing unchecked exponential growth."
"Semantic Web researchers... accept that paradoxes and unanswerable questions are a price that must be paid to achieve versatility. We make the language for the rules as expressive as needed to allow the Web to reason as widely as desired. This philosophy is similar to that of the conventional Web: early in the Web's development, detractors pointed out that it could never be a well-organized library; without a central database and tree structure, one would never be sure of finding everything. They were right. But the expressive power of the system made vast amounts of information available, and search engines... now produce remarkably complete indices of a lot of the material out there. The challenge of the Semantic Web, therefore, is to provide a language that expresses both data and rules for reasoning about the data and that allows rules from any existing knowledge-representation system to be exported onto the Web."
> still a hard problem, to be sure, but which admits varying degrees of partial progress.
I think one of the big problems with the semantic web was that it turned "varying degrees of partial progress" into "multiple competing approaches", each of which wanted to detract from the others.
My impression, back then, was that The Semantic Web would be the sort of thing that you could create a real, general, AI upon. But populating the SW accurately, and maintaining it, was far too large a task for a small group of people and required too much coordination for a large group of people. You'd need a real, general, AI to manage it. So you can't create it without the AI, and you don't need it if you've already got the AI.
> Somewhere there's a great in-depth series of blog posts from someone who describes just trying to get libraries to agree upon and correctly use an author field or set of author fields for libraries.
That sounds interesting! Don't suppose you remember any more terms that might enable a Google search to find it?
The entire design of the "linked data" ecosystem is based on the idea that, as you point out, "There is no way you can get everyone to agree upon a universal "semantic", and if you can, which you can't, you can't get people to accurately use it, ..." etc.
In all fairness, it was also promoted by legitimate academics including Time Berners-Lee. I actually saw him give a talk about it a number of years back for his Draper Prize speech.
I remember his AAAI '07 talk where he laid into the audience for changing and reusing URLs. It made me really want to see this world he was imagining where millions of people using the web every which way agree to universally abide by a rule that makes their life a lot harder but makes it easier to reason about algorithms on the Web.
Btw, I don't think this or failure of semantic web reflect badly on TBL. He is in a rare class of folks, along with Stallman, who can leave a bigger dent in the world while missing their ideals by a mile than most of is could if we got everything we wanted.
Oh, I'm certainly not going to be very critical of Sir Tim! Along with the obvious reasons, I believe he coined the read/write web term and was a strong advocate of users being creators as well as consumers. And TBH, while it's fairly obvious today that classification and discovery has to happen largely organically if only because of the scale of the Web, thinking in terms of formal schemes is a pretty natural bias for someone of TBL's background to have.
My main point was that this wasn't just some fantasy of out-to-lunch bureaucrats.
There's no such thing as "right" way to represent any given data stream, just ways that are more or less suitable to specific tasks and interests. That's why HTML failed as descriptive language (and has become fine-grained-formatting language), and it's why symantic web was doa.