February 2 2006
Recently, I've been looking for good case studies of the practical benefits of RDF adoption, and while I've found a little interesting stuff, I've been noticing a severe dissonance in the volume of useful information out there, compared to the volume of hype, bombast, needless jargon, and misplaced criticism.
One significant aspect of this RDF dissonance is that many people have a tendency not to see beyond it's expression in XML. I'm starting to realise how important it is to think of it as a graph format, first and foremost, and not to get too caught up on general syntactic woes. Unfortunately for application/rdf+xml pushers, the problems with namespaces cross cut this syntactic-semantic divide, but there are always alternatives. The main (interrelated) issues with RDF that I see are much more tricky to resolve than mere quirks of XML adoption:
The semantic web community tends to emphasise the importance of well defined meaning first and foremost, which can be somewhat at odds with the large scale behaviour of web users. Tagging is probably so popular because it is a syntactic rather than semantic way of describing things, thus places no ontological burden on users or developers. This surely leads to accumulating mountains of ill-defined mess, but at the same time, it broadens the reach of participation on the web, and extends the concept of publishing and architecture to encompass organic and reflexive editing and community growth.
A crossover between the abstract semantic model of RDF and the concrete visible semantics of HTML is emerging with the work being done on GRDDL. The clunkiness of the acronym is probably the only thing holding this approach back - it directly addresses the major problem of non-visible metatata mentioned above, placing the resource description as an extraction from a well understood HTML document source.
From my limited understanding, the power of RDF to disambiguate is not always clear in many situations, and is obviously more suited to massively distributed application contexts where the value of organic approaches can be limited by some of the inherent instabilities in social software. These contexts are perhaps where resource descriptions could be of some benefit without needing to throw in heavyweight natural language processing and machine learning.
I still haven't fully made up my mind on much of this. I can see some significant forces preventing the wider adoption of RDF models by publishers, but I can also see that the graph model itself is so well optimized for the bigger picture of the web. It would be great to find a unifying vision of RDF and HTML technologies that doesn't just blindy evangelize the semantic web and ignore the messy realities of natural language. At this stage however, I'm just not sure where to look.
This Note
Asides