Microformats vs. RDF: How Microformats Relate to the Semantic Web
Published 8 months ago by James Simmons
Update: Joe from the Squio blog has posted a response to this entry.
Microformats are a wildly popular set of formats for embedding metadata within normal XHTML. The primary advantage Microformats offer over RDF (including its embedded serializations) is that you can embed metadata directly in the XHTML, reducing the amount of markup you need to write (e.g. you don't have to write XHTML and additional RDF). Many people have contended that Microformats are a possible replacement for RDF, however Microformats were not designed to cover the same scope as RDF was. While both Microformats and RDF make it possible to store data about data, they simply do not work to solve the same set of problems.
A quick comparison
I don't blame the Microformats people for this confusion over what Microformats are or are not. Rather, I blame the sensationalists and know-nots that tend to jump on any new standard, format, or design pattern. Directly on the Microformats about page you are told what Microformats are and are not.
What Microformats were not intended to be:
- A new language
- Infinitely extensible and open-ended
- An attempt to get everyone to change their behavior and rewrite their tools
- A whole new approach that throws away what already works today
- A panacea for all taxonomies, ontologies, and other such abstractions
- Defining the whole world, or even just boiling the ocean
- Any of the above
There you have it, clearly stated and all. I would guess that most of the arguments made by pro-RDF people are extinguished after reading that unordered list. However some people still believe that we can create the Semantic Web with Microformats.
What RDF allows (and Microformats lacks):
- Resources are represented as URIs, allowing you to access metadata remotely
- Infinitely extensible and open-ended design
- A powerful Ontology language (OWL) that is built upon it
- The ability to utilize, share, and extend any number of vocabularies
- No reliance on pre-defined "formats" (i.e. not limited by the types of data that can be encoded)
As you can see there are a few things we can do with RDF that cannot be done with Microformats. The Semantic Web relies on the things I've listed above. These are the clear-cut reasons why Microformats will not be part of the W3C's Semantic Web vision.
Persisting the data within Microformats
Another issue I've thought about is how we are to persist the data we glean from Microformats. How do you usefully store Microformat metadata (beyond leaving it in its XHTML form)? The information stored in Microformats eventually comes out in triple form, one way or the other. Take a look at this example:<span class="tel"><span class="type">home</span>:<span class="value">+1.415.555.1212</span></span>
What information can be gleaned from this example? Well, the home telephone number (of an unknown person or entity, in this example) is +1.415.555.1212. In the end we are still getting the subject-predicate-object form. In this case the subject would be the owner of that number, the predicate would be "home," and the object is the telephone number itself.
So really, we will likely require triple storage for either RDF or Microformats. In all honesty, I don't know of any Microformat-stores. If you know of some, I would like to know if they are any different from a normal triple-store.
Microformats have a place and a purpose
At this point I'd like to say that Microformats do have a number of qualities that RDF (although not necessarily all serializations) does not accommodate for, at least not in the same way:
- Designed for humans first, machines second
- Modularity / embeddability
- Enables and encourages decentralized development, content, services
- A design principles for formats
- Adapted to current behaviors and usage patterns
- Highly correlated with semantic XHTML
I've stated before that I believe Microformats will help bring about the Semantic Web by introducing "metadata sprinkling" (the act of including metadata in otherwise "normal" data) to more people. They allow for simple metadata embeddability and do not affect how an XHTML document validates. This is the kind of approach that will help normal users come closer to understanding the Semantic Web vision.
Conclusions
To me, Microformats are to RDF as HTML 5 is to XHTML; on the surface they both appear to be a solution to the same problem, but the former misses the point as to why the latter was created. On the very same about page I cited earlier there is a bullet point that suggests that Microformats will be part of the semantic web (note the lowercase letters, implying a semantic web, not the one envisioned by the W3C). I find that all competing Semantic Web development paths fall short of creating an entirely linked Semantic Web. The kind of Semantic Web that gives us a platform to stand on above the Web document layer. Microformats have their place, just not as a replacement to RDF.
About the author
Trackback URL for this entry:
http://www.semanticfocus.com/blog/tr/id/231239/
Spam protection by Akismet
Post a comment


Posted by Yihong Ding on October 17, 2007 at 4:02pm
James,
A nice brief comparison between Microformat and RDF. An addition to your arguments is that RDF is built on top of a well conducted mathematical foundation. As a result, it is mathematically sound to derive a mechanism to reason any RDF file. So we have query languages such as SPARQL.
Microformat, on the other hand, is closer to a set of well conducted building blocks. They are easy to be adopted by ordinary users and they do solve many regular problems. But they are short of extensibility and they are less of resiliency on mathematical variations. From the beauty of mathematics, RDF is elegant but Microformat is a little bit clumsy.
-- Yihong
Posted by James on October 17, 2007 at 4:08pm
Great points, Yihong. That's an angle I didn't look at this argument from!
Posted by Mark Murphy on October 17, 2007 at 6:32pm
OK, time to vent. Forgive the length, but you're hitting a nerve...
I've tried sporadically over the past two years to get into the whole RDF vision of the Semantic Web, because I agree that additional semantics layered on the Web is A Good Thing. Now, I like to think of myself as a pretty savvy developer -- I've used over a dozen programming languages, have written applications in everything from the TRS-80 to Linux, etc. Yet every time I try to wrap my head around the practical implementation of RDF and related technologies, I get a headache.
My personal belief is that the problem is this: you're all speaking Greek.
RDF came out of the ivory tower, as you pointed out in an earlier blog post. The problem is, I have yet to find the Rosetta Stone that translates ivory tower-speak into ordinary-Joe-speak.
For example, in this blog post, you use:
-- "Ontology language"
-- "triple form"
-- "subject-predicate-object form"
-- "triple-store"
Those terms are ones I only see in other ivory-tower areas, not in the world of the rank-and-file developers. This doesn't mean they're bad terms per se, just that they're going to make lots of peoples' eyes glaze over before they just turn back to microformats to be done with the matter. For example, if a "triple-store" is just a database, why not call it that? If it's not just a database, why not? If it can't be just a database, what is the "big win" for using RDF that justifies a whole new architecture for front-end *and* back-end vs. just microformat-style embedded semantics? And please don't tell me it's "purity of mathematics"...
I go over to Amazon.com, search for books with RDF in the title, and get a whopping half-dozen entries, only one of which was written as recently as last year. The only one that doesn't itself seem to be an ivory-tower book, _Practical RDF_, is over four years old and has a middling rating. Moreover, Jack Herrington's review suggests it's not really practical at all, and I have a lot of faith in Mr. Herrington opinion as to what is and is not practical.
Or, look at your recent list of 30+ Semantic Web resources. The most practical-sounding entry ("Putting RDF to Work") is from 2000. The second-most practical-sounding entry ("The Bottoms-Up RDF Tutorial") thinks "I could be described as apple/lime/pineapple and you as tangerine/strawberry/kiwi" is somehow practical. If this is putting the RDF community's best foot forward, you need more feet.
If you want to make RDF take off, here's my suggestions:
1. Create RDF equivalents to the major full-content microformats (hResume, hReview, hCard, etc. -- not hAtom or the rel-* patterns)
2. Create an RDF equivalent to the Operator plugin for Firefox, so things encoded in Web pages in the format from #1 can be actually used by somebody vs. just taking up space
3. Write *the* practical guide to RDF, focusing on #1 and #2, plus some back-end services that can leverage SPARQL and the like (RDF plugin for Rails? triple-store wrapper for MySQL?)
4. Publicize the heck out of #1, #2, and #3
In other words, ya gotta give us something to believe in. There may be other ways of accomplishing #1 and #2 than what I have listed, but you really need a solid #3, ideally something that can become a talisman, like the "pickaxe book" is for Rails.
Focusing on "the beauty of mathematics" is fine perhaps for purity of the RDF world, but it ain't gonna do jack for getting it adopted by anyone. It's more likely that POSH (plain ol' semantic HTML) will get sufficient traction to make it "good enough" and keep RDF and the Semantic Web in the background. If RDF really is "all that", the RDF community needs to help bridge the gap between RDF-as-ivory-tower-construct and microformats-as-something-factory-joes-can-use.
I apologize for the rant, but I hate to see a technology as promising as RDF get lost because it can't be described in practical terms.
Posted by Mark Murphy on October 17, 2007 at 6:38pm
As a quick followup, I just ran into this blog post that spins a similar tale:
http://www.madetostick.com/blog/2007/10/17/digital-signal-processing-made-to-stick/
I heartily recommend _Made to Stick_ in general. This professor's description of how he took the dry mathematics behind digital signal processing and stripped it down to the concrete core is just the sort of thing RDF/Semantic Web needs. Of course, the professor is a bit windy -- the key portion begins at paragraph #5 of his quoted section.
Posted by Yihong Ding on October 17, 2007 at 8:44pm
Mark,
I fully understand what you said. Mathematics is somehow annoying. Although it is elegant, mathematics is not easy to be understood unless somebody can well explain it. Unfortunately, however, the mathematical foundation of RDF is a little bit too complicated. And this is exactly why Mircoformat has so many fans. People do want to have Semantic Web. The main obstacle of realizing Semantic Web is that up to the present people still don't know how to do. As you said, Mark, the ivory-tower people were too proud to bow down to the ordinaries. This was a pity.
But things have started to change. Now more and more academic researchers begin to develop more user-friendly tools for users or developer who do not know much about RDF. For example, in our research lab we are building Semantic Web applications that encapsulate all the RDF expressions under the cover and provide users or external developers easy-to-understand interface. So long as users know the very basic of RDF, i.e. the RDF triple subject-predicate-object, they can start to practice Semantic Web. Many other labs have started to build similar tools too. So, you see, things are getting better.
I believe that in the future nobody really needs to know the details of RDF unless they are professional back-end developers. For ordinary users or front-end developers, they will have higher level specifications and can totally avoid the annoying math foundations at all.
So, take it easy. ;-)
-- Yihong
Posted by Joe on October 18, 2007 at 6:39am
While no competitor for RDF in any way, microformats can be a stepping stone to the "semantic web for the masses".
I was starting to type my thoughts here, but it grew too lenghthy so I posted a blog instead:
http://squio.nl/blog/2007/10/18/from-microformats-to-rdf/
Posted by VidiMonkey on October 18, 2007 at 7:59am
A semantic web would first require some sort of unity format, but the problem is that the format needs to be extensive enough to allow for expansion while still being simple enough for end users with little to no programming background can understand. I'm not sure this can be achieved especially since a lot of companies are vying for their own patents and are unwilling to cooperate.
Posted by russ on October 18, 2007 at 9:46am
Personally I don't see the point in yet more formats, or focusing on them when plain old XML/json works just fine.
Good points, nice article, thanks.
Posted by Kelly on October 18, 2007 at 10:11am
The thing I really hate about Microformats - the fact it uses the class attribute. This to me is an ugly hack. I shouldn't be forced to use certain class names to add semantic meaning to my HTML. I love the XHTML 2 idea of adding in the "role" attribute.
I agree RDF is best.
Good article.
Posted by Rob on October 18, 2007 at 11:53am
You didn't write it, but the Digg description says "Plain and simple reasons as to why Microformats will never replace RDF or be part of the Semantic Web." [emphasis mine]
That I object to. GDDRL is a great way to make use of microformats, and bring them into the semantic web, and IMHO will be the technology enables the semtantic web's success.
Posted by Eric Monse on October 18, 2007 at 12:57pm
You really can't blame microformats people for the confusion.
Posted by James on October 19, 2007 at 2:05pm
@Mark:
Hi Mark, thanks for taking the time to write such a long response. As you said, RDF did come from the ivory towers, but that isn't a bad thing. In fact, the W3C is the best organization for tackling a vision as serious and intricate as the Semantic Web. Ontology languages and triple-stores are still very new terms to most developers, just as HTML and later CSS would have been to developers in the late 1980s.
I don't feel that the terms are to blame. The reason you don't see these terms being used by your average developer is because they are still too "new." These terms represent standards and concepts on the bleeding edge of Web evolution. Therefore it's natural to continue to see a trend of developers putting-off learning them, until a later date when they are practical and can be used daily.
To answer your question, a triple-store is another kind of database. Much like there is a difference between a column store and an RDBMS, a triple-store is intended to store information in a different manner (optimized for triples) and therefore has its own unique name. Architectures change, and we shouldn't resist change simply because it means having to learn something new.
I hate when people fight against simply something because it requires more learning. If you've hit the wall with the number of languages you can learn, please don't try to force us to only advance as far as the lowest common denominator chooses to go.
I cannot speak for Semantic Web authors. I think the reason we are not seeing more books being published is because we're neither full-force with the development of the Semantic Web, nor has it reached its peak in popularity, nor have there been many changes in the RDF specification for years. Currently we are tackling many of the higher-level issues with the Semantic Web.
I thank you for your suggestions; I consider them valuable insight from the Microformats community. I hope to see more Semantic Web developers take the paths you described.
As the Semantic Web continues to gain traction and as we slowly introduce more developers to RDF, OWL, SPARQL, etc. we will see people realize the difference between "semantic XHTML" and the benefits a true Semantic Web brings.
Posted by James on October 19, 2007 at 2:06pm
@Joe
I agree, I've written in the past that I think Microformats will be a stepping stone towards the Semantic Web. Glad to see we agree :)
Posted by James on October 19, 2007 at 2:11pm
@VidiMonkey
Not all serializations of RDF are complex. RDF/XML seems to be the one people have the most trouble with, but there are others such as Turtle and Notation3 that are easier to grasp.
Posted by James on October 19, 2007 at 2:16pm
@russ
XML and JSON are not 1:1 comparisons to RDF or Microformats. RDF has a serialization built atop XML, and JSON has nothing to do with what we've been talking about.
Posted by James on October 19, 2007 at 2:18pm
@Kelly
I agree with you that using the class attribute is lame. I don't like it because it can interfere with your design if the class in question is already used on your site.
Posted by James on October 19, 2007 at 2:21pm
@Rob
GRDDL is an excellent way to "harvest" the information from Microformats. This is another reason why I think Microformats will help bring about the Semantic Web; by getting users used to the idea of "metadata sprinkling" and also by providing information that can be stored in RDF at a later time with little trouble.
Posted by Luigi Montanez on October 19, 2007 at 8:50pm
I completely agree with Mark here. I'm a web developer and I really want to see a semantic web, but I have absolutely no clue why the W3C's Semantic Web exists, or why it takes RDF, OWL, and SPARQL to get there.
It's about working solutions. For example, there's been a huge push lately for portable social networking, or social internetworking, and the solutions proposed are very practicable. They rely on OpenID, Microformats, and (more recently) OAuth.
RDF was invented in the 90's. OWL has been around since early 2004. SPARQL is really the only truly new technology. But all those technologies mentioned above that are being proposed for portable social networking didn't rely on the W3C to hash things out for years, and from my view they're infinitely more useful because of it.
Web developers aren't adopting the W3C's Semantic Web technologies because they're too lazy to learn something new. Since the OWL was finalized in 2004, Ruby on Rails developers have taught themselves the language of Ruby, the conventions of Rails, RESTful application design, and many also have learned about Microformats and OpenID. It's all about practical use and helping us to build better web apps for our users. The W3C's Semantic Web just doesn't offer that to us.
Posted by James on October 20, 2007 at 8:45am
@Luigi
Luigi: "They rely on OpenID, Microformats, and (more recently) OAuth."
You forgot that social network portability and the social graph rely on FOAF (Friend of a Friend), an RDF vocabulary. Don't believe me? I copied and pasted a few quotes directly from SixApart's now famous "We are Opening the Social Graph" article:
"So we've created an experimental demo based upon open technologies OpenID, the Microformats hCard and XFN, and FOAF that allow you to see your entire network of relationships in one place" (emphasis mine)
"This is made possible through the combination of technologies like XFN and FOAF, which together can describe who you know and how you know them. TypePad, LiveJournal and Vox produce FOAF (and soon XFN) automatically, and Movable Type has always had this capability." (emphasis mine)
"Finally, if you manage a social networking service, we strongly encourage you to embrace OpenID, hCard XFN, FOAF and the other open standards around data portability" (emphasis mine)
It seems as though they really consider FOAF necessary for this to work. They seem like they've probably thought about it more than you have.
Luigi: "But all those technologies mentioned above that are being proposed for portable social networking didn't rely on the W3C to hash things out for years, and from my view they're infinitely more useful because of it."
Microformats are not a technology, and you didn't mention anything that the W3C didn't have a hand it creating. Microformats are a design pattern at best. They use XHTML, a W3C standard, and a select few keywords in the class attribute to half-assed string together "semantics."
Please, keep continuing to try to confuse would-be RDF developers so that you can snatch an audience for your weak alternative to RDF. It's only setting us back further. You can see there is something greater than Microformats for what you're trying to get Microformats to do. Why can't you just learn it? It's really not that hard. I'll assume you already know XML syntax, is the learning curve that great?
Luigi: "Web developers aren't adopting the W3C's Semantic Web technologies because they're too lazy to learn something new"
Next sentence:
Luigi: "...Ruby on Rails developers have taught themselves the language of Ruby, the conventions of Rails, RESTful application design, and many also have learned about Microformats and OpenID"
So Web developers are too lazy to learn RDF, but are not too lazy to learn Ruby, Ruby on Rails, RESTful application design, Microformats, and OpenID?
Luigi: "It's all about practical use and helping us to build better web apps for our users. The W3C's Semantic Web just doesn't offer that to us."
Honestly, I don't think you'd know the difference if it hit you in the face.
Posted by Luigi Montanez on October 20, 2007 at 10:40am
I'm not sure why you feel the need to personally attack me. Yes, you're right about 6A using FOAF, and I hope that 6A does use it successfully to open up the Social Graph as they're trying to do, so that web developers can see the practical benefit of RDF.
You missed my point about technologies web developers have chosen to learn. It's about practical use, not laziness or some illogical unwillingness to learn it. The W3C's Semantic Web technologies are just not practical to implement. We don't see their immediate benefit right now. We don't see what the benefit is by dealing with RDF, OWL, and SPARQL. Any material online about it is mired in a thick swamp of jargon that's impossible to wade though.
In other words, it's an evangelism and marketing problem, and insulting the intelligence and learning ability of developers who are trying to understand the W3C's Semantic Web is not a very good way to evangelize it. I want to learn about it's benefits and I want to understand why it's a good technology to use. Accusing me of being dense and trying to confuse my fellow developers isn't exactly welcoming.
Posted by Yihong Ding on October 21, 2007 at 12:57pm
@ Luigi,
Please don't take anything personally. Discussion so far is healthy and helps everybody understand more about the strengths and shortcomings of current technologies. I don't think anyone is personally attacking on anybody else, and please don't think of arguments in this way.
Besides, I agree many of your viewpoints that W3C is short of explaining their technologies in plain words. This is a problem causing the slow adoption of Semantic Web in reality.
In fact, not only you but also professional semantic web developers themselves may not even understand what these W3C technologies really mean. For example, the most recent Twine is advertised by Radar Networks to be a Semantic-Web product or a Web-3.0 product. But with a careful look at its beta, Twine indeed is lack of reflecting the spirit of next geneation web. The spirit of Twine still stays on the philosophy of Web 2.0 though it uses W3C technologies of Semantic Web.
You see, technologies do not mean everything. People must first have vision and then they can really leverage the technologies to a new level. If you'd be interested, I would recommend my study of Twine to you. You can find it at:
http://yihongs-research.blogspot.com/2007/10/twine-first-impression.html
cheers, and take it easy. Let's keep a healthy and constructive discussion environment here at SemanticFocus.
-- Yihong
Posted by James on October 21, 2007 at 1:41pm
@Luigi:
Sorry if you took what I said personally, I'm not trying to lash out at anyone, despite the fact that I can sometimes get a little heated.
@Yihong:
I agree that the Semantic Web vision and its associated technologies are not the easiest to digest. We should focus more effort on lowering the barrier of entry for normal users to get acquainted with Semantic Web standards.
Posted by Dave Kor on October 22, 2007 at 1:42am
The main problem I have with RDF is that it is in the position SGML was in about two decades ago. SGML ended up being replaced by a much simpler, less elegant HTML. Further more, for all its mathematical soundness RDF fails at representing semantics simply because of the disambiguation problem. Anyone who has done any research on Word Sense Disambiguation would tell you that semantics as a whole IS ambiguous and highly contextual. It is part of the reason why successful WSD systems have mainly been statistical learning methods. I'm afraid a formal, mathematical RDF would simply never be able to solve this issue and thus make it very difficult for the semantic web to become a reality.
Posted by James on October 22, 2007 at 8:50am
@Dave:
..two decades later we're trying really hard to clean-up after our laziness.
Posted by fauigerzigerk on November 1, 2007 at 2:02am
My complaint is that OWL is not powerful enough to express simple everyday knowledge. For instance, it doesn't let me define the class of all phones that are more expensive than the iPhone, or the class of people who earn more than the average income, or the class of first-born children. These are things I can easily do with any relational DBMS by defining a view. As far as I'm aware, I cannot define OWL classes using SPARQL. So one very important application of RDF technologies, publishing databases on the web, basically falls flat. Yes, I can still publish the data or even a SPARQL query interface, but I cannot publish all the knowledge contained in my relational schema as an ontology.
Posted by nick on December 30, 2007 at 9:22am
you're contrasting apples and oranges; why can't microformats and rdf just coexist in a sweet little symbiotic relationship; hmm; maybe they already do
Posted by know-not-sensationalist on December 30, 2007 at 9:30am
oh please don't snuff microformats yet, I'm still using them. oh please please don't take away my rel="tag"
Posted by James Simmons on February 21, 2008 at 8:41am
@nick,
Actually, that's about how I feel about the two.
Posted by Maros Ivanco on June 25, 2008 at 7:52am
@James: ... and without the "laziness", as you call it, there would be no clean up needed - there would be no web :-). For a technology, to get adopted is even more important, than just to be conceived. Furthermore, it is not possible to jump from stone age to the space flights.
RDF become recommendation in 2004. The only widely deployed application in 2008? RSS. Something here really sucks. And no, I do not think it is the math education level of the developers. RDF IS very simple (forget the ivory tower-speak).
So, we have the very simple resource description technology (RDF), four years old, and it fails to get adopted to describe documents on the web. The question we should ask is not what RDF can or cannot do, what is its potential, or how it relates to other technologies. The question we should ask is why it has failed. Microformats are just living evidence of the failure.
I think, the RDF folks should think more about constructive criticism like Mark Murphy's one.