The journey from now to the Semantic Web is a long one. What we currently have on our hands with the current version of the Web are billions of documents totaling terabytes of data. This data is usually found within HTML pages comprised mainly of non-validating markup and very little, if any, meta data.
While there are billions of documents on the Web that contain no meta data whatsoever there is one shining star of hope: Natural Language Processing. NLP can be used to sift through the "garbage" data to extract coherent statements about the information held within.
Natural Language Processing and the Semantic Web
This entry is a response to I will never support the Semantic Web by Brian of d'bug.
I'm getting tired of reading about how the Semantic Web is some kind of pipe dream that will never be realized. The Semantic Web is completely and entirely within our technological reach. People may have been given the impression that we cannot create the Semantic Web because of its complexity, the number of years it has been in development, or even the unanswered questions that still exist for certain problems we will face. These are valid reasons to doubt our progress, but progress is certainly what we are making.
Some People Will Never Support the Semantic Web
Natural Language Processing is very important to the Semantic Web. Language processing algorithm development will rise as better and smarter NLP agents are used to scour silos of raw textual data for semantic meaning. The addition of NLP Web services to the Web will give light to new and innovative mashups. An example mashup powered could be a service that uses a language processing agent to read a news article about the Apple iPhone and:
Future value paradigms of the Semantic Web
Last night I had an interesting conversation with an online acquaintance about the Semantic Web. I was surprised to find that the mere mention of the name "Semantic Web" sent him into a 5 minute rant about how much he disliked everything to do with it. His biggest qualm was with what he considered to be the empty promises made by proliferators and supporters of the Semantic Web. One example promise was that the Web would be transformed into an artificial intelligence that will think and act independently from humans.
Misconceptions of the Semantic Web versus reality
A lot of you emailed me asking where to find more videos, so I'm delivering the goods. I've expanded the previous list from a paltry 17 to a remarkable 302, and I've included podcasts this time! There were so many videos I had to break them up into different categories for easier skimming. There are no duplicates, however I did place some videos into more than one category when I felt it was appropriate. This list is monstrous, enjoy.
302 Semantic Web Videos and Podcasts!
I like to consider myself fair and balanced when speaking about most topics. To educate the uneducated and to balance things out a bit I have compiled a list of 5 problems we will likely run into when we reach the Semantic Web. Each problem is a side-effect of advances in technology, rushes to fill new niches, or the previous two plus the desire to make a quick dollar.
5 Problems of the Semantic Web
I have 3 interesting links that you need to check out. The first two are products for discovering and storing metadata, natural language processing, and many more things. The third link goes to a post on Geospatial Semantic Web Blog which gives us an update on Metalink's ability to map its descriptions into RDF.
Mini-roundup of links I think you need to check out
For just about every area of research there exists documents online describing background information or techniques to accomplish a task in that domain of research. These documents are often referred to as white papers, provided their content is of technical or research orientation. The information held within white papers is essentially accessible by humans only because machines are not able to read and comprehend text in the same way humans can. If machines were able to read white papers and extract information in the same way humans can we would be able to store each fact and piece of knowledge from the documents. This method of indexing would facilitate much more detailed searches, allowing users to search by topic, theory, conclusion, methods, citations, references, etc.
Extracting Information from White Paper Text
It seems as though nothing short of a new buzzword can stop the burst of activity in the vertical search market, and who are we to complain? Vertical search engines differ from their horizontal brethren (who attempt to index the Web as a whole) by focusing on a single topic or niche about which to index information from the Web. Often, a VSE can deliver results with much greater relevancy and accuracy than major horizontal players like Google, Yahoo, and Microsoft.
Vertical Search Engines are Domain Knowledge Silos
Jim Rapoza at eWeek has an opinion about gaming the Semantic Web regarding companies and developers that are using the Semantic Web label inappropriately. He makes a good point worth mentioning: When an innovative new idea comes along and gets popular enough it is commonplace to see vendors and companies take some of the concepts and strategies of the idea and try to adapt them, but are often not true to the idea's core principals (either purposely or accidentally).
Misrepresenting the Semantic Web
