54 results for information extraction

Semantically-Interlinked Online Communities (SIOC for short) is a framework aimed at connecting online communities and discussions from blogs, forums, content management systems mailing lists, and anything else. In the current Web, communities such as forums and blogs are like islands - they contain valuable information but are not well connected or queryable. SIOC allows you to connect these sites, and enables the extraction of semantic information from unlimited discussion platforms.

Continue reading Connect Discussions Between Blogs, Forums, and more with SIOC

Tags:

A lot of you emailed me asking where to find more videos, so I'm delivering the goods. I've expanded the previous list from a paltry 17 to a remarkable 302, and I've included podcasts this time! There were so many videos I had to break them up into different categories for easier skimming. There are no duplicates, however I did place some videos into more than one category when I felt it was appropriate. This list is monstrous, enjoy.

Continue reading 302 Semantic Web Videos and Podcasts!

Tags:

For just about every area of research there exists documents online describing background information or techniques to accomplish a task in that domain of research. These documents are often referred to as white papers, provided their content is of technical or research orientation. The information held within white papers is essentially accessible by humans only because machines are not able to read and comprehend text in the same way humans can. If machines were able to read white papers and extract information in the same way humans can we would be able to store each fact and piece of knowledge from the documents. This method of indexing would facilitate much more detailed searches, allowing users to search by topic, theory, conclusion, methods, citations, references, etc.

Continue reading Extracting Information from White Paper Text

Tags:

11 months ago I posted a short entry that posed the question of whether the world needed a metadata extraction service. I stated that the service could quickly become the largest repository of metadata (in the form of named entites and facts) on the Web if it stored the resulting metadata from each request. Open Calais seems to me to be the "metadata extraction service" I had in mind; it's is a Web service that allows you to automatically annotate content and extract information like facts and named entities (people, places, and organizations, and much more) from unstructured text. If that weren't enough of a good thing, Open Calais returns the metadata in RDF.

Although the question of whether we need it still hasn't been answered, I believe this service could be a catalyst for change towards Semantic Web standards if it is integrated into (or used to create plugins for) the multitudes of open source blogs and other CMS software. Open Calais opens the door to the possibility of lowering the barrier enough for everyday users to publish semantic content.

Tags:

The Seesaw Effect of Algorithms vs. Data Over the years I've noticed that the importance of algorithms and data tends to shift back and forth, depending on which at the time is hardest to duplicate (often from a business perspective). This effect seems to be caused by the availability or demand of one side increasing or decreasing, shifting the balance of importance to the other. At one point the world of software was dominated by the proprietary. The organization with the best software (backend, algorithms, etc) was the dominant entity and data (from say, a Web 2.0 perspective) was generally not the focus. This may have partly been the responsibility of a mindset formed during an era with very little storage space and before mass user activity on the Web.

Continue reading Algorithms vs. Data: The Seesaw Effect

Tags:

The journey from now to the Semantic Web is a long one. What we currently have on our hands with the current version of the Web are billions of documents totaling terabytes of data. This data is usually found within HTML pages comprised mainly of non-validating markup and very little, if any, meta data.

While there are billions of documents on the Web that contain no meta data whatsoever there is one shining star of hope: Natural Language Processing. NLP can be used to sift through the "garbage" data to extract coherent statements about the information held within.

Continue reading Natural Language Processing and the Semantic Web

Tags:

Natural Language Processing is very important to the Semantic Web. Language processing algorithm development will rise as better and smarter NLP agents are used to scour silos of raw textual data for semantic meaning. The addition of NLP Web services to the Web will give light to new and innovative mashups. An example mashup powered could be a service that uses a language processing agent to read a news article about the Apple iPhone and:

Continue reading Future value paradigms of the Semantic Web

Tags:

The value of a dataset may be determined by any number of factors, however it can generally be agreed upon that the data's accuracy, how difficult it is to re-create, its source, and other important factors can affect the value of the data. However, as technology evolves to allow easier access to the information we require, the value of dataset may eventually decrease over time.

Continue reading The value of current datasets in the Semantic Web

Tags:

I've rounded up over 60 Semantic Web blogs for your reading and subscribing pleasure! These blogs are just a portion of the sources being indexed regularly by the soon-to-be-launched Planet Semantic Focus. Enjoy!

Continue reading 60+ Semantic Web Blogs (List)

Tags:

This entry is a response to I will never support the Semantic Web by Brian of d'bug.

I'm getting tired of reading about how the Semantic Web is some kind of pipe dream that will never be realized. The Semantic Web is completely and entirely within our technological reach. People may have been given the impression that we cannot create the Semantic Web because of its complexity, the number of years it has been in development, or even the unanswered questions that still exist for certain problems we will face. These are valid reasons to doubt our progress, but progress is certainly what we are making.

Continue reading Some People Will Never Support the Semantic Web

Tags: