There's a lot of talk about new search engines and the promising technologies behind them. One technology that has more or less recently been applied to Web search is natural language processing. NLP allows search engines such as Hakia and Powerset to return results based on the query's meaning rather than relying on keyword distribution as a means of identifying relevant Web documents.

Continue reading Stochastic (statistical) search is on the way out

JUN 15th 2007

Semantic search has two legs

Published 1 year ago by Yihong Ding

The discussion of semantic search has gradually become popular. Just not long time ago, semantic search was thought to be barely a little bit more than a dream. At present, optimistic researchers have started to believe its possibility in the near future. Very recently at Read/WriteWeb, Dr. Riza C. Berkan, the CEO of Hakia (a company declared to perform "semantic search"), posted an article about semantic search that attracted much attention. Despite of agreeing with the post, here are more thoughts about semantic search.

Continue reading Semantic search has two legs

MAY 10th 2007

Right now there is more content being created than can be consumed. You might say "but all content gets consumed eventually, by someone." This is generally true and I completely agree. However, how much of that information is consumed by yourself? I will assert that it is a very small slice of the pie. Even if you focus on a single topic, there are simply too many publications. Try searching "Semantic Web" on Technorati or Bloglines to see just what I mean. It's a never ending flow of information. At the Web's current rate of expansion it will become harder and harder to keep up with it all.

Continue reading So much information, so little time for it all

It seems as though nothing short of a new buzzword can stop the burst of activity in the vertical search market, and who are we to complain? Vertical search engines differ from their horizontal brethren (who attempt to index the Web as a whole) by focusing on a single topic or niche about which to index information from the Web. Often, a VSE can deliver results with much greater relevancy and accuracy than major horizontal players like Google, Yahoo, and Microsoft.

Continue reading Vertical Search Engines are Domain Knowledge Silos

I have 3 interesting links that you need to check out. The first two are products for discovering and storing metadata, natural language processing, and many more things. The third link goes to a post on Geospatial Semantic Web Blog which gives us an update on Metalink's ability to map its descriptions into RDF.

Continue reading Mini-roundup of links I think you need to check out

MAR 11th 2007

For just about every area of research there exists documents online describing background information or techniques to accomplish a task in that domain of research. These documents are often referred to as white papers, provided their content is of technical or research orientation. The information held within white papers is essentially accessible by humans only because machines are not able to read and comprehend text in the same way humans can. If machines were able to read white papers and extract information in the same way humans can we would be able to store each fact and piece of knowledge from the documents. This method of indexing would facilitate much more detailed searches, allowing users to search by topic, theory, conclusion, methods, citations, references, etc.

Continue reading Extracting Information from White Paper Text

I was reading a blog entry by Matt at PeerPressure that brings up a point worth sharing. One of the biggest problems supporters of the Semantic Web initially faced was, as Matt stated, the classic tech catch 22. His explanation is:

Continue reading To reach the Semantic Web, must we already be there?

MAR 6th 2007

It isn't difficult to imagine that in 10 or even 30 years into the future, the Web will be a dramatically different place. If you look at how quickly we've progressed in the last decade you can see that technology has a way of developing quite rapidly. It has been my observation that Web technology, specifically in the area of Web standards, seems to have always moved slower than other areas of technology. This is due to the immaturity of the medium; the World Wide Web can still be considered in its infancy. Another contributing factor to slow progress has been the difficulty surrounding browser vendors cooperating with each other and following standards properly.

Continue reading How long will the Web remain as we know it?

The other day I was thinking, wouldn't it be interesting to see a site come out that essentially acts as a broker or mirror of metadata from other sites? You could go to this site, enter a URL and have the metadata from that page presented to you in clean, crisp XML. It would be even better if this was turned into a Web service and the API was free for anyone to use. I would imagine there would be quite a bit of mashing potential!

Continue reading Does the World Need a Metadata Extraction Service?

FEB 28th 2007

MetalinkMetalink was designed for describing the locations of large files that are multi-located (shared via many mirrors and with P2P) to increase usability, reliability, speed, and availability. If a server goes down during a download, download programs can automatically switch to another mirror. Or segments can be downloaded from different places at the same time, automatically, which can make downloads much faster.

Continue reading Metalink combines FTP and HTTP with optional P2P

Page 9 of 11