Feed: Comments for Semantic Search: The Myth and Reality - AggScore: 57.6


Visitor Rating: 6 (1) (Rate)
Story Clicks: 7
Lenses: (Add|?)
Comments: (Log in to add)
Log in to add feed to you bookmarks.


Semantic Search: The Myth and Reality

Date Published: May 29, 2008 - 3:15 pm

Первыйнах



Date Published: May 29, 2008 - 3:33 pm

Alex:

Excellent points. Thinking in terms of potential outlets for monetization (advertising and licensing), I believe there are two fundamental problems with how semantic search has been positioned:

a.) In the consumer search space, Google has no ROI to consumers (as they do to advertisers). To the best of my knowledge, Google has never published precision or recall figures and probably never will. It is *discernibly* good enough, convenient, lightning fast, *comprehensive* (hence its part of speech is now as much a verb as a noun), and free. Google doesn’t promise much. But what it promises, it does *very* well. And it does the job for free. Google's relevance was significantly better than its predecessors without requiring any change in behavior - this is the critical point. PageRank produced a very noticeable leap in quality *with the exact same user model as its predecessors*. It was classic "embrace and extend."

Promises are tempting but dangerous. Until there is another such leap (that the average Joe can notice *without* any change in searching behavior), Google will remain King.


b.) Taking ROI into account, most consumers (and business users) don’t care about the "how." They care about the "what" and the "why." Some IT managers care about the "how" either because they are paid to care (to provide needed expertise and due diligence for internal ops) or because they are enthusiasts that track the latest trends in technology. But at critical mass, most people don’t care.

Semantics represents the "how." The industry should, instead, focus on the "what" and the "why." Once there is a clear business case for the "what" and the "why," the market will determine the best "how" that meets the objectives of the "what" and the "why."

As an example, Google *sells* better targeting as the "what" and the "why." Search is the "how." Search is merely a means to an end - the end is Google's value proposition to advertisers. If semantic search represents a clear leap in terms of better targeted, more quantifiable advertising, advertisers will take notice. But again, the operative phrase is *targeting* (the "what") NOT semantics (the "how").


c.) Businesses care *primarily* about business processes, not enabling technologies. Business processes have ROIs, budgets, and buyers. What is the ROI of semantics? Pivoting the conversation this way is a non-starter for an enterprise buyer with highly competitive budget line items. Businesses do indeed buy infrastructure (e.g., storage, routers, etc.) but only because such infrastructure supports existing business processes which have (or *should* have) measurable or perceived ROIs. Semantics need to either have a clear value-add to existing business processes, must facilitate the creation (yes, creation) of new business processes that might not be possible or practical absent semantics, or must clearly constitute infrastructure that underlies enhanced or novel business applications.


I am confident though that this positioning problem will be addressed - in the near future. Any new technology goes through this phase. Eventually, there will be big winners, with a diverse range of business models.

Cheers.



Date Published: May 29, 2008 - 5:20 pm

Nosa,

Great insight and addition to the post - thank you!

Alex



Date Published: May 29, 2008 - 7:10 pm

The value in semantic search is simply about creating ontologies that allow for results refinement that can deliver a much higher degree of relevance than currently possible with query refinement. This can has huge value to users and advertisers.

I would also caution people to not see semantic search engines as destination sites the way people regard search engines. this is a whole new ball game.



Date Published: May 29, 2008 - 7:33 pm

Alex - one of the most thoughtful posts on semantic search I've read.
We'd be better off coming up with a different name for it than Semantic Search, since search tends to position it incorrectly to users who are accustomed to Google. More importantly, we all have to be careful not to overhype semantic search. I started in the text analytics space in 2000 with ClearForest and too many companies (including us) were overpromising and under-delivering.

For ClearForest and others, rather than search, we attempted to position it as "business intelligence for text", but that drove users to expect simple dashboards a la Business Objects, which is also not the right paradigm.

The analytics are what makes semantic analysis useful. So, tools like visualization that can surface relationships in the tagged content become ways to navigate large bodies of semantic knowledge. The more we think about it as an analysis tool, the more likely it will be that we come up with the appropriate problems for it to solve.

For those familiar with structured databases, I look at semantic analysis more like a SAS tool than a Business Objects dashboard. If you think about the data problems that SAS can solve, they're probably the equivalents of what semantic analysis can solve for text.



Date Published: May 29, 2008 - 8:34 pm

to really do the meaning thing, we will have to get away from the alphabet...

either ideograms, categorization by color, the five senses, in short, a different kind of coding system...

prior to that, i think this overall problem would be solved very differently if the starting place was, say, mandarin. a calligrapher can communicate many many layers of meaning with just four symbols, impossible with four roman letters.

and the learning curve for any new "system" will probably not be as long as feared, so simple a child can do it



Date Published: May 29, 2008 - 9:29 pm

The future of Search will be a combination of both Semantics which is symbolic (computer manipulation of symbols or objects) & numeric-based search (manipulation of numbers ie, high speed number-crunching), eg - Google, LiveSearch, etc... The 2 will complement each other. Currently, numeric based search still outperformed semantic search, in terms of recall & precision.

The current Google PageRank algorithm only computes a 2D (rows & columns) frequency matrix of links (outward from & inward to a page), but multi-dimensional (greater than 2D, such as 3D, 4D or more) matrix analysis (called Tensor calculus) is starting to appear from the community of data analysts, which I quoted in my message on the thread : sezwho acquires tejit semantic platform. I haven't seen any Tensor-PageRank yet, but it won't be too long before it appears in the literatures. However, the HITS algorithm (similar to PageRank) has been tensorised (3D matrix) as described in the abstract of the following paper:

Abstract The TOPHITS Model for Higher-Order Web Link Analysis

The third dimension in addition to the outward & inward links is the anchor text of the links between them. This is only the beginning of tensorisation of current algorithms. Imagine a search engine that is based on say 20 tensor dimensions?

To avoid the shortfall of numeric-based search in todays environment, one can use a guided search (interactive search), ie, start with a narrow bag of words, then the engine will refine the search & narrow it down to the target the user wants. Such guided search is described in the following paper:

Interactive Search Grouping - Search result grouping using Independent Component Analysis

See, the thing about Semantic search's advantage is it relies on the user to give it a full natural language sentence as alex's example : What actor starred in both Pulp Fiction and Saturday Night Fever? Most users don't like typing a query like this, because it is too long to type, however if it is voice-enabled search , then long phrases is no problem. This is where numeric-based guided search comes into the picture. Users can type in short phrase and from there, the engine guides the user by feedbacks and more queries.

Finally, here is a good video from Peter Norvig director of research at Google, in his talk in the Future of Search meeting organised by Berkeley in 2007. Interested readers should watch the video as he raised interesting things about the future of Search for Google.

Future of Search - 2007 : Peter Norvig



Date Published: May 30, 2008 - 2:34 am

Alex,

There is a big assumption here... that Google is not semantic search.

How do we know that? How do you know where to place Google on the "matrix" above?

I have no doubt Google are doing whatever they can - using whatever technology delivers the goods - to produce the best search experience for its users.

Unless PROVEN wrong, I think we can't just call Google Search the "semantic outsider" which is what I hear in your article.

Can we prove they're not hiding a lot of PLSI or whatever under the hood???

-Alister



Date Published: May 30, 2008 - 4:27 am

Alex, great post that is really helpful in getting to grips with a complex subject. The main take away for me was the idea that copying the Google search box is a recipe for failure. There are plenty of great user interfaces that can be constructed around a structured query. Google will never do that, it would damage their brand. They will put in all the smarts (semantic or whatever) under the cover to keep that 85% growing to 90%. The only way I can see to make a dent in that is to fundamentally change the value proposition for publishers and advertisers. Not sure how to do that of course (if I was I would keeping quiet about it).



Date Published: May 30, 2008 - 6:17 am

Bernard,

Absolutely, you are right, the main take a way is that search box is inviting to enter old style queries and that is not going to be impressive.

Alex



Date Published: May 30, 2008 - 6:54 am

@Alister,

Google is not a semantic search engine today, at least not the same kind of way that others are trying to be. The main algorithm is based on frequency analysis and Page rank.

There is also a light weight semantical analysis - for example when you search for books or movies Google knows about these types of objects. But it does not appear to be using deeper semantics. Nor does it need to, because in face to face comparison there is no advantage for types of searches that people perform.

Alex



Date Published: May 30, 2008 - 6:57 am

Alex - very nice overview of a confusing topic area.

I'm skeptical that semantic search will be anything more than a niche technology anytime soon, for one reason: Most searchers don't dig further than the top three search results.

If people are happy with what they see in Google's top three search results, they aren't going to use advanced search or semantic search.

The focus of these companies probably needs to be on finding profitable niches or on turning this into background technology.



Date Published: May 30, 2008 - 9:38 am

Alex - great article, and close to home for Snooth. We're a vertical search engine, and so have some semantic-ish search functionality, yet we also use the plain ol' search box, and find that most queries dont take into account the full potential of our seach algorithms.

--Philip



Date Published: May 30, 2008 - 11:35 am

No amount of RDF will let your computer answer "What is the best vocation for me now?", I agree.

But it can get you most of the way there - it wouldn't take much for your computer to be able to answer "What jobs are available in my area that match my skills A, B and C, my interests X, Y and Z and pay at least $$$?"



Date Published: May 30, 2008 - 12:27 pm
u-mp5566 serv 1.0418 seconds to generate.