Tuesday, October 9, 2007

Browsing or Searching

Today my cousin Martin sent me a link on how the Google search engine works. I have to say it is very good paper written by the founders of Google, SergeyBrin and Larry Page. Here is why I think the Google idea so great:
  1. Sergei Brin and Larry Page identified the problem - It takes too much time to browse for information when the user really doesn't know where to look for it. I remember in Mid 1990s, I used to spend hours looking for HTML pages on the Yahoo! site.
  2. They identified the basic functional solution - Rather than spending hours browsing for information, users can query for results via keywords and get results within seconds.
  3. They identified the basic non-functional solution - The system gets a request, search the HTML pages indexes, which reside in a repository, generate a result set of documents.
  4. They designed and developed the solution - Designed the Google search engine which ranks results through various algorithms.
  5. Prototyped a solution
  6. Now it is multibillion dollar business
They identified a business problem, designed and implemented a solution. They identified the business problems associated with the way Yahoo!, Excite, Alta Vista and others organized the web pages. Yahoo! and the first generation sites should be credited for creating a directory of Web Pages. These first generation sites gave most, if not all, web pages exposure to an inquisitive web user.

The question now to be answered: "What is next?"

Problems with Google's approach: Search results in Google are solely based on Google's PageRank algorithm. This can be said about any other search engine like Convera or Autonomy. The user of a search engine is at mercy of the search engine's search algorithms.

Problems with Yahoo!'s directory approach: Google identified the problem when a user is allowed to browse through pages to get to what he or she is looking for, it takes time (sometimes alot of time). The benefits of the directory approach is that the user is using his cognitive skills to decide which information is relevant and which is not.

The ideal solution is somewhere in between. Ideally the user should get the information he desired in the shortest amount of time. Since each business domain and its processes can be so different, it would be ideal to identify the processes and then identify the points in the process where the user needs to search or browse for information. After the points have been identified, it would then make sense to design a solution which does a hybrid of browsing and searching.

In the future, Semantic Web, and machine learning will be incorporated into Web Browser or their equivalent and the browsers will render results according to each user's profile.

No comments: