Proposed Search Engine(s)
(2007 Jan 14 blog post)
A couple of years ago (2005 Mar), I tried to propose to Google a major enhancement to their search engine. I got an automated reply --- essentially a non-reply.
The image above indicates the suggestion --- a search-words distance-apart number that the user can specify. Many web pages are huge and contain sections on many different topics. If this suggestion were implemented, as outlined further below, this feature would drastically reduce the number of useless 'hits', in large pages [such as Google blogspot.com pages], in most of my web searches.
I found what I thought was an appropriate email address --- email@example.com. But their reply said to "register" at a Google "posting" web site and submit the suggestion there. Interesting --- the email address firstname.lastname@example.org does not accept suggestions. As Spock would say, it is not logical.
I did not have time or energy to go through their registry dance to post the suggestion. The dance: Get a userid and password ... and try to remember where I hid the information (so that I can follow up to responses to the posting), as I go through computer and mail system migrations ... along with potentially 50 other registrations, if I responded to every such command to "register". So I let the suggestion to Google go, for the time being.
I am still, years later, just as frustrated by the massive amount of non-pertinent pages that I get --- on doing almost any wordS search, with any search engine.
So I am posting the suggestion openly now --- hoping that ANY search engine organization will take up this challenge. Are you listening AltaVista, A9, AOL, Ask.com, Clusty, Exalead, Gigablast, Google, Lycos, MSN, WiseNut, Yahoo, and others? Readers, please alert them.
I plan to periodically mail this suggestion to Google and others. Hence I am formatting this page to support printout with appropriate page breaks and other formatting.
[Actually, there have been a couple of attempts at implementing an enhancement like this. But one was done by an essentially-one-person web-searcher development-operation, in the Netherlands --- walhello.com (Web+valhalla+hello). They/he did not have a very big database of web documents to search, nor the huge server farm of an organization like Google.
The other attempt was limited to two options --- a fixed word span of 16 words, OR no limit on word span (the current, lamentable state of affairs). This (preliminary?) attempt is by a major search engine organization in France, exalead.com.
With Exalead, you can use the word NEAR between words in a search query --- to do a "proximity search". "The NEAR operator finds documents where the query terms are within 16 words of each other."
Note that the French and Dutch are not willing to resign themselves to using Google for all their searches. They know they can do better.
Hopefully, these two, and other searcher development organizations, are still working on this feature.]
The image at the top of this page (for a hypothetical search engine called Hoogle) gives the gist of the suggestion in a readily assimilatable visual form.
To give some details of the suggestion, here is the text of the original proposal that I e-mailed to email@example.com on March 13, 2005.
Subject: Suggestion for search feature to blow competitors away [2005 Mar]
Dear Google Developers,
In doing searches on multiple keywords, I am continually getting many pages that do not apply --- because they are long pages (like pages with hundreds of mail responses, or a lot of information on many different subjects).
Data gathering (word location) considerations:
Although the 4-bytes for each keyword might increase the size of Google database(s) by about 20%, the pay-back would be well worth it.
Cheers, a constant Google user (still looking for a better search engine)
2013 UPDATE :
I recently (2013 April) bought a book called "9 Algorithms That Changed the Future" by John McCormack. That book points out, in the first chapter, on web search algorithms, that the position of words within web pages IS SAVED and accessible to search engines like Google. So there is no reason why they could not provide the facility suggested here --- if not on the main search page, then via the 'Advanced Search' link.
That chapter even points out that search engines like Google use the 'near' capability very heavily for their own purposes. Why they do not make that ability available to users is puzzling --- especially when it could cut down searches that return millions of pages down to returning thousands of pages instead. A situation devoutly to be wished --- especially as the databases of web pages explode in size.
Here is a page of web searcher sites for reference.
Bottom of page on the blog topic
To return to a previously visited web page location, click on the
Or you can scroll up, to the top of this page.
Posted 2007 Jan 14.