Search engine features and search techniques Translators and computers translation jobs
Home More Articles Join as a Member! Post Your Job - Free! All Translation Agencies
Advertisements

Search engine features and search techniques



Become a member of TranslationDirectory.com at just $12 per month (paid per year)





When using a search engine, the most difficult problem to solve is the huge amount of results you get and the importance to be granted to them.
    As a matter of fact, the efficiency of a search engine is mainly due to its ability of listing the results of our search giving the higher rank to the most important topics found.
    In addition to that, its efficiency is also shown by its capacity of interpreting our query, which is a very difficult side of the job, as it is an automatic mechanism, not a human being endowed with intelligence.
    When we submit a keyword to a search engine, the poor server that has to do the job starts browsing any possible references into the database, and then picks up all the occurrences which may be of interest to us.
    Afterwards, it sorts them according to a criterion depending on the relevant algorithm of the engine in question.
The main engines distinguish from one another thanks to their different criteria, which should be useful to know in order to better take advantage of the engines, possibly using them in different ways to get different results or obtaining different solutions.
    At the same way, it should be helpful to use advanced search methods more often, instead of the usual, simple input of a series of words.
    This article intends to carry out a more precise exploration on how search engines are used, in addition to describe the operation of some meta-engines that are also considered accurate and useful.
    The logic method valid for all search engines is still the frequency of occurrence of terms in the metatags, along with those found within the page.
We call 'metatag' a series of web page descriptors such as its title, description (that will be hidden to the browser), keywords and several other fields (e.g. author and language).

GOOGLE.
SIGNIFICANCE CRITERIA.
    Doubtless, it is the one also the reader of this article uses more often. As a matter of fact, it is the more submitted URL and it covers 90% of the queries made to all engines.
    The importance of results is based upon an algorithm consisting of about a hundred parameters.
    However, the guidelines are well defined: the pages with the higher LINK POPULARITY are considered as the most important, as well as the pages with an acceptable frequency of occurrence of the words searched for.
    The first concept means that a higher number of external connections to that page define its condition as significant.
    The second concept establishes that if many words are repeated within the page, then the topic of that page is the one you are looking for.
    The third concept states that the words searched for and that are closer to each other are more significant that other words that, despite their occurrence within the page, are not so close to each other.

Advanced search criteria.
Strings and words:
These criteria enable the user to reduce considerably the number of results and to obtain a more accurate choice among them, if you want to look for one or more words occurring together for example: software and localizzazione
software+localizzazione
a higher accuracy and discrimination can be obtained by quoting the phrase you are looking for: "localizzazione software"     In the Boolean logics, these are an AND-type searches, because you want to get the occurrence of the words required.
    There are two further Boolean criteria that can be useful:
- looking for a word and for an alternative one (OR), so that you will look for all the pages containing the term 'software' or 'localizzazione". In that case, you will get a sum of the two criteria and therefore a higher number of results. localizzazione OR software - looking for pages not containing a certain term; of course, should we carry out a search with the only aim of meeting this criterion, we would get such a redundant list to be totally useless.
Instead, this criterion results much more effective if used in conjunction to one of the two previous criteria described above.
For example, you may want to search for pages containing the phrase "localizzazione software", but not the word "Microsoft".
Therefore, your search would be: https://www.google.com/search?as_q= "localizzazione software" - Microsoft

Notes:

Common terms such as article and prepositions are not considered during the search:
on the other hand, in case you want they represent the search criterion, you should add the symbol'+'. Ex: 'localizzazione+di+software'

Other criteria reducing the set of results are the following:

    Looking for documents restricted to one editing language. This is a risky criterion, as not every document describes the editing language in its metatags. Anyway, whichever document containing the editing language in its metatags would give a peculiar importance to that propriety, so it would be a quality criterion.
The address would be:
https://www.google.com/search?as_q=localizzazione+software&lr=lang_ita to look for the pages in Italian only.

    Looking for documents in a specific file format or leaving out such documents from the search.
Using this criterion, you will get no data search value, but it is useful to leave out all the documents whose formats you cannot acquire or do not want to acquire.

    Looking for documents with a data range.
This criterion enables to establish 'a priori', i.e. from the beginning, if your search will expire or if you desire to get only recent documents or not.
Using GOOGLE, the criterion is restricted to three meta-groups (all, last 3 months, last 6 months, last year).
If you want to leave out the .pdf's, the phrase to be submitted will be:
localizzazione software -filetype:pdf

    Looking for terms placed in specific domains (or leaving them out).
For example, you may want to search for pages (in the Italian version) dedicated to software localization within my website, so the phrase would be:
localizzazione software site:antotranslation.com
Using this criterion, the search is carried out within a domain or a grouping of domains.
For example, all the domains of the 'italia' hierarchy (.it)

    Looking for terms placed in specific areas of the text (or leaving them out).
You may want to extrapolate only those documents containing the searched term in their titles, texts, URL address or internal links.
Searching using the title criterion can represent a significant criterion, as if a term appears in the title, no doubt it is more important than another term appearing only within the text, because it is the title that provides that main definition of the document contents.
    Note that the engine defines as title the 'H' htm tags and the phrases with a graphic body exceeding the standard size.
To look for the phrase into the title:
allintitle: localizzazione software

To look for the phrase into the body:
allintext: localizzazione software

    Besides, also the presence of the term into the domain name identifies more accurately the importance of a topic.
If a page is called 'localizzazionesoftware.htm' it is very likely to deal with software localization.
To look for the phrase into the web address:
allinurl: localizzazione software

YAHOO
SIGNIFICANCE CRITERIA.
The text in the page, the title and description accuracy, its address (URL), its source, the links contained in the page and in other pages quoting it, and other features of the website.

Advanced search criteria.
In Yahoo, the advanced search covers many of the criteria previously described for Google.
    The syntaxes for exact phrase, OR, AND and exclusion (leaving out) are totally similar.
The presence of the word in the title
intitle:localizzazione+software
The presence of the word in the domain
inurl:localizzazione+soft
The presence of the word in the title
intitle:"localizzazione software"
localizzazione OR software
Search in domain
http://it.search.yahoo.com/search?va=localizzazione+software&vs=www.antotranslation.com
Search for file type
http://it.search.yahoo.com/search?va=localizzazione+software&vf=pdf
Search for language
http://it.search.yahoo.com/search?va=localizzazione+software&vl=lang_it

ICEROCKET
Advanced search criteria.
Exact phrase:
"localizzazione software"
OR
localizzazione OR software
exclusion
-localizzazione -software
domain
localizzazione software site:antotranslation.com
News are divided into 5 categories, and their search is good

MSN
Advanced search criteria.
Exact phrase:
"localizzazione software"
OR
exclusion
-(localizzazione software)
dominio
localizzazione software site:antotranslation.com
domain
link:antotranslation.com
Country of origin
(loc:IT OR loc:AU)
language:
language:it
    A peculiarity of MSN Search is the possibility to calibrate the visibility of results using three scroll-bars in the advanced search, in visual mode, or setting some values in the range
0..100 in the command string.
The criteria are the following:
exact match {mtch=50}
popularity index (link popularity) {popl=50}
page refresh index {frsh=50}

ALLTHEWEB
Advanced search criteria.
In ALLTHEWEB, the advanced search covers many of the criteria previously described for Google.
    The syntaxes for exact phrase, OR, AND and exclusion (leaving out) are totally similar.
The presence of the word in the title
title:localizzazione+software
The presence of the word in the domain
url:localizzazione+soft
Search in a website
site:www.antotranslation.com
Search in a domain
domain:.it
Search for file type
http://it.search.yahoo.com/search?va=localizzazione+software&vf=pdf
Search for language
http://it.search.yahoo.com/search?va=localizzazione+software&vl=lang_it

HOTBOT
Advanced search criteria.
    Currently, Hotbot has the most enhanced advanced search system. It features all the characteristics already described for Google, as well as an unlimited time filter - differently than Google and Yahoo -, and the file formats used to set the searches are sorted per best number and quality.
    The word definition filter is more detailed and you can combine either the position of the terms within the document and their individual inclusion/exclusion.
    For example, you can search for the word 'software' in the title and the word 'localizzazione' in the URL.
    Finally, you can set these criteria in HOTBOT to carry out a direct query of the GOOGLE database (the largest one) and the ASK JEEVES database.

ALTAVISTA
Advanced search criteria.
    In ALTAVISTA, the advanced search covers many of the criteria previously described for Google and Yahoo.
    Like in HotBot and ASK JEEVES, the time filter is much more flexible, and you can compose a real date; besides, you can define a range per year, months and weeks.
    Finally, you can compose a SQL-style search string by combining the elements through the Boolean logics (for advanced users).

TEOMA
SIGNIFICANCE CRITERIA.
    In Teoma, significance is defined as 'authority' and is very similar to the 'link popularity' in Google; in addition, Teoma assures the exclusion of any links to spam websites.
    The characteristic of Teoma is the list of terms suggested along with the searched words.
    Another feature connected to the searched terms is the list of websites containing related link collections. This is a powerful feature that enables the user to increase the search very accurately.

Advanced search criteria.
They are very similar to those used by HOTBOT, besides these criteria handles either word plurals and derivatives on an implicit level.

GIGABLAST
Advanced search criteria.
All the criteria related to terminology, file type, presence of the terms in URLs and page name.
These are the syntaxes to be used:
suburl:
site:
url:
title:
ip: (if only the tcp/ip address is known and you want to display other information)
link: -link:(exclusion)
type:pdf type:doc type:xls type:ppt type:ps type:text
The exposure of results will also present the percentage of occurrences of the searched words appearing among the results obtained. These occurrences represent in turn suggestions of alternative terms.

ENTIREWEB
Advanced search criteria.
All the criteria related to terminology, language, geography, presence of the terms in URLs and page name.

LYCOS
    One of the features of Lycos is the presence - among the resources related to the search engine - of a resource specialised in searching for discussion utilities related to the topic you are looking for (forum, mailing lists, etc.) Also the news search engine through keyword is very good.
Advanced search criteria.
    All the criteria related to terminology, language, date range, presence of the terms in URLs and page name.

META-ENGINES

MAMMA
Advanced search criteria.
    All the criteria related to terminology, language, geography, presence of the terms in URLs and page name.
This meta-engine enables the selection of which directories you will search in:
-Open Directory
-Looksmart Directory
-Business.com
-About.com
-Mamma's Collection
and which search engines:
-Teoma
-Google
-MSN
-Entireweb
-Gigablast

IXQUICK
    You can use natural language or complex Boolean searches supporting phrases, wildcards (meta-characters), skipped terms, mandatory terms, brackets and other modifiers such as NEAR (similar to), as the meta-engine knows which search engines can carry out complex searches.
Duplicates are removed, but they are added in order to give the result most importance; therefore, if you got the same result in more than one engine, the page will be given more importance.
    Meta-characters can change a character to any other.
The NEAR command enables to define a term related to another.
These are the syntaxes to be used:
+title:
+domain:
host:
immagine:
image:
url:
link:
text:
related:
You can select the used engines according to the national version in use.
    As a matter of fact, this meta-engine uses a pool of search engines including, in addition to the most important ones, also those on a national level.
You can ask you query using a conversational language, and they will be transferred to those search engines which accept that kind of search.

CLUSTY
    In the result window, Clusty presents a list of terms related to the context of the query. This enables the user to look for the source topic in an alternative manner.
Advanced search criteria.
All the criteria related to terminology, language, presence of the terms in URLs and domain. Synthax used:
domain:
host:
selection of search among:
GigaBlast
MSN
Lycos
Looksmart
Wisenut
Open Directory
Overture

WEBCRAWLER
Advanced search criteria.
All the criteria related to terminology, language, date range, presence of the terms in URLs and domain.







Submit your article!

Read more articles - free!

Read sense of life articles!

E-mail this article to your colleague!

Need more translation jobs? Click here!

Translation agencies are welcome to register here - Free!

Freelance translators are welcome to register here - Free!









Free Newsletter

Subscribe to our free newsletter to receive news from us:

 
Menu
Recommend This Article
Read More Articles
Search Article Index
Read Sense of Life Articles
Submit Your Article
Obtain Translation Jobs
Visit Language Job Board
Post Your Translation Job!
Register Translation Agency
Submit Your Resume
Find Freelance Translators
Buy Database of Translators
Buy Database of Agencies
Obtain Blacklisted Agencies
Advertise Here
Use Free Translators
Use Free Dictionaries
Use Free Glossaries
Use Free Software
Vote in Polls for Translators
Read Testimonials
Read More Testimonials
Read Even More Testimonials
Read Yet More Testimonials
And More Testimonials!
Admire God's Creations

christianity portal
translation jobs


 

 
Copyright © 2003-2024 by TranslationDirectory.com
Legal Disclaimer
Site Map