The latest marketing trends. WOM Marketing, Guerilla Marketing, Viral Marketing and the other alternative marketing techniques.


Spamdexing

Spamdexing or search engine spamming is the practice of deliberately creating web pages which will be indexed by search engines in order to increase the chance of a website or page being placed close to the beginning of search engine results, or to influence the category to which the page is assigned. Many designers of web pages try to get a good ranking in search engines and design their pages accordingly. The word is a portmanteau of spamming and indexing.

Spamdexing refers exclusively to practices that are dishonest and mislead search and indexing programs to give a page a ranking it does not deserve. "White hat" techniques for making a website indexable by search engines, without misleading the indexing process, are known as search engine optimization (SEO). SEO techniques do not involve deceit.

Search engine spammers, on the other hand, are generally aware that the content that they promote is not very useful or relevant to the ordinary internet surfer. Search engines use a variety of algorithms to determine relevancy ranking. Some of these include determining whether the search term appears in the META keywords tag, others whether the search term appears in the body text of a web page. A variety of techniques are used to spamdex (see below). Many search engines check for instances of spamdexing and will remove suspect pages from their indexes.

The rise of spamdexing in the mid-1990s made the leading search engines of the time less useful, and the success of Google at both producing better search results and combating keyword spamming, through its reputation-based PageRank link analysis system, helped it become the dominant search site late in the decade, where it remains. While it has not been rendered useless by spamdexing, Google has not been immune to more sophisticated methods either. Google bombing is another form of web vandalism, which involves creating pages that directly affect the rank of other sites.

Spamdexers may act as consultants, to help other web publishers drive up their sites' ranks using black-hat techniques. Alternatively, they may set up sites of their own that benefit from misleadingly-high rankings -- for instance, creating thousands or millions of landing pages containing links for which the spammer earns a commission whenever the user clicks.

Common spamdexing techniques can be classified into two broad classes: content spam and link spam.

Content Spam
----------------------------------------------------------------

These techniques involve altering the logical view that a search engine has over the page's contents. They all aim at variants of the vector space model for information retrieval on text collections.

Hidden or invisible text

Disguising keywords and phrases by making them the same (or almost the same) color as the background, using a tiny font size or hiding them within the HTML code such as "no frame" sections, ALT attributes and "no script" sections. This is useful to make a page appear to be relevant for a web crawler in a way that makes it more likely to be found. Example: A promoter of a Ponzi scheme wants to attract web surfers to a site where he advertises his scam. He places hidden text appropriate for a fan page of a popular music group on his page, hoping that the page will be listed as a fan site and receive many visits from music lovers. However, hidden text is not always spamdexing: it can also be used to enhance accessibility.

Keyword stuffing

This involves the insertion of hidden, random text on a webpage to raise the keyword density or ratio of keywords to other words on the page. Older versions of indexing programs simply counted how often a keyword appeared, and used that to determine relevance levels. Most modern search engines have the ability to analyze a page for keyword stuffing and determine whether the frequency is above a "normal" level.

Meta tag stuffing

Repeating keywords in the Meta tags, and using keywords that are unrelated to the site's content.

Gateway or doorway pages

Creating low-quality web pages that contain very little content but are instead stuffed with very similar key words and phrases. They are designed to rank highly within the search results. A doorway page will generally have "click here to enter" in the middle of it.

Scraper sites

Scraper sites, also known as Made for AdSense sites, are created using various programs designed to 'scrape' search engine results pages or other sources of content and create 'content' for a website. These types of websites are generally full of advertising, or redirect the user to other sites.

Link Spam
----------------------------------------------------------------

Link spam takes advantage of link-based ranking algorithms, such as Google's PageRank algorithm, which gives a higher ranking to a website the more other highly-ranked websites link to it. These techniques also aim at influencing other link-based ranking techniques such as the HITS algorithm.

Link farms

Involves creating tightly-knit communities of pages referencing each other, also known humorously as mutual admiration societies

Hidden links

Putting links where visitors will not see them in order to increase link popularity.

Sybil attack

This is the forging of multiple identities for malicious intent, named after the famous schizophrenia patient Shirley Ardell Mason. A spammer may create multiple web sites at different domain names that all link to each other, such as fake blogs known as spam blogs.

Spam in blogs

This is the placing or solicitation of links randomly on other sites, placing a desired keyword into the hyperlinked text of the inbound link. Guest books, forums, blogs and any site that accepts visitors comments are particular targets and are often victims of drive by spamming where automated software creates nonsense posts with links that are usually irrelevant and unwanted.

Spam blogs(also known as splogs)

A spam blog, on the contrary, is a fake blog created exclusively with the intent of spamming. They are similar in nature to link farms.

Page hijacking

Referer log spamming

When someone accesses a web page, i.e. the referee, by following a link from another web page, i.e. the referer, the referee is given the address of the referer by the person's internet browser. Some websites have a referer log which shows which pages link to that site. By having a robot randomly access many sites enough times, with a message or specific address given as the referer, that message or internet address then appears in the referer log of those sites that have referer logs. Since some search engines base the importance of sites by the number of different sites linking to them, referer-log spam may be used to increase the search engine rankings of the spammer's sites, by getting the referer logs of many sites to link to them.

Buying expired domains

Some link spammers monitor DNS records for domains that will expire soon, then buy them when they expire and replace the pages with links to their pages.

Some of these techniques may be applied for creating a Google bomb, this is, to cooperate with other users to boost the ranking of a particular page for a particular query.

Other Types of Spamdexing
----------------------------------------------------------------

Mirror websites

Hosting of multiple websites all with the same content but using different URLs. Some search engines give a higher rank to results where the keyword searched for appears in the URL.

URL redirections

Taking the user to another page without his or her intervention, e.g. using META refresh tags, CGI scripts, Java, JavaScript, Server side redirects or server side techniques.

Cloaking refers to any of several means to serve up a different page to the search-engine spider than will be seen by human users. It can be an attempt to mislead search engines regarding the content on a particular web site. However, cloaking can also be used to ethically increase accessibility of a site to users with disabilities, or to provide human users with content that search engines aren't able to process or parse. It is also used to deliver content based on a user's location; Google itself uses IP delivery, a form of cloaking, to deliver results.

A form of this is 'code swapping, this is: optimizing a page for top ranking, then, swapping another page in its place once a top ranking is achieved.

Some useful links for webmasters:

Google's Webmaster Guidelines page
Yahoo!'s Search Engine Indexing page
MSN Search's Site Owner page