<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-20123300</id><updated>2011-04-21T18:49:04.010-07:00</updated><title type='text'>web development, seo services company india</title><subtitle type='html'>website development company, seo services india, seo firm, web design firm doing seo as well.search engine optimization is a process of SEO development delhi</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://kvcindia-websolutions.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://kvcindia-websolutions.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>website-seo-services</name><uri>http://www.blogger.com/profile/11504393972691329912</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>5</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-20123300.post-113533284314307651</id><published>2005-12-23T02:13:00.000-08:00</published><updated>2005-12-23T02:14:03.156-08:00</updated><title type='text'>Few Myths Explored</title><content type='html'>Google is the Web's most popular search engine, powering not only the popular Google.com Website, but also Yahoo! and AOL. Being listed in Google is very important, and being listed highly in Google can bring great benefit to your site. &lt;br /&gt;&lt;br /&gt;However, there are many myths about how Google works and, while fairly harmless in themselves, these myths tend to allow people to draw incorrect conclusions about how Google works. The purpose of this article is to correct the most popular Google myths.&lt;br /&gt;&lt;br /&gt;Myth #1: The Higher Your Google PageRank (PR), the Higher You'll be in the Search Results Listing&lt;br /&gt;&lt;br /&gt;This myth is frequent, and is the source of many complaints. People often notice that a site with a lower PageRank than theirs is listed above them, and get upset. While pages with a higher PageRank do tend to rank better, it is perfectly normal for a site to appear higher in the results listings even though it has a lower PageRank than competing pages.&lt;br /&gt;&lt;br /&gt;To explain this concept without going into too much technical detail, it is best to think of PageRank as being comprised of two different values. One value, which we'll call "General PageRank" is nothing more than the weighting given to the links on your page. This is also the value shown in the Google Toolbar. This value is used to calculate the weighting of the links leaving your page, not your search position.&lt;br /&gt;&lt;br /&gt;The other value we'll call "Specific PageRank." You see, if PageRank equated to search engine results rank then Yahoo, the site with the highest PR, would be listed #1 for every search result. Obviously, that wouldn't be useful, so what Google does is examine the context of your incoming links, and only those links that relate to the specific keyword being searched on will help you achieve a higher ranking for that keyword. It's very possible for a site with a lower PageRank to in fact have more on-topic incoming links than a site with a higher PageRank, in which case the site with a lower PageRank will be listed above its competitor in the search results for that term.&lt;br /&gt;&lt;br /&gt;PageRank aside, there are also other factors that contribute Google search results -- though PageRank remains the dominant one.&lt;br /&gt;&lt;br /&gt;Myth #2: The Google Toolbar will List Your Actual PageRank &lt;br /&gt;&lt;br /&gt;When Google created their toolbar it was a boon for many Webmasters as this was the first time we got to see any value related to our PageRank. However, the toolbar has also caused some confusion.&lt;br /&gt;&lt;br /&gt;The toolbar does not show your actual PageRank, only an approximation of it. It gives you an integer rank on a scale from 1-10. We do not know exactly what the various integers correspond to, but we're sure that their curve is similar to an exponential curve with each new "plateau" being harder to reach than the last. I have personally done some research into this, and so far the results point to an exponential base of 4. So a PR of 6 is 4 times as difficult to attain as a PR of 5. &lt;br /&gt;&lt;br /&gt;The exponential base is important because it illustrates how broad a range of pages can be assigned a particular PR value. The difference between a high PR of 6, and a low PR of 6, could be hundreds or thousands of links. So if your PR as reported by the toolbar increases or drops, it's important to remember that it could be the result of a small change, or a large change. Additionally, it's possible to lose or gain links and see no change in your reported PageRank.&lt;br /&gt;&lt;br /&gt;The other issue with the toolbar has to do with the fact that sometimes the PageRank it displays is only a guess. People will often notice pages on Geocities or another free hosting provider having a high PageRank. This is because when Google hasn't spidered a page, but has spidered the root domain, the toolbar will guess a PageRank based on the value of the root domain. Therefor it's common to see pages on Geocities with a PR of 6 or 7. The PageRank does not equate in any way to a high Google listing, in fact in this case it indicates the opposite: that the page isn't even in Google. Once Google spiders the page, it will be assigned a more appropriate (and usually lower) PageRank.&lt;br /&gt;&lt;br /&gt;Myth # 3: PageRank is a Value Based on the Number of Incoming Links to Your Site &lt;br /&gt;&lt;br /&gt;This myth is a frequent source of incorrect assumptions about Google. People will often see that a site with fewer incoming links than their own site has a higher PageRank, and assume that PageRank is not based on incoming links.&lt;br /&gt;&lt;br /&gt;The fact is that PageRank is based on incoming links, but not just on the number of them. Instead PageRank is based on the value of your incoming links. To find the value of an incoming link look at the PR of the source page, and divide it by the number of links on that page. It's very possible to get a PR of 6 or 7 from only a handful of incoming links if your links are "weighty" enough.&lt;br /&gt;&lt;br /&gt;Also remember that for PageRank calculations every page is an island. Google does not calculate PageRank on a site-wide basis -- so internal links between your pages do count. This is very important, as instituting a proper structure for your internal links can drastically improve your rankings.&lt;br /&gt;&lt;br /&gt;Myth # 4: Searching for Incoming Links on Google Using "link:" will Show you all Your Backwards Links&lt;br /&gt;&lt;br /&gt;Similar to Myth #3, people will sometimes look for backwards links to a site on Google and fine none, but if the site does have a PR listed and it is in Google's cache, they know that the toolbar isn't just guessing.&lt;br /&gt;&lt;br /&gt;The reason for this is that Google does not list all the links that it knows about, only those that contribute above a certain amount of PageRank. This is especially evident in a brand new site. By default, all pages in Google have a minimum PR. So even a page without any incoming links has a PR value, albeit a small one. If you have a brand new site with 20 or 30 pages, all of which Google has spidered, but you have no incoming links from other sites, then your pages will still have a PageRank resulting from these internal links. As your home page is likely linked to from every page on your site, it might even get a PageRank of up to 1 or 2 from all these little boosts. However, in this situation searching for incoming links will likely yield 0 results.&lt;br /&gt;&lt;br /&gt;You can also see this happening on pages that have been around for awhile. For instance, this page has 0 incoming links listed in Google, yet it has a PageRank of 3. We can see that Google has spidered it by checking its cache, so the PageRank is not a guess. We also know that Google has spidered this page, again by checking its cache. Therefore, we can be sure that Google knows of at least 1 link to the page in question, both by its listed PR, and the fact that Google has spidered a page that links to it. &lt;br /&gt;&lt;br /&gt;However, if you look at the DMOZ.org page with the Google Toolbar installed, you'll notice the page has a PR of 0, which is very low. Furthermore, if you count the number of links on the page, you'll notice it has over 20. So you're dividing a very low PR among over 20 links. Thus each link carries very little weight, so Google doesn't list these links when you search for them. However, Google does count the links, which is why the page in question has a PR listed.&lt;br /&gt;&lt;br /&gt;It's very important to remember how Google lists incoming links. Often, people see their number of incoming links drop, and they think they have lost those links. In reality, the linking page could have lost some weight and consequentially, the links might have dropped below the value threshold that's required in order for links to be listed. Or the linking page could have added more links, causing each link's share of the weight to be lower, and again causing the link to drop below the value threshold. In either case the link is still counted, it just isn't listed.&lt;br /&gt;&lt;br /&gt;Why does Google do this? Perhaps the answer has to do with technical limitations. If the average number of links per page is 20 then Google would have to deal with over 60 billion links, which might create an index that was too large to be publicly searchable.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20123300-113533284314307651?l=kvcindia-websolutions.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kvcindia-websolutions.blogspot.com/feeds/113533284314307651/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20123300&amp;postID=113533284314307651' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default/113533284314307651'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default/113533284314307651'/><link rel='alternate' type='text/html' href='http://kvcindia-websolutions.blogspot.com/2005/12/few-myths-explored.html' title='Few Myths Explored'/><author><name>website-seo-services</name><uri>http://www.blogger.com/profile/11504393972691329912</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20123300.post-113533183806733029</id><published>2005-12-23T01:56:00.000-08:00</published><updated>2005-12-23T01:57:24.530-08:00</updated><title type='text'>More Spam Technique for google</title><content type='html'>Links Inside No Script Tags&lt;br /&gt;&lt;br /&gt;One top publishing site I recently discovered secretly interlinked its sites using the no script tag. Although I can't name the site, I can show you how the technique worked.&lt;br /&gt;&lt;br /&gt;Used legitimately, the no script tag provides spiderable links when a user's browser (or a search engine robot), has its JavaScript turned off. Anything that appears inside the no script tags is not visible on the Web page itself.&lt;br /&gt;&lt;br /&gt;To be used authentically, the no script tag must contain links that replicate those used within JavaScript code in the actual page.&lt;br /&gt;&lt;br /&gt;But in this case, the links went to sites that strategically collected PageRank. They were basically hidden, acting as underground network of links to support the publisher's rankings. This code appeared in almost all of the site's many domains -- and perhaps exists in the Websites of others, who may not even know it's there! Some of the pages only used a closing &lt;/NO SCRIPT) tag, which could also confuse search engines.&lt;br /&gt;&lt;br /&gt;&lt;SCRIPT LANGUAGE="javascript" SRC="http://www.spammersite1.com/counter.asp?ID=2667&amp;NoLink=1" TYPE="text/javascript"&gt;&lt;/SCRIPT&gt; &lt;br /&gt;         &lt;NOSCRIPT&gt;&lt;a href="http://www.spammersite3.com"&gt;new homes&lt;/a&gt; &lt;a href="http://www.spammersite3.com/popularkeywords.asp?Keyword=concrete+design"&gt;concrete  &lt;br /&gt;         design&lt;/a&gt; &lt;a href="http://www. spammersite3.com/popularkeywords.asp?Keyword=precast"&gt;precast&lt;/a&gt;  &lt;br /&gt;         &lt;a href="http://www. spammersite3.com/popularkeywords.asp?Keyword=mantel"&gt;mantel&lt;/a&gt;  &lt;br /&gt;&lt;a href="http://www.spammersite4.net/"&gt;home decorating&lt;/a&gt; &lt;br /&gt;&lt;a href="http://www.spammersite5.biz/"&gt;home  &lt;br /&gt;         improvement world&lt;/a&gt; &lt;a href="http://www.spammersite6.com"&gt;luxury homes&lt;/a&gt;  &lt;br /&gt;         &lt;/NOSCRIPT&gt;&lt;br /&gt;&lt;br /&gt;The complex code above was even loaded with keywords (using asp code). These keywords signaled to the Web server at the spam target site the type of dynamically generated page that should be served in response to the query. These tactics are not approved of if they're done deliberately to manipulate search rankings. Visitors to this site were totally oblivious to the devious intent of the site owner, and search engines were fooled as well.&lt;br /&gt;&lt;br /&gt;Non-Robot JavaScript Detectable Redirects&lt;br /&gt;The use of mouseover code like that shown below is quietly spreading across the Web:&lt;br /&gt;&lt;br /&gt;&lt;body onMouseOver="eval(unescape('%6C%6F%63%61%74%69%6F%6E%2E%686F%70%69%63%62%61%74%6F%6E%73%2E%6E%65%74%2F%27%3B'));"&lt;br /&gt;&lt;br /&gt;There have been rumors that Google is taking action against this tactic. In the cases I discovered, the JavaScript code automatically redirected the visitor to another page, but only upon the cursor being moved over the page itself. It was almost impossible for the user to avoid setting this code off.&lt;br /&gt;&lt;br /&gt;I found the code on a site ranked number one on Google for its primary keyword phrase. As search engine robots don't use a mouse, they're blind to the spamming activity. In this case, the tactic was combined with a server side redirect to another page, which was relevant only in some cases. The purpose of the redirect may have been part of bigger ploy to support another ranking strategy.&lt;br /&gt;&lt;br /&gt;Dynamic Real Time Page Generation&lt;br /&gt;&lt;br /&gt;It is possible for a Web server to produce and serve different, optimized pages according to the referrer of any page request.&lt;br /&gt;&lt;br /&gt;In theory, there is nothing wrong with serving a page that's customized to the circumstance in which it was requested -- indeed, many ad campaigns serve up different ads based on the type of banner that was clicked. Customized ads are seen as being far more effective and useful for users.&lt;br /&gt;&lt;br /&gt;With dynamic page spam, however, the site is loaded with hundreds of these phantom pages (dynamic urls) that act as affiliate links to some other site. Search engines don't want affiliate links. In the case I found, all the links were credited to the site's backlink count.&lt;br /&gt;&lt;br /&gt;I don't think this is what search engines had in mind when they began to spider dynamic urls -- they certainly don't want to allow affiliate link spam.&lt;br /&gt;&lt;br /&gt;Here's what the links typically look like:&lt;br /&gt;&lt;br /&gt;www.spammersite7.com/perl/click.pl?id=2068&amp;a=i&lt;br /&gt;&lt;br /&gt;When the robot follows the links, it receives a meta refesh that links to an error page called redirect.cfm. This page has links back to the home page, which are credited to the site's backlink count.&lt;br /&gt;&lt;br /&gt;&lt;meta http-equiv="REFRESH" content="2; URL=http:// www.spammersite7.com/redirect.cfm?url=spammersite7.com"&gt; &lt;br /&gt;&lt;br /&gt;&lt;/head&gt; &lt;br /&gt;&lt;body onLoad="document.form1.submit();" &gt; &lt;br /&gt;Please Wait... &lt;br /&gt;&lt;form name="form1" method="post" action="redirect.cfm"&gt; &lt;br /&gt;&lt;input type="hidden" name="url" value=" spammersite7.com "&gt; &lt;br /&gt;&lt;/form&gt;&lt;br /&gt;&lt;br /&gt;DHTML Layering and Hidden Text/&lt;br /&gt;&lt;br /&gt;Using DHTML layering, spammers can hide layers of keywords beneath graphics. One layer covers the other visually, yet the text hidden on the lowers layer is readable by the search engine robot -- another highly illegal technique.&lt;br /&gt;&lt;br /&gt;HTML Hidden Table Cells&lt;br /&gt;&lt;br /&gt;The combined powers of CSS and html and the loose dtd &lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"&gt; allow the unscrupulous site owner to hide the content of table cells loaded with keywords and heading tags.&lt;br /&gt;&lt;br /&gt;CSS permits the flexible positioning of Web page elements; it's a flexible coding language that search engines do not fully understand at present. In short, the search engine doesn't really know what's being displayed. This trickery can be specified in a separate CSS sheet (.css file), which a search engine may or may not index. This CSS style sheet file, however, does affect the display of content on the page.&lt;br /&gt;&lt;br /&gt;In this example, the CSS affects the display of the body of the Web page, which is set to 97%: &lt;br /&gt;&lt;br /&gt;{font-family: Arial, Helvetica, sans-serif; width:97%; font-size: 10pt; overflow: hidden; color: #000000; margin: 0px;}.&lt;br /&gt;&lt;br /&gt;Within the regular code, .gif files can be placed in the page at a width of 150%, ensuring that part of the page is not seen. That extra 50% provides plenty of room for keywords stuffed into &lt;h1&gt; tags.&lt;br /&gt;&lt;br /&gt;Enormous Machine-Generated Websites&lt;br /&gt;&lt;br /&gt;Those Webmasters who are not adept to html, dhtml or css tricks may try something simpler. When there's not enough content to go around, they often try to stretch a minimal amount of content across thousands of pages. The pages are built with templates and the sentences within them are basically shuffled from one page to the next. Unique title tags are plugged into each page that's generated.&lt;br /&gt;&lt;br /&gt;This technique basically sees the same page repeated hundreds to thousands of times. It can even be done using a computer program that systematically stuffs the text sentences, paragraphs and headings, including keywords, into pages.&lt;br /&gt;&lt;br /&gt;This technique is most often used with ecommerce sites that have a limited range of products for sale. Often, the products are simply re-organized, or shuffled to create another page that appears to be unique. It's actually the same selection of products presented countless different ways.&lt;br /&gt;&lt;br /&gt;Link Spam&lt;br /&gt;&lt;br /&gt;To maximize Pagerank distribution throughout a Website, some spammers will fill a page with links to the point where it is just a links page, and every page links to every other page.&lt;br /&gt;&lt;br /&gt;Why do this? Well, by maximizing the number of links, the spammer more equally spreads PageRank throughout his or her site. When links from all those pages point to a single page on a keyword topic, the site can gain higher rankings for that phrase. &lt;br /&gt;&lt;br /&gt;Link exchanges are also considered link spam. The links are fabricated -- not a real reflection of personal choice. Most link exchanges are now being filtered out of search results; however, some links in link exchanges are still being recognized.&lt;br /&gt;&lt;br /&gt;This system allows the server to give the robot different content than that which is delivered to human visitors. And that means the search engine could be deceived.&lt;br /&gt;&lt;br /&gt;Invisible Text&lt;br /&gt;&lt;br /&gt;Invisible text is invisible because the font color is the same as the color of the background or background image.&lt;br /&gt;&lt;br /&gt;In one example I saw, a site used the font color "snow" to make the text white on a white background. The author also used this font tag in a way that caused it to overlap another tag, thereby confusing the search engine robot further.&lt;br /&gt;&lt;br /&gt;The example below uses a black color .gif as the background to hide black text. It also has a dhtml layer directly above it, to further hide the text.&lt;br /&gt;&lt;br /&gt;&lt;body bgcolor="#000000"&gt; &lt;br /&gt;&lt;table width="14%" border="0" cellpadding="6" cellspacing="0" bgcolor="#FFFFFF"&gt; &lt;br /&gt; &lt;tr&gt; &lt;br /&gt;   &lt;td background="black.gif"&gt;&lt;font color="#000000"&gt;invisible text&lt;/font&gt;&lt;/td&gt; &lt;br /&gt; &lt;/tr&gt; &lt;br /&gt;&lt;/table&gt; &lt;br /&gt;&lt;div id="Layer1" style="position:absolute; width:200px; height:115px; z-index:1; left: 5px; top: 8px; background-image: url(black.gif); layer-background-image: url(black.gif); border: 1px none #000000;"&gt;&lt;/div&gt; &lt;br /&gt;&lt;/body&gt;&lt;br /&gt;&lt;br /&gt;A robot can't detect whether text in a dhtml layer is the same as the background used in a layer below it. The layer can even be set off-screen so it is never visible to a person.&lt;br /&gt;&lt;br /&gt;Link Farms&lt;br /&gt;&lt;br /&gt;Link farms are still prevalent on the Web, even though search engines can detect their presence through link pattern recognition. Since link spamming is being done at a macro level, so search engines must be able to view a large sophisticated network of links and delete those that are machine-generated and not true, human-chosen links.&lt;br /&gt;&lt;br /&gt;The hilltop algorithm is one filter that minimizes the advantage gained by hundreds of useless links.&lt;br /&gt;&lt;br /&gt;Spamming Penalties&lt;br /&gt;Each search engine has its own distinct prohibitions and related penalties. Each penalty is a response to the degree of threat the search engine that a given spamming technique represents.&lt;br /&gt;&lt;br /&gt;Spammers may receive demerits, through which the ranking of their sites on a particular phrase might drop significantly. Alternatively, a zero PageRank penalty may be awarded to a particular page, or whole sites may be banned if the search engine so chooses.&lt;br /&gt;&lt;br /&gt;Now that these techniques are widely known, I strongly advise you not to try them. The search engine engineers may be embarrassed that these tricks really do work, and will move swiftly to take action against spammers.&lt;br /&gt;&lt;br /&gt;Oh What a Wicked Web We Weave&lt;br /&gt;What's the final word on search engine spam? Well, that's between you and the search engines. Now that you know some of the popular spamming techniques in use, you'll at least know how to avoid using them. Once word gets out, the search engines will ban their usage.&lt;br /&gt;&lt;br /&gt;To avoid the problems created by spamming, choose an SEO that can achieve legitimate results. Don't ask for top ten guarantees when guarantees are considered wrong by the search engines. Hire an SEO that offers the full package of content creation and development. You'll get your money's worth, the search engines will get rich, useful content, and your site will attract targeted, qualified users.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20123300-113533183806733029?l=kvcindia-websolutions.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kvcindia-websolutions.blogspot.com/feeds/113533183806733029/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20123300&amp;postID=113533183806733029' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default/113533183806733029'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default/113533183806733029'/><link rel='alternate' type='text/html' href='http://kvcindia-websolutions.blogspot.com/2005/12/more-spam-technique-for-google.html' title='More Spam Technique for google'/><author><name>website-seo-services</name><uri>http://www.blogger.com/profile/11504393972691329912</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20123300.post-113533163572826492</id><published>2005-12-23T01:50:00.000-08:00</published><updated>2005-12-23T01:54:17.330-08:00</updated><title type='text'>Latest Spam Technique for Google</title><content type='html'>&lt;a href="http://www.kvcindia.com"&gt;Search engine rankings &lt;/a&gt;are extremely competitive and Website owners are under pressure to do all they can to gain visibility in search results. &lt;br /&gt;&lt;br /&gt;Those pressures come from many quarters: there are branding restrictions, style guidelines, legal issues, navigation needs, sales conversion demands, site interaction demands and more. &lt;br /&gt;&lt;br /&gt;The fact remains, though, that search engines were designed for information purposes. This presents hurdles to businesses that try to exploit the search engines in order to attract users who seek in information, then try to sell them something. To overcome these hurdles, many businesses use increasingly ruthless tactics -- tactics that lead them into dishonest territory -- to gain those higher search rankings.&lt;br /&gt;&lt;br /&gt;Exploiting the Engine&lt;br /&gt;The exploitation of search engines today is a serious issue, but, like it or not, most businesses see it as something that must be done -- an online business imperative. To exploit a search engine, however, most organizations must exploit a search engine optimization company. In these arrangements, exploitation, or the gaining of something for nothing, becomes the central theme for interaction between client and SEO provider. &lt;br /&gt;&lt;br /&gt;Thousands of SEO providers are now in business, and each ranking promiser is more famous than the next. For numerous of these service providers, quality is not an issue. What matters is making promises that beat the competition and win them the client. Faced with these enormous and often unreasonable pressures, ethical SEOs will withdraw from an optimization project. Unethical SEOs, however, will take on the project, saying, "No problem. I'll take care of it."&lt;br /&gt;&lt;br /&gt;"Taking care of" an impossible situation means spamming. The client's demand for the impossible and expectation of something for nothing pushes the SEO or Webmaster to that sorry path of search engine spamming. This approach involves the study and nurturing of a growing list of tricky spam techniques. &lt;br /&gt;&lt;br /&gt;The best way to diffuse the issue is to bring these methods to light. If everyone knows about a spamming technique, it will cease to work. This is the way to defeat search engine spam, and is the purpose of this article.&lt;br /&gt;&lt;br /&gt;Who's Responsible?&lt;br /&gt;Search engines value popular, content-rich sites; however, many Website owners either can't or don't want to spend the money required to create that type of content and popularity. The needed resources, such as researchers, Web content developers, copywriters and skilled SEOs, aren't available, or are beyond the financial resources of the company.&lt;br /&gt;&lt;br /&gt;This is the something for nothing scenario that launches all spam projects.&lt;br /&gt;&lt;br /&gt;TrafficPower is a SEO provider that was made infamous by Google's taking action to ban the company and its clients from their index. A Google rep was quoted as saying "I believe that one SEO had convinced clients either to put spammy JavaScript mouseover redirects, doorway pages that link to other sites, or both on their clients' sites. That can lead to clients' sites being flagged as spam in addition to the doorway domains that the SEO set up." &lt;br /&gt;&lt;br /&gt;Now, it seems Traffic Power's clients are suing the company, but the damage is done. We still have to wonder who the guilty parties are.&lt;br /&gt;&lt;br /&gt;In reality, when a site utilizes spam tactics, it is the client who's ultimately responsible, not the SEO provider. The client has control over a Website and its deployment. When spamming occurs, the Website owner is solely responsible.&lt;br /&gt;&lt;br /&gt;The Lure of Top Listings&lt;br /&gt;It is well publicized by some shady operators that rankings are cheap and easy to get. That lie -- and the expectation it generates -- forces some SEOs to offer a guarantee of top 5 rankings. This, in turn, puts pressure on all SEO providers to provide similar guarantees.&lt;br /&gt;&lt;br /&gt;Besides angering search engine companies, such guarantees are misleading. Top rankings can't be put on a schedule like an advertising buy. &lt;a href="http://www.shanq.net"&gt;Search engine-organic results&lt;/a&gt; are not for sale, and it is this element of honesty that ensures their continued popularity: that which cannot be bought is trustworthy.&lt;br /&gt;&lt;br /&gt;When SEOs can't achieve rankings on schedule, they are forced to refund perhaps thousands of dollars. Since many are barely able to pay their bills, they can't afford to return that money. This sets the stage for SEO spamming. &lt;br /&gt;&lt;br /&gt;There are spammers who don't care one way or another -- they don't mind cheating as they have no sense of ethics. There are also large SEO companies that are tasked to create rankings for clients that just shouldn't be attempted. They want to automate the SEO process in order to increase revenues. Search engines, in contrast, want to rid their indexes of automated material of any kind.&lt;br /&gt;&lt;br /&gt;The Website owner's greed, combined with search engine spammer's opportunism, sets the stage for an unholy union. Here's just one example of a spamming site I've seen.&lt;br /&gt;&lt;br /&gt;Spammingsite1.com used several types of spam to achieve strong results:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;mouse-activated redirects &lt;br /&gt;hidden table cells stuffed with keywords within &lt;h1&gt; tags &lt;br /&gt;links from contrived Websites &lt;br /&gt;&lt;br /&gt;The end users saw a different page than the search engine indexed. The search engine was tricked by these tactics, and, as is the case with all instances of spamming, lost control of the product it served to search users. &lt;br /&gt;&lt;br /&gt;Spammingsite1 was a leader in the search results -- but only because of spam. A check of the sites that linked to Spammingsite1 revealed a list of dubious quality sites with which no legitimate site owner would have wanted to be associated. One of the sites was a growing list of open directory copies -- sites that draw all their content from the open directory project. Copies of Open Directory listings represent a huge problem for Google.&lt;br /&gt;&lt;br /&gt;The Perils of New Content Types&lt;br /&gt;As Google and Yahoo! venture into spidering new types of Web content, they run the risk of being tricked by the complexity of the code itself. Spammers succeed by staying ahead of the technical filtering capabilities of search engines. &lt;br /&gt;&lt;br /&gt;Search engines apply content filters as they spider sites, and afterward, in what¡¦s called post-processing. This sophisticated filtering is wonderful, however it's also limited by the imagination, foresight and programming of the engineers. Spammers can trick the system by exploiting cracks in the filters. &lt;br /&gt;&lt;br /&gt;Sometimes innocent sites are penalized because they appear to have some characteristics of spamming. Is your site one of them? Why might a legitimate link to your site not be recognized? It probably looks like a paid link to the search engine. This is another huge problem for search engines: their filters are so complex that they become almost uncontrollable, and innocent sites are incorrectly penalized. &lt;br /&gt;&lt;br /&gt;Search engines can only see and know so much about any given Website and its owners. One SEOs content and links are another's spam, so it¡¦s difficult to make statements about who the spammers are. The problem is further complicated by the fact that search engines have different listing and content assessment guidelines. &lt;br /&gt;&lt;br /&gt;There are, of course, numerous tactics that are considered spam. Below are some of the most common spamming techniques being used right now -- tactics that should be avoided.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Publishing Empires &lt;br /&gt;Wikis &lt;br /&gt;Networked Blogs &lt;br /&gt;Forums &lt;br /&gt;Domain Spam &lt;br /&gt;Duplicate Domains &lt;br /&gt;Links inside No Script Tags &lt;br /&gt;Javascript Redirects &lt;br /&gt;Dynamic Real Time Page Generation &lt;br /&gt;HTML invisible table cells &lt;br /&gt;DHTML laying and Hidden text under layers &lt;br /&gt;Humungous machine-generated Web sites &lt;br /&gt;Link stuffing &lt;br /&gt;Invisible text &lt;br /&gt;Link Farms &lt;br /&gt;&lt;br /&gt;Let's discuss each of these in more detail.&lt;br /&gt;&lt;br /&gt;Publishing Empires&lt;br /&gt;&lt;br /&gt;When a publisher builds a vast array of interlinked Websites, it can generate high PageRank and subsequent rankings. This form of spam is difficult for a search engine to penalize, since the links are legitimate. Any single business entity has the right to interlink its own Websites. The company can create further overlap between the sites' content themes so that the links are truly valued by search engines.&lt;br /&gt;&lt;br /&gt;This kind of activity is exemplified by one of the Internet's largest publishers. The business has 120+ Web properties, all of which are carefully linked to the others. Perform a search on one of these sites, and you're virtually guaranteed to see one of the company's other Web properties in the search results.&lt;br /&gt;&lt;br /&gt;Many among the most successfully ranked sites use this system -- this form of spamming is extremely widespread. Perpetrators basically collect PageRank and link reputation within their network, then use it creatively to dominate the best keyword phrases. Search engines haven't found a way to stop this technique, but they'll have to. This form of spamming is a major threat to the quality of search results.&lt;br /&gt;&lt;br /&gt;Wikis&lt;br /&gt;&lt;br /&gt;Wikis are Web repositories to which anyone can post content. They can be a great way to present and edit ideas without close censorship, and have proven extremely successful for the creation, management, and maintenance of projects that require input from users around the globe.&lt;br /&gt;&lt;br /&gt;However, despite their considerable advantages, the often un-scrutinized nature of wikis makes them ripe for abuse. Like a link farm, a wiki's links are free for all. Ironically, the value of wikis is consistent with popularity-based search engines. Some of these wikis boast a very high pagerank, which can make the wiki an attractive place from which to gain a link to your site. But without close human control, users may simply add their links as a means to take advantage of the wiki's PR. Until another user of the wiki removes the link, the linked site enjoys the benefits of this unscrupulous activity. The search engine spammers have control.&lt;br /&gt;&lt;br /&gt;Networked Blogs&lt;br /&gt;&lt;br /&gt;Blogs can be a source of precise, up-to-date and technically detailed information, presented by specialists and experts. Blogs are thus very valuable to info-hungry searchers, and are extremely popular.&lt;br /&gt;&lt;br /&gt;However, some spammers start a blog, plug it full of garbage content such as comments on what they thought at 5:15, along with a link or two and some keyword rich text. Keyword rich musings don¡¦t present real value to deceived searchers. Worse still, blogs often operate in a free-for-all link structure that further validates the linked sites in search engine indexes.&lt;br /&gt;&lt;br /&gt;Forums&lt;br /&gt;&lt;br /&gt;Like blogs, forums can be a rich source of relevant information. &lt;br /&gt;&lt;br /&gt;Unfortunately, some forum participants make comments in forums only in an effort to publish links back to their own sites. This may be acceptable if the user provides help or assistance to another forum member. Indeed, they should gain credit for that information, which they may have worked hard to discover.&lt;br /&gt;&lt;br /&gt;However, when the posts become excessive and are comprised solely of glib or irrelevant comments, then value of the link, or indeed, the whole forum, can be put into question. Some forum owners only start forums in the hope that they will raise search engine rankings.&lt;br /&gt;&lt;br /&gt;Domain Spam&lt;br /&gt;&lt;br /&gt;Probably the most popular spam technique today involves creating and hosting a number of Websites. These sites rarely have any intrinsic value other than providing ranking support for the owner's main Website.&lt;br /&gt;&lt;br /&gt;I've had several former clients who had used this technique -- and had been penalized for it. After I got them to get rid of the duplicates completely, their rankings were repaired.&lt;br /&gt;&lt;br /&gt;Duplicate Domains&lt;br /&gt;&lt;br /&gt;Why can't Google detect two exact duplicate Websites that only differ on domain names? Why would Google give these same sites first and second rank for the very same phrase? This happens all too frequently and is due to Google's preoccupation with linking between topically related sites. &lt;br /&gt;&lt;br /&gt;Domain spam is usually the result of a corporation's attempt to have Web sites for each of its company departments or subsidiaries. Those with many subsidiaries get a big boost from these domains. Realizing this, spammers are increasingly encouraging clients to have sites hosted on different IP addresses and even in different geographical locations.&lt;br /&gt;&lt;br /&gt;The link pattern detection used by Google has difficulty dealing with this practice, and is currently failing to cope with it. Google's new emphasis on authority sites actually makes this matter worse, as the authority can gain trust it really doesn't deserve.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20123300-113533163572826492?l=kvcindia-websolutions.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kvcindia-websolutions.blogspot.com/feeds/113533163572826492/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20123300&amp;postID=113533163572826492' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default/113533163572826492'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default/113533163572826492'/><link rel='alternate' type='text/html' href='http://kvcindia-websolutions.blogspot.com/2005/12/latest-spam-technique-for-google.html' title='Latest Spam Technique for Google'/><author><name>website-seo-services</name><uri>http://www.blogger.com/profile/11504393972691329912</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20123300.post-113533121397613791</id><published>2005-12-23T01:45:00.000-08:00</published><updated>2005-12-23T01:46:53.980-08:00</updated><title type='text'>prevent brute-force attacks against your login pages</title><content type='html'>If you're looking for a way to prevent brute-force attacks against your login pages, automated sign-ups through your registration forms, or automated spam in your blog commenting system, then look no further! In this article, I'll guide you through the basics of creating and integrating a security image like those found on the sign-up pages of many mainstream sites.&lt;br /&gt;&lt;br /&gt;A security image is a visual representation of a number of random characters that can be easily read by humans, but is difficult for a computer program to interpret. If you integrate such an image into your form, ask your site visitors to enter into a separate input box the letters they see in the image, and compare the two, you can easily distinguish humans from machines.&lt;br /&gt;&lt;br /&gt;Server Requirements&lt;br /&gt;This project requires that you have a Web server with PHP and the GD library installed, so make sure your hosting environment supports these packages if you want to try out this tutorial. The example we'll discuss writes GIF images, but you could easily substitute the relevant functions to output JPEG or PNG images instead, if that's all your hosting environment supports.&lt;br /&gt;&lt;br /&gt;Creating a Reusable Security Image Class&lt;br /&gt;To begin, we'll create a reusable PHP class that we can use to generate security images; we'll finish up with a simple login screen that demonstrates the use of this class.&lt;br /&gt;&lt;br /&gt;Our class will need to perform a number of tasks to generate a suitable security image. It will need to:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Create a blank image with a white background.&lt;br /&gt;&lt;br /&gt;Add some random background noise to the image. This will help to confuse automated processes that try to use character recognition algorithms to identify the characters within the image.&lt;br /&gt;&lt;br /&gt;Write out a specified number of random characters using random font selection for each character.&lt;br /&gt;&lt;br /&gt;Write the image to the user's browser. For added flexibility, our class will also provide the option to write the image to a file, although this isn't a core requirement of this example.&lt;br /&gt;&lt;br /&gt;Here's the skeleton of our security image class:&lt;br /&gt;&lt;br /&gt;class SecurityImage {&lt;br /&gt;var $oImage;&lt;br /&gt;var $iWidth;&lt;br /&gt;var $iHeight;&lt;br /&gt;var $iNumChars;&lt;br /&gt;var $iNumLines;&lt;br /&gt;var $iSpacing;&lt;br /&gt;var $sCode;&lt;br /&gt;&lt;br /&gt;function SecurityImage(&lt;br /&gt;$iWidth = 150,&lt;br /&gt;$iHeight = 30,&lt;br /&gt;$iNumChars = 5,&lt;br /&gt;$iNumLines = 30&lt;br /&gt;) {&lt;br /&gt;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;function DrawLines() {&lt;br /&gt;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;function GenerateCode() {&lt;br /&gt;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;function DrawCharacters() {&lt;br /&gt;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;function Create($sFilename = '') {&lt;br /&gt;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;function GetCode() {&lt;br /&gt;&lt;br /&gt;}&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;Here, we've defined a few class properties; these will hold important information that we'll use when we generate our image. The class constructor will take four parameters, allowing us to change the appearance of the generated image. Specifically, we can choose the overall width and height of the image, the number of characters we want the image to display, and the number of background lines (or the amount of noise) to draw.&lt;br /&gt;&lt;br /&gt;Coding the Constructor&lt;br /&gt;The constructor will create the blank image, assign parameters to class properties and define the background colour of our image. Let's add that code now:&lt;br /&gt;&lt;br /&gt;function SecurityImage(&lt;br /&gt;$iWidth = 150,&lt;br /&gt;$iHeight = 30,&lt;br /&gt;$iNumChars = 5,&lt;br /&gt;$iNumLines = 30&lt;br /&gt;) {&lt;br /&gt;// get parameters&lt;br /&gt;$this-&gt;iWidth = $iWidth;&lt;br /&gt;$this-&gt;iHeight = $iHeight;&lt;br /&gt;$this-&gt;iNumChars = $iNumChars;&lt;br /&gt;$this-&gt;iNumLines = $iNumLines;&lt;br /&gt;&lt;br /&gt;// create new image&lt;br /&gt;$this-&gt;oImage = imagecreate($iWidth, $iHeight);&lt;br /&gt;&lt;br /&gt;// allocate white background colour&lt;br /&gt;imagecolorallocate($this-&gt;oImage, 255, 255, 255);&lt;br /&gt;&lt;br /&gt;// calculate spacing between characters based on width of image&lt;br /&gt;$this-&gt;iSpacing = (int)($this-&gt;iWidth / $this-&gt;iNumChars);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;The last line calculates a value for the spacing to include between characters, based on the number of characters it needs to generate, and the image width we specified.&lt;br /&gt;&lt;br /&gt;Making some Noise&lt;br /&gt;Next, we'll add code to create background noise (the lines). We add the lines first, because we want the characters to sit above them in the image, ensuring that the characters can be read by human users.&lt;br /&gt;&lt;br /&gt;function DrawLines() {&lt;br /&gt;for ($i = 0; $i &lt; $this-&gt;iNumLines; $i++) {&lt;br /&gt;$iRandColour = rand(190, 250);&lt;br /&gt;$iLineColour = imagecolorallocate($this-&gt;oImage, $iRandColour, $iRandColour, $iRandColour);&lt;br /&gt;imageline($this-&gt;oImage, rand(0, $this-&gt;iWidth), rand(0, $this-&gt;iHeight), rand(0, $this-&gt;iWidth), rand(0, $this-&gt;iHeight), $iLineColour);&lt;br /&gt;}&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;As the code loops, it creates lines of random lengths in random positions. Each line is drawn with a random grey scale selected from a RGB range. I have chosen a range of 190-250 to sufficiently obscure the characters from any automated process that might try to interpret them, while at the same time offering enough contrast to make them easily readable without squinting! You'll see below that we've chosen a darker grey scale range to write out the characters. You can increase or decrease the number of lines drawn -- and, therefore, the noise generated -- by passing a higher or lower value to the lines parameter of the class constructor.&lt;br /&gt;&lt;br /&gt;On to the Characters&lt;br /&gt;The code to generate the characters is split between two methods. The first, GenerateCode, generates the code, while the second, DrawCharacters, writes it to the image.&lt;br /&gt;&lt;br /&gt;function GenerateCode() {&lt;br /&gt;// reset code&lt;br /&gt;$this-&gt;sCode = '';&lt;br /&gt;&lt;br /&gt;// loop through and generate the code letter by letter&lt;br /&gt;for ($i = 0; $i &lt; $this-&gt;iNumChars; $i++) {&lt;br /&gt;// select random character and add to code string&lt;br /&gt;$this-&gt;sCode .= chr(rand(65, 90));&lt;br /&gt;}&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;The GenerateCode method starts by clearing the $this-&gt;sCode variable, to prevent any saved characters that were generated previously from confusing our current image generation work.&lt;br /&gt;&lt;br /&gt;Then, the method loops until it has selected the number of random characters requested in the class constructor. On each iteration of the loop, it appends the selected character to the $this-&gt;sCode variable.&lt;br /&gt;&lt;br /&gt;Upper case characters are picked from ASCII character codes in the range 65 to 90. We use the rand function to select a number within this range, then pass it to the chr function to convert it into a readable character for display.&lt;br /&gt;&lt;br /&gt;I've stuck to upper case characters because their ASCII codes are a continuous set of numbers; this makes it easier to pick a character at random. Including lower case characters or numbers would add an extra level of complexity to the character selection process. However, if you wanted to take this step, you could statically declare an array of characters to use, then choose a random number to act as an array index for character selection. The following code illustrates this point:&lt;br /&gt;&lt;br /&gt;// characters to use&lt;br /&gt;$aChars = array('A', 'B', 'C', '3', 'g');&lt;br /&gt;&lt;br /&gt;// get number of characters&lt;br /&gt;$iTotal = count($aChars) - 1;&lt;br /&gt;&lt;br /&gt;// get random index&lt;br /&gt;$iIndex = rand(0, $iTotal);&lt;br /&gt;&lt;br /&gt;// selected character&lt;br /&gt;$this-&gt;sCode .= $aChars[$iIndex];&lt;br /&gt;&lt;br /&gt;This code would replace the following line in the GenerateCode method:&lt;br /&gt;&lt;br /&gt;// select random character&lt;br /&gt;$this-&gt;sCode .= chr(rand(65, 90));&lt;br /&gt;&lt;br /&gt;Having generated this code, we call DrawCharacters to write the selected characters to the image.&lt;br /&gt;&lt;br /&gt;function DrawCharacters() {&lt;br /&gt;// loop through and write out selected number of characters&lt;br /&gt;for ($i = 0; $i &lt;&gt;sCode); $i++) {&lt;br /&gt;// select random font&lt;br /&gt;$iCurrentFont = rand(1, 5);&lt;br /&gt;&lt;br /&gt;// select random greyscale colour&lt;br /&gt;$iRandColour = rand(0, 128);&lt;br /&gt;$iTextColour = imagecolorallocate($this-&gt;oImage, $iRandColour, $iRandColour, $iRandColour);&lt;br /&gt;&lt;br /&gt;// write text to image&lt;br /&gt;imagestring($this-&gt;oImage, $iCurrentFont, $this-&gt;iSpacing / 3 + $i * $this-&gt;iSpacing, ($this-&gt;iHeight - imagefontheight($iCurrentFont)) / 2, $this-&gt;sCode[$i], $iTextColour);&lt;br /&gt;}&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;With each iteration of the loop, the DrawCharacters method selects a random font from the five built into the GD library. They're simply numbered 1 to 5. We could have used FreeType or TrueType fonts, but this would have meant that, for this example to work, we'd have to worry about the locations of the required font files. I'll leave adding support for those fonts as an exercise for you!&lt;br /&gt;&lt;br /&gt;In the next section, a random grey scale from an RGB range of 0 to 128 is selected. As previously mentioned, this is darker than the range we used two draw the background lines, ensuring that our image displays enough contrast to maintain its readability. You may want to tinker with these ranges to obtain what you feel is the best trade-off between obscurity and readability.&lt;br /&gt;&lt;br /&gt;Finally, the character is written to the image. You'll notice that we're treating the $this-&gt;sCode string as an array, which enables us to select each character in turn.&lt;br /&gt;&lt;br /&gt;The three methods we've created so far -- DrawLines, GenerateCode and DrawCharacters -- are private. Any code that calls the class won't use them directly, but, instead, uses them through a public wrapper method. In PHP 4, there's no way to mark a method as private, so we'll just have to hope that anyone who calls the class is well-behaved and uses the wrapper function we've provided.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20123300-113533121397613791?l=kvcindia-websolutions.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kvcindia-websolutions.blogspot.com/feeds/113533121397613791/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20123300&amp;postID=113533121397613791' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default/113533121397613791'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default/113533121397613791'/><link rel='alternate' type='text/html' href='http://kvcindia-websolutions.blogspot.com/2005/12/prevent-brute-force-attacks-against.html' title='prevent brute-force attacks against your login pages'/><author><name>website-seo-services</name><uri>http://www.blogger.com/profile/11504393972691329912</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20123300.post-113533066416996093</id><published>2005-12-23T00:55:00.000-08:00</published><updated>2005-12-23T01:44:24.790-08:00</updated><title type='text'>PHP mistakes leading to security breach</title><content type='html'>&lt;a class="glossary" title="PHP, or Hypertext Preprocessor, is an open source, server-side programming language." href="http://www.sitepoint.com/glossary.php?q=P#term_1"&gt;PHP&lt;/a&gt; is a terrific language for the rapid development of dynamic Websites. It also has many features that are friendly to beginning programmers, such as the fact that it doesn't require variable declarations. However, many of these features can lead a programmer inadvertently to allow security holes to creep into a Web application. The popular security mailing lists teem with notes of flaws identified in PHP applications, but PHP can be as secure as any other language once you understand the basic types of flaws PHP applications tend to exhibit.&lt;br /&gt;In this article, I'll detail many of the common PHP programming mistakes that can result in security holes. By showing you what not to do, and how each particular flaw can be exploited, I hope that you'll understand not just how to avoid these particular mistakes, but also why they result in security vulnerabilities. Understanding each possible flaw will help you avoid making the same mistakes in your PHP applications.&lt;br /&gt;Security is a process, not a product, and adopting a sound approach to security during the process of application development will allow you to produce tighter, more robust code.&lt;br /&gt;&lt;br /&gt;Unvalidated Input Errors&lt;br /&gt;One of -- if not the -- most common PHP security flaws is the unvalidated input error. User-provided data simply cannot be trusted. You should assume every one of your Web application users is malicious, since it's certain that some of them will be. Unvalidated or improperly validated input is the root cause of many of the exploits we'll discuss later in this article.&lt;br /&gt;As an example, you might write the following code to allow a user to view a calendar that displays a specified month by calling the &lt;a class="glossary" title="Multi-user, multitasking Operating System and set of specifications" href="http://www.sitepoint.com/glossary.php?q=U#term_22"&gt;UNIX&lt;/a&gt; cal command.&lt;br /&gt;$month = $_GET[month]; $year = $_GET[year]; exec("cal $month $year", $result); print "&lt;br /&gt;&lt;pre&gt;"; foreach ($result as $r) { print "$r&lt;br /&gt;"; } print "&lt;/pre&gt;";&lt;br /&gt;This code has a gaping security hole, since the $_GET[month] and $_GET[year] variables are not validated in any way. The application works perfectly, as long as the specified month is a number between 1 and 12, and the year is provided as a proper four-digit year. However, a malicious user might append ";ls -la" to the year value and thereby see a listing of your Website's &lt;a class="glossary" title="HTML stands for HyperText Markup Language." href="http://www.sitepoint.com/glossary.php?q=H#term_75"&gt;html&lt;/a&gt; directory. An extremely malicious user could append ";rm -rf" to the year value and delete your entire Website!&lt;br /&gt;&lt;a href="http://www.sitepoint.com/phpadsnew/adclick.php?bannerid=242&amp;zoneid=28&amp;amp;source=&amp;dest=http%3A%2F%2Fad.doubleclick.net%2Fjump%2FN2790.Sitepoint.com%2FB1664283%3Babr%3D%21ie4%3Babr%3D%21ie5%3Bsz%3D300x250%3Bord%3D25159370%3F&amp;amp;ismap=" target="_blank"&gt;&lt;br /&gt;&lt;/a&gt;&lt;br /&gt;The proper way to correct this is to ensure that the input you receive from the user is what you expect it to be. Do not use &lt;a class="glossary" title="JavaScript is a Web scripting language most commonly used for client-side applications." href="http://www.sitepoint.com/glossary.php?q=J#term_9"&gt;JavaScript&lt;/a&gt; validation for this; such validation methods are easily worked around by an exploiter who creates their own form or disables javascript. You need to add PHP code to ensure that the month and year inputs are digits and only digits, as shown below.&lt;br /&gt;$month = $_GET[month]; $year = $_GET[year]; if (!preg_match("/^[0-9]{1,2}$/", $month)) die("Bad month, please re-enter."); if (!preg_match("/^[0-9]{4}$/", $year)) die("Bad year, please re-enter."); exec("cal $month $year", $result); print "&lt;br /&gt;&lt;pre&gt;"; foreach ($result as $r) { print "$r&lt;br /&gt;"; } print "&lt;/pre&gt;";&lt;br /&gt;This code can safely be used without concern that a user could provide input that would compromise your application, or the server running it. Regular expressions are a great tool for input validation. They can be difficult to grasp, but are extremely useful in this type of situation.&lt;br /&gt;You should always validate your user-provided data by rejecting anything other than the expected data. Never use the approach that you'll accept anything except data you know to be harmful -- this is a common source of security flaws. Sometimes, malicious users can get around this methodology, for example, by including bad input but obscuring it with null characters. Such input would pass your checks, but could still have a harmful effect.&lt;br /&gt;You should be as restrictive as possible when you validate any input. If some characters don't need to be included, you should probably either strip them out, or reject the input completely.&lt;br /&gt;Access Control Flaws&lt;br /&gt;Another type of flaw that's not necessarily restricted to PHP applications, but is important nonetheless, is the access control type of vulnerability. This flaw rears its head when you have certain sections of your application that must be restricted to certain users, such as an administration page that allows configuration settings to be changed, or displays sensitive information.&lt;br /&gt;You should check the user's credentials upon every load of a restricted page of your PHP application. If you check the user's credentials on the index page only, a malicious user could directly enter a URL to a "deeper" page, which would bypass this credential checking process.&lt;br /&gt;It's also advisable to layer your security, for example, by restricting user access on the basis of the user's IP address as well as their user name, if possible. Placing your restricted pages in a separate directory that's protected by an &lt;a class="glossary" title="Apache is one of the world's most widely-used Web servers." href="http://www.sitepoint.com/glossary.php?q=A#term_19"&gt;apache&lt;/a&gt; .htaccess file is also good practice.&lt;br /&gt;Place configuration files outside your Web-&lt;a class="glossary" title="'Accessibility" href="http://www.sitepoint.com/glossary.php?q=A#term_61"&gt;accessible&lt;/a&gt; directory. A configuration file can contain database passwords and other information that could be used by malicious users to penetrate or deface your site; never allow these files to be accessed by remote users. Use the PHP include function to include these files from a directory that's not Web-accessible, possibly including an .htaccess file containing "deny from any". Though this is redundant, layering security is a positive thing.&lt;br /&gt;For my PHP applications, I prefer a directory structure based on the sample below. All function libraries, classes and configuration files are stored in the includes directory. Always name these include files with a .php extension, so that even if all your protection is bypassed, the Web server will parse the PHP code, and will not display it to the user. The www and admin directories are the only directories whose files can be accessed directly by a URL; the admin directory is protected by an .htaccess file that allows users entry only if they know a user name and password that's stored in the .htpasswd file in the root directory of the site.&lt;br /&gt;/home /httpd /www.example.com .htpasswd /includes cart.class.php config.php /logs access_log error_log /www index.php /admin .htaccess index.php&lt;br /&gt;You should set your Apache directory indexes to 'index.php', and keep an index.php file in every directory. Set it to redirect to your main page if the directory should not be browsable, such as an images directory or similar.&lt;br /&gt;Never, ever, make a backup of a php file in your Web-exposed directory by adding .bak or another extension to the filename. If you do this, the PHP code in the file will not be parsed by the Web server, and may be output as source to a user who stumbles upon a URL to the backup file. If that file contained passwords or other sensitive information, that information would be readable -- it could even end up being indexed by Google if the spider stumbled upon it! Renaming files to have a .bak.php extension is safer than tacking a .bak onto the .php extension, but the best solution is to use a source code version control system like CVS. CVS can be complicated to learn, but the time you spend will pay off in many ways. The system saves every version of each file in your project, which can be invaluable when changes are made that cause problems later.&lt;br /&gt;&lt;br /&gt;Session ID Protection&lt;br /&gt;Session ID hijacking can be a problem with PHP Websites. The PHP session tracking component uses a unique ID for each user's session, but if this ID is known to another user, that person can hijack the user's session and see information that should be confidential. Session ID hijacking cannot completely be prevented; you should know the risks so you can mitigate them.&lt;br /&gt;For instance, even after a user has been validated and assigned a session ID, you should revalidate that user when he or she performs any highly sensitive actions, such as resetting passwords. Never allow a session-validated user to enter a new password without also entering their old password, for example. You should also avoid displaying truly sensitive data, such as credit card numbers, to a user who has only been validated by session ID.&lt;br /&gt;A user who creates a new session by logging in should be assigned a fresh session ID using the session_regenerate_id function. A hijacking user will try to set his session ID prior to login; this can be prevented if you regenerate the ID at login.&lt;br /&gt;If your site is handling critical information such as credit card numbers, always use an SSL secured connection. This will help reduce session hijacking vulnerabilities since the session ID cannot be sniffed and easily hijacked.&lt;br /&gt;If your site is run on a shared Web server, be aware that any session variables can easily be viewed by any other users on the same server. Mitigate this vulnerability by storing all sensitive data in a database record that's keyed to the session ID rather than as a session variable. If you must store a password in a session variable, do not store the password in clear text; use the sha1() (PHP 4.3+) or md5() function to store the hash of the password instead.&lt;br /&gt;if ($_SESSION[password] == $userpass) { // do sensitive things here }&lt;br /&gt;The above code is not secure, since the password is stored in plain text in a session variable. Instead, use code more like this:&lt;br /&gt;if ($_SESSION[sha1password] == sha1($userpass)) { // do sensitive things here }&lt;br /&gt;The SHA-1 algorithm is not without its flaws, and further advances in computing power are making it possible to generate what are known as collisions (different strings with the same SHA-1 sum). Yet the above technique is still vastly superior to storing passwords in clear text. Use MD5 if you must -- since it's superior to a clear text-saved password -- but keep in mind that recent developments have made it possible to generate MD5 collisions in less than an hour on standard PC hardware. Ideally, one should use a function that implements SHA-256; such a function does not currently ship with PHP and must be found separately.&lt;br /&gt;&lt;br /&gt;Cross Site Scripting (XSS) Flaws&lt;br /&gt;Cross site scripting, or XSS, flaws are a subset of user validation where a malicious user embeds scripting commands -- usually JavaScript -- in data that is displayed and therefore executed by another user.&lt;br /&gt;For example, if your application included a forum in which people could post messages to be read by other users, a malicious user could embed a&lt;script&gt; tag, shown below, which would reload the page to a site controlled by them, pass your &lt;a class="glossary" title="Cookies are pieces of data that a Website uses to &amp;quot;brand&amp;quot; users' computers in order to identify them individually." href="http://www.sitepoint.com/glossary.php?q=C#term_59"&gt;cookie&lt;/a&gt; and session information as GET variables to their page, then reload your page as though nothing had happened. The malicious user could thereby collect other users' cookie and session information, and use this data in a session hijacking or other attack on your site.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;To prevent this type of attack, you must perform user input validation by disallowing any&lt;script&gt; tags from being submitted to your forms. Always convert the &lt;&gt; characters in user input that may be viewed by other users to &lt;&gt;. Additionally, it may be wise to convert the parenthesis, ampersand, and hash (#) characters to their HTML entity equivalents.&lt;br /&gt;&lt;a class="sublink" href="http://www.cgisecurity.com/articles/xss-faq.shtml"&gt;The Cross Site Scripting FAQ&lt;/a&gt; at &lt;a class="sublink" href="http://www.cgisecurity.com/"&gt;cgisecurity.com&lt;/a&gt; provides much more information and background on this type of flaw, and explains it well. I highly recommend reading and understanding it. XSS flaws can be difficult to spot and are one of the easier mistakes to make when programming a PHP application, as illustrated by the high number of XSS advisories issued on the popular security mailing lists.&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20123300-113533066416996093?l=kvcindia-websolutions.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kvcindia-websolutions.blogspot.com/feeds/113533066416996093/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20123300&amp;postID=113533066416996093' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default/113533066416996093'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20123300/posts/default/113533066416996093'/><link rel='alternate' type='text/html' href='http://kvcindia-websolutions.blogspot.com/2005/12/php-mistakes-leading-to-security.html' title='PHP mistakes leading to security breach'/><author><name>website-seo-services</name><uri>http://www.blogger.com/profile/11504393972691329912</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
