Two Simple Ways to Read Restricted Website Content
Have you ever had a problem that you used a search engine to try to find the solution? Did that search bring you results from a site that then forced you to register in order to see the content? This happened to me all of the time before I found two simple ways to display that content without me having to register at all.
Let me begin by explaining the why before I tell you the how. In order for a search engine to index a site’s content, it needs to be able to see that content. The webmasters of that site are eager to let the search engine see the content as they know it will drive additional visitors to their site. The end result is that they have to find a way for the search engine to see the content, while at the same time obscuring it from the view of the average user. Most of the time they do this by keying off of the browser’s USER AGENT. This creates a loophole for us to exploit since if Google is able to see the search engine results, then so can we. Here’s my two tricks to see the restricted content:
- Trick #1: Change the user agent! In Firefox, this can be done rather easily by entering “about:config” as an address in the address bar. Click the right mouse button to get the context menu and then select New->String. Enter the preference name “general.useragent.override”. Now simply enter the name of the new user agent string that you want to use. If you want to cloak yourself as Google and see what they see, try using the string “Googlebot/2.1 (+http://www.googlebot.com/bot.html)”. You can check the current value by just entering “about:” in the address bar.
- Trick #2: Check Google’s cache! If you don’t want to go about mucking with your browser’s preferences, there’s an even easier trick for you to use. Go to http://www.google.com and enter the restricted URL, but prepend “cache:” before it. This allows you to pull the site’s indexed content out of Google’s cache instead of from the site itself, bypassing the access restrictions entirely.
So, there you have it. Two simple ways to read restricted website content without ever having to register an account with the site. Enjoy!
April 9th, 2009 at 9:15 am
What do you mean by prepend cache before it? Mind explaining it.
Thank.
April 9th, 2009 at 11:05 am
Go to http://www.google.com and do a search for something. Notice at the bottom of each search result there is a link for a “Cached” version. Click the link and you’ll see the version of that page that was cached the last time Google search indexed that page. You can do this manually by simply modifying the search you use to begin with. Say, for example, I thought there might be something hiding on webadminblog.com that only the Google bots could see. I would search in Google for “cache:http://www.webadminblog.com“. The end result is the same content as if you searched for something and clicked the link for the “Cached” link, but this allows you to go directly to the cached version without searching for it. Note that there is “cache:” before the URL in my search term. This is what I meant by “prepend cache before it”.
October 28th, 2009 at 8:58 am
dosent work for me…mind helping me out?