Two Simple Ways to Read Restricted Website Content

Aug.24, 2008 in Popular, Web Application Security

Have you ever had a problem that you used a search engine to try to find the solution? Did that search bring you results from a site that then forced you to register in order to see the content? This happened to me all of the time before I found two simple ways to display that content without me having to register at all.

Let me begin by explaining the why before I tell you the how. In order for a search engine to index a site’s content, it needs to be able to see that content. The webmasters of that site are eager to let the search engine see the content as they know it will drive additional visitors to their site. The end result is that they have to find a way for the search engine to see the content, while at the same time obscuring it from the view of the average user. Most of the time they do this by keying off of the browser’s USER AGENT. This creates a loophole for us to exploit since if Google is able to see the search engine results, then so can we. Here’s my two tricks to see the restricted content:

Trick #1: Change the user agent! In Firefox, this can be done rather easily by entering “about:config” as an address in the address bar. Click the right mouse button to get the context menu and then select New->String. Enter the preference name “general.useragent.override”. Now simply enter the name of the new user agent string that you want to use. If you want to cloak yourself as Google and see what they see, try using the string “Googlebot/2.1 (+http://www.googlebot.com/bot.html)”. You can check the current value by just entering “about:” in the address bar.
Trick #2: Check Google’s cache! If you don’t want to go about mucking with your browser’s preferences, there’s an even easier trick for you to use. Go to http://www.google.com and enter the restricted URL, but prepend “cache:” before it. This allows you to pull the site’s indexed content out of Google’s cache instead of from the site itself, bypassing the access restrictions entirely.

So, there you have it. Two simple ways to read restricted website content without ever having to register an account with the site. Enjoy!

Tags: access, agent, browser, cache, cloaking, content, engine, google, restriced, Search, user, website

3 Comments on “Two Simple Ways to Read Restricted Website Content”

NaDou
April 9th, 2009 at 9:15 am
What do you mean by prepend cache before it? Mind explaining it.

Thank.

Josh
April 9th, 2009 at 11:05 am
Go to http://www.google.com and do a search for something. Notice at the bottom of each search result there is a link for a “Cached” version. Click the link and you’ll see the version of that page that was cached the last time Google search indexed that page. You can do this manually by simply modifying the search you use to begin with. Say, for example, I thought there might be something hiding on webadminblog.com that only the Google bots could see. I would search in Google for “cache:http://www.webadminblog.com“. The end result is the same content as if you searched for something and clicked the link for the “Cached” link, but this allows you to go directly to the cached version without searching for it. Note that there is “cache:” before the URL in my search term. This is what I meant by “prepend cache before it”.

Aerrow
October 28th, 2009 at 8:58 am
dosent work for me…mind helping me out?

Welcome to WebAdminBlog!

This blog site is run by Josh Sokol, the Founder and CEO of SimpleRisk, a free tool for Governance, Risk Management, and Compliance. Josh is a former Web Admin and Information Security Program Owner of National Instruments.

Categories
Recent Posts
Recent Comments
devops
Links
Security
Tags
21ct agile amazon analysis application appsec attack aws browser cloud Cloud Computing code Conferences data devops ec2 firewall google hansen internet lynxeon malware Management network Operations owasp PCI performance project rsnake SaaS secure Security strategies velocity velocity08 velocityconf velocityconf08 velocityconf09 Virtualization vpn vulnerability waf web wifi
Categories
- Advertising (2)
- Application Performance Management (14)
- Automation (4)
- Browsers (4)
- Cloud Computing (9)
  - Elastic Compute Cloud (3)
- Conferences (64)
  - BSides Austin 2013 (1)
  - BSides Austin 2016 (1)
  - OWASP AppSec DC 2009 (16)
  - OWASP AppSec NYC 2008 (18)
  - OWASP LASCON 2017 (1)
  - OWASP LASCON 2018 (1)
  - TRISC 2009 (8)
  - Velocity 2008 (8)
  - Velocity 2009 (8)
- Content Management (2)
- Featured (3)
- Green Computing (1)
- High Availability (1)
- Log Management (2)
- Management (4)
- Monitoring (4)
- Networking (12)
  - Firewalls (4)
  - NetFlow (4)
- Operating Systems (2)
  - Linux (2)
  - Mac OSX (1)
  - Unix (2)
- Operations (11)
- Popular (2)
- SaaS (2)
- Sarcasm (1)
- Search (1)
  - Enterprise Search (1)
- Security (75)
  - Access Management (1)
  - Capture the Flag (4)
  - Cloud Computing (4)
  - Compliance (1)
  - Disaster Recovery (1)
  - Malware (4)
  - Metrics (2)
  - OWASP (2)
  - PCI (2)
  - Phishing (2)
  - Physical (1)
  - Risk Management (2)
  - Virtualization (3)
  - Web Application Security (32)
    - Dynamic Analysis (1)
    - Static Analysis (1)
  - Wireless Networks (5)
- Service-Oriented Architecture (1)
- Software and Tools (15)
  - Crashplan (1)
  - Drobo (1)
  - GRC (1)
- Training (2)
- Uncategorized (1)
- Virtualization (4)

Blogroll
- Agile Operations Blog
- Agile Testing
- Agile Web Operations
- Amazon Web Services Blog
- dev2ops – Web Ops at Scale
- Gilligan on Data Web Analytics pro tips
- ISSA Home The Information Systems Security Association (ISSA)® is a not-for-profit, international organization of information security professionals and practitioners.
- Kitchen Soap, A WebOps Blog
- Michael Howard's Blog Software security guy at Microsoft.
- National Instruments Home The majority of the contributers here are current or past NI employees.
- OWASP Home The Open Web Application Security Project (OWASP) is a worldwide free and open community focused on improving the security of application software.
- RSnake's Blog ha.ckers.org web application security lab
- Server Fault
- Steve Souders’ Blog Google High Performance Guru
- The Madstop
- The Open Minded Enterprise
- The Simple Logic
- Transparent Uptime blog
Archives
- March 2019
- October 2017
- April 2016
- January 2016
- December 2015
- May 2015
- November 2014
- August 2014
- June 2014
- May 2014
- October 2013
- September 2013
- August 2013
- May 2013
- March 2013
- February 2013
- October 2012
- May 2011
- April 2011
- December 2010
- July 2010
- June 2010
- April 2010
- March 2010
- February 2010
- January 2010
- November 2009
- September 2009
- July 2009
- June 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
Tag Cloud
21ct agile amazon analysis application appsec attack aws browser cloud Cloud Computing code Conferences data devops ec2 firewall google hansen internet lynxeon malware Management network Operations owasp PCI performance project rsnake SaaS secure Security strategies velocity velocity08 velocityconf velocityconf08 velocityconf09 Virtualization vpn vulnerability waf web wifi

Web Admin Blog

Real Web Admins. Real World Experience.