100% found this document useful (1 vote)
387 views7 pages

Google Dork Report

Google dorking uses specialized search queries and operators to find private information that is not intended for public viewing but has not been adequately protected, such as usernames, passwords, and sensitive documents. The Google Hacking Database provides categorized queries to uncover such information. While robots.txt files can help manage search engine access to websites, they have limitations and cannot fully prevent indexing private pages if they are linked from other sites. Other methods like password protection or noindexing are needed to fully prevent disclosure of private information.

Uploaded by

rushabhp17
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
387 views7 pages

Google Dork Report

Google dorking uses specialized search queries and operators to find private information that is not intended for public viewing but has not been adequately protected, such as usernames, passwords, and sensitive documents. The Google Hacking Database provides categorized queries to uncover such information. While robots.txt files can help manage search engine access to websites, they have limitations and cannot fully prevent indexing private pages if they are linked from other sites. Other methods like password protection or noindexing are needed to fully prevent disclosure of private information.

Uploaded by

rushabhp17
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

GOOGLE

DORKING
 Google dorking, also known as Google hacking, can return
information that is difficult to locate through simple search
queries. That description includes information that is not
intended for public viewing but that has not been adequately
protected. 

 A Google dork query, sometimes just referred to as a dork, is a


search string that uses advanced search operators to find
information that is not readily available on a website.

 As a passive attack method, Google dorking can return


usernames and passwords, email lists, sensitive documents,
personally identifiable financial information (PIFI) and website
vulnerabilities. That information can be used for any number of
illegal activities, including cyberterrorism, industrial
espionage, identity theft and cyberstalking.

Here are some basic and important dorks:

S.No. Operator Description Example


1 intitle: which finds strings in the title of a page intitle:”Your Text”
2 allintext: which finds all terms in the title of a page allintext:”Contact”
3 inurl: which finds strings in the URL of a page inurl:”news.php?id=”
site:yeahhub.com
4 site: which restricts a search to a particular site or domain “Keyword”
which finds specific types of files (doc, pdf, mp3
5 filetype: etc) based on file extension filetype:pdf “Cryptography”
6 link: which searches for all links to a site or URL link:”example.com”
7 cache: which displays Google’s cached copy of a page cache:yeahhub.com
8 info: which displays summary information about a page info:www.example.com
 The effective way to dorks is utilise the GHDB (Google
Hacking Database).

 The Google Hacking Database (GHDB) is a categorized index


of Internet search engine queries designed to uncover
interesting, and usually sensitive, information made publicly
available on the Internet. In most cases, this information was
never meant to be made public but due to any number of factors
this information was linked in a web document that was crawled
by a search engine that subsequently followed that link and
indexed the sensitive information.

Some categories of GHDB includes:

 Files containing juicy info.


 Pages containing Login Details.
 Various Online Devices.
 Web Server Detection.
 Footholds
 Files containing usernames and passwords
 Vulnerable Files
 Error Messages
 Sensitive Online Shopping Info
 Sensitive Directories
 Network or Vulnerability Data

Now there is something called robot.txt which is used as a


safety measures to avoid the leak of sensitive data on
Google.
 A robots.txt file tells search engine crawlers which URLs the
crawler can access on your site. This is used mainly to avoid
overloading your site with requests; it is not a mechanism for
keeping a web page out of Google. To keep a web page out of
Google, block indexing with noindex or password-protect the
page.

robots.txt effect on different file types:

Web page: You can use a robots.txt file for web pages (HTML, PDF,
or other non-media formats that Google can read), to manage
crawling traffic if you think your server will be overwhelmed by
requests from Google's crawler, or to avoid crawling unimportant or
similar pages on your site.
If your web page is blocked with a robots.txt file, its URL can still
appear in search results, but the search result will not have a
description. Image files, video files, PDFs, and other non-HTML files
will be excluded. If you see this search result for your page and want
to fix it, remove the robots.txt entry blocking the page. If you want to
hide the page completely from Search, use another method.

Media file: Use a robots.txt file to manage crawl traffic, and also to
prevent image, video, and audio files from appearing in Google
search results. This won't prevent other pages or users from linking to
your image, video, or audio file.

Resource file: You can use a robots.txt file to block resource files
such as unimportant image, script, or style files, if you think that
pages loaded without these resources will not be significantly affected
by the loss. However, if the absence of these resources make the page
harder for Google's crawler to understand the page, don't block them,
or else Google won't do a good job of analyzing pages that depend on
those resources.

Understand the limitations of a robots.txt file:


Before you create or edit a robots.txt file, you should know the limits
of this URL blocking method. Depending on your goals and situation,
you might want to consider other mechanisms to ensure your URLs
are not findable on the web.
 robots.txt directives may not be supported by all search
engines.
The instructions in robots.txt files cannot enforce crawler
behavior to your site; it's up to the crawler to obey them. While
Googlebot and other respectable web crawlers obey the
instructions in a robots.txt file, other crawlers might not.
Therefore, if you want to keep information secure from web
crawlers, it's better to use other blocking methods, such
as password-protecting private files on your server.
 Different crawlers interpret syntax differently.
Although respectable web crawlers follow the directives in a
robots.txt file, each crawler might interpret the directives
differently. You should know the proper syntax for addressing
different web crawlers as some might not understand certain
instructions.
 A page that's disallowed in robots.txt can still be indexed if
linked to from other sites.
While Google won't crawl or index the content blocked by a
robots.txt file, we might still find and index a disallowed URL if
it is linked from other places on the web. As a result, the URL
address and, potentially, other publicly available information
such as anchor text in links to the page can still appear in
Google search results. To properly prevent your URL from
appearing in Google search results, password-protect the files on
your server, use the noindex meta tag or response header, or
remove the page entirely.

References:
 https://www.exploit-db.com/google-hacking-database
 https://www.yeahhub.com/top-8-basic-google-search-dorks-
live-examples/
 https://www.techtarget.com
 https://infosecwriteups.com/google-hacking-dorking-
3a58757a9ae7
 https://en.wikipedia.org/wiki/Google_hacking
 https://developers.google.com/search/docs/advanced/robots/
intro
 https://www.esds.co.in/blog/ghdb-google-hacking-database/

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy