10.1109 Iccic.2017.8524272
10.1109 Iccic.2017.8524272
Abstract—Internet Forensics has become an indispensable These files may hold History, Cookies, Cache, Bookmarks and
part of Cyber Forensics. This is due to the rapid growth in the other forensically relevant information. Whenever a user
number of cybercrimes which are related to Internet usage. makes request in the web browser, the details of that access
These crimes vary from malware crimes to crimes related to use will be added in the browser files. Browser Files once created
of Social Media, banking transactions and other financial
can be accessed and analyzed for retrieving forensic evidence
services. In these type of crimes, the browser files which are
generated by different web browsers, should be analyzed. Among from it. So, evidences of Cyber Crimes related to Internet
the different artifacts left by web browsers, the most relevant file usage including social media can be identified through
in forensic investigation is the cache file as it stores important browser forensics. The involvement of a suspect in social
cyber forensics information of frequently visited websites. media can be identified from the visited sites found in the
Investigators can obtain a clear picture of visited websites, loaded browser files. This may provide initial hints which may points
pictures and other objects using the information stored in the to a terrorist related cybercrime and helps in further
cache files. The paper describes the structure of cache files investigation. Thus, this type of Browser Forensics may
created by Google Chrome in detail. The results obtained in this provide crucial evidence in cybercrime investigation. The anti-
way can provide forensically sound information in cybercrime
forensics techniques that can be done to erase browser related
investigation. Advanced analysis of java scripts and other objects
obtained in this way provide crucial evidence in proving different evidence are described in [3].
types of cybercrimes including malware crimes.
III. GOOGLE CHROME
Keywords—Browser Forensics, Google Chrome, Cache File, A Windows Computer stores browser files created by
Cyber Forensics, Internet Forensics Google Chrome in “OS Drive: \\Users\Username\AppData\
local\Google\Chrome\UserData\Default\”. In addition to this
I. INTRODUCTION default profile path, there may be other folders in “...Google\
Cyber Forensics is the identification, acquisition, Chrome\UserData\” corresponding to each additional profile
authentication, analysis, presentation and preservation of created by the user. The cached pages and images that a user
cybercrime evidence [1]. New types of cybercrimes are accessed through Google Chrome Web Browser is stored in a
evolving due to the rapid growth of Information Technology. folder named “Cache” present inside these profile paths.
Most of the reported crimes are online crimes and thus, Internet Google Chrome saves browser forensics details in different
Forensics is an inevitable part of cybercrime investigation. files like Bookmarks, Cookies, Current Session, Current Tabs,
Browser Forensics is the main part of Internet Forensics which History, Last Session, Last Tabs, Network Action Predictor,
deals with acquisition and analysis of browsers files generated Preferences, Visited Links, and Top Sites. This information is
by web browsers. Web Browsers keep information related to stored in separate files created in the profile paths as shown in
Visited Sites, Downloads, Search History, Cookies and Cache Table I. Table II shows the forensically relevant artifacts
Information in a predefined location in the Suspect’s machine. which can be retrieved from these browser files.
Forensically relevant information regarding the web browsing
profile of the user can be obtained by analyzing these files. The various artifacts in Google Chrome is described in [4].
There are many browsers available in IT World. Main among The browser files created by Google Chrome are saved in
them are Internet Explorer, Google Chrome and Mozilla SQLite database format mostly, but a few of the information
Firefox. Google Chrome being one of the most popular are stored as JSON and SNSS Files. Among this, Cookies,
webrowsers, it is important to analyze the browser files left by
Extension Cookies, Favicons, History, Login Data, Network
this browser at the time of cyber forensics investigation. The
Action Predictor and Web Data are in SQLite Format.
Importance of identifying Internet Activities for Computer
Forensics is explained in [2]. Bookmarks file is stored in JSON format and Session and Tab
information is in SNSS format. The purpose of browser cache
file is to avoid re-downloading of information that has been
II. BROWSER FORENSICS previously viewed. So there are various types of objects
Browser Forensics plays an important role in providing cached in a system by the web browsers. So, from these
forensically relevant information in a cybercrime cached files, the previously visited website itself can be
investigation. This is because, web browsers creates a number reconstructed as described in [5]. The next section describes
of files in the local system at the time of Internet Browsing. the structure of cache file crated by Google Chrome in detail.
TABLE I. CHROME BROWSER FILES or block files store the information as blocks having a
SL Artefacts Location predefined size. If the size of cache information is smaller than
No. 16KB, the information is saved to one of the data files based on
1 History History.sqlite the block size. The default data size of files are shown in Table
2 Cookies Cookies.sqlite III. If the size is greater than 16KB, it will be stored as a
3 Login data Login Data.sqlite separate file. In this case, the whole data which is to be cached
4 Network action predictor NetworkActionPredictor.sqlite
is stored as a file with a name f_######, where # is the
hexadecimal number that identifies the file like f_000001,
5 Top Sites TopSites. sqlite
f_000002 etc. as shown in Fig.1.
6 Bookmarks Bookmarks. Json
7 Search Keywords History.sqlite
8 Downloads History.sqlite
9 Cache Cache\
IV. CACHE
A. Cache Structures
Caching is the process of making a temporary storage for
frequently accessed contents in visited websites like html,
images, java script etc. This temporary storage helps to avoid
re-downloading of objects in previously viewed websites. The
different files stored in “Cache” folder in the Google Chrome
profile path are shown in Fig.1. Usually cache folder contains
at least five files, one index file and four data files [6]. The
index file contains addresses that points to a data file having
cached items. The data files that stores the cached data are
named as data_0, data_1, data_2 and data_3.The data files can
also be called as block files because the cached data are
allocated in fixed size of blocks. The maximum allocated
block size for data file is 16 Kb and if the content size
exceeds, chrome saves it in an external file. The external file
name starts with “f_”. Fig.1 shows few external files.
Fig.2. Index File
The index file has three parts, a header, last recently used
TABLE III. DATA BLOCK AND SIZE
(LRU) data and a hash table. The header size is 256 bytes and
size of LRU data is 112 bytes. The size of hash table is Block Maximum
mentioned in the ‘table size’ member variable present in the File
size Data size
header. The LRU data is used for eviction [6] which helps the
browser to delete old entries when the available space for data_0 36 b -
storing cache is filled up. The hash table is a bucket containing data_1 256 b 1k
data_2 1kb 4k
cache addresses, that points to the actual cached data inside the data_3 4Kb 16k
data files. A sample index file is shown in Fig.2. The data files
2017 IEEE International Conference on Computational Intelligence and Computing Research
The Index table’s hash table keeps a bucket of cache Initialized File Reserved Block File Block
addresses in a little-endian format. The number of entries in Flag Type Count Number Number
the bucket is at least 65536 (0x10000), but the actual number 1 010 00 00 0000 0001 0101 1110
of entries is controlled by the ‘table size’ member present in 0010 0111
the header. Fig.2 shows the table size and cache address fields
in an index file. The cache address in the hash table is a 32-bit Block Offset = 8192 + (Block Number * Block Size) (1)
value that describes the location of the cached data such as
TABLE VIII. BLOCK SIZE
URLs of the cached item, its created time etc. Table IV shows
sample cache address values and its description. File Type Block Size(bytes)
Rankings 36
TABLE IV. EXAMPLES OF CACHE ADDRESS Block_256 256
Cache address Description Block_1k 1024
Block_4k 4096
0x00000000 Not initialized
0xA0015E27 BLOCK_256,data_1, block offset 5e4700 B. Block Structure
0x800061e1 External file f_0061e1 Data or Block files have two parts, a file header and
an array of cached data blocks [7]. The file header is of 8KB
The 32-bit cache address is to be parsed for finding the and is shown in the Fig.3. The data files stores cached data as
original location of the cached item. The parsing process starts different data blocks after the file header. The data block size
with converting the 32-bit value to a binary number. The MSB in data_0, data_1, data_2 and data_3 is 36, 256, 1024 and
of this binary value is a flag and the succeeding 3 bits 4096 bytes respectively. For example, if the browser wants to
determines the file type. This file type specifies where the store a data of 3122 bytes, then it needs four blocks of 1024
cached data is stored, in a data file or in an external file. The bytes and hence it writes the content in data_2 file. Likewise if
file types and the corresponding cache files are shown in it wants to keep a data of 500 bytes, then it needs two blocks
Table V. If the file type is zero, the data is stored in an of 256 bytes and hence it writes the content to data_1 file.
external file and its filename is calculated from the last 28 bit C. Cache Entry
of cache address. The example for finding the filename from Each cache address in the index file points to a block in the
its cache address is shown in Table VI. Here the file number is data file. This block contains name, created time and length of
“0000 0000 0000 0110 0001 1110 0001” and so the name of browsed URLs. It also stores address and size of http response
the file will be ‘f_0061E1’. and content. The block of data can also be called as cache
entry. The http response from the server is a combination of
TABLE V. FILE TYPE OF CACHED DATA http content and its metadata information. Metadata includes
Binary File Cache Files Block content type, cache-control, server name, server response etc.
Type Size The first four bytes in the cache entry is a hash number of the
000 0 f_###### >16 KB URL and the succeeding 4 bytes is the next cache address. A
001 1 data_0 36 bytes cache entry in a data file is shown in Fig.4. If the ‘next cache
010 2 data_1 256 bytes address’ value is non-zero, then this address should be parsed
011 3 data_2 1024 again for getting another cache entry. The parsing process is
bytes same as parsing of cache addresses in index file as described
100 4 data_3 4096 in section IV.A. This process is to be repeated until the ‘next
bytes
cache address’ becomes zero. The byte order of the data file is
TABLE VI. EXTERNAL FILE f_0061e1
little-endian and its creation time is in microseconds since
January 1, 1601 00:00:00 UTC.
Initialized File File Number
Flag Type The ‘key data size’ field in cache entry structure represents
1 000 0000 0000 0000 0110 0001 1110 0001 the requested URL’s length. If the key length is longer than
the respective block size of data files, the browser saves the
URL in a separate location and is represented using the field
If the file type is non-zero, then the data is found in one of 'long key data cache address’. The value of this field is zero if
the four data files. In such cases, the last 28 bits are to be URL’s length is short. Then the URL is stored in the same
parsed further to obtain the file number of the data file. This block itself. The requested URL’s response and its content is
parsing is shown in Table VII. Here, in the example given in saved in any of the data files or external files according to its
the table, block number is ‘0101 1110 0010 0111’ or size. The cache entry keeps the location of this URL’s
0x5e4700. The block offset can be calculated using the (1). response and its content in a separate array of cache addresses
The block size is obtained from file type value as shown in called data stream array. ‘Array of Data stream cache’ field
Table VIII. Thus the cached items are located in data_1 file at holds the http response and the content. Its size is given by
the offset 0x5e4700. ‘Array of Data stream size’ filed as shown in Fig.4. Table IX
shows the structure of Cache entry.
2017 IEEE International Conference on Computational Intelligence and Computing Research
This provide useful information in cybercrime Advanced analysis of java scripts and other objects obtained in
investigation. In addition to this, recently visited websites can this way provides crucial evidence in proving different types of
be obtained through a time analysis. From the content of the cybercrimes including malware related crimes. These evidence
java script files obtained, the presence of malware can be may be crucial in supporting investigation of various types of
identified through malware analysis. The pictures recovered cybercrimes including crimes related to use of Social Media,
from the cache file provide direct evidence in the cybercrime banking transactions and other financial services. Thus this
investigation. Advanced analysis of cached content obtained type of cache file analysis is important in Cyber Forensics.
in this way may provide more forensically sound evidence.
The methodology described in this paper can be used for REFERENCES
analyzing cache files in version 1.0 and 2.0. Further research [1] S Dija, TR Deepthi, C Balan, K L Thomas, "Towards retrieving live
should be done to analyze the cache files in version 2.1 forensic artifacts in offline forensics," International Conference on
Security in Computer Networks and Distributed Systems, SNDS 2012:
onwards. Recent Trends in Computer Networks and Distributed Systems Security,
225-233.
TABLE X. CACHE CONTENTS
[2] Asaf Varol and Yesim Ülgen Sönmez, "The Importance of
Content Type Exist/Not Exist WebActivities for Computer Forensics" ,International Conference on
Computer Science and Engineering (UBMK),pp.66 – 71, 2017.
Logos Yes
[3] G Patel,"Anti-forensics techniques for browsing artifacts,"2014.
Rotating and non-rotating Images Yes
[4] D.Rathod, "Web Browser Forensics : Google Chrome, "International
Style sheets Yes Journal of Advanced Research in Computer Science”, Vol. 8 Issue 7, p
JavaScript files Yes 896-899. 4p, Jul/Aug 2017.
Media Files Yes [5] E. Schaap,I. Hoogendoorn ,"Reconstructing web pages from browser
Downloadable Contents Yes cache,"2013.
HTML pages Yes [6] The Chromium Projects,”Disk Cache,” https://www.chromium.org/
developers/design-documents/network-stack/disk-cache.
Banking information No
[7] Web Browser Forensics-Part2, http://forensicinsight.org/wp-content/
User passwords No uploads/2012/03/INSIGHT_Web-Browser-Forensics-Part-II.pdf.
[8] MDN webdocs, "HTTP headers," https://developer.mozilla.org/en-
VI. CONCLUSION US/docs/Web/HTTP/Headers .
Cache file analysis done as described in this paper provides [9] Wikipedia, ”List of HTTP header fields,” https://en.wikipedia.org/wiki/
List_of_HTTP_header_fields.
crucial information in a cybercrime investigation. In addition
[10] mkyong, "How to view HTTP headers in Google Chrome?,”
to analyzing objects found in this analysis, the urls obtained in https://www.mkyong.com/computer-tips/how-to-view-http-headers-in-
the analysis can be loaded online for further investigation. google-chrome/ , January 22, 2016.