Information Retrival Using Indexing
Information Retrival Using Indexing
Introduction
Information retrieval in todays world is retrieval of
information from huge databases containing may be more then terabytes of information, if you need to find a piece of information or data from these huge database we just cant go for linear searching which not at all is up to mark in real time application one solution to this is indexing.
We will be talking about how indexing help us to retrieve data
from huge database of data but as data increases day by day even index itself becomes large and huge so we will be discussing about latest compression technique to compress index itself and finally we will discuss latest technique of how indexing is used in search engines like Google, AltaVista, Excite to retrieve information.
Index representation
size of index
Bits per keyword entry
Where N is Number of records in collection Total size = Total number of Bits per keyword entry.
each search keywords has collection of document numbers so size of index grows. Solution is use compression technique.
Compression technique
Steps: 1. Divide the document number into two parts 1)document numbers which are not repetitive e.g.: 24567 2)document numbers which are repetitive e.g.: 222223 2. Use the compression technique on Repetitive numbers only. 3. First reduce the repetitive doc number e.g.: 222222331 into 2B331
Compression technique
Compression technique
4. Represent this document number in binary for storage According to table 6. e.g.: 2B331 binary representation without table: 1101001111101101011111111011 binary representation with table: 10 1011 0011 0011 0001
Compression technique
Address table is divided into two compressible document numbers have different
Application of indexing
Search engines
Thank You