0% found this document useful (0 votes)
234 views12 pages

Information Retrieval 1 Introduction To IR

This document provides an introduction to information retrieval, covering early developments in libraries and indexing, the key goal of retrieving relevant information while limiting non-relevant results, and how users search by translating information needs into queries or browse collections. It discusses how IR has grown in importance with the rise of the web and digital libraries, and the basic components of an IR system including indexing, querying, and ranking results by relevance.

Uploaded by

Vaibhav Khanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
234 views12 pages

Information Retrieval 1 Introduction To IR

This document provides an introduction to information retrieval, covering early developments in libraries and indexing, the key goal of retrieving relevant information while limiting non-relevant results, and how users search by translating information needs into queries or browse collections. It discusses how IR has grown in importance with the rise of the web and digital libraries, and the basic components of an IR system including indexing, querying, and ranking results by relevance.

Uploaded by

Vaibhav Khanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Information Retrieval : 1

Introduction to IR

Prof Neeraj Bhargava


Vaibhav Khanna
Department of Computer Science
School of Engineering and Systems Sciences
Maharshi Dayanand Saraswati University Ajmer
Learning objectives of IR Series
• Introduction: Motivation, Basic concepts, past, present, and future, the retrieval process.
•  
• Modeling: Introduction, A taxonomy of information retrieval models, retrieval: ad hoc and filtering, a formal
characterization of IR models, classic information retrieval, alternative set theoretic models, alternative
algebraic models, alternative probabilistic models, structured text retrieval models, models for browsing.
•  
• Retrieval Evaluation: Introduction, retrieval performance evaluation, reference collections. query
• Languages: Introduction, keyword-based querying, Pattern matching, Structural queries, Query protocols.
•  
• Query Operations: Introduction, user relevance feedback, automatic local analysis, automatic global analysis.
•  
• Text and multimedia languages and Properties: Introduction, metadata, text, markup languages,
• Indexing and searching: Introduction; inverted files; other indices for text; Boolean queries; sequential
searching; pattern matching; structural queries; compression.
•  
• Searching the Web: Introduction, challenges, characterizing the web, search engines, browsing, meta
searchers, finding the needle in the haystack, searching using hyperlinks.
Architecture of the IR System
Information Retrieval (IR)
• IR deals with the representation, storage, organization
of, and access to information items
• Types of information items: documents, Web pages,
online catalogs, structured records, multimedia objects
• Early goals of the IR area: indexing text and searching
for useful documents in a collection
• Nowadays, research in IR includes:
– Modeling, Web search, text classification, systems
architecture, user interfaces, data visualization, filtering and
languages.
Early Developments
• For more than 5,000 years, man has organized information for
later retrieval and searching
• This has been done by compiling, storing, organizing, and
indexing papyrus, hieroglyphics, and books
• For holding the various items, special purpose buildings called
libraries, or bibliothekes, are used
• The oldest known library was created in Elba, in the Fertile
Crescent, between 3,000 and 2,500 BC
• Since the volume of information in libraries is always growing,
it is necessary to build specialized data structures for fast
search — the indexes
Libraries and Digital Libraries
• For centuries indexes have been created manually as sets of
categories, with labels associated with each category
• The advent of modern computers has allowed the construction of
large indexes automatically
• Libraries were among the first institutions to adopt IR systems for
retrieving information
• Initially, such systems consisted of an automation of existing
processes such as card catalogs searching
• Increased search functionality was then added
• Ex: subject headings, keywords, query operators
• Nowadays, the focus has been on improved graphical interfaces,
electronic forms, hypertext features
IR at the Center of the Stage
• Until recently, IR was an area of interest restricted mainly to
librarians and information experts
• A single fact changed these perceptions—the introduction of
the Web, which has become the largest repository of
knowledge in human history
• Due to its enormous size, finding useful information on the
Web usually requires running a search
• And searching on the Web is all about IR and its technologies
• Thus, almost overnight, IR has gained a place with other
technologies at the center of the stage
The IR Problem
• The IR Problem
• The key goal of an IR system is to retrieve all
the items that are relevant to a user query,
while retrieving as few non relevant items as
possible
• That is, the IR system must rank the
information items according to a degree of
relevance to the user query
The User’s Task
• Consider a user who seeks information on a topic of their
interest
• This user first translates their information need into a query,
which requires specifying the words that compose the query
• In this case, we say that the user is searching or querying for
information of their interest
• Consider now a user who has an interest that is either poorly
defined or inherently broad
• For instance, the user has an interest in car racing and wants to
browse documents on Formula 1 In this case, we say that the
user is browsing or navigating the documents of the collection
The User’s Task
Information × Data Retrieval
• Data retrieval: the task of determining which
documents of a collection contain the keywords in the
user query Data retrieval system Ex: relational
databases
• Deals with data that has a well defined structure and
semantics
• A single erroneous object among a thousand retrieved
objects means total failure
• Data retrieval does not solve the problem of retrieving
information about a subject or topic
Assignments
• Briefly Discuss the Architecture of the IR
System

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy