0% found this document useful (0 votes)

11 views61 pages

Lec2 BooleanRetrieval 1

Uploaded by

Ishuraj chaudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views61 pages

Lec2 BooleanRetrieval 1

Uploaded by

Ishuraj chaudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

Information Retrieval (CSD510)

Boolean Retrieval

Ayan Das
Classic IR models

Boolean model
Vector Space model
Probabilistic model

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 2 / 60

Basic concepts (Terminology.)

1 ki be an index term
2 dj is a document
3 t - Total number of index terms
4 K = {k1 , k2 , ·, kt } - Set of all index terms.
5 wij weight associated with (ki , dj ), 0 indicates absence of ki in dj .
6 vec(dj ) = (w1j , w2j , ·, wtj ) is the weight vector indicating the weights
associated with the index terms in dj .
7 gi (vec(dj )) - function returning the weight associated with (ki , dj ).

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 3 / 60

Boolean model
Simple model based on set theory and Boolean algebra.
Documents are sets of terms
Queries are Boolean expressions on terms
Queries specified as boolean expressions.
Terms are either present or absent.wij ∈ {0, 1}.
There are three connectives used
AND (∧): the intersection of two sets
OR (∨): the union of two sets
NOT (¬): set inverse, or set difference
Document: A set of words (indexing terms) present in a document
each term is either present (1) or absent (0)
Query: A Boolean expression.
Effective terms are index terms.
Operation: Boolean algebra over sets of terms and sets of
documents.
Relevant: A document is relevant to a query expression if it satisfies
the query expression
Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 4 / 60
Boolean Retrieval

Term-Document Matrix
Inverted Index

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 5 / 60

Example: Boolean retrieval

Document set: All plays of Shakespeare.

Query: BRUTUS AND CAESAR AND NOT CALPURNIA
Task: Find all Shakespeare’s plays that satisfy the query

A possible solution
A linear scan of documents (BRUTE FORCE).
1 grep for all plays containing the words BRUTUS and CAESAR.
2 From them, strip out all the plays containing the word CALPURNIA.
Cons
1 Slow for large data collection (e.g., the web, which contains billions or
trillions of words)
A better solution: Organize and index the documents into better
representation to enable more efficient search.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 6 / 60

Term-Document Incidence Matrix

Two dimensional: Terms and documents

Matrix element (t, d) = 1 if term t appears in document d

Brutus AND Caesar AND NOT Calpurnia

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 7 / 60

Term-Document Incidence Matrix

Brutus 110100
Caesar 110111
Calpurnia 010000
Brutus AND Caesar 110100
NOT Calpurnia 101111
Brutus AND Caesar AND (NOT Calpurnia) 100100
Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 8 / 60
Retrieval result

The incidence matrices are usually sparse.

Difficult to build for too big Document Corpus.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 9 / 60

Bigger collections

Consider N = 1 million documents, each with about 1000 words.

Avg 6 bytes/word including spaces/punctuation
6GB of data in the documents.
Say there are M = 500K distinct terms among these.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 10 / 60

Can’t build the matrix

500K x 1M matrix has half-a-trillion 0’s and 1’s.

But it has no more than one billion 1’s.
matrix is extremely sparse.
What’s a better representation?
Solution is to record only if a term appears in a document.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 11 / 60

Inverted Index

Postings list

Posting

Dictionary

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 12 / 60

Building inverted index

Preprocessing
1 Collect documents to be indexed
2 Tokenize the text, turning each document into a list of tokens
3 Identify the index terms to form the vocabulary
4 Do linguistic pre-processing, producing a list of normalized tokens,
which are the indexing terms

Inverted index construction

1 Identify each document by a unique identifier (docID).
2 For each term t in the vocabulary
prepare a list of documents in which the term appears.
sort the list on the docIDs.
3 Can be implemented using either singly linked lists or variable length
arrays

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 13 / 60

Inverted index
Consider the following documents
Doc 1: Breakthrough vaccine for Covid
Doc 2: New Covid vaccine
Doc 3: A new approach to vaccination against Covid
Doc 4: New hopes for Covid patients
Tokens: Breakthrough, vaccine, for, Covid, New, A, new, approach,
to, vaccination, against, hopes, patients
Case normalization: breakthrough, vaccine, for, covid, a, new,
approach, to, vaccination, against, hopes, patients
Stopword removal breakthrough, vaccine, covid, new, approach,
vaccination, against, hopes, patients (a, for, to)
Stemming: breakthrough, vaccin, covid, new, approach, against,
hope, patient
Index terms: breakthrough, vaccin, covid, new, approach, against,
hope, patient
Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 14 / 60
Inverted index
Sort by docID Sort by terms

breakthrough 1 against 3
vaccin 1 approach 3
covid 1 breakthrough 1
new 2 covid 1
covid 2 covid 2
vaccin 2 covid 3
new 3 covid 4
approach 3 hope 4
vaccin 3 new 2
against 3 new 3
covid 3 new 4
new 4 patient 4
hope 4 vaccin 1
covid 4 vaccin 2
patient 4 vaccin 3
Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 15 / 60
Building Inverted Index
Multiple term entries in a single document are merged.
Split into Dictionary and Postings
Document frequency information is added to dictionary entries.
against 3
against 3
approach 3
breakthrough 1
approach 3
covid 1
covid 2 breakthrough 1
covid 3
covid 4 covid 1 2 3 4
hope 4
new 2 hope 4
new 3
new 4 new 2 3 4
patient 4
vaccin 1 patient 4
vaccin 2
vaccin 3 vaccin 1 2 3

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 16 / 60

Boolean Retrieval

Processing Boolean queries

Term vocabulary and postings lists

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 17 / 60

Practical considerations

For a practical IR system handling a huge corpus

Postings lists will be stored on disk.
Ideally, retrieve (from disk) only those postings lists that are needed
to answer a query.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 18 / 60

Processing Boolean Queries

Consider the query: Brutus AND Calpurnia

1 Locate Brutus in the Dictionary
2 Retrieve its postings
3 Locate Calpurnia in the Dictionary
4 Retrieve its postings
5 Intersect the two postings lists

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 19 / 60

Intersecting two postings lists (a “merge” algorithm)

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 20 / 60

Query processing
Query: Brutus AND Calpurnia AND Caesar
For each of the n terms, get its postings, then AND them together.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 21 / 60

Query optimization

Process in order of increasing frequency:

start with smallest set, then keep cutting further.

Execute the query as (Calpurnia AND Brutus) AND Caesar.

If the list lengths are x and y, the merge takes O(x+y) operations.
Crucial: postings sorted by docID.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 22 / 60

Query processing

(wind OR fire) AND (thunder OR lightning)

Get doc. frequencies for all terms.
Estimate the size of each OR by the sum of its document frequencies.
Process in increasing order of OR sizes.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 23 / 60

Query processing

Given the following postings list sizes:

Recommend a query processing order for the following two queries
1 (tangerine OR trees) AND (marmalade OR skies) AND (kaleidoscope
OR eyes)
2 (tangerine AND (NOT trees)) AND (NOT marmalade)

Term Posting size

eyes 213312
kaleidoscope 87009
marmalade 107913
skies 271658
tangerine 46653
trees 316812

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 24 / 60

Query processing

(tangerine OR trees) (363,465)

(marmalade OR skies) (379,571)
(kaleidoscope OR eyes) (300,321)

((kaleidoscope OR eyes) AND (tangerine OR trees)) AND

(marmalade or skies)
(tangerine AND (NOT trees)) AND (NOT marmalade)

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 25 / 60

Limitations of Boolean model

Retrieval based on binary decision criteria with no notion of partial

matching
No ranking of the documents is provided (absence of a grading scale)
Information need has to be translated into a Boolean expression
which most users find awkward
Binary term weights extremely limited in terms of expressiveness
and relation among contextual words.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 26 / 60

Lecture outline

1 Term vocabulary
2 Skip pointers
3 Phrase queries
4 Dictionary structures

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 27 / 60

Term Vocabulary and Postings List

Pre-processing to form the Term vocabulary

Documents
Tokenization
Indexing

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 28 / 60

Term Vocabulary and Postings List

Pre-processing to form the Term vocabulary

Documents
Tokenization
Indexing
Postings
Faster merges: skip lists
Positional postings and phrase queries

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 28 / 60

Document interpretation

Obtaining the character sequence in a document.

Choosing a document features
We need to deal with format and language of each document.
What format is it in? pdf, word, excel, html etc.
Language of the document

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 29 / 60

Document processing steps for vocabulary generation

Tokenization
Stop words
Normalization
Stemming and Lemmatization

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 30 / 60

Tokenization

Token: An instance of character sequence in some particular

document that are grouped together as a semantic unit for processing.
Type: A type is the class of all tokens containing the same character
sequence.
Term: A term is a type that is included in the IR system’s dictionary.
Tokenization is a way of separating a document into smaller units,
called tokens, by removing unwanted tokens.
Example of tokenization
Input: “Friends, Romans, Countrymen”
Output: Friends, Romans, Countrymen
Each such token is now a candidate for an index entry, after further
processing.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 31 / 60

Issues in Tokenization
What are the correct tokens to use?
Mr. O’Neill thinks that the boys’ stories about Chile’s capital aren’t
amusing.

Hypens
Hewlett-Packard
Hewlett and Packard as two tokens?
state-of-the-art
co-education
lowercase, lower-case, lower case
White Space
San Francisco: one token or two?
red herring: one token or two?
Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 32 / 60
Issues in Tokenization

Different character sequences

email addresses (jblack@mail.yahoo.com)
Web URLs (http://stuff.big.com/new/specials.html)
numeric IP addresses (142.32.48.231)
package tracking numbers (1Z9999W99845399981)
Often have embedded spaces
Older IR systems may not index numbers
But often very useful:
looking up error codes/stack traces on the web
Date of an email Will often index “meta-data” separately, Creation
date, format, etc.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 33 / 60

Tokenization
Tokenization: language issues
French
L’ensemble one token or two?
L ? L’ ? Le ?
German noun compounds are not segmented
Lebensversicherungsgesellschaftsangestellter
‘life insurance company employee’
Chinese and Japanese have no spaces between words
莎拉波娃现在居住在美国东南部的佛罗里达。
Not always guaranteed a unique tokenization
Arabic (or Hebrew) is written right to left, but with certain items like
numbers written left to right
Use rule-based or machine learning-based compound-splitters or word
segmentation tools to tokenize long compound words or languages
where explicit separators are not used to indicate word boundaries.
Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 34 / 60
Stop words

Common words that appear to be of little value in helping select

documents matching a user’s need.
With a stop list, exclude from the dictionary entirely the most
common.
They have little semantic content
the, a, and, to, be
To sort the terms by collection frequency and then to take the most
frequent.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 35 / 60

Issues in removing stop words

Some special query types are disproportionately affected.

Phrase queries:
“King of Denmark”
“President of the United States”, President AND “United States”
Various song titles, etc.:
“Let it be”, “To be or not to be”
“Relational” queries:
“flights to London”: if to removed, it implies both “flights to London”
or “flights from London”
Standard use of quite large stop lists (200–300 terms) to very small
stop lists (7–12 terms)

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 36 / 60

Token normalization

Token normalization is the process of canonicalizing tokens so that

matches occur despite superficial differences in the character
sequences of the tokens
match U.S.A. and USA
A term is a (normalized) word type, which is an entry in the IR
system dictionary
To implicitly create equivalence classes, which are normally named
after one member of the set.
deleting periods to form a term
U.S.A., USA
deleting hyphens to form a term
anti-discriminatory, antidiscriminatory

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 37 / 60

Token normalization
Alternatives to creating equivalence classes are
to maintain relations between unnormalized tokens.
to do asymmetric expansion.
Example: Microsoft Windows, Rear Window, glass window
Enter: window Search: window, windows
Enter: windows Search:Windows,windows, window
Enter: Windows Search: Windows

Maintain relations between unnormalized tokens

1 Index unnormalized tokens.
2 Maintain a query expansion list of multiple vocabulary entries to
consider for a certain query term.
3 A query term is then effectively a disjunction of several postings lists.

Asymmetric expansion
Perform the expansion during index construction e.g. When the document
contains automobile, we index it under car as well and vice versa.
Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 38 / 60
Token normalization

Accents and Diacritics: Naı̈ve, peña (a cliff), pena (sorrow).

Case folding – True Casing
Reduce all letters to lower case
The simplest heuristic is to convert to lowercase words
at the beginning of a sentence
all words that are all uppercase or in which most or all words are
capitalized
exception: upper case in mid-sentence

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 39 / 60

Text normalization

Handling synonyms and homonyms

e.g., by hand-constructed equivalence classes
by hand-constructed equivalence classes
car = automobile; color = colour
We can rewrite to form equivalence-class terms
When the document contains automobile, index it under
car-automobile (and vice-versa)
Or we can expand a query
When the query contains automobile, look for car as well
Spelling mistakes
One approach is Soundex, which forms equivalence classes of words
based on phonetic heuristics

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 40 / 60

Stemming and Lemmatization

To reduce inflectional forms and sometimes derivationally related

forms of a word to a common base form.
Example:
am, is, are → be
car, cars, car’s, cars’ → car
the boy’s cars are different colors → the boy car be different color

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 41 / 60

Stemming and Lemmatization

Stemming refers to a crude heuristic process that chops off the ends
of words and removes the derivational affixes.
It commonly collapses derivationally related words
Lemmatization refers to doing things properly with the use of a
vocabulary and morphological analysis of words, normally aiming to
remove inflectional endings only and to return the base or dictionary
form of a word, which is known as the lemma.
It only collapses the different inflectional forms into the corresponding
root forms.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 42 / 60

Stemming

Reduce terms to their common basic form before indexing.

“Stemming” suggests crude affix chopping
language dependent
e.g., automate(s), automatic, automation all reduced to automat

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 43 / 60

Porter’s Stemmer
The most common algorithm for stemming English.
Results suggest it’s at least as good as other stemming options
Algorithm
1 5 phases of reductions
2 phases applied sequentially
3 each phase has various conventions to select rules
4 sample convention: Of the rules in a compound command, select the
one that applies to the longest suffix.
Phase 1
SSES → SS caresses → caress
IES → I ponies → poni
SS → SS caress → caress
S→ cats → cat
Phase 2
Loosely checks the number of syllables to find whether a syllable is
suffix or part of the stem of the word.
replacement→replac, and NOT cement → c
Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 44 / 60
Lemmatizer

Tool from Natural language Processing

Does full morphological analysis to accurately identify the lemma for
each word.
Full morphological analysis
Is usually more time consuming and elaborate process.
produces at most very modest benefits for retrieval.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 45 / 60

Faster postings list access

If lengths of postings lists are m and n then, intersection operation

takes O(m + n) time.
The speed of intersection may be increased by using skip pointers
Skip pointers are shorcuts to bypass parts of posting lists that will
not appear in the search result

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 46 / 60

Skip pointers

Points to consider
Where to place the skip pointers?
How to do efficient merging using skip pointers?

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 47 / 60

Skip pointers

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 48 / 60

Where to place the skip pointers?

More skips → shorter skip spans

1 more likely to skip
2 increased number of skip comparison operations.
3 more successful skips
Less skips → longer skip spans
1 fewer pointer comparison
2 fewer successful skips
√
Simple heuristic: for postings of length P, use P evenly-spaced
skip pointers.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 49 / 60

Phrase queries

Consider the query - “Stanford University - as a phrase

Following documents are false positives
1 “I went to university at Stanford”
2 “The inventor Stanford Ovshinsky never went to university ”
Postings lists comprising of documents containing individual terms
not sufficient to handle such queries.
Approaches for phrase queries
1 Biword Indexes
2 Positional Indexes

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 50 / 60

Biword Indexes

Index every consecutive pair of terms in the text as a phrase

Query:“Friends, Romans, Countrymen”
Pairs of consecutive words indexed as dictionary terms
friends romans
romans countrymen
For longer queries consecutive word pairs are ANDed
1 Query:“Friends, Romans, Countrymen”
(friends roman) AND (roman countrymen)
2 stanford university palo alto
(stanford university) AND (university palo) AND (palo alto)
Disadvantage: False positives: The biwords may not necessarily
appear together in the retrieved document.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 51 / 60

Extended biwords

Nouns and noun groups (N) are usually more significant in queries
as compared to words with other parts-of-speeches (X).
For any string of terms of the form NX ∗ N, the word pair
corresponding to NN
forms an extended word pair
indexed in the dictionary

cost overruns on a power plant

N N X X N N

Extended bi-words
1 cost overruns
2 overruns power
3 power plant
Query: (cost overruns) AND (overruns power) AND (power plant)

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 52 / 60

Positional indexes

Store in the posting the positions where the term appear in the
document.
to, 993427:
(1, 6: (7, 18, 33, 72, 86,
231);
<term, # docs containing term; 2, 5: (1, 17, 74, 222, 255);
doc1 : freq. of the term; pos1, pos2, 4, 5: (8, 16, 190, 429, 433);
··· ; 5, 2: (363, 367);
doc2 : freq. of the term; pos1, pos2, 7, 3: (13, 23, 191); · · · )
··· ; be, 178239:
etc. > (1, 2: (17, 25);
4, 5: (17, 191, 291, 430,
434);
5, 3: (14, 19, 101); · · · )

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 53 / 60

Proximity intersection

Query: to be or not to be
Start from the postings lists of the terms in increasing order of
document frequency.
Consider to and be
1 Find the documents containing both terms
2 Look for positions in the lists where be occurs with one index
position greater than an occurrence of to
3 Look for occurrence of both words with token positions 4 higher than
first occurrence
to: < · · · ;4:< · · · ,429,433,>;· · · >
be: < · · · ;4:< · · · ,430,434,>;· · · >

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 54 / 60

Dictionaries

Dictionary data structures

Tolerant retrieval
1 Wild-card queries
2 Spelling correction
3 Phonetic correction
Develop techniques that are robust to typographical errors in the
query, as well as alternative spellings.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 55 / 60

Search structures for dictionaries

The dictionary data structure stores the term vocabulary, document

frequency, pointers to each postings list.
Explore the data structures for the dictionary.
Postings list

Posting

Dictionary

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 56 / 60

A simple dictionary
An array of structures
document pointer to
term frequency postings list
a 656,265 →
aardvark 65 →
··· ··· ···
zulu 221 →
char[20] int Postings
20 bytes 4/8 bytes 4/8 bytes
Storage and retrieval is not efficient
Points to be considered:
1 # of terms in dictionary
2 Keys remain static or dynamic
3 The relative frequencies with which various keys will be accessed
Two choices
1 Hashtables
2 Trees
Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 57 / 60
Hashing

Query terms (keys) mapped to integers from a big enough space to

avoid collision.
Collision resolution done by auxiliary structures
O(1) search complexity

Cons
1 Minor variants may be Positions in t1
Query dictionary
mapped to distant p1
t2
k1
integers.(color/colour) k2
Hash
p2
function t221
2 No prefix search k3 p3

(free/freely/freedom) t548
3 Expanding vocab may
necessitate redesigning t1024
Collision
the hash function. resolution

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 58 / 60

Binary trees
ROOT
a-m n-z

n-sh si-z
a-hu hy-m

s
en

le
va

e
ck

t
yg

go
rd

si
hu
aa

zy
Efficient search time is O(M) if tree is balanced
Allows prefix search
If balanced at each node, the difference in depth of left and right
subtrees differ by at most 1.
Insertion and deletion unbalance a tree
Costly rebalancing step required to maintain balance
Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 59 / 60
B-tree

To mitigate rebalancing, B-trees may be used

Each internal node of a B-trees has variable number of children in a
fixed range.
Each branch under an internal node represents a test for a range of
character sequences.

Boolean Retrieval (Ayan Das) Information Retrieval (CSD510) 60 / 60

Deswik Scripting Training Manual
100% (3)
Deswik Scripting Training Manual
12 pages
The Absolute Basics: Basic and Intermediate Python 3 - Notes/Cheat Sheet
100% (3)
The Absolute Basics: Basic and Intermediate Python 3 - Notes/Cheat Sheet
11 pages
Vocalizer Dictionary and Rules
No ratings yet
Vocalizer Dictionary and Rules
12 pages
Web Search and Mining: Lecture 2: Boolean Retrieval
No ratings yet
Web Search and Mining: Lecture 2: Boolean Retrieval
45 pages
Lecture 2 - Boolean Retrieval
No ratings yet
Lecture 2 - Boolean Retrieval
49 pages
2-Boolean IR and Indexing
No ratings yet
2-Boolean IR and Indexing
46 pages
lecture02 - IR
No ratings yet
lecture02 - IR
36 pages
Unit 1 Intro to IR
No ratings yet
Unit 1 Intro to IR
32 pages
Lect 2 Boolean Retrieval
No ratings yet
Lect 2 Boolean Retrieval
24 pages
IR Unit 2 Final
No ratings yet
IR Unit 2 Final
43 pages
04 - Recuperación Información Modelo Booleano
No ratings yet
04 - Recuperación Información Modelo Booleano
41 pages
Lecture1 Intro Handout 1 Per
No ratings yet
Lecture1 Intro Handout 1 Per
57 pages
L3L4 IRSW Boolean Retrieval
No ratings yet
L3L4 IRSW Boolean Retrieval
54 pages
2
No ratings yet
2
50 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
38 pages
Information Retrieval (CS6370) : Maunendra Sankar Desarkar
No ratings yet
Information Retrieval (CS6370) : Maunendra Sankar Desarkar
44 pages
Unit 1
No ratings yet
Unit 1
181 pages
02 Boolean Retrieval
No ratings yet
02 Boolean Retrieval
52 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
31 pages
2.boolean Retrieval Model
No ratings yet
2.boolean Retrieval Model
40 pages
Lecture1-Intro - Realted To Ch1
No ratings yet
Lecture1-Intro - Realted To Ch1
60 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
57 pages
Introduction To Information Retrieval
100% (2)
Introduction To Information Retrieval
60 pages
Lecture1 Intro
No ratings yet
Lecture1 Intro
57 pages
Unit I
No ratings yet
Unit I
83 pages
lecture1-intro
No ratings yet
lecture1-intro
60 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
67 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
42 pages
Ir 1
No ratings yet
Ir 1
14 pages
Module 4-Boolean Retrieval Models
No ratings yet
Module 4-Boolean Retrieval Models
52 pages
Boolean Model 2021spring
No ratings yet
Boolean Model 2021spring
43 pages
Information Retrival Systems
No ratings yet
Information Retrival Systems
50 pages
Intro To IRE
No ratings yet
Intro To IRE
48 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
69 pages
Unit 2 Irt
No ratings yet
Unit 2 Irt
33 pages
Lecture2 Ranking1
No ratings yet
Lecture2 Ranking1
126 pages
Ir Notes
No ratings yet
Ir Notes
111 pages
IR Lecture 1b
No ratings yet
IR Lecture 1b
54 pages
IR Unit 2
No ratings yet
IR Unit 2
54 pages
Boolean Retrieval PPT Updated
No ratings yet
Boolean Retrieval PPT Updated
30 pages
Information Retrieval
No ratings yet
Information Retrieval
44 pages
01 Intro
No ratings yet
01 Intro
145 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
48 pages
Information Retrieval: Indexing
No ratings yet
Information Retrieval: Indexing
32 pages
C1 Intro
No ratings yet
C1 Intro
10 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
33 pages
Unit Ii Modeling
No ratings yet
Unit Ii Modeling
15 pages
Lecture01 Intro
No ratings yet
Lecture01 Intro
45 pages
Inverted Index Construction: Adapted From Lectures by
No ratings yet
Inverted Index Construction: Adapted From Lectures by
78 pages
Lect 3 Inverted Index
No ratings yet
Lect 3 Inverted Index
24 pages
IR-Lec1 - Ch1-2023
No ratings yet
IR-Lec1 - Ch1-2023
41 pages
Lec 1 IR
No ratings yet
Lec 1 IR
42 pages
L003
No ratings yet
L003
15 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
48 pages
IR Merged Merged
No ratings yet
IR Merged Merged
132 pages
Boolean Retrieval
No ratings yet
Boolean Retrieval
34 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
30 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
33 pages
Lecture2 Intro Boolean 6per
No ratings yet
Lecture2 Intro Boolean 6per
9 pages
Unit 2
No ratings yet
Unit 2
58 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
51 pages
P 01 Intro
No ratings yet
P 01 Intro
70 pages
Coronavirus An Overview Through This Pandemic
From Everand
Coronavirus An Overview Through This Pandemic
Jeffrey Simmons
No ratings yet
Computational Geometry F
No ratings yet
Computational Geometry F
14 pages
Approximation Algorithms 8
No ratings yet
Approximation Algorithms 8
9 pages
Ramanuja-III
No ratings yet
Ramanuja-III
9 pages
Indian Knowledge System: Abstract
No ratings yet
Indian Knowledge System: Abstract
18 pages
AI As Awakened Intelligence Buddha Kurzw
No ratings yet
AI As Awakened Intelligence Buddha Kurzw
13 pages
Documentation Cloud Sem 7
No ratings yet
Documentation Cloud Sem 7
13 pages
Ijirt172005 Paper
No ratings yet
Ijirt172005 Paper
4 pages
CLM Modbus Parameter Reference Dictionary LK09240
No ratings yet
CLM Modbus Parameter Reference Dictionary LK09240
63 pages
XIC CS Dictionary
No ratings yet
XIC CS Dictionary
7 pages
akshat sethi practical file
No ratings yet
akshat sethi practical file
50 pages
DSA With Python
No ratings yet
DSA With Python
10 pages
Lecture 01 - Introduction To Data Structures
No ratings yet
Lecture 01 - Introduction To Data Structures
20 pages
Report of Industrial Training
No ratings yet
Report of Industrial Training
51 pages
Arraylist, Hashtable, Dictionary
No ratings yet
Arraylist, Hashtable, Dictionary
4 pages
UNIT-4 Python Programming
No ratings yet
UNIT-4 Python Programming
18 pages
Data Structures in Python
No ratings yet
Data Structures in Python
15 pages
VSG Ip Practical Index-1
No ratings yet
VSG Ip Practical Index-1
4 pages
Getting Started With Python
No ratings yet
Getting Started With Python
13 pages
Nps School, Techcbse Term-I (Mock Test) Subject: Informatics Practices (Code-065) Class - Xi
No ratings yet
Nps School, Techcbse Term-I (Mock Test) Subject: Informatics Practices (Code-065) Class - Xi
9 pages
Learn Python 3 - Dictionaries
No ratings yet
Learn Python 3 - Dictionaries
2 pages
Matlab调用C#Dll
No ratings yet
Matlab调用C#Dll
8 pages
Pandas
No ratings yet
Pandas
167 pages
Python Essentials Concept For Beginners
No ratings yet
Python Essentials Concept For Beginners
220 pages
Ise36 - Modul1 - I PDF
No ratings yet
Ise36 - Modul1 - I PDF
56 pages
Python Questions and Answers - Variable Names: Advertisement
No ratings yet
Python Questions and Answers - Variable Names: Advertisement
10 pages
PPS Dyp Pyq
No ratings yet
PPS Dyp Pyq
6 pages
1000+ Core Java & Advance Java
No ratings yet
1000+ Core Java & Advance Java
24 pages
Data Structures Lab - Version 1.3 - Revision Summer 2023
No ratings yet
Data Structures Lab - Version 1.3 - Revision Summer 2023
158 pages
Introduction of Python
No ratings yet
Introduction of Python
125 pages
Cse - Ai
No ratings yet
Cse - Ai
46 pages
SHUBH SAXENA 11-B IP FILE.docx
No ratings yet
SHUBH SAXENA 11-B IP FILE.docx
61 pages
2 - Data Types in Python
No ratings yet
2 - Data Types in Python
44 pages
Acts Computer College: Total
No ratings yet
Acts Computer College: Total
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.