0% found this document useful (0 votes)
67 views18 pages

Custom Search Engine Project Presentation

The document describes a custom search engine project built using Elasticsearch. It allows users to gather data from web crawlers and analyze information collected from different sources. The search engine architecture involves users passing search queries to Elasticsearch, while crawlers concurrently populate Elasticsearch with data. Tools used include ReactJS, NodeJS, Material UI and Elasticsearch. LinkedIn and search client functionality are demonstrated.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views18 pages

Custom Search Engine Project Presentation

The document describes a custom search engine project built using Elasticsearch. It allows users to gather data from web crawlers and analyze information collected from different sources. The search engine architecture involves users passing search queries to Elasticsearch, while crawlers concurrently populate Elasticsearch with data. Tools used include ReactJS, NodeJS, Material UI and Elasticsearch. LinkedIn and search client functionality are demonstrated.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Custom Search

Engine
By
Mehakdeep Singh Chhina(202193857)

Supervisor:
Thumeera R. Wanasinghe

ENGI 981B -001


Introduction

● Powerful Custom search engine based on Elasticsearch


● Allows to gather data using web crawlers.
● Place to analyze data collected from different sources
● Can behave as personal google
Motivation

● Software to handle data at different places on web.


● Multiple use cases - collect data at one place, competition analysis
● Previous work experience
Architecture

● User passes a search query to get results


● Elasticsearch acts a repository.
● Crawlers are responsible to populate
elasticsearch concurrently
Architecture
Flow Chart
Tools

● ReactJs
● NodeJs
● Material UI
● Elasticsearch
Linkedin Crawler
Linkedin Crawler

Enter the url and name of company to


see in filters
Linkedin Crawler

Wait for the crawler to crawl the posts and check the result in search page.
Search Client

● Search from the data you have crawled


● Use different keywords and filters for company specific data
Search Client
Results

● Custom Search Product with multiple use cases for different organisations
● ESG, Data analysis, customer service
● Web crawlers and developer apis used to crawl data from different platforms
● Child process in node are used so the software doesn’t get block while crawling
Challenges

● Making crawlers to scroll the page and get the required posts
● Setting up elasticsearch
● Showing images and videos related to posts in search results
Missing

● Different web crawlers specially that use developer apis and OAuth
● Downloadable search interface to add user’s website.
Future Work

● Machine learning algorithms can be added to elasticsearch


● Generate tags and user specific search results
● Search setting can be provided for users to improve rank of results, use stop words etc
● Open API’s can be provided using OAUTH for others developers to use search
services.
References
[1] M. K. T. E. M. J. a. M. N. Z. G. Gonzalez, Search Engine Indexing, U.S. Patent
Application 13/713,765., 2012.

[2] W. I. C. a. E. R. M. S. T. Kirsch, Real-time document collection search engine with


phrase indexing., U.S. Patent 5,920,854., 1999.

[3] Apache Software Foundation, Solr, [Online]. Available: http://lucene.apache.org/solr/.

[4] S. Banon, "Elasticsearch," [Online]. Available: https://www.elastic.co/.

[5] O. V. R. V. P. Nikita Kathare, "A Comprehensive Study of Elasticsearch,"


International Journal of Science and Research (IJSR).

[6] A. A. H. Karau, "foursquares now uses Elasticsearch," [Online]. Available:


https://engineering.foursquare.com/2012/08/09/foursquare-now-uses-elastic-search-and-
on-a-relatednote-slashem-also-works-with-elastic-search/
Thank You!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy