Review OSINT Tool For Social Engineering
Review OSINT Tool For Social Engineering
*CORRESPONDENCE
In recent years, we observed an increase in cyber threats, especially social
Martina Nobili engineering attacks. By social engineering, we mean a set of techniques and tools
m.nobili@unicampus.it to collect information about a person or target to extort sensitive information.
RECEIVED 19 February 2023 Such information might be used for (industrial) espionage, to blackmail the user,
ACCEPTED 31 July 2023 or represent the starting point to perform malicious cyber attacks against the
PUBLISHED 01 September 2023
individual or, more often, against the organization they work for. The human factor
CITATION
Nobili M (2023) Review OSINT tool for social
is often the most vulnerable element in the security of any system, and the mass of
engineering. Front. Big Data 6:1169636. information we disseminate online largely facilitates social engineering activities.
doi: 10.3389/fdata.2023.1169636 To prevent and mitigate social engineering attacks, Open Source INTelligence
COPYRIGHT (OSINT) techniques and tools can be used to evaluate the level of exposition of
© 2023 Nobili. This is an open-access article an individual or an organization. OSINT is the collection of information through
distributed under the terms of the Creative
Commons Attribution License (CC BY). The use, open sources, that is, sources not protected by copyright or privacy. The article
distribution or reproduction in other forums is reviews the main OSINT tools for countering and preventing social engineering
permitted, provided the original author(s) and attacks. Specifically, it proposes the different tools diving them accordingly to the
the copyright owner(s) are credited and that
the original publication in this journal is cited, in specific information they allow to track (e-mail, social profiles, phone numbers,
accordance with accepted academic practice. etc.).
No use, distribution or reproduction is
permitted which does not comply with these
KEYWORDS
terms.
open source intelligence, OSINT, social engineering, cyber security, cyber threats
1. Introduction
In the recent years, we have seen increased access to online resources and the
development of more and more services on the network. This phenomenon, fostered by
the COVID-19 pandemic, has increased data production and the growing exposure of each
of us on the network. Added to this is an intensifying use of social networks, often related
also to work activities. All this has simplified our lives, making many activities in different
contexts more flexible, avoiding a lot of physical travel, and allowing more cost-effective
management of our time. At the same time, however, it exposes us to greater dangers and
risks of a different nature. The number of cyber-attacks increases rapidly. Only in 2021, the
average number of cyber-attacks and data breaches increased by 15.1% from the previous
year (Forbes, 2023). The data stolen through these attacks is increasingly sensitive and
puts the security of individuals and organizations at risk. Moreover, attack techniques have
become more and more complex and structured. At the same time, several “user-friendly”
tools, often free, are available on the internet (e.g., the SET tool in Kali Linux, 2023). As a
result, an increased number of cyber attacks is targeting small enterprises and professional
firms with fewer defensive capabilities (Kaspersky, 2023). The main strategies of these attacks
are phishing and setting up fake websites to steal the data of users who try to use them.
In particular, the most used attacks vector is ransomware which generally exploits social
engineering strategies to perform malware delivery. Social engineering means collecting
information about a person’s behavior to extort sensitive information and perform malicious
actions against the individual or organization for which they work. Such information might
be used for (industrial) espionage to be sold on the dark web or to blackmail the user.
Nevertheless, they might represent the starting point to perform malicious cyber actions
against the individual or, more often, against the organization for numerous data related to the subject can be inferred. To this end,
which he or she works. These stolen information can lead to the they integrate the results of Maltego with Twitter search with a
execution of a sophisticated cyber-attack. specific tool, as Tinfoleak (Tinfoleak, 2023). Related to the use of
This attack greatly impacts the company in terms of cost Twitter, another example of using this source is given by Hoppa
and reputation. On average, a worldwide data breach costs et al. (2019). In this study, a pipeline is developed for automated
approximately 4.35 million dollars, a figure that rises to 9.44 million data collection, represented by tweets. Another threat comes from
dollars in the United States. In particular, the economic impact password theft, which is still one of the most vulnerable elements
is more severe in the healthcare sector, costing approximately 10 in the security system as it is closely related to the human factor.
million dollars per attack (IBM, 2022). These figures then do not The study Kanta et al. (2020) shows how using OSINT techniques
consider the impact of these types of attacks on citizens’ lives. can speed up the collection of information on subjects. In this case,
An example of these attacks and their consequences for citizens it was carried out in a “positive” way, going to provide additional
is the attack on the Lazio region healthcare system in Italy in support to police investigations. However, this does not exclude
August 2020. The attack resulted in a halt of the administration of the use of the same techniques maliciously, thus threatening an
the patient care system as well as a slowdown in the vaccination organization’s security and integrity.
campaign that was taking place (Regione Lazio, 2020). OSINT can be also used to collect files protected by
There are different types of social engineering attacks that cryptographic algorithms. Using several techniques (see for
can be classified, taking into account various aspects (Salahdine example Mozaffari-Kermani and Reyhani-Masoleh, 2011a,b), a
and Kaabouch, 2019). Nevertheless, they have a common pattern malicious attack can crack the protection of inadequate algorithms
of execution that starts with the acquisition of meaningful or keys and consequently diffusion of the stand information. In
information about the victim (target) and the subsequent this way, the attack can acquire more sensible information. Notice
connection with it. Hence, it is crucial for the attacker to collect that while the simple collection of files as the explanation of the
relevant information about the victim to know personal details, meta-data can be considered an open all, the crack of such file
passions, lifestyles, and what is useful to foster a “link” with them violates the OSINT framework because it cannot be considered
to carry out the attack. The nature of information of interest to the as a “gray source” activity violates user rights and law in several
attacker depends on the type of attack being carried out. The latter countries. However, this cannot be considered as a barrier for
may interest the individual, the organization of the company in a malicious actor; hence, it is mandatory to use free available
which they operate or other elements relating to the work or private files with strong cryptographic algorithms (Mozaffari-Kermani and
context that can be exploited, for instance, to construct a targeted Reyhani-Masoleh, 2009).
spear phishing attack. For an attacker, there are different strategies Notice that OSINT can also be used to contrast social
to perform such a task. Moreover, the channels used to conduct engineering attacks. Indeed, understanding what kind of data is
a social engineering action can include emails, instant messages, exposed on the network is extremely relevant to design appropriate
social networks messages, and telephones (Krombholz et al., 2015). awareness campaigns (Assenza et al., 2020). Hence, there are
One approach increasingly used by attackers is to search inside several examples of integrating the OSINT methodology into the
the so-called open sources, exploiting Open Source INtelligence organization’s security system, particularly in the cybersecurity
(OSINT) (Ariu et al., 2017). OSINT can be defined as “the field. Lande and Shnurko-Tabakova (2019) analyzed integrating
intelligence discipline that pertains to intelligence produced OSINT within a cyber defense system and highlighted the key
from publicly available information that is collected, exploited, benefits of this practice for an organization, such as cost reduction,
and disseminated promptly to an appropriate audience to effectiveness, and data volume. The article highlights methodology
address a specific intelligence and information requirement” (USA and techniques to integrate existing resources through OSINT.
Headquarters Department of the Army, 2012). Thus, it represents Hayes and Cappa (2018) presented an example of a risk assessment
the discipline of intelligence gathering on data sources not covered conducted on a target company using only OSINT sources and
by privacy or copyright exploiting techniques able to retrieve tools. They identified vulnerabilities present in the corporate
information ’left behind,’ either voluntarily or through carelessness, network. Moreover, in their investigation, they were able to identify
by a user on the internet or social media. the personal information and opinions of the employees. It emerges
OSINT has been used since the early 20th century, relying on how critical it is to adopt specific security policies to avoid
research from traditional sources such as newspapers. From the disseminating potentially sensitive information and the need to
birth of the Internet, this discipline has experienced ever greater activate proactive OSINT-based initiatives to reduce the risk of
development and growth (Hassan and Hijazi, 2018). This spread accidental information leakage. There are also attempts to integrate
and growth are closely linked to the increased availability of data OSINT techniques in security standards, as suggested in AlKilani
on the net, which is very often freely accessible. and Qusef (2021), where OSINT techniques are used to assess
Several examples in the literature of attacks based on OSINT companies’ compliance with ISO 27001.
techniques to steal sensitive data exist. Khanna et al. (2016)
analyze how to subtract and elicit personal information through
this methodology. Using the Maltego tool (Maltego, 2023) and its
extension allows the collection of data inherent to specific targets, 1.1. Contribution
such as email, social profiles, profiles linked to the specific email
address, and phone number. Possible countermeasures to this type This study reviews available OSINT tools for performing social
of attack are highlighted in the study. Another study Uehara et al. engineering activities. We considered tools for the collection of
(2019) shows how starting from an email sent to a subject X, information, analyzing some of the most popular and widely
used tools, presenting their features and limitations. We provided of two basic parameters: the source’s reliability and the news’s
an overview of which kind of information and/or vulnerabilities truthfulness. Notice that the reliability of a source does not
can be collected using such methods to provide an instrument automatically make the news true. In addition, the fact that a
to understand the level of exposition and, consequently, define source has provided true news does not make it to be considered
adequate protection initiatives. Consequently, we mainly focus on reliable.
data relevant to arrange social engineering attacks such as email, 4. Exploitation: Here, we consider the "use" of the information
username, phone number, and social profile. to perform the intended task. If the processing should not be
The article is organized as follows. Section 2 presents the completed to avoid delay, it should be indicated. While with
OSINT methodology. Section 3 illustrates the different social appropriateness, one must respond to the user’s requests and be
engineering tools divided into five application groups based on accessible and understandable to them.
the type of information elicited using the specific tool. Section 4
The definition of these steps appears to be common to the
presents an experimental validation of the tools. Section 5 collects
different investigations. Nevertheless, it may occur in various
some consideration and possible future studies, while Section 6
forms depending on the contexts and the analyst’s choices, but the
reports the study results and relevant conclusions.
step nature is common and available in literature (Hwang et al.,
2022). The methodology is adapted to the needs and demands
2. OSINT methodology of the research, as in the case (Lee and Shon, 2016) where a
framework for information gathering in critical infrastructure is
As discussed above, in the last years, an exponential increase presented. Another method to integrate open sources is presented
of data available on the web has been observed. Hence, in in Pastor-Galindo et al. (2020). In this case, the OSINT cycle is
parallel, the relevance of methodologies and tools able to help integrated with the DML model representing abstraction levels in
users to retrieve valuable information from this huge amount cyber attack detection.
of data has also grown. In this context, OSINT represents an
effective methodology to search, collect, analyze, purge data, The general framework can be tailored to the aim of this
and potentially exploit relevant information. Doing an OSINT article, considering how different resources can be used to collect
investigation, there is a risk of finding too much inaccurate information with the OSINT methodology.
information that generates noise and not the correct result.
1. Direction: In this, we should consider and analyze investigation’s
These “false positives” and searches are being often carried
aim and the type of resources. We can consider the search engine,
out quickly, and there can be bias and confusion in the
email, username, phone number, and social network.
results.
2. Collection: In this phase, we use the selective source to obtain
For this reason, it is fundamental to define a correct plan
the raw data about the target of the investigation.
on how to proceed in an OSINT investigation. The first step is
3. Elaboration: After collecting the data, it is important to analyze
represented by the information’s research. It is generally based on
the output of the previous step to obtain valuable information.
the research of keywords and analysis of images. Then one has to
To this end, the data are generally aggregated and combined
perform correlation among data to refine the information. And
to elicit information. Techniques to analyze the data are not
finally, such information needs to be exploited to perform the
discussed in this article.
intended task.
4. Exploitation: In this phase, it is essential to produce a report or
Then, a process to manage the OSINT process. We can define a
a document where the investigation results are presented.
four-step process:
Figure 1 presents the different phases of research information.
1. Direction: The first step of the process. It is based on the
As mentioned earlier, the study focuses on information
determination of information needs. It consists of defining
gathering. To carry out this phase, it is essential to define the steps
objectives, appropriate sources, and choice of time frames.
to be performed and followed. It is possible to define the structural
This is the less automatized phase and largely depends on the
information collection flow:
research experience. Notice that this can be considered the
most critical phase because any mistake in the definition of 1. Target. The first aspect is to identify the target of the attack,
the relevant questions or deficiencies in the data source may typically a company. After that, it is necessary to search for
produce dramatic consequences in terms of output quality. the information related to it. In this phase, the collection of
2. Collection: It constitutes the second phase of the activity. In information is performed on different types of information
this step, data are collected from the identified sources. This is sources, but it is started by using a search engine. The aim of
generally performed using bots or scrapers. this step is to collect data about the targets, their structure, and
3. Elaboration: Intelligence analysis is performed on the collected their characteristics.
data. Such tasks are devoted to integrating the information 2. Company. After researching possible targets and identifying a
provided by the different sources and identification of data company of interest, information about it is sought in order to
incoherence and lack of information. This process is generally perform the attack. Information is first sought on search engines
divided into two sub-stages: aggregation and evaluation. In the even if preliminary information has already been identified. At
first sub-stages, data are grouped into interrelated information. this stage, it is important to identify the company structure, how
In the second stage, such information is evaluated in terms it works, and how it communicates.
FIGURE 1
Summary of OSINT circle to collect information.
3. Employees. Starting from the company’s information, an attempt 3.1. Search engine
is made to identify the employees. On the base of the specific
attack strategies, it should be of interest to identify specific One of the most useful and used OSINT tools, and the
hierarchies inside the organization, e.g., chief financial officer, very first starting point for any information collection campaign,
IT responsible, and legal. In this activity, different types of is the search engine. They are used by everyone every day to
tools could be helpful, i.e., email identification and social realize textual queries on the web. However, these tools generally
networks. provide many answers, that are not always completely reliable. This
4. Employee information. After the identification of an represents the main criticality of this class of tools. Hence, it is
employee, it begins the research of information about important to carry out targeted research exploiting the features
them. All possible information about them and their of the tools to refine, circumscribe, and unbias the research
lives is sought. One tries to identify and understand their results.
interests and relationships to generate a targeted attack on Google and Bing are the most widely used search engines
them. globally. They are certainly the best-performing tools in search
depth and the number of indexed sites. Both have filters that allow
From the structured collection of this type of
refinement and more precise and targeted searching by extracting
information, an attack toward a possible target can be
only the information of interest. In this way, the amount of
defined and carried out. Hence, this information and
information to be verified is limited in addition to selecting more
structure should be protected to prevent social engineering
precise data. Table 1 shows a list of these filters. Nevertheless, both
attacks. In Figure 2, the main information to be collected
Google and Bing are companies tracking searches and activities
is summarized and divided into the macro-groups
the users perform to sell the data. For this reason, both have
highlighted above.
shortcomings from a privacy perspective.
A lesser-known search engine is the Russian Yandex, very
3. Review of the OSINT tool popular in Eastern Europe. It has no different features than Google
and Bing, also presenting operators that limit language, date, and
This section analyzes some of the most common and type of file searched. The main feature of this search engine is that
effective OSINT tools to collect personal information. We it performs very well in image searches.
focus on tools able to support the performance of social Moreover, an interesting search engine is DuckDuckGo as this
engineering attacks and, conversely, helpful to understand engine does not collect or share users’ activities and personal
and control the information exposed on the Internet information. Therefore, it can be used to maintain the user’s
to identify potential vulnerabilities to the security of a privacy protected. A peculiar characteristic of DuckDuckGo is the
company or an individual. We divided this section into possibility to use the bangs ! operator to limit the research on the
five subsections, each focused on tools designed to collect specific source (e.g., !tw limit the research to Twitter).
information about emails, usernames, phone numbers, and social
networks.
Exploiting the OSINT tools described in this section allows one 3.2. Email
to elaborate and collect the information illustrated in Section 2.
The order in which they are presented follows the structure of the Nowadays, each company provides an individual email address
research, starting from the elements that are easier to acquire and to each employee. Many have more than one email address not
then used as a starting point for further refinement and enrichment just for their job but also for their personal life. Personal email
of information about the target. address is very often associated with online services. Protecting the
FIGURE 2
Different collection steps of information to perform an attack.
AND It returns searches that have both words present in the search Google, Bing
OR It returns pages that contain at least one of the keywords entered Google, Bing
Filetype It narrows down the results to a certain file type, e.g., pdf, doc, and docx. Google, Bing
() Group words or search operators to control how the search is done Google
Intitle/Allintitle It returns only pages that have one/all the words specified in the title Google, Bing
Inurl/Allinurl It returns only pages that have one/all the words specified in the URl Google
Intext/Allintext Returns only pages that have one/all the words specified in the text Google
Ext It returns only Web pages with the specified filename extension Bing
Inanchor/Inbody These keywords return Web pages that contain the term specified in the metadata Bing
3 Haveibeenpwned Online application to verify if an email was compromised in a data breach Haveibeenpwned, 2023
5 Email reputation Web application to verify the reputation of an e-mail address Email Reputation, 2023
2 KnowEm It checks on different platforms whether a given username has been used Knowem, 2023
3 NameVine Tool to analyze the presence of social network profile from a username Namevine, 2023
5 WhatsMyName It is a web application that permits to search a username in different domain WhatsMyName, 2023
work email address and the personal one is important because they Usernames are names associated with profiles, often related to the
are largely used for massive cyber-attacks and phishing campaigns. target’s characteristics or passions. The username alone often does
Moreover, if the email of the target is known, it is possible to derive not provide information that can immediately be used to construct
connected profiles and, consequently, personal information about a social engineering attack. However, it allows one to determine
them. People use the same email to register on several websites profiles on social networks or other platforms and, from them,
and, online services and to participate in e-promotion, often using discover unknown email addresses of the target. Investigation of
the same password as in their business email. Unfortunately, the usernames can be performed starting from a tentative to find
level of security of these websites and services is not as high as the possible matches. Moreover, it is also possible to start with a
business accounts, and it is relatively easy to collect passwords from username associated with a profile on a specific domain and look
them. For this reason, emails represent the first step in searching for for other profiles with the same username.
elements to arrange a social engineering attack. Hence, monitoring The first tool for usernames that we analyze is UserSearch.
how employees spread their email on the Internet is a cornerstone UserSearch (2023) permits searching different profile types starting
element to be considered in any cyber security strategy. from a username. It was specifically designed to perform research
There are two classes of search tools: look at the domains on social networks, but it can also be used for specific websites
associated with a company or check known addresses. We present and applications. It also has an extension to perform searches on
examples for both categories, starting with tools to obtain domain email addresses as well. It returns the profiles associated with the
emails. username found on a given platform.
Hunter (2023) is a web application that allows one to search for Knowem (2023) has a similar functionality to the previous one.
the email addresses of a given company. By checking predefined It tests a given username on different types of platforms. In this
combinations of company addresses, it searches for online case, however, it only returns information on whether the name
correspondence from them. It also allows testing and verifying has been used on a determined platform not about the associated
emails. Through the browser extension, it allows employees’ emails profile. Unlike the previous case, however, KnowEm performs the
to be viewed from the website and thus have the names of search on a greater number and types of platforms.
employees. Another tool for analyzing usernames is NameVine. Namevine
Emailformat (2023) has a list of domains and allows one to (2023) is a tool to analyze usernames on a limited number of
search for a specific one and verify its email address composition. platforms. It provides results on whether a match exists on one or
It then has a list of email addresses representative of the searched more of the analyzed platforms and provides the link to the profile
domain with possible verification found on the web. This provides found.
a list of possible employees and places where they used that email. Slightly different as a tool is Leakcheck (2023). This tool allows
It works very well for US domains but also matches with domains one to check whether a given username is present within a data
from other nationalities. breach. It returns as a result the domains that were breached and
Now let us start with a tool to verify an email address and the date of when this occurred. The results highlighted the presence
obtain information relative to it. Haveibeenpwned (2023) was born of data profiles and the actual use of the username.
as a web application to verify the compression of an email address The latest tool for this section is WhatsMyName.
after a data breach attack. Moreover, the tool also proves other WhatsMyName (2023) is an online tool that allows one to
information that could be stolen during the data breach. search for a given username on over 500 online platforms. It
Epieos (2023) is a tool to verify the email addresses. returns the matches it finds, indicating the type of platform and the
Furthermore, it analyzes how the email is used and shows potential link to the identified profile.
profiles connected to the email. These profiles represent social In the Table 3 we report a summary of the tools proposed.
services or applications where the target has used the same
email in the enrollment process. In this way, it is possible to
acquire information about the target’s interests, communities, and
activities. In addition, this tool can return information about the 3.4. Phone number
contact, i.e., name, surname, and username.
Email Reputation (2023) is a web tool to verify the reputation Private or business phone numbers are largely exploited to
of an e-mail address. The tools search the web for any profile or perform social engineering attacks. Notice that, in several countries,
service that uses e-mail. It returns the verification of the email, with phone numbers are considered sensible data. It is possible to obtain
a grade of accuracy, and the profile or service associate. It does not different types of information about the target from the phone
come back to the specific profile but gives the existence of a profile number. Moreover, if one knows the target’s phone number, it
in that social. can be used to directly realize an attack, sending fake messages
In Table 2, we report a summary of the tools proposed. containing malicious links or malware.
As for the previous case, there are different classes of tools to
obtain the phone number, either to retrieve the latter in larger data
searches or to retrieve data obtainable from a given phone number.
3.3. Username In the first case, the phone number is the object of the research;
in the second case, the attacker has discovered the phone number
One aspect that is generally underestimated but is very relevant from a different source and he/she wants to associate it with a target
in gathering information about a target is the search for usernames. to acquire more information.
2 Syncme Web application that permits obtaining name and photo of the subject from phone number Syncme, 2023
3 Phone validetor Tool to obtain information about the phone number Phone Validator, 2023
5 True Caller Verify real user of a phone number True Caller, 2023
An example of a tool to elicit the phone number of a target There are many social networks, each with its own
starting from knowing the target’s email is Email2phonenumber characteristics and peculiarities. Consequently, it is useful to
(2023). As a Python OSINT tool, it permits obtaining the phone have both tools able to search information on several social
number of a target just by having his email. It uses a scraping of networks at the same time (i.e., cross-media search) and also tools
different platforms, searching the phone numbers associated with designed to be able to extract information from specific platforms,
the email. such as Facebook, LinkedIn, and Twitter.
Syncme (2023) is a tool that allows one to search for a phone First, we analyze tools to perform cross-media investigations.
number and obtain information about the owner. Specifically, the Social Searcher (2023) is a tool that permits obtaining the social
free version permits only to see the location, the name of the profile from the username or name of a subject. This tool
subject, and possible photo. The paid version allows obtaining more investigates different sources, i.e., Facebook and Instagram. It
information such as the photo of the phone number’s owner from provides a list of the possible profiles of the subject on each social
social networks and a report of his past activities. This type of media.
information could be very relevant to design sophisticated social A similar tool is Webmii (2023). Webmii returns the social
engineering attacks. profiles associated with a name. In addition, it associates a relative
Another tool that allows obtaining information about a phone score to the profile, representing the reliability of the result. It
number is Phone Validator. Phone Validator (2023) allows to search associates with the result the sources of the profile, i.e., web and
the phone number of a target and find information about the last social. It also provides username discovery on the social network.
location, the type of the number, and the phone company. The An interesting feature is that it provides information about the
paid version allows obtaining also information about the owner. A people connected on social media with the target. Lastly, it is
limitation of this tool is that it is only usable with North American associated with the Google search engine.
numbers. Tools that analyze different social networks provide a broader
The tool Moriarty Project (2023) is a Python tool that allows overview of the analysis. However, they may generate many
searching phone numbers. It permits to search for different aspects: false positives, i.e., profiles that are not referable to the target.
the owner of the number, if it has a spam risk situation, possible link This imposes the user to perform further analysis to check the
connect with the number, and possible social platforms or profiles quality of the results. To partially overcome such limits, it is
connected with it. possible to use tools tailored to research single social networks.
True Caller (2023) is an OSINT tool to identify whose telephone In particular, we will look at some of the most widely used
number it is, whether it is in the name of an individual or a number social media, i.e., LinkedIn, Instagram, and Twitter. Let us start
linked to a company. In addition, this tool makes it possible to with Linkedin, a professional social network. It allows one to
search for numbers from different countries. obtain a wealth of professional information about the target
In Table 4, we report a summary of these tools. and the work environment and company in which he or she
works.
RocketReach (2023) is a web application that knows the name
3.5. Social network of a target (both an individual or a company), and it allows
extracting from LinkedIn information about the target. Starting
Most employees use social networks, even during business time, from the target’s name, it gives back the associated profile and
leaving social media footprints, i.e., trace of the daily activities possible contact information. Moreover, it verifies the existence
performed by the user on a social platform. This information can be of such information on the web. At the same time, starting the
used by the attacker to perform social engineering attacks. Indeed, research with a company name, it is possible to obtain global
from the analysis of the social media footprint, it is possible to information about it, i.e., headquarters address, website, and area
understand the habits, the common activity, and the interests of the of expertise. In addition, employee information and profile are
target. There are different services and tools to collect information shown. In this way, it is possible to obtain the email address and
about a target from a social network. It is important to underline information about the people that work in the organization.
that a company should adopt specific policies regarding the use of We now look at Instagram, a social network from the Meta
social media by its employees to prevent the spread of sensible data group, where users share photos, videos and activities, often
on social platforms. indicating their location.
Pikuki (2023) is a tool that allows one to see Instagram Continuing TikTok, the platform is a Chinese social network
posts without having an account on the social network. It allows that is becoming increasingly popular among younger people. It is
searching for a profile without knowing the username associated based on making short videos of different themes.
with it, just enter a first and last name. The system is not always UrleBird (2023) is an OSINT tool that permits visualizing
constantly updated, so sometimes it shows posts that have been TikTok profiles and videos without an account on the social
deleted by the user, and this could provide interesting data. It network. It is similar to the tool presented above for Instagram. The
should be noted that the system allows viewing only posts from research is possible both by username and by hashtag. The research
public profiles, while for private profiles, the tool is able to provide for hashtags could be very helpful when the username is not known
just the profile photo and the associated username. but the activity or the subject of the channels is. Finding a profile, it
Another tool with similar functionality is Pixwox (2023). It is a is possible to see the profile photo and the description in addition
tool that allows viewing Instagram profiles without having a profile to the shared videos.
on the platform. The main difference with respect to Pikuki is the Finally, let us turn to message applications, i.e., WhatsApp and
possibility to download the profile photo of any profile, even private Telegram, which can be considered social networks and allow us to
ones. This aspect is very useful when searching for a target because obtain significant information.
it allows us to extend the search activities also to images. Moreover, WATools (2023) is a tool to track WhatsApp activities. It
it allows viewing the stories saved on the profile as well as making permits monitoring the access to the application and the duration
downloads of the posts. of its use of it. It is possible to activate a function that sends
From the Meta group is Facebook, one of the most popular a notification when a contact is online and to analyze when a
and widely used social networks. Over the years, its popularity person is connected to the platform. It also allows you to view and
and target audience has changed a lot, but it remains a daily diary download the profile photo associated with the phone number. It
of many users’ activities and thoughts. As a result of some user can be very useful in verifying an identified phone number and
privacy issues, however, it has undergone many restrictions that continuing an image search.
led to the shutdown of many OSINT tools designed to perform Another messaging platform that is becoming increasingly
analysis on this social media. However, accessing its data through popular is Telegram. Telegram is not just a standard messaging
some alternative techniques is still possible. platform, where you communicate between known phone
In particular, it is possible to take advantage of some tools that numbers, but it allows also you to create channels to discuss about
are not really for OSINT use but that allow viewing web pages and, topics of interest. You can also interact with chatbots. Telegago
Facebook profiles without logging into the social network or having (2023) is a tool that allows investigating inside the functionality
a Facebook profile. Indeed, one can search for the user’s profile by of Telegram. The search is performed by keywords and returns
taking advantage of the operators in the search engine section. Once different types of results. First, it provides an overview of the
the profile is found, one can test the mobile-friendly mode, which results associated with the topic entered as a keyword, and then it
allows one to test viewing a web page on a mobile device. This is shows the public channels that deal with that topic or have talked
usually a feature that is used by developers when building websites about it in posted messages. It also allows seeing contacts involved
or applications. Once it tests the page, it generates the HTML code with that topic, voice chats, and bots. It proves very powerful data
that describes it. Copying the same to any code viewing tool will if you want to analyze a particular phenomenon or establish a
result in the page being found. Obviously, with this type of search, relationship with the target subject of the attack.
navigation on the user’s profile is limited, and it is up to the analyst In Table 5, we report a summary of the tools proposed.
to highlight the information present.
Now, we turn to Twitter, one of the most widely used and
popular social networks. Twitter is based on writing short texts 3.6. Collective tools
expressing one’s thoughts on news facts, events, passions, etc.
To analyze Twitter, one can refer to Truth Nest. Truth Nest In addition to the tools presented so far, there are instruments
(2023) is a tool that allows us to get from searching the username that allow to search and analyze different kinds of information from
of a Twitter profile to find the info about it. As with Instagram, different types of data. These types of tools make it possible to
the tool will enable us to analyze and see some tweets without collect amounts of data of different types within a single search.
logging in to the platform. In this case, it is necessary to subscribe Maltego (2023) is one of the most well-known and widely used
to the service. The information it returns is varied. First, it OSINT tools. The system collects and links data from different
provides preliminary information about the profile, such as the sources and reports them within a single dashboard via graph.
name, when and where it was created, and the description that The system is based on two concepts entities and transformations.
the user has entered. In addition, it provides an overview of the Entities are represented as nodes in a graph. Investigations begin
activities performed by the profile and the most popular posts with one or more entities, on which transformations are performed
it has made. An interesting feature is a possibility of having to explore the relationships between these entities and other
information about the profile’s network, both people who are yet unknown information. Entities can be of different types-
followed and those who follow it. Finally, it returns information emails, phone numbers, people, in-directories, web domains, etc.
on how to interact with the profile, such as topics it has talked Transformations, on the other hand, are pieces of code that, when
about. All statistics are collected in a PDF file that can be executed, generate information based on information we already
downloaded. have. Transformations look up information about an entity in the
2 Webmii Tools that permit to obtain information about social profiles starting by the name of a subject Webmii, 2023
3 RocketReach Web application to search a person or company to obtain the email address and additional information RocketReach, 2023
4 Pikuki Tool to search and visualize Instagram profiles without an account Pikuki, 2023
5 PixWox Tool to search and visualize Instagram profiles without an account Pixwox, 2023
6 Truth Nest Tool to analyzed Twitter profiles from username Truth Nest, 2023
7 UrleBird Tool to search and visualize TikTok profiles without an account UrleBird, 2023
Advantage of this type of tool is the possibility to collect data 3 s***********y@******.com NF/P P
from one tool without the necessity to switch to different sources.
4 e*******h@*****.com P P/P
The disadvantage is that it could produce confusion on the results.
They produce a lot of results that it is needed to verify. 5 a**a@*****.com P P
This type of tool is not comparable with the others for the 6 g**************o@**************.com NF NF
characteristics they have. 7 f************o@**************.com NF NF
8 c*************c@******.com NF NF
4. Experimentation 9 m************r@************.com NF P
their capability to acquire a profile starting from a given username. numbers, testing it with just a single number from this country was
Notice that a username can be used multiple times by the same user possible, and it found a correct match.
or different users and on different platforms. Matches were sought The Email2phonenumber tool did not match any phone
among the results proposed by the same tool and by comparing the number associated with the emails.
results presented by the other tools examined. In the second phase, The results were reported in Table 8.
a precise match was sought with the target under examination.
The different tools responded well to the tests performed,
reporting a nearly 100 percent positive result rate when comparing 4.4. Social network
the results between the different proposed tools. The major
limitation at this stage is that one is not researching a person For social network validation, as in a classic OSINT search, we
specifically but testing the accuracy of a product, so it cannot be started from the results obtained in the previous stages, i.e., names
ruled out that in a search for a specific target, the tool would not and usernames.
have a high rate of false positives. This is because they presented First, the tools that simultaneously performed a search and
different matches on different platforms by testing generic names. analysis of multiple social networks, Social Searcher and Webmii
Certainly, good reliability of the products emerges, but this must were analyzed. Regarding the first tool, it has a low success rate,
always be accompanied by human analysis to verify the correlation failing to find a match with as many as six of the ten profiles tried.
between the results. The presented profiles can be an additional step Of the remainder, a success rate of 50%. Webmii, on the other hand,
in information collection, but false positives must be eliminated. presents more satisfactory results, giving no match in only one case
An exception in this discourse is LeackCheck, a tool that verifies and with a success rate of 70% and with three cases of false positives.
the presence of a username within a data breach. The platform had We then moved on to analyze specialized tools. It started with
many matches with time-dated attacks and profiles from platforms RockReach, which analyzes LinkedIn profiles. This tool is one of the
less used in recent years, which may be a limitation in the use of the best performing tested, having found 80% of the profiles and with
tool. 100% success rate. This tool identifies usernames used later or other
It is possible to see the summarized results in Table 7. analyses.
Two Instagram-specific tools, Pikuki and PixWox, were tested,
presenting the same results. In this case, the profiles identified are
only 40%, with a 50% rate of false positives. This depends on several
4.3. Phone number factors, such as the presence of homonyms. Only one case, however,
yielded no matches. It is difficult to understand whether there is a
For the phone number, we use a phone number generator lack in the tool or if the searched individuals do not have a profile
available online to generate 10 different numbers to test. The on the social network in question.
different tools respond with interesting results to the test. For Analyzing the TruthNest tool that specifically searches for
the tool where the geographic spread was limited to one country, Twitter profiles, many profiles were absent on the platform. In this
we tested only the numbers of that country. It is the case of case, we can have more confidence that these profiles do not exist as
Phonevalidator, which responds with a 10% rate of error, but it there were no hits on other tools indicating the presence of Twitter
covers only the United States, and it reported the geographic area profiles. Twitter is also a lesser-used social and used primarily for
and location of each number identified and if known also number business communication. Of the matched results, there is a 70%
characteristics. For the other tools, numbers from other countries positive rate.
were used in the tests, and good results were obtained. Sync.Me Finally, the tool that searches for and displays TikTok profiles
obtained an error rate of 30%, showing a preference for finding was analyzed. In this case, the validation is biased as the searched
U.S. phone numbers. In contrast, the Python tool Moriarty Project profiles are not the platform’s target and may most likely not have a
searches on numbers from different countries found only one error, profile on the platform. In this case, the false positive rate found is
thus an error rate of 10%. The two tools provide almost the same 70%, with only one case found to be positive and two cases of which
information about the phone number’s owner. No tool was able to no correlation was found in the platform.
obtain additional information about the phone numbers. Sync.me, All the results are presented in Table 9. From it, it is possible to
however, showed the associated user’s name, while the Moriarty and get a summary view of the performance of different social network
PhoneValidator tools provided information on the type of number survey tools, comparing them with each other, and it also allows
found and the operator associated. us to highlight how these results are also influenced by the subjects
In the second step, the tools were also tested through the phone being researched. In fact, it is evident that some subjects are more
numbers found in the information search through the other tools. prone to activity on social networks, having encountered profiles
This was done using the same procedure as in a classic OSINT on almost all platforms, compared to others with almost no profile.
survey. As such, it was possible to get a greater and clearer view
of the effectiveness of these tools. During the other phases of the
research, three different phone numbers were found, from three 5. Discussion and future studies
different countries, and they were all tested by different tools.
Sync.me and TrueCaller showed the best results by associating it The article shows several tools for carrying out a
with the searched user. Moriarty Project tool was unable to find any social engineering attack. All these tools can be used both
of the three numbers. Being PhoneValidator available only for U.S. to prepare for an attack and to defend against it. It is
2 m*****e P FP P P P
3 k*****a P/FP FP P P P
2 +1 4********4 P P P P
3 +1 9********0 P P P P
4 +1 8********5 P P P NF
5 +1 5********8 NF FP P /
7 +39 3********8 P / P P
8 +44 1*******2 P / P P
9 +46 1********1 NF / P P
10 +1 1********5 P/FP P P P
P is a positive result, NF is not found, and FP is a false positive.
2 a*************z@******.com NF P P P P NF FP
3 s***********y@******.com FP P P FP FP P NF
4 e*******h@*****.com FP FP P FP FP NF FP
5 a**a@*****.com P FP P FP FP FP FP
6 g**************o@**************.com NF FP P FP FP P FP
7 f************o@**************.com NF P P P P P P
8 c*************c@******.com NF P NF P P NF NF
9 m************r@************.com NF P P P P P FP
10 o***********f@************.com NF NF NF NF NF NF NF
P is a positive result, NF is not found, and FP is a false positive.
important to know them in order to understand one’s to perform an assessment of the exposed data and prevent
vulnerabilities and try to protect oneself from possible it from remaining so in order to protect yourself from
malicious attacks. It is possible to use the tools presented potential threats.
Author contributions
Clearly, it is not possible to identify a tool that provides The author confirms being the sole contributor of this work and
error-free results. These tools allow us to help in the search for has approved it for publication.
information, but they are not infallible, and it is always necessary
to go and verify the information obtained. Several tools have been
presented that allow us to obtain different types of data, but it is Conflict of interest
necessary to combine the different information obtained in order to
get an overview of the subject being researched. An overview of the The author declares that the research was conducted in the
different proposed tools and their error rate and success in correctly absence of any commercial or financial relationships that could be
identifying the target is given in Table 10. As mentioned earlier, it construed as a potential conflict of interest.
is not possible to find matches for all the proposed tools and keep
in mind that the Maltego and Lampyre tools are not comparable.
In addition, tools that provide the structure of emails are also Publisher’s note
comparable with other results. The tools with the most effectiveness
seem to be those for detecting and individuating usernames. It must All claims expressed in this article are solely those of the
be kept in mind that the use of certain usernames associated with a authors and do not necessarily represent those of their affiliated
user must be verified and are not always used on other platforms organizations, or those of the publisher, the editors and the
as well. The research done and the data required are closely reviewers. Any product that may be evaluated in this article, or
interconnected, and therefore, there is a need for a combination of claim that may be made by its manufacturer, is not guaranteed or
the information and tools obtained. endorsed by the publisher.
References
AlKilani, H., and Qusef, A. (2021). “Osint techniques integration with risk Leakcheck. (2023). Available online at: https://leakcheck.io (accessed January, 2023).
assessment iso/iec 27001,” in International Conference on Data Science, E-Learning and
Lee, S., and Shon, T. (2016). “Open source intelligence base cyber threat inspection
Information Systems 2021, 82–86. doi: 10.1145/3460620.3460736
framework for critical infrastructures,” in 2016 Future Technologies Conference (FTC).
Ariu, D., Frumento, E., and Fumera, G. (2017). “Social engineering 2.0: a San Francisco, IEEE, 1030–1033.
foundational work,” in Proceedings of the Computing Frontiers Conference, 319–325.
Maltego. (2023). Available online at: https://www.maltego.com (accessed January,
doi: 10.1145/3075564.3076260
2023).
Assenza, G., Chittaro, A., De Maggio, M. C., Mastrapasqua, M., and Setola, R.
Moriarty Project. (2023). Available online at: https://github.com/AzizKpln/
(2020). A review of methods for evaluating security awareness initiatives. Eur. J.
MoriartyProject (accessed January, 2023).
Security Res. 5, 259–287. doi: 10.1007/s41125-019-00052-x
Mozaffari-Kermani, M., and Reyhani-Masoleh, A. (2009). “A low-cost s-box for
Email Reputation. (2023). Available online at: https://emailrep.io (accessed January,
the advanced encryption standard using normal basis,”? in 2009 IEEE International
2023).
Conference on Electro/Information Technology. Windsor, ON: IEEE, 52–55.
Email2phonenumber. (2023). Available online at: https://github.com/martinvigo/
Mozaffari-Kermani, M., and Reyhani-Masoleh, A. (2011a). “A high-performance
email2phonenumber (accessed January, 2023).
fault diagnosis approach for the aes subbytes utilizing mixed bases,”? in 2011 Workshop
Email-Format. (2023). Available online at: https://www.email-format.com (accessed On Fault Diagnosis And Tolerance In Cryptography. Nara: IEEE, 80–87.
January, 2023).
Mozaffari-Kermani, M., and Reyhani-Masoleh, A. (2011b). “Reliable hardware
Epieos. (2023). Available online at: https://epieos.com/ (accessed January, 2023). architectures for the third-round sha-3 finalist grostl benchmarked on fpga platform,”
in 2011 IEEE International Symposium on Defect and Fault Tolerance in VLSI and
Forbes. (2023). Alarming Cyber Statistics for Mid-Year 2022 That You Need
Nanotechnology Systems. Vancouver, BC: IEEE, 325-331.
to Know. Available online at: https://www.forbes.com/sites/chuckbrooks/2022/
06/03/alarming-cyber-statisticsfor-mid-year-2022-that-you-need-to-know/?sh= Namevine. (2023). Available online at: https://namevine.com/ (accessed January,
440e40d07864 (accessed January, 2023). 2023).
Hassan, N. A., and Hijazi, R. (2018). “The evolution of open source Pastor-Galindo, J., Nespoli, P., Mármol, F. G., and Pérez, G. M. (2020). The not yet
intelligence,”? in Open Source Intelligence Methods and Tools, Cham: Springer, 1-20. exploited goldmine of osint: opportunities, open challenges and future trends. IEEE
doi: 10.1007/978-1-4842-3213-2_1 Access 8, 10282–10304. doi: 10.1109/ACCESS.2020.2965257
Haveibeenpwned. (2023). Available online at: https://haveibeenpwned.com Phone Validator. (2023). Available online at: https://www.phonevalidator.com
(accessed January, 2023). (accessed January, 2023).
Hayes, D. R., and Cappa, F. (2018). Open-source intelligence for risk assessment. Pikuki. (2023). Available online at: https://www.picuki.com/ (accessed January,
Bus. Horiz. 61, 689–697. doi: 10.1016/j.bushor.2018.02.001 2023).
Hoppa, M. A., Debb, S. M., Hsieh, G., and KCa, B. (2019). Twitterosint: automated Pixwox. (2023). Available online at: https://www.picnob.com (accessed January,
open source intelligence collection, analysis & visualization tool. Ann. Rev. Cyberther. 2023).
Telemed. 2019, 121.
Regione Lazio (2020). Attacco hacker ai sistemi informatici della Regione Lazio.
Hunter. (2023). Available online at: https://hunter.io (accessed January, 2023). Available online at: https://www.regione.lazio.it/notizie/attacco-hacker (accessed
January, 2023).
Hwang, Y.-W., Lee, I.-Y., Kim, H., Lee, H., and Kim, D. (2022). Current status
and security trend of osint. Wireless Commun. Mobile Compu. 2022, 1290129. RocketReach. (2023). Available online at: https://rocketreach.co/ (accessed January,
doi: 10.1155/2022/1290129 2023).
IBM Cost of a Data Breach (2022). Available online at: https://www.ibm.com/ Salahdine, F., and Kaabouch, N. (2019). Social engineering attacks: a survey. Future
reports/data-breach (accessed January, 2023). Int. 11, 89. doi: 10.3390/fi11040089
Jalali, A., Azarderakhsh, R., Kermani, M. M., and Jao, D. (2019). “Towards Social Searcher. (2023). Available online at: https://www.social-searcher.com/
optimized and constant-time csidh on embedded devices,”? in Constructive Side- (accessed January, 2023).
Channel Analysis and Secure Design: 10th International Workshop, COSADE 2019,
Syncme. (2023). Available online at: https://sync.me/ (accessed January, 2023).
Darmstadt, Germany, April 3-5, 2019, Proceedings 10. Cham: Springer, 215–231.
Telegago. (2023). Available online at: https://cse.google.com/cse?cx=
Kali Linux (2023). Kali Linux Tools. Available online at: https://www.kali.org/tools/
006368593537057042503:efxu7xprihg#gsc.tab=0 (accessed January, 2023).
(accessed January, 2023).
Tinfoleak. (2023). Available online at: https://tinfoleak.com (accessed January,
Kanta, A., Coisel, I., and Scanlon, M. (2020). A survey
2023).
exploring open source intelligence for smarter password
cracking. Forensic Sci. Int. 35, 301075. doi: 10.1016/j.fsidi.2020.30 True Caller. (2023). Available online at: https://www.truecaller.com/ (accessed
1075 January, 2023).
Kaspersky. (2023). The Year Of Social Distancing Or Social Engineering? Phishing Truth Nest. (2023). Available online at: https://www.truthnest.com (accessed
Goes Targeted And Diversifies During COVID-19 Outbreak. Available online at: January, 2023).
https://www.kaspersky.com/about/pres-releases/2020the-year-of-social-distancing-
Uehara, K., Mukaiyama, K., Fujita, M., Nishikawa, H., Yamamoto, T., Kawauchi,
or-social-engineerin (accessed January,2023).
K., et al. (2019). “Basic study on targeted e-mail attack method using osint,” in
Khanna, P., Zavarsky, P., and Lindskog, D. (2016). Experimental analysis of International Conference on Advanced Information Networking and Applications.
tools used for doxing and proposed new transforms to help organizations protect Cham: Springer, 1329–1341.
against doxing attacks. Procedia Comput. Sci. 94, 459–464. doi: 10.1016/j.procs.2016.0
UrleBird. (2023). Available online at: https://urlebird.com (accessed January, 2023).
8.071
USA Headquarters Department of the Army (2012). Open-Source Intelligence. Fort
Knowem. (2023). Available online at: https://knowem.com/ (accessed January,
Eustis, VA: Army Techniques Publication.
2023).
UserSearch. (2023). Available online at: https://usersearch.org (accessed January,
Krombholz, K., Hobel, H., Huber, M., and Weippl, E. (2015).
2023).
Advanced social engineering attacks. J. Inform. Security Appl. 22, 113–122.
doi: 10.1016/j.jisa.2014.09.005 WATools. (2023). Available online at: https://watools.io (accessed January, 2023).
Lampyre. (2023). Available online at: https://lampyre.io (accessed June, 2023). Webmii. (2023). Available online at: https://webmii.com (accessed January, 2023).
Lande, D., and Shnurko-Tabakova, E. (2019). Osint as a part of cyber defense system. WhatsMyName. (2023). Available online at: https://whatsmyname.app (accessed
Theoret. Appl. Cybersecu. 1, 1. doi: 10.20535/tacs.2664-29132019.1.169091 January, 2023).