No Need To Hack: When It's Leaking
No Need To Hack: When It's Leaking
hack when
it’s leaking
GITHUB HEALTHCARE LEAKS
PROTECTED HEALTH INFORMATION ON THE PUBLIC WEB
A collaborative report by
Jelle Ursem & DataBreaches.net
August, 2020
EXECUTIVE SUMMARY
2
CONTENTS
2 EXECUTIVE SUMMARY
6 MISUSE OF GITHUB
6 The Uber and Lynda Breaches
6 GnosticPlayers
6 Ransom Attempts
7 ShinyHunters
7 No Need to Hack When It’s Leaking
8 FINDINGS
8 1. Xybion
10 2. MedPro Billing
13 3. Texas Physician House Calls
14 Malware
15 4. VirMedica
16 5. MaineCare
18 6. Waystar
21 7. Shields Health Care Group
23 8. AccQData
25 9. ‘Unnamed Entity’
34 CONCLUSION
35 ABOUT US
OVERVIEW AND METHODS
It took Ursem less than 10 minutes to find that yes, medical data had
been exposed on GitHub — and a lot of it. Ursem, who unabashedly For those who are not
familiar with GitHub and
claims to be ‘the lamest hacker you know’, uses variations on sim-
would like an overview, see
ple search phrases like ‘companyname password’ (or in this case, Brown K: What Is GitHub,
‘medicaid password FTP’) to quickly find potentially vulnerable hard- and What Is It Used For?
November 13, 2019.
coded login usernames and passwords for systems. You don’t even
Link to Article
need to be a nerd to be able to do this, he notes — literally anyone Retrieved July 12, 2020.
could do the same.
4
Once logged in to a Microsoft Office365 or Google G Suite environment,
Ursem is often able to see everything an employee sees: contracts,
user data, internal agendas, internal documents, emails, address books,
team chats, and more.
In the past year and half, Ursem has contacted more than 400 entities
to alert them to leaks. Without going into much detail, these entities
include some of the biggest brands and companies in the world ranging
from Fortune 500, publicly traded companies to private sector, from
banking to entertainment, and from commercial to governmental.
For the findings reported in this paper, Ursem quickly found plenty of
exposed protected health information, but it would take him months
— and in some cases, the assistance of DataBreaches.net — to get
some of these leaks secured.
Spoiler alert:
You can think of that part as ‘The Good, the Bad, and the Downright
Infuriating’.
5
MISUSE OF GITHUB
Rather than just making general claims that bad things can happen,
we thought it might motivate some entities more if we tell you about
a few actual cases involving criminals finding and misusing leaked
credentials.
GNOSTICPLAYERS
In 2019, prolific threat actors known as GnosticPlayers would also Uber paid $100,000 ran-
som that they tried to cover
capitalize on the ability to gain information and access via GitHub. More
up by having the threat
than three dozen firms were hacked because the threat actors were actors sign non-disclosure
able to brute force access to repositories and then use that access to agreements. Eventually,
it all came out and Uber
search for employee credentials to valuable databases. In many cases,
was fined $148 million for
entities had no idea they had been hacked until the media contacted covering up the breach that
them to ask about their data being up for sale on ‘dark web’ forums compromised the personal
information of 57 million
and marketplaces.
riders and hundreds of thou-
sands of drivers. Chappell,
Bill: Uber Pays $148 Million
RANSOM ATTEMPTS Over Yearlong Cover-Up Of
Glover, Mereacre, and GnosticPlayers are not the only examples of Data Breach. Sept. 27, 2018:
Link to Article
threat actors misusing GitHub. In 2019, we also saw a rash of attacks on
repositories similar to what we had seen with misconfigured MongoDB
installations years earlier: attackers gained access to repositories,
6
wiped out their data, and then demanded ransom to restore the data. Cimpanu, Catalin:
In other recent cases, databases have just been wiped out maliciously A hacker is wiping Git
repositories and asking for a
and replaced with a “meow”. Almost 4,000 such databases were wiped ransom. May 3, 2019.
out in quick succession in July, 2020. Link to Article
Retrieved July 12, 2020.
Franceschi-Bicchierai,
SHINYHUNTERS Lorenzo: Someone
Is Hacking GitHub
In May, 2020, threat actors calling themselves ‘ShinyHunters’ emerged Repositories and Holding
to offer what they claimed was 500 GB of data from Microsoft’s own Code Ransom. May 19,
private repositories. Their description and claims were not totally 2019.
Link to Article
accurate, and Investigation by Microsoft revealed that for the most part, Retrieved July 12, 2020.
the data they had acquired were sample projects and code snippets
that had been intended for publication anyway. But ShinyHunters had Abrams, Lawrence:
accessed Microsoft private repositories, and that was enough to help Microsoft’s GitHub account
hacked, private repositories
them start to establish a reputation that the press would pay attention stolen. May 6, 2020.
to. They subsequently disclosed numerous other hacks and data dumps. Link to Article
Retrieved July 12, 2020.
7
FINDINGS
“GitHub search is the most dangerous hacking tool out there”
— Jelle Ursem
In this section, we describe nine leaks discovered on GitHub by Ursem. We originally intended
to report on four leaks, but
every time Ursem went on
GitHub, he seemed to find
1. XYBION
more leaks. We finally called
a halt to his searching for
Xybion is a software, services and consulting company with a pres- now so that we could get
this report out.
ence in workplace health issues. In February, 2020, Ursem discovered
that one of their developers had left some code in a public repository
that provided a system user's username and password. That code, in
conjunction with other exposed code, gave Ursem full access to one
of Xybion’s billing backoffices, including data on almost 7,000 patients
and more than 11,000 health insurance claims. The data seemed to
have been publicly available since October 31, 2018.
Ursem first tried to contact the company himself. When they did not
respond to his notification, and concerned that PHI was involved, Ursem
gave the information to a well-known IT news reporter. In June 2020,
when Ursem discovered that the data were still vulnerable to access, he
contacted Databreaches.net for notification help. When DataBreaches.
net was also unable to get a response from Xybion, Dissent called one
of Xybion's clients to alert them to the situation and to suggest that
they contact Xybion to urge them to call DataBreaches.net. The next
day, Xybion called DataBreaches.net and the data was secured.
8
Nor do we know if they have notified any of the patients whose PHI was
exposed and/or if they have notified the U.S. Department of Health &
Human Services or any state regulators.
Xybion’s Billing
Dashboard, showing
how much money flows
through the system,
amount of paid and
rejected insurance claims,
billing sources and pro-
viders.
9
2. MEDPRO BILLING
The earliest exposed files on the SFTP server appeared to date from
2015. The GitHub repositories have been online since 2016.
Email requests concerning named patients' PHI were not the only con-
tents of the exposed mail account, however. The firm's email account
appeared to have been compromised by spammers.
“It appears that this system was set up once, the application
started running, and then this mailbox was probably never
looked into again. There should have been a lot of monitoring
for a system that processes PHI. Instead, there seemed to be
none,” Ursem tells DataBreaches.net.
10
An example of French
spam sent out via med-
prosystems.net
To this day, however, MedPro did not contact Ursem to get more
information about their security issues. One of their clients did call
DataBreaches.net back to thank us, and to say that when they called
MedPro about the concerns, MedPro tried to say that the leak was the
client’s fault. To be clear: this leak was the result of the developer’s
11
public repository exposing credentials that permitted access to some
of MedPro’s sensitive information. What Ursem found was in no way
the fault of MedPro’s clients, although we hope that in the future they
will remember this incident and audit any vendor’s or business asso-
ciate’s security a bit more.
12
3. TEXAS PHYSICIAN HOUSE CALLS
Highly redacted
screenshot of a patient list
Ursem was able to access.
Risecorp.com HTML
contained (an invalid) link
to a gambling site
But even more concerning was some malware Ursem found — mal-
ware that due to poor incident response, is still active on one of their
live servers.
13
MALWARE
Ursem also discovered that they had unwillingly and most likely
unknowingly integrated and uploaded malware into their codebase at
two spots. “ALL of the client data that has lived on this server
since at least June 13, 2017 should be considered compromised”,
Ursem asserted, adding, “How can you NOT detect you've been
hacked for 3 whole years?”
Windows Defender
recognized the malware,
which left us wondering:
Are the developers run-
ning any antivirus at all?
14
4. VIRMEDICA
Ursem first discovered their leak in February. He noticed that the code
had been up since January 2018 and attempted to notify them by phone
and via their website ‘contact us’ form. He received no response at This is a useful example
of why entities should
the time, and after alerting an IT journalist to the leak in case that jour-
respond to notifications,
nalist wanted to pursue it, Ursem did not attempt to notify VirMedica however cautiously.
again until June, when he realized that protected health information As dedicated as Ursem is to
responsible disclosure, he
was still exposed.
was busy with other things
and after they failed to
FTP login credentials were publicly available, as were large .csv and respond to his first notifica-
tion attempt, he didn’t even
Excel files. Headers from some of the files indicated that they contained
think about them again until
numerous types of protected health information, including demographic months later. How many
information on patients, their diagnoses, health insurance information, threat actors might have
found the leak and exploited
provider information, and insurance subscriber information. Other files
it during that time?
were in .pdf format and also contained insurance information.
“including but not limited to Patient Name (First, and Last); DOB;
Medication; Primary Diagnosis; CPT Codes; Health Plan Policy
Numbers, and Medicare Beneficiary Identifiers (in some instances).
There were no Social Security Numbers in the SSN field. Many other
fields found in the data set were not routinely populated or were
not populated at all. Data accessed by the security researcher
also included de-identified test data that appeared identifiable.”
15
5. MAINECARE
On June 19, while looking into another leak, Ursem discovered a leak
involving MaineCare, a program that is state- and federally-funded to
provide healthcare coverage to Maine residents. Discovering that the
leak exposed approximately 75,000 individuals' personal and protected
health information and also gave him unexpected administrator rights
to their entire website, Ursem immediately called the state to alert them.
16
While the state and its contractor secured the data and systems,
we note that almost two months later, there is still no contact
information on MaineCare’s site that indicates how to responsibly
disclose security concerns.
17
6. WAYSTAR
18
As we completed the review of the firm’s findings, the Waystar
team agreed we should connect with you to describe the steps we
had taken and engage with you to ensure we have remediated
correctly the items you also may have knowledge of.”
As they would learn when they finally connected with Ursem, there was
a lot more they needed to do. And what was this bit about credentials
circulating on the ‘dark web’? We were not even aware of those.
19
Waystar’s incident is a cautionary tale about the need to investi-
gate attempts at responsible disclosure. If we had given up trying
to alert them to their problems, the consequences to Waystar, its
customers, and its patients might have been severe.
20
7. SHIELDS HEALTH CARE GROUP
21
“I'm sure you hear this all time, but we truly take security very
seriously at this company. We appreciate you tipping us off, it was
a good wake up call. Not only do we need to worry about making
sure we are following best practices and keeping up with industry
standards, but we also have to make sure that our vendors are
doing the same. Especially when patient data is involved.”
22
8. ACCQDATA
Before we had figured out that the repositories Ursem had found
probably belonged to AccQData, Ursem had contacted Availity by
phone on August 7 to alert them to the exposure. The first-line support
23
employee from Availity handled the report professionally, gathered all
the information that was important, and promised to relay the infor-
mation to the security team, thereby providing a text-book example
of how it should be done.
This was not the first threat Dissent had received in the process of
trying to alert entities to their data leaks, and while we have no fear
of the FBI, we do not take kindly to people defaming us — especially
after we went out of our way to help them. So Dissent sent an email
to AccQData.
24
9. ‘UNNAMED ENTITY’
[At the time of publication, one of the leaks has yet to be secured.
arrow-right
For that reason, we are not naming the entity, although we describe
the incident.]
While digging further through the repository, Ursem also found that the
developer had included a .sql file with the full name, address, birthdate
and other identifying properties of 40 children that may have been the
initial sample of pediatric patients registered into the system.
25
appeared that in 2015, someone used the login credentials, retrieved
the source code, sql database structure, and data and posted it all on
GitHub. The director had reached out to GitHub to ask them to delete
the data, and when we last heard from him, was awaiting their response.
26
THE ‘TYPHOID MARY OF
DATA LEAKS’
How much damage can one developer do? A lot — if he is good at getting
himself hired and keeps repeating the same mistakes as he goes from
employer to employer without ever deleting his old or no-longer-need
repositories. In Ursem’s research, he came across a few names who
were ‘repeat offenders’ when it came to exposing access credentials
in public repositories. We will only tell you about one — the one we
think of as the ‘Typhoid Mary of Data Leaks’.
To give you a preview of where this is going: when Ursem was trying to
wrap his head around the scope of the MedPro billing leak mentioned
earlier in this report, he found so many things wrong that it was hard
to know where to start reporting on it. It seemed that if there was any
way this developer could do something wrong or mess something up,
he would. And he seemed to be surprisingly unaware that everything
he was doing was visible to others. Even after Github hit him with a
DMCA Takedown request for an Ebook he improperly shared back
in 2018, he continued to expose everything. If that takedown notice
wasn’t a wake-up call that others could see all his work, we don’t know
what would be.
27
Storing
arrow-right 800mb SQL backups with PHI from MedPro’s clients’
patients on Github
Wrongful attribution of a client that wasn’t even a client, leading
arrow-right
Ursem and DataBreaches.net on a wild goose chase tracking down
an innocent party — and making the non-client look like they fool-
ishly hired a developer who would expose their login credentials
publicly
Exposing access to the telephone central system for a large entity
arrow-right
in debt collection
Exposing an estimated 200,000+ PII and PHI records (he had non-
arrow-right
PHI disasters as well as PHI disasters).
Uploading credentials for a web application error tracker that also
arrow-right
logged PII
Exposing credentials that lead to highly sensitive records for people
arrow-right
with a history of substance abuse
arrow-right Apparently not adding sensitive systems to audit and monitoring
controls.
Even though we have not been in direct contact with this person, we
know that the message of his wrongdoings has sunk in with either him
or his current employer now as he made all his repositories private on
June 22, 2020. Sadly, some of them had already been cloned by third
parties and remain publicly available as silent witnesses to the havoc he
caused. Thankfully the cloned repositories do not include PHI and the
passwords inside them have been changed. Ursem and Databreaches.
net are still trying to get the last of the fallout of his actions plugged
and are trying to make sure that the companies affected are aware of
the fact that their data has been out in the open for sometimes years
on end.
28
KEY FINDINGS AND
RECOMMENDATIONS
FINDINGS
Of the nine entities in this report, three were health care providers, one
was a health plan, and the remainder were business associates or in
third-party relationships. All of the three healthcare providers informed
us that the developers in their case were contractors or employees of
their business associates.
We note that even had they used private repositories, threat actors can
credential stuff or brute force their way into private repositories where
they can then search for access credentials for corporate accounts.
At least three of the nine entities intentionally did not respond to early
notification attempts and would later claim that they had been fearful the
notifications were a social engineering attack. Their failure to respond
left PHI exposed even longer and they risked never finding out about
29
their leaks had we given up after the first or second attempt. None of
the three apparently made any attempt to google or investigate either
Ursem or DataBreaches.net to determine if we might be legitimate.
Those who have not grown up in the U.S. may not fully appreciate our
laws or the language of their instructions concerning keeping medical
and personal data highly secured and private. We are not suggesting
laziness or callous disregard of privacy or security on any developers’
part. There may be a language problem or lack of training in the need
to keep personal and sensitive information secure and confidential.
RECOMMENDATIONS
Security is never perfect, but entities can improve their security posture
and incident response.
Provide
arrow-right a way for researchers to responsibly disclose security
incidents: Create a 'disclosure@yourcompany' e-mail address on
your Contact / About us webpage that's monitored by your CISO,
CSO or CTO, or MSP/IT provider.
Train employees and especially your first line support and social
arrow-right
media team on procedures for escalating notifications they receive.
Teach them how to avoid a phishing attempt, but make sure they
arrow-right
always escalate it to have it investigated by someone who has the
skills to determine credibility of the communication.
Regularly search GitHub for your firm's name and domain name(s).
arrow-right
Even if you do not use a developer, one of your business associates
or vendors might.
Regularly force password changes and do not allow password
arrow-right
reuse. If you have objections to this recommendation, at least
rotate passwords used by former employees after they left.
30
Lock
arrow-right down connections by IP address. Is there really a need for
your webservice or secure FTP site to communicate with the whole
world? Or do you just want to open the door for a couple specific
people?
Use 2FA or MFA for every third party service you use that supports
arrow-right
it.
Make sure to enforce admin approval of devices used for MFA and
arrow-right
that every account gets logged in to at least once when enabling
it. Ursem has encountered more than one case where he would
have been able to establish his own phone number as the MFA
authenticator.
Require developers to use private repositories; prohibit public
arrow-right
repositories. Recognize, however, that attackers may attempt to
brute force or credential stuff to gain access to private repositories.
So:
arrow-right Never allow developers to embed passwords or authentication
tokens in code repositories;
arrow-right Prohibit the use of real (production) data in GitHub repositories;
and
arrow-right When developers terminate, ensure that their repository(ies)
is/are deleted.
Vet your business associates. Do they hold themselves to any
arrow-right
information security standards like ISO-27001?
Even if you do all of the above, incidents may occur, so make sure
you have a way for people to contact you to alert you. And then be
prepared to respond promptly and appropriately.
31
DON’T SHOOT THE
MESSENGER
But what all entities need to know is that when some people irrespon-
sibly threaten those trying to alert them to a problem, they discourage
others from trying to be helpful. Threats do have a chilling effect for
many people. While there are those who might want to see us be
understanding of people accusing us falsely because they are upset
or misunderstood, we prefer to take a firm line that shooting the mes-
senger is not acceptable, period.
32
If you want researchers to let you know when there is a problem that
you need to address, then reinforce responsible disclosure. Most
researchers we know do not expect any bug bounty, reward, mention
on your website, or even swag (although Ursem assures DataBreaches.
net that most researchers would definitely appreciate it). But don’t
underestimate the value of a thank-you note. Even those of us who
are not marketing our services still like hearing that our volunteer
efforts to help secure data are appreciated.
33
CONCLUSION
In this paper, we presented nine examples involving leaks of PHI, but the
problems we describe and the recommendations apply to other sectors
as well. While it took Ursem only minutes for each case to find a total of
a couple hundred thousand PHI records, it took a lot of time — in many
cases, months — to get entities to respond to attempts to responsibly
notify them of the leaks. Not one of these entities had a clearly posted
means to contact them to report a data security concern. We tried phone,
e-mail, Facebook, Twitter, LinkedIn, and even reached out to associates
or clients if need be to relay the notification. Most people will not persist
like that, and shouldn't need to. Regardless of what sector you work in,
facilitating responsible disclosure is critical to your security.
There are undoubtedly many more leaks that can be found on GitHub, and
we know that at least some threat actors are already using GitHub as a
way to find login credentials in repositories. It is no longer sufficient to just
search Google or Shodan or BinaryEdge for your firm's data or to search for
signs of your firm’s data on the dark web. You also need to search GitHub.
34
ABOUT US
JELLE URSEM
is a Developer / Devops engineer by day and ethical security researcher
by night. Hailing from The Netherlands, he cheerfully disregards the
yellow tape US researchers have to work around to not get sued and
opts to directly inform companies that are at risk of being hacked or
extorted. Over the past 1.5 years, he has accumulated over 400, mostly
privately handled, responsible disclosures to his name. To contact Jelle
about his research or this report, send an e-mail to jelle[at]esctunes.
com. You can follow Ursem on Twitter @SchizoDuckie.
DATABREACHES.NET
is a non-commercial blog created in 2008 to report on leaks and data
breaches. While much of the site’s more than 27,000 posts represent
news aggregation, the site also includes original reporting and com-
mentary by ‘Dissent Doe, PhD’ a licensed mental health professional.
Dissent’s reporting and watchdog complaints with federal agencies
have resulted in enforcement action in a number of cases. Dissent can
be reached via e-mail to breaches[at]databreaches[dot]net. You can
follow Dissent on Twitter @PogoWasRight.
35