One of the biggest intelligence failures in American history happened in August 1978 when the CIA concluded that “Iran is not in a revolutionary or even pre-revolutionary situation.” The infamous comment was followed by the Islamic Revolution of 1979. They lacked the right intelligence. Meanwhile, in universities and research facilities around the globe, tech’s best and brightest were working on the Internet which now captivates us, captures us, and connects us in some shape or form. Fast forward to today, corporate and information security teams should not have that same problem of lack of intelligence.
Open-source intelligence (OSINT) is the collection, analysis, and knowledge extraction of publicly available data. In fact, the CIA invented it to stay informed on foreign Points of Interest. Security teams working with open-source data on Points of Interest (POI) is a constant in their strategies to identify threats and manage risk to their organizations. If your organization needs information about a POI, such as a person, organization, domain, or location, you will need to secure the right OSINT techniques, sources, and tools in order to make effective use of it. This is how security teams develop better strategies for managing risk to their assets.
In this blog post series, we will look at a number of different types of POI, including People, Organizations, and Domains. Our focus today: People. The simple techniques we will cover illustrate how anyone can take advantage of open-source intelligence, including bad actors. In large-scale security operations, these same OSINT techniques can be automated and streamlined to speed up the velocity of POI investigations using specialized platforms like Media Sonar.
POI Workflows for People Investigations
Step 1: Start with what you know
Investigations are the transition between “what we know” and “what we want to know.” Our Point of Interest in this case is a Person of Interest, and what you know might include a real name, a username, or an email address.
If you already know the real identity of the person, along with their first and last name, you can use these as keywords to start your search. Keep in mind, searching for a person using their real name can deliver a lot of false positives. You can pair their real name with additional information such as a location to cut down on irrelevant results that won’t match.
The easiest way to identify a person’s username is to search with the information you have to locate the username. This might be a combination of their first and last name, it might be consistent with a domain name they own. The type of username the person is using can vary depending on how easily they want to be located and identified.
As with the real name, searching username can bring up a lot of false positives. You might try pairing with any knowledge you have of real name i.e. first name, last name or both.
Consider yourself lucky if you have an email address to search with. Unlike the person’s real name and their usernames, the email address will be unique to that person and will not deliver the same level of false positives. You are far more likely to land on what you need if you have their email address. Like username, a person can have more than one email address, and some people have many.
Step 2: Search available data sources
There are countless legal and public data sources available for open-source data collection, and Finding the right data from the right sources can be time-consuming and is a largely manual process, but even the most basic tools are better than nothing.
The common search engine can be used to initiate an investigation, discover data or new sources. Google is the most popular one that comes to mind. Google’s advanced search queries are often referred to as “Google dorks,” and these are used to search for precise results in the name of data collection. An entire database of options has been catalogued in the Google Hacking Database, and these are used to conduct OSINT searches in security operations and hacking operations.
Here are a few notable ones to get started:
Searching for a real name:
- Search string: “jane smith” site:twitter.com: Expected Result: Look for an exact match to that name on Twitter
- Search string: “jane” “smith” -site:twitter.com: Expected Result: Look for an exact match to the first and last name but in different combinations, and exclude Twitter from the results.
Searching for a username:
- Search string: inurl:janesmith site:twitter.com Expected Result: Search for URLs on Twitter that contain “janesmith” in them.
- Search string: allinurl:jane smith ny site:twitter.com Expected Result: Find web pages with “jane”, “smith”, and “ny” words in the Twitter URL. Note: Similar to inurl but supports multiple words.
Searching for an email address:
- Search string: “@example.com” site:example.com Expected Result:Search for all emails on a given domain.
- Search string: HR “email” site:example.com filetype:csv | filetype:xls | filetype:xlsx Expected Result:Find HR contact lists on a given domain.
- Search string: site:example.com intext:@gmail.com filetype:xls Expected Result: Extract email IDs from Google on a given domain.
Ultimately though, search engines only let you search across the Surface Web, which makes up a fractional 4% of the Internet, and do not include Deep and Dark Web data
Beyond Google, it is possible to find different results in other search engines. A popular common search engine in the OSINT community, DuckDuckGo, can also be queried for the same results.
People Search Sites
Search engines are developed for universal results. While POI information can be located, it’s tricky and requires some digging and reading. People search websites, for example spokeo.com and beenverified.com, can also be used to quickly search for people using a real name, username, email or phone number. People search websites allow individuals to opt and have their listed information removed, but other sites will often appear with the same information having obtained it from the same data set. It might be necessary to search multiple sites to get all the data you need about a person. Keep in mind, people change locations and phone numbers all the time, it might be an old email address, the information could have been manipulated, actively or systematically. Nonetheless, people search really helps speed up search for a POI.
When data breaches occur, the exposed info of course can also be a source of POI intelligence gathering. Sites like haveibeenpwned.com exist that strip off the passwords and associate emails to the breach they were involved in. This info allows the security analyst or investigator to possibly identify how many times a POI email was included in a breach, how many times their passwords were captured, and the types of services and platforms that person is using, or have used in the past. This can be an important indicator of the digital neighborhood the person frequents, and if a person’s information is located multiple times, their identity may be vulnerable.
Understanding the digital neighborhoods your POI frequents is important, and social activity is helpful as it comes with context, and points to their behaviors, viewpoints, and activities. Understanding the behavior of the POI requires special insight, not everything will be as it appears. On the other hand, if a POI is deemed a potential threat, their social activity might be a good sign of when they are active, and when a threat might escalate. If your POI needs to go beyond mere identification, then social activity across social platforms, discussion sites, blogs, and forums will weigh heavily in your investigation.
Step 3: Capture and analyze the data
How you capture and preserve your POI data will impact how it can be used later. If it will be used to make business decisions or in a court setting, there will be a higher standard than information used internally by a team. Tracking back after the fact can be nearly impossible, especially if something has been deleted at the source. Capture the data, a screen capture can work, and record where it was found along with the date and time saved. You might also want to consider capturing the steps you took to find the information. To avoid headaches in the future, before getting started, find out exactly how you need to use the information, and how it might need to be used in the future. Needless to say, scaling up can present challenges. Media Sonar takes work out of these steps by automating them with a case management approach to your OSINT data collection and capturing your investigative workflows.
At this point in an investigation, you might have a combination of screen captures, notes, and images, maybe more, and it’s time to find order in it. You hopefully have an idea of how the data is connected, but without visualizing it you cannot be sure and it’s difficult to analyze as separate entities. A mindmap can be helpful – sketch one out if necessary to plot the key data, where you found it, and how it is related. This type of visualization will help you to analyze, correlate, and validate your data. It promotes analytical thinking and can help suspend bias in focusing too much on one piece of data over others.
These steps are where manual OSINT techniques will generally fall short. While it is possible to do this manually, it requires considerable effort and a high level of analytical thinking. Given the speed and level of resources most organizations will be operating in, it is not a feasible long-term solution. Therein lies the value of Media Sonar. These connections are plotted as you conduct your investigation and captured by the Pathfinder feature. At this point, all you need to do is analyze the data.
Step 4: Generate a POI report
The Internet is the go-to source for information about people. It can be time-consuming to sift through the Internet to identify and capture the digital footprint of a person, but public Internet data is still the easiest way to passively research and vet third-parties, new employees, and bad actors. Not all information will be particularly valuable depending on the purpose of your investigation. Generally you will need to include the personal information you were able to obtain such as your POI’s email address, real name, username, age, CV, phone number, location, education, and career.
Here are some additional findings you might need to include in your report:
- Economic situation
- Relationships – family, friends, contacts
- Tendency to crime
- De-anonymization and sock puppets
- Activity on the web
- Places lived, places visited
- Network information
It is often at this stage that any gaps become most apparent. It might be necessary at this point to check the data, check your assumptions, and search anew. You may have missed capturing that one thing, or forgotten to write down the source of a piece of information. This is the problem Media Sonar aims to solve with our platform. By automating the steps of a POI investigation, we can effectively take care of the mundane tasks of capturing each little detail. When it is time to generate a report, our corporate and information security customers are ahead of the game.
Automating POI Investigations
Capturing the digital footprint of a POI manually, as described in this blog post, is time-consuming and not always effective. Automated tools can help make connections between the data that might be missed otherwise. Plus, there is a steep learning curve when it comes to open-source intelligence and considerable experience is often required.
The Media Sonar platform bundles together the tools and access to datasets in one place to help automate POI workflows. With best-in-class digital footprint features and advanced search functions, queries and filters, access to data sources across social, Deep and Dark Web, and specialized OSINT checks, the Media Sonar platform is developed to help you investigate POI in corporate and information security environments.
Don’t miss Part 2 of our OSINT Techniques for Security POI series where our focus will be on Organizations. Learn concepts and techniques for POI investigations, focusing on vetting and monitoring third-parties such as partners, customers, and vendors.