Pandas Series.str.findall() Method



The Series.str.findall() method in Python Pandas is used to find all occurrences of a pattern or regular expression within each string in the Series or Index. This method is equivalent to applying re.findall() to all elements in the Series/Index.

The method returns a Series or Index of lists, where each list contains all non-overlapping matches of the pattern or regular expression found in the corresponding string. And it is useful for finding and extracting all non-overlapping occurrences of a specified pattern or regular expression from each string in a Pandas Series, Index, or a DataFrame column.

Syntax

Following is the syntax of the Pandas Series.str.findall() method −

Series.str.findall(pat, flags=0)

Parameters

The Series.str.findall() method accepts the following parameters −

  • pat − A string representing the pattern or regular expression to be searched for.

  • flags − An optional integer, default is 0. Flags from the re module, such as re.IGNORECASE, to modify the pattern matching behavior.

Return Value

The Series.str.findall() method returns a Series or Index of lists of strings. Each list contains all non-overlapping matches of the pattern or regular expression found in the corresponding string. If no matches are found, an empty list is returned for those elements.

Example 1

This example demonstrates finding all occurrences of the substring 't' in each string element in a Series.

import pandas as pd

# Create a Series of strings
s = pd.Series(['tutorials', 'articles', 'Examples'])

# Find all occurrences of the substring 't' in each string
result = s.str.findall('t')

print("Input Series:")
print(s)
print("\nOccurrences of 't':")
print(result)

When we run the above code, it produces the following output −

Input Series:
0    tutorials
1     articles
2     Examples
dtype: object

Occurrences of 't':
0    [t, t]
1       [t]
2        []
dtype: object

An empty list [] indicates that there are no occurrences of the pattern in the element.

Example 2

This example demonstrates finding all occurrences of a pattern using a regular expression. Here, we look for all substrings starting with 't' followed by any character.

import pandas as pd

# Create a Series of strings
s = pd.Series(['tutorials', 'testing', 'test cases'])

# Find all substrings starting with 't' followed by any character
result = s.str.findall(r't.')

print("Input Series:")
print(s)
print("\nOccurrences of pattern 't.':")
print(result)

When we run the above code, it produces the following output −

Input Series:
0    tutorials
1      testing
2   test cases
dtype: object

Occurrences of pattern 't.':
0    [tu, to]
1    [te, ti]
2    [te, t ]
dtype: object

The output shows lists of matches for the regular expression pattern 't.' where each element represents substrings that match the pattern.

Example 3

This example demonstrates applying the Series.str.findall() method to a DataFrame. We find all email addresses in a DataFrame that match a specified pattern.

import pandas as pd

# Create a DataFrame 
df = pd.DataFrame({
    'Email': ['user1@example.com', 'info@tutorialspoint.com', 'contact@website.org']
})

# Find all occurrences of the pattern 'tutorialspoint.com' in the 'Email' column
result = df['Email'].str.findall('tutorialspoint.com')

print("Input DataFrame:")
print(df)
print("\nOccurrences of 'tutorialspoint.com':")
print(result)

When we run the above code, it produces the following output −

Input DataFrame:
                      Email
0          user1@example.com
1  info@tutorialspoint.com
2       contact@website.org

Occurrences of 'tutorialspoint.com':
0    []
1    [tutorialspoint.com]
2    []
Name: Email, dtype: object

The output shows that the pattern 'tutorialspoint.com' is found in the second email address only.

python_pandas_working_with_text_data.htm
Advertisements
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy