Functions
Text
REPLACE
How to Use Excel's REPLACE, SUBSITUTE Function in Pandas
There are several formulas in Excel designed to help you replace characters in a string.
The REPLACE function in Excel replaces characters in a string by location.
The SUBSTITUTE function in Excel replaces chracters in string by finding matching substrings.
Use Mito's REPLACE, SUBSITUTE function
Mito is an open source library that lets you write Excel formulas in Python. Either write the formula directly in Python or use the REPLACE, SUBSITUTE formula in the Mito Spreadsheet and generate the equivalent Python code automatically.
Mito's REPLACE, SUBSITUTE function works exactly like it does in Excel. That means you don't need worry about managing data types, handling errors, or the edge case differences between Excel and Python formulas.
Install Mito to start using Excel formulas in Python.
# Import the mitosheet Excel functions
from mitosheet.public.v3 import *;
# Use Mito's REPLACE, SUBSITUTE function
# Note: No need to convert the column to a string first
# because Mito's SUBSTITUTE formula does so automatically
df['column'] = SUBSITUTE(df['column'], 'old_text', 'new_text')
Implementing the replace function in Pandas#
Pandas offers the replace method which is versatile and can be used for a wide range of replacements. Here are some common implementations:
Replacing characters from the front of a string#
Excel's REPLACE function allows you to replace a substring by specifying the starting position and the number of characters to be replaced.
In pandas, you can achieve this using the string slicing syntax. For example, to replace the first three characters of a string with 'ABC' in Excel, you'd use the formula =REPLACE(A1, 1, 3, "ABC"). In Pandas you'd use:
# Replace the first three characters with 'ABC'
df['column_name'] = 'ABC' + df['column_name'].str[3:]
Replacing characters from the end of a string #
To replace the last three characters from the end of a string in Excel, you'd use the formula =REPLACE(A1, LEN(A1) - 2, 3, "ABC"). In Pandas you'd use:
# Replace the last three characters with 'ABC'
df['column_name'] = df['column_name'].str[:-3] + 'ABC'
Replacing characters from the middle of a string #
To replace characters from the middle of a string in Excel, you'd use the formula =REPLACE(A1, 3, 3, "ABC"). In Pandas you'd use:
# Replace three characters starting from the 3rd position with 'ABC'
df['column_name'] = df['column_name'].str[:2] + 'ABC' + df['column_name'].str[5:]
Substituting specific values in a column#
To replace specific values in a column in Excel, you'd use the 'Find & Replace' feature or the SUBSTITUTE formula instead of the REPLACE functionality.
Instead of replacing characters by location, the SUBSTITUTE formula replaces characters by finding matching substrings. To do the same in Pandas, you can use the replace method.
# Replace old_text with new_text
df['column_name'] = df['column_name'].replace(old_text, new_text)
Substitute multiple values in a column using a mapping#
To replace multiple values, you can pass a dictionary to the replace method. The keys of the dictionary will be replaced by the corresponding values.
For example, to replace 'NYC' with 'New York City' and 'LA' with 'Los Angeles', you could use the following code:
# Replace multiple values using a dictionary
df['column_name'] = df['column_name'].replace({
'NYC': 'New York City',
'LA': 'Los Angeles'
})
Common mistakes when using REPLACE, SUBSITUTE in Python#
The replace method in pandas is powerful, but certain pitfalls might produce unexpected results. Here are some of the common mistakes and how to address them.
Incorrect data types#
The string slicing method that we use above only works on string columns. If you try to use it on a numeric column, you'll get an error.
To avoid this, you can convert the column to a string using the astype method.
# Convert a column to a string before replacing characters
df['column_name'] = df['column_name'].astype(str).str[3:]
Handling missing values#
If you having missing values in your column, the string slicing method will return an error. To avoid this, you can use the fillna method to replace missing values with an empty string.
# Replace missing values with an empty string
df['column_name'] = df['column_name'].fillna('').str[3:]
Replacing by value#
The string slicing method is useful when you want to replace a specific number of characters from the front, end or middle of a string. However, if you want to replace a specific value, you can use the replace method directly.
To learn more about the replace method, refer to the SUBSITUTE page.
Understanding the replace Formula in Excel#
In Excel, you can use the REPLACE function to replace specific characters in a string by location. For example, to replace the first three characters of a string with 'ABC', you'd use the formula =REPLACE(A1, 1, 3, 'ABC').
REPLACE, SUBSITUTE Excel Syntax
Parameter | Description | Data Type |
---|---|---|
old_text | The original text or the text which contains the characters you want to replace. | string |
start_num | The starting position of the character in old_text you want to replace. The first position in text is 1. | number |
num_chars | The number of characters in old_text that you want to replace. If omitted, it will replace all characters from start_num to the end of old_text. | number |
new_text | The text string that will replace the characters in old_text. | string |
Examples
Formula | Description | Result |
---|---|---|
=REPLACE("Hello World", 7, 5, "Pandas") | Replaces the text 'World' with 'Pandas' starting at the 7th position. | Hello Pandas |
=REPLACE("123456", 2, 3, "ABC") | Replaces three characters starting from the 2nd position with 'ABC'. | 1ABC56 |
=REPLACE("ABCDE", 1, 0, "X") | Inserts 'X' at the 1st position without replacing any characters. | XABCDE |
=REPLACE("ABCDE", 3, 2, "XY") | Replaces two characters starting from the 3rd position with 'XY'. | ABXYE |
Don't re-invent the wheel. Use Excel formulas in Python.
Install MitoDon't want to re-implement Excel's functionality in Python?
Edit a spreadsheet.
Generate Python.
Mito is the easiest way to write Excel formulas in Python. Every edit you make in the Mito spreadsheet is automatically converted to Python code.
View all 100+ transformations →