Functions
Math
SUMIF
How to Use Excel's SUMIF Function in Pandas
Excel's SUMIF function is commonly used to sum the values of numeric cells that meet some predefined conditions. Often, its used to group and sum values based on a single condition, but it can also be used to sum values based on multiple conditions.
This page provides several common examples of how to use replicate the SUMIF function in Python with pandas.
Implementing the Sum with Condition function in Pandas#
To use SUMIF in Python, you can use conditional expressions, like 'df['A'] > 5', combined with the `sum` method to sum the values that match the condition.
Summing Based on a Single Condition#
In pandas, you can sum the values of a column based on a condition from another column using a simple comparison and the `sum` method.
For example, the Excel formula =SUMIF(A:A, ">5", B:B) counts the values in column B where the corresponding values in column A are greater than 5. In pandas, you can implement the same functionality with the following code:
# Sum the values from 'B' where values in 'A' are greater than 5
total_sum = df.loc[df['A'] > 5, 'B'].sum()
Creating a table using SUMIF#
You can also use the SUMIF function in Excel to create a table of values based on a condition.
For example, let's say that you have a spreadsheet with a list of remaining cars for sale at your car dealership. You want to find the value of the unsold inventory by car manufacturer. You could use the SUMIF function to create a new table that has two columns: Manufacturer and Inventory Value. The Manufacturer column would contain the unique manufacturer names, and the Inventory Value column would contain the sum of the remaining inventory for each manufacturer.
You can replicate this same behavior in Python by building a pivot table that groups the data by manufacturer and then sums the inventory values in each bucket.
# Create sample dataframe
df1 = pd.DataFrame({
'Manufacturer': ['Honda', 'Honda', 'Honda', 'Tesla', 'Tesla'],
'Color': ['White', 'White', 'Black', 'Blue', 'White'],
'Price': [35000, 29999, 42000, 45000, 135000],
})
# Create a new table that has two columns: Manufacturer and Price
df2 = df1.pivot_table(index=['Manufacturer'], aggfunc={'Price': 'sum'})
# Rename the Price column to Inventory Value
df2 = df2.rename(columns={'Price': 'Inventory Value'})
# Reset the index
df2 = df2.reset_index()
Creating a table using SUMIF with condition#
Let's say that instead of just wanting to group the data by manufacturer, you are now interested in findinding the inventory value of white cars only.
You can implement this in Python by applying a filter before creating the pivot table.
# Create sample dataframe
df1 = pd.DataFrame({
'Manufacturer': ['Honda', 'Honda', 'Honda', 'Tesla', 'Tesla'],
'Color': ['White', 'White', 'Black', 'Blue', 'White'],
'Price': [35000, 29999, 42000, 45000, 135000],
})
# Create a copy of the data first to preserve the original data
temp_df = df1.copy()
# Filter the data to only include rows with white cars
temp_df = temp_df[temp_df['Color'] == 'White']
# Create a new table that has two columns: Manufacturer and Price
df2 = temp_df.pivot_table(index=['Manufacturer'], aggfunc={'Price': 'sum'})
# Rename the Price column to Inventory Value
df2 = df2.rename(columns={'Price': 'Inventory Value'})
# Reset the index
df2 = df2.reset_index()
Common mistakes when using SUMIF in Python#
While using pandas to replicate the SUMIF function
Not resetting the index#
When you create a pivot table in pandas, the index of the resulting dataframe is the column that you grouped by. In the example above, the index of df2 is the Manufacturer column.
To make it easier to work with the data, you may want to reset the index so that the Manufacturer column becomes a regular column in the dataframe. You can do this by calling the `reset_index` method on the dataframe.
# Reset the index
df2 = df2.reset_index()
Understanding the Sum with Condition Formula in Excel#
The SUMIF function in Excel takes three arguments: a range of cells you want to evaluate against the criteria, the criterion to apply before calculating the sum, and the range of values you want to sum
=SUMIF(range, criteria, sum_range)
SUMIF Excel Syntax
Parameter | Description | Data Type |
---|---|---|
range | The range of cells you want to check against the criteria. | range |
criteria | The condition that determines which cells to sum. | conditional |
sum_range | The range of cells you want to sum if the criteria is met. | range |
Examples
Formula | Description | Result |
---|---|---|
=SUMIF(A1:A10, ">5", B1:B10) | Sums the values in range B1:B10 where the corresponding cells in range A1:A10 are greater than 5. | Sum of cells in B1:B10 where A1:A10 > 5 |
=SUMIF(A1:A10, "banana", B1:B10) | Sums the values in range B1:B10 where the corresponding cells in range A1:A10 are the word 'banana'. | Sum of cells in B1:B10 where A1:A10 is 'banana' |
=SUMIF(A1:A10, "*banana*", B1:B10) | Sums the values in range B1:B10 where the corresponding cells in range A1:A10 contain the word 'banana' anywhere in the text. | Sum of cells in B1:B10 with 'banana' in A1:A10 text |
Don't re-invent the wheel. Use Excel formulas in Python.
Install MitoDon't want to re-implement Excel's functionality in Python?
Edit a spreadsheet.
Generate Python.
Mito is the easiest way to write Excel formulas in Python. Every edit you make in the Mito spreadsheet is automatically converted to Python code.
View all 100+ transformations →