Advanced Filter - 26 Ways
Advanced Filter - 26 Ways
Before diving into the different ways Excel provides of filtering data we will look at the sample dataset
that we will use for the rest of the chapter. The data set is a modified version of a table from the
Northwind database that Microsoft has used for several years in its examples. The modifications to the
data specifically make the already fairly rich dataset more useful in understanding Excel’s filter
capabilities. Also, the modifications replace the long string “Northwind Traders” that starts every
product name with the shorter “NT.” A sample of the first few rows of data together with all the column
headers is in Figure 1.
The table provides information about the status of each item in each order placed by customers. When
a customer places an order, it is given a unique Order ID. Each product is uniquely identified
by Product. If an order includes multiple products, there is one row in the table for each combination of
Product & Order ID. For each order-product combination, the table tracks the order date ( Order Date),
the product ship date (Shipped Date), which will be empty if the product is not yet shipped, the
customer name (Customer), the quantity ordered (Quantity), and the unit price (Unit Price).
Figure 1
We will explore a range of management questions that we can answer with Advanced Filter in support
of business operations.
Advanced Filter
Advanced Filter provides many features for data-based filtering that are not available in AutoFilter.
Because Advanced Filter works with the data and not the formatting, the color-related capabilities of
Autofilter are outside the scope of an advanced filter. At the same time, Advanced Filter delivers to the
Excel consumer a very powerful filter and extract tool. For those who have used SQL (or have heard of
SQL but have been uncertain about learning it), Advanced Filter represents an easy and Excel-friendly
way to tap into some of the power of SQL.
To access the Advanced Filter dialog box (Figure 2) select any cell in the source range and click the Data
tab | Sort & Filter group | Advanced button. This dialog box is deceptively simple given that it actually
packs a lot of capability.
Figure 2
Use the choices under Action to either “Filter the list, in-place” (very much like what AutoFilter does) or
to extract a Copy to another location of the selected data. I find the latter (i.e., copy to another
location) to be the more useful feature since it leaves the original data untouched and all the rows
remain visible. It also makes it possible to simultaneously view the results of different queries and to
save the results of different queries for future review.
The List range field specifies the Excel range containing the source data. In our example it is Sheet1!
A1:G60.
The Criteria range is optional. This might seem strange since shouldn’t a criteria be required to filter
data? But, it turns out that leaving this out can be a powerful mechanism. Also, when specified, and
unlike AutoFilter, the criteria are entered in a worksheet range. This, particularly when combined with
copying the result to another location, allows one to “save” both the criteria and the corresponding
result for future review. It also allows one to ask extremely sophisticated questions, since the content of
the criteria range can be any Excel formula.
When the selected action is Copy to another location, the Copy to field allows one to specify where Excel
should save the results. Use this field to specify which columns to include in the output.
Finally, the Unique records only checkbox indicates that the result should have no duplicate results,
where duplicates are defined as having the same value in every output column.
Filter in-place or Copy to another location
As mentioned above, I am not a proponent of “Filter the list, in-place.” I have a strong preference for
the alternative, Copy to another location’ and it is how advanced filter is used in this document.
However, if one does filter a data source in-place the way to remove the advanced filter is to click
the Data tab | Sort & Filter group | Clear button.
There are two ways to specify the criteria for filtering or extracting data. This section addresses the one
that is probably easier to understand. Duplicate the names of the columns of the source data in a single
row in some empty range. Then, in the row under it, enter the different conditions. Conditions entered
in the same row are ‘AND’ed together while conditions on separate rows are ‘OR’ed. This is possibly
easier to understand with a few examples.
This requires specifying a single value for the Product column as shown in Figure 3. The dialog box also
shows that the result will be copied to the range starting with J8.
Figure 3
The result, shown in Figure 4, lists all the product orders that relate to NT Dried Plums.
Figure 4
List all the orders for NT Dried Plums where the quantity is greater than 15.
The requirement translates to the criteria: Product=”NT Dried Plums” and Quantity >30. Since this is an
AND condition, the two individual conditions will be on the same row. Also, it is enough to include only
those column headers that contain a condition as in Figure 5.
Figure 5
The result is in Figure 6.
Figure 6
List all the orders for NT Dried Plums where the quantity is greater than 15 and less than or equal to 30.
In this case, the conditions are Product=”NT Dried Plums” and Quantity > 30 and Quantity <=30.
Remember that ANDed conditions have to be on the same row and there are two conditions that
involve Quantity! Luckily, Advanced Filter allows columns to be duplicated as in Figure 7.
Figure 7
List the Order ID, Order Date, and Company Name for all orders for NT Dried Plums where the quantity
is greater than 15 and less than or equal to 30.
This filter criteria is the same as in the previous example except that the output list should include only a
subset of 3 specific columns. To accomplish this the Copy to field should specify only the headers of the
desired columns as in Figure 8. If the header names do not match those of the source, Excel will raise an
error.
Figure 8
The last few examples have shown how to specify various elements of the Advanced Filter dialog box.
Subsequent examples will focus on the conditions in the cells and may skip showing the contents of the
Advanced Filter dialog box.
Since the criteria involves an OR condition, the 2 values should be on different rows as in Figure 9.
Figure 9
List the Product, Order ID, and Quantity for all orders for quantities greater than 10 and for either Dried
Plums or Dried Pears.
Remember that AND conditions go in a single row and OR conditions go in separate rows. In this case,
Excel will evaluate each row independentaly and then check the OR part. So, the correct way to write
the logical requirement is (Quantity > 10 and Product=”NT Dried Plums”) or (Quantity >
10 and Product=”NT Dried Pears”) as in Figure 11
Figure 11
While previous searches also involved text fields what was not evident was that Excel actually matches
values that start with the specified text. This becomes more obvious when the search is for “NT
Chocolate,” which shows results for both NT Chocolate and NT Chocolate Biscuits Mix.
Figure 12
The way to search for an exact match is to enter the search string as a formula as shown in Figure 13,
which shows the formula entered in J80.
Figure 13
For all tea related products, list the product, the order id and the order date.
Tea related products are defined as having “tea” somewhere in the product name. Advanced Filter
works with wildcard characters such as * and ?. To search for any occurrence of tea in the name of a
product, use *tea*. [As in other context and other languages, the ? character allows a single character
to be anything.]
Figure 14
Example 10: Text search with 2 wildcards
List the product, the order id and the order date for all products with names like t-any character-a.
The required condition includes a wildcard pattern, i.e., Product=*t?a*. The result includes, in additio to
the various tea products, NT Almonds, in which the T-space-A combination satisfies the t?a pattern.
Figure 15
This is similar to the earlier conditions and it involves simply typing in the characters >6/1/2012. Excel
will correctly interpret the content of the cell as a date (see Figure 16).
Figure 16
This is also similar to the previous example of 2 conditions on the quantity ordered (Example 3). Simply
enter the conditions in the 2 cells as shown in Figure 17.
Figure 17
List all products that remain unshipped. Ensure that the output is verifiable.
An unshipped product will have an empty Shipped Date. Adapt the earlier suggestion for finding an
exact match (Example 11) except this time specify nothing in the search expression as in Figure 18. Also,
to ensure a verifiable output, include the Shipped Date column. Obviously, it should contain only empty
cells.
Figure 18
The way to enter a formula as the filter criteria is similar to and yet somewhat different from the above
examples. The important items to remember are:
1. In the criteria range the row containing the column headers must be blank.
2. Relative cell references to the data source must refer to the 1 st data row in the source range.
3. The formula must yield a TRUE or FALSE result. That is how Excel decides whether to include a
particular row in the result or not.
List all products that remain unshipped. Ensure that the output is verifiable.
This is the same as Example 12 but now instead of a “trick” to specify an empty cell, the formula is
straightforward and easy to understand (Figure 19). As mentioned earlier, the criteria range must have
a blank first row and the formula must refer to the 1 st row with data.
Figure 19
List all product orders that shipped more than 7 days after the order was place.
The condition requires checking if the difference between the product order date and shipped date is
more than 7 days. [This specifically excludes unshipped orders.] Figure 21 shows both the criteria and
the result.
Figure 21
List all product orders that remain unshipped or were shipped more than 7 days after the order was
placed.
This essentially is Example 14 combined with Example 15. The formula is shown in Figure 22.
Figure 22
As of July 6, 2012, list all product orders that were ordered more than 7 days ago and remain unshipped.
This is similar to the previous example except that it involves an ‘and’ condition as well as a specific date
in the formula (rather than comparing two dates that are already in the table). The formula and the
associated result are in Figure 23. As should be expected the result should match Example 14 with the
exception of the last 2 orders (order 79 and order 80), which were placed on June 30, 2012, i.e., 6 days
before July 6, 2012.
Figure 23
Example 18: A calculated field as a condition – II
This requires calculating the revenue from the existing columns. The revenue is the quantity (column F)
multiplied by the unit price (column G), which makes the problem similar to Example 15 (which required
calculating the difference between two columns). The criteria and the result are in Figure 24.
Figure 24
List all product orders where the entire order generated revenues in excess of 2,000.
This is similar to the previous example except that now the decision is based on the entire order. An
order that includes multiple products will have one row for each product in the source range.
Consequently, the filter criteria must sum the revenue for each product in the order corresponding the
each row. In the formula below, the first argument of the SUMPRODUCT is --($B$2:$B$60=B2). The part
in parenthesis generates a list of 59 TRUE or FALSE values, one for each row in the source data, based on
whether the row is part of the current order or not. The double minus signs convert the TRUE/FALSE
values in 1/0 values. The other arguments to the SUMPRODUCT include the quantity and the unit price.
Consequently, Excel will SUM the PRODUCT of the 1s or 0s with the quantity and the unit price. This will
yield the total revenue for the current order.
As one might expect, it includes all of the results from the previous example and also a few more. None
of the individual products in Order 46 met the 2,000 threshold but as a complete order it exceeds 2,000
in revenue. One of the products in Order 58 was included in the previous result and as a result the
entire order exceeds the 2,000 threshold. Consequently, the other smaller product order in Order 58 is
now included in the result. For the criteria and the result see Figure 25.
Figure 25
Example 20: Compare the row ‘value’ relative to the overall data set
List all product orders where the quantity ordered is more than 50% greater than the average quantity.
In this example, the comparison will be between each row of data and a statistical measure applied to
the entire data set. What is required is to test whether the quantity is more than 1.5 * average quantity
ordered per product. This translates to the formula and result in Figure 26.
Figure 26
Advanced Filter criteria must be a regular formula and not an array formula
Array formulas, for those who know about them, represent a powerful computation capability within
Excel. Unfortunately, the Advanced Filter criteria range cannot contain such a formula. If it does, Excel
will return no results. Consequently, it becomes necessary to find a workaround, when possible, to the
use of an array formula in a filter criteria formula.
List all product orders where the revenue is more than 50% greater than the average revenue.
This looks very much like Example 20 but it requires the use of a calculated field. The most obvious way
might be the formula =(F2*G2)>1.5*AVERAGE($F$2:$F$60*$G$2:$G$60). This will yield a #VALUE! error
since it must be entered as an array formula.
After than is done, and Advanced Filter is given this criteria range, the result will be no rows in the
output.
Obviously, not every array formula has an equivalent non-array formula and in those cases Advanced
Filter will not provide the desired functionality.
The one element of the Advanced Filter dialog box that has remained unused has been the Unique
records only checkbox. Select it to ensure that the result contains only unique records, where
uniqueness means that no 2 rows have the same value in all the output columns.
This enables a set of useful results that identify unique values in various columns in the source data. For
example, a list of every product that has been ordered at least once or a list of orders and their order
dates.
It is also possible to specify a criteria together with the unique records only option to get concise results
that satisfy the desired condition.
This is probably the easiest way to create a list of unique items in any column. When the Advanced
Filter output range specifies a single column header, Excel will create a unique list of all the items in that
column. This is an excellent example of the benefit of leaving empty the filter criteria field (see Figure
28).
Figure 28
Create a unique list of dates when orders were placed and the corresponding orders.
The ability to create a unique list extends to multiple columns. When more than 1 column is involved,
uniqueness is defined as having a different value when all columns are taken into account.
Consequently, the same value may appear multiple times in the one column (as the date in Figure 29)
but the values across all the columns together will be unique.
Figure 29
List all customers affected by unshipped orders placed earlier than 7 days prior to July 6, 2012.
This is similar to Example 17 except that the result should be a unique list of affected customers as in
Figure 30.
Figure 30
Some of the formulas used as filter criteria referred to the entire column of data using absolute
references (e.g., $G$2:$G$60). Obviously, these formulas would require revision if the source data
range were to expand.
There are two ways to create a reference that dynamically adjusts itself. One is the use an Excel table.
The other is to use a named formula that creates a dynamic range reference. Given how flexible Excel
tables are, we will look at only that way of creating a dynamic reference. To convert the same data set
into an Excel table, select any cell in the data source, then select the Insert tab | Tables group | Table
button. In the example workbook, the TableFilter worksheet contains the source in table form. The first
few rows in the table look like Figure 31.
Figure 31
List all product orders where the entire order generated revenues in excess of 2,000.
This is the same problem as Example 19. As always, it’s easiest to create the formula using the mouse to
click, point, and drag as appropriate. Selecting all the data in a column will result in Excel converting the
range reference into a table reference. Note the difference in the dialog box source data field as well as
in the formula in the criteria cell (Figure 32). The result is identical to that in Example 19.
Figure 32
Figure 33
Working with the source list in another workbook is similar. Start with the destination worksheet in the
destination workbook. The rest of the mechanics of creating the criteria formula and the references in
the dialog box are the same. Obviously, the references are longer because, now, they also have to
include the workbook name as in Figure 34.
Figure 34
One final note on working with another workbook: Even when copying the data to another location, the
source workbook must be open for Advanced Filter to work.
Summary
This document provides what is hopefully a comprehensive review of the Advanced Filter
capability in Excel 2007 and above versions.