How to Filter and Remove Columns in Knime

May 01, 2021

In today’s post we’re going to go over how to filter out and remove columns from your data tables. Filtering columns from your data is a fundamental skill for business analyst, data analyst, data scientists and everyone in between. Whether you need to drop a column that’s no longer needed or a column that never really had any useful detail, today we’ll learn how to filter columns out from your data set using the Knime analytics Platform.

Let’s get started!

The Data Set We Will Filter Columns From

If you’ve followed our video or blog post on how to unpivot data, then you might have noticed that the RowID column also unpivoted. We don’t really have any valuable data in that column, so we’ll use that table to filter out the redundant RowID column.

The data table We will remove columns from using Knime
The Data Table We’re Working With in Knime
Filtering columns in Knime
We Will Filter the Row ID Columns in Knime

Find the Column Filter Node

To filter out columns from data tables in Knime, we’re going to have to use the Column Filter node. If you type Column Filter in the node repository, you should see the node pop up. Drag and drop the node onto your workflow and connect it to the data table that we want to filter columns from.

The Column Filter Node in the Knime Node Repository
The Column Filter Node in the Knime Node Repository

Configuring the Column Filter Node in Knime

Once you’ve got the node connected, you can double click into it to start the configuration. The configuration screen for the Column Filter node should look like the below.  

Configuring the Column Filter Node in the Knime Analytics Platform
Configuring the Column Filter Node in Knime

The menu is relatively straight forward for the goal we’re looking to achieve today – to filter out the RowIDs column. The columns that are in the green section are the columns that will be included in the output data table, while the columns in the red section are those that will be filtered out / removed from the output data table.

For our goal we simply need to double-click on the RowIDs column so that it moves from the green section to the red section. After we’ve got that done, then we can click apply, ok, and execute the node. The final output table should look like this:

How to drop Columns in Knime
How to Filter Columns in Knime

That’s all there is to it! We’ve successfully filtered out a column from our data table.

While we used Knime’s column filter node in a rather straight-forward situation, the column filter node does offer additional flexibility for filtering in less straight-forward situations. For those less than simple situations, you might have noticed the three bullet selections above the red & green windows. I’ve highlighted the bullets in yellow in the below screen shot.

Knime Column Filtering with Regex & by Column Type
Knime Column Filtering with Regex & by Column Type

These bullets allow us to filter columns thru a manual selection, thru regex mechanisms, and thru column type criteria. The manual selection bullet is the one we just used, the straight-forward one. The other two selections are much more useful for more complex column filtering. Some complex filtering scenarios include:

  • when we’re looking to filter multiple similarly named columns all in one go

For example, when you run a loop on your data that appends new columns to your data set, then we could use the wildcard/regex option to filter out unwanted iteration columns. Or,

  • when we only want to keep a specific type of column in our data set

For example, if we only want to keep numeric value columns because we want to run a PCA analysis, then the type selection option would work best.

Stay tuned for more detail on these two other filtering options and how to use them in your work, I’m working on another post for that. I’ll link that post here once it’s live!   

I hope this post helped you learn how to filter columns from your data in Knime. As always, if you have any questions or need anything clarified, don’t hesitate to reach me via DMs on twitter (@cest_nick). Don’t forget to share this post with any of your friends or colleagues that might find it helpful!

-Nick

Book a call to explore the different avenues the Analytic Iron team can help You extract insights from Your Data.
BOOK a Discovery cALL