# Numpy Unique, Explained

This article is originally published at https://www.sharpsightlabs.com

This tutorial will explain how to use the Numpy unique function.

It will explain what the np.unique function does, how the syntax works, and it will show you clear examples.

If you need something specific, you can click on any of the following links.

**Table of Contents:**

## A Quick Introduction to Numpy Unique

The Numpy unique function is pretty straight forward: it identifies the unique values in a Numpy array.

So let’s say we have a Numpy array with repeated values. If we apply the np.unique function to this array, it will output the unique values.

Additionally, the Numpy unique function can:

- identify the unique
*rows*of a Numpy array - identify the unique
*columns*of a Numpy array - compute the
*number*of occurrences of the unique values - identify the
*index*of the first occurrence of the unique values

So the Numpy unique function identifies unique values, rows, and columns, but can also identify some other information about those unique values.

## The syntax of np.unique

Now that I’ve briefly explained what the Numpy unique function does, let’s take a look at the syntax.

#### A quick note

On the syntax explanation here, and in the examples section below, I’m going to assume that you’ve imported Numpy with the following code:

import numpy as np

This is the common convention for importing Numpy. It’s important though, because the exact form of the syntax will depend on how we import Numpy.

### np.unique syntax

The syntax is mostly straightforward.

We typically call the function as `np.unique()`

, assuming that we’ve imported Numpy with the alias `np`

.

Inside the parenthesis, the first argument to the function will be the name of the array that you want to operate on.

In the above syntax, this is called `arr`

, but here, you’ll actually use the name of your array. So if your array is called `my_array`

, you’ll use the code `np.unique(my_array)`

.

This input array is required.

Additionally though, there are a set of optional parameters that you can use to modify the behavior of the function.

### The parameters of np.unique

The np.unique function has four optional parameters:

`return_index`

`return_counts`

`axis`

`return_inverse`

Let’s look at each of those.

`return_index`

(optional)

When `return_index = True`

, np.unique will return the index of the first occurrence of the unique value.

This parameter is optional.

By default, this is set to `return_index = False`

.

`return_counts`

(optional)

When `return_counts = True`

, np.unique will return the number of times each unique value occurs in the input array.

This parameter is optional.

By default, this is set to `return_counts = False`

.

`axis`

(optional)

The `axis`

parameter enables you to specify a direction along which to use the np.unique function.

If set to `axis = None`

, the input array will be flattened before applying np.unique.

To learn more about the different axes (i.e., the “directions” along a Numpy array), you can read our tutorial about Numpy axes.

This parameter is optional.

By default, this is set to `axis = None`

.

`return_inverse`

(optional)

If `return_inverse = True`

, np.unique will return the indices of the unique array. These index values can be used to reconstruct the original array.

This parameter is optional.

By default, this is set to `return_inverse = False`

.

## Examples of how to use Numpy Unique

Now that we’ve looked at the syntax of the np.unique function, let’s look at some examples.

**Examples:**

- Get unique values from a 1D Numpy array
- Identify index of first occurrence of unique values
- Get the counts of each unique value
- Get the unique rows and columns

#### Run this code first

Before you run any of these examples, you need to run some code to import Numpy and to create a dataset.

##### Import Numpy

To import Numpy, run this code:

import numpy as np

This will enable us to call Numpy functions with the prefix `np`

.

##### Create Dataset

Now we’ll create a Numpy array.

Here, we’ll use the np.array function to create a 1-dimensional array.

array_with_duplicates = np.array([5,5,1,5,4,5,1,5,3,5,1,3])

As you can see, the array has several duplicated values.

### EXAMPLE 1: Get unique values from a 1D Numpy array

First, let’s get get the unique values from our 1D array, `array_with_duplicates`

.

# GET UNIQUE VALUES np.unique(array_with_duplicates)

OUT:

array([1, 3, 4, 5])

##### Explanation

This is pretty simple.

The input array, `array_with_duplicates`

, has the values `1`

, `3`

, `4`

, and `5`

, but they are duplicated and organized in random order.

When we apply the `np.unique()`

function, the output is a Numpy array of the unique values. These unique values are sorted in ascending order.

### EXAMPLE 2: Identify index of first occurrence of unique values

Next, we’re going to get the unique values *and also* get the index of the first occurrence of each unique value.

To do this, we’ll use the `return_index`

parameter.

# GET UNIQUE VALUES, WITH INDEX OF FIRST OCCURRENCE unique_values, first_occurrence_index = np.unique(array_with_duplicates, return_index = True)

Next, let’s print each of these output arrays.

print('These are the unique values:') print(unique_values) print('These are the indexes of the first occurrence:') print(first_occurrence_index)

OUT:

These are the unique values: [1 3 4 5] These are the indexes of the first occurrence: [2 8 4 0]

##### Explanation

Here, we used the `np.unique()`

on our input array, and we set parameter `return_index = True`

.

This caused `np.unique()`

to output two Numpy arrays:

- one array with the unique values (
`unique_values`

) - another array with the index of the first occurrence of every unique value (
`first_occurrence_index`

)

Just remember: when you set `return_index = True`

, `np.unique()`

will output two arrays!

### EXAMPLE 3: Get the counts of each unique value

Now, we’ll get the unique values *and* get the count of the number of occurrences of each unique value.

To do this, we’ll use the `return_counts`

parameter.

# GET UNIQUE VALUES, WITH COUNTS unique_values, value_count = np.unique(array_with_duplicates, return_counts = True)

Next, let’s print each of these output arrays.

print('These are the unique values:') print(unique_values) print('These are the counts of the unique values:') print(value_count)

OUT:

These are the unique values: [1 3 4 5] These are the counts of the unique values: [3 2 1 6]

##### Explanation

Here, we used the `np.unique()`

on our input array, and we set parameter `return_counts = True`

.

This caused `np.unique()`

to output two Numpy arrays:

- one array with the unique values (
`unique_values`

) - another array with the count of the number of occurrences of every unique value (
`value_count`

)

Again, when you set `return_counts = True`

, `np.unique()`

will output two arrays!

### EXAMPLE 4: Get the unique rows and columns

Finally, let’s identify the unique rows and the unique columns of an array.

To do this, we’ll use the `axis`

parameter.

##### Create 2D Array

To run this example, we first need to create a 2-dimensional array. So here, we’ll create a 2D array using the Numpy array function.

dupe_array_2d = np.array([[1,2,1],[2,2,2],[1,2,1]])

And now, let’s look at it with a print statement:

print(dupe_array_2d)

OUT:

[[1 2 1] [2 2 2] [1 2 1]]

So the array, `dupe_array_2d`

, is a two dimensional array with 3 rows and 3 columns.

If you look carefully, you’ll notice that the 1st and 3rd rows are the same. The 1st and 3rd columns are also the same.

##### Get unique rows and columns

Now that we have our array, let’s get the unique rows and unique columns.

To get the unique rows, we set `axis = 0`

, and to get the unique columns, we set `axis = 1`

.

# GET UNIQUE ROWS print('Unique rows:') np.unique(dupe_array_3x4, axis = 0) # GET UNIQUE COLUMNS print('Unique columns:') np.unique(dupe_array_3x4, axis = 1)

OUT:

Unique rows: array([[1, 2, 1], [2, 2, 2]]) Unique columns: array([[1, 2], [2, 2], [1, 2]])

##### Explanation

This is somewhat straightforward, if you understand how axes work.

For a 2D array, axis-0 points downward and axis-1 points horizontally.

So when we set `axis = 0`

, np.unique operates downward in the axis-0 direction. This causes it to identify the unique rows.

Similarly, when we set `axis = 1`

, np.unique operates horizontally in the axis-1 direction. This causes it to identify the unique columns.

This is fairly simple once you understand how Numpy axes work. Having said that, many people are confused by Numpy axes. If you need help understanding how axes work, read our explanation of Numpy array axes.

##### Leave your other questions in the comments below

Do you have other questions about the Numpy unique function?

If so, leave your questions in the comments section at the bottom of the page.

## For more Python data science tutorials, sign up for our email list

This tutorial should have given you a good understanding of the Numpy unique function.

But to learn data science in Python, you’ll need to learn a lot more about Numpy. In fact, you’ll need to learn about Pandas, and several other data science topics.

So if you want to learn Python data science, you should sign up for our FREE email list.

When you sign up, you’ll get free tutorials on:

- NumPy
- Pandas
- Base Python
- Scikit learn
- Machine learning
- Deep learning
- … and more.

We publish new tutorials every week, and when you sign up for our free email list, these tutorials will be delivered directly to your inbox.

The post Numpy Unique, Explained appeared first on Sharp Sight.

Thanks for visiting r-craft.org

This article is originally published at https://www.sharpsightlabs.com

Please visit source website for post related comments.