Pandas DataFrame duplicated() Method

Check which rows are duplicated and not:

import pandas as pd

data = {
"name": ["Sally", "Mary", "John", "Mary"],
"age": [50, 40, 30, 40]
}

df = pd.DataFrame(data)

s = df.duplicated()

Definition and Usage

The duplicated() method returns a Series with True and False values that describe which rows in the DataFrame are duplicated and not.

Use the subset parameter to specify if any columns should not be considered when looking for duplicates.

dataframe.duplicated(subset, keep)

The parameters are keyword arguments.

Parameter	Value	Description
subset	column label(s)	Optional. A String, or a list, containing any columns to ignore
keep	`'first' 'last' False`	Optional, default 'first'. Specifies which duplicate to keep. If False, drop ALL duplicates

A Series with a boolean value for each row in the DataFrame.