September 30, 2024

Remove Duplicates from Array

Problem Statement

Given an array of integers, we have to remove duplicates from array such that each element in the array appears only once (all the values in the array should become unique).

Example

Input : [1,2,2,3,4,4,4,5,5]

Output : [1,2,3,4,5]

Explanation

As we can see the frequency of all the elements

Element	Frequency
1	1
2	2
3	1
4	3
5	2

After removing the duplicates, the frequency of all the elements became 1, so all the duplicate elements have been removed.

Element	Frequency
1	1
2	1
3	1
4	1
5	1

So the output is [1,2,3,4,5]

Constraints

1<=N<=10⁶
1<=A[i]<=10⁵

Approach 1 – Using Extra Space

This is the Brute-Force method to remove duplicates from array. In this method, we will be using extra space.

Algorithm:

Sort the array
Create a resultant array named res, to store the modified array
Run a loop from i=0 to i=n-2
if a[i]!=a[i+1], then push a[i] into res array
add the a[n-1] element into res array
Finally return the resultant array

C++ Implementation

#include<bits/stdc++.h>
using namespace std;

void removeDuplicates(vector<int> a,int n){
    vector<int> res;
    sort(a.begin(),a.end());
    for(int i=0;i<n-1;i++){
        if(a[i]!=a[i+1])
            res.push_back(a[i]);
    }
    res.push_back(a[n-1]);
    for(auto x:res)
        cout<<x<<" ";
}

int main(){
    vector<int> a{1,2,1,2,3,4,5,6,4,4,5,5};
    removeDuplicates(a,a.size());
}

Java Implementation

public class Main {
  
    public static void removeDuplicates(int a[], int n)
    {
        int[] res = new int[n];
        int j = 0;
        Arrays.sort(a);
        for (int i = 0; i < n - 1; i++) {
            if (a[i] != a[i + 1]) {
                res[j++] = a[i];
            }
        }  
        res[j++] = a[n - 1];
        for (int i = 0; i < j; i++) {
            System.out.print(res[i]+" ");
        }
    }
    public static void main(String[] args)
    {
        int a[] = {1,2,1,2,3,4,5,6,4,4,5,5};
        int n = a.length;
        removeDuplicates(a, n);
    }
}

Python Implementation

def removeDuplicates(a, n):
    a.sort()
    res = list(range(n))
    j = 0;
    for i in range(0, n-1):
        if a[i] != a[i+1]:
            res[j] = a[i]
            j += 1
    res[j] = a[n-1]
    j += 1
    for i in range(0, j):
        a[i] = res[i]
    for i in range(0, j):
        print(a[i],end=" ")
a = [1,2,1,2,3,4,5,6,4,4,5,5]
n = len(a)
removeDuplicates(a, n)

Complexity Analysis

Time Complexity Analysis

In the first step, we are sorting the array which takes $O(NlogN)$ time
Then we are running the loop from 1 to n-1 which takes $O(N)$ time

So the total worst case time complexity for this approach to remove duplicates from array is $O(NlogN)+O(N) = O(NlogN)$

Auxilary Space Analysis
In this approach we are using the resultant array for storing the unique elements, so it uses $O(N)$ auxiliary space to remove duplicates from array.

Time Complexity: $O(NlogN)$

Auxilary Space: $O(N)$

Approach 2 – Using Constant Extra space

In this method, we will not use any auxiliary space, we will just modify the given array such that the k elements from the starting are the unique elements, and the rest are duplicates and we will return the value of k. So in this way we can remove duplicates from array.

Algorithm:

Sort the array
We know that the first element will always be unique, so we maintain a pointer named k, which will start from 1 and keep the track of the unique elements.
Run a loop from i=0 to i=n-2
If a[i]!=a[i+1], then a[k]=a[i] and k++
else continue;
add the a[n-1] element into res array
Finally return the value of k, which is now the total number of unique elements in the array, so we will print the elements present in the index from i=0 to i=k-1

C++ Implementation

#include<bits/stdc++.h>
using namespace std;

int removeDuplicates(vector<int> &a,int n){
    int k=0;
    sort(a.begin(),a.end());
    for(int i=0;i<n-1;i++){
        if(a[i]!=a[i+1]){
            a[k]=a[i];
            k++;
        }
    }
    a[k]=a[n-1];
    k++;
    return k;
}

int main(){
    vector<int> a{1,2,1,2,3,4,5,6,4,4,5,5};
    int k = removeDuplicates(a,a.size());
    for(int i=0;i<k;i++)
        cout<<a[i]<<" ";
}

Java Implementation

public class Main {

    public static void removeDuplicates(int a[], int n)
    {
        int j = 0;
        Arrays.sort(a);
        for (int i = 0; i < n - 1; i++) {
            if (a[i] != a[i + 1]) {
                a[j++] = a[i];
            }
        }  
        a[j++] = a[n - 1];
        for (int i = 0; i < j; i++) {
            System.out.print(a[i]+" ");
        }
    }
    public static void main(String[] args)
    {
        int a[] = {1,2,1,2,3,4,5,6,4,4,5,5};
        int n = a.length;
        removeDuplicates(a, n);
    }
}

Python Implementation

def removeDuplicates(a, n):
    a.sort()
    j = 0;
    for i in range(0, n-1):
        if a[i] != a[i+1]:
            a[j] = a[i]
            j += 1
    a[j] = a[n-1]
    j += 1
    for i in range(0, j):
        print(a[i],end=" ")
a = [1,2,1,2,3,4,5,6,4,4,5,5]
n = len(a)
removeDuplicates(a, n)

Complexity Analysis

Time Complexity Analysis

In the first step, we are sorting the array which takes $O(NlogN)$ time
Then we are running the loop from 0 to n-2 which takes $O(N)$ time

So the total worst case time complexity for this approach to remove duplicates from array is $O(NlogN)+O(N) = O(NlogN)$

Auxilary Space Analysis
In this approach we are using only one variable i.e. k, so ultimately we are using only constant extra space which is $O(1)$

Time Complexity: $O(NlogN)$

Auxilary Space: $O(1)$

Approach 3 – Using a Set

In this method, we are going to use the Set Data structure to remove duplicates from array. Finally, we will modify the given array such that the k elements from the starting are the unique elements, and we will return k which is the total number of unique elements.

Note: Set is a data structure that stores only one occurrence of each element inserted into it.

Algorithm:

Create a set named s
Insert all the elements of the array into the set
Maintain a pointer named k and initialize it with 0
Start traversing the set and modify the array as a[k]=x and k++, where x is the element in the set
Finally return the value of k, which is now the total number of unique elements in the array, so we will print the elements present in the index from i=0 to i=k-1

C++ Implementation

#include<bits/stdc++.h>
using namespace std;

int removeDuplicates(vector<int> &a,int n){
    int k=0;
    set<int> s;
    for(int i=0;i<n;i++)
        s.insert(a[i]);
    for(auto x:s){
        a[k]=x;
        k++;
    }
    return k;
}

int main(){
    vector<int> a{1,2,1,2,3,4,5,6,4,4,5,5};
    int k = removeDuplicates(a,a.size());
    for(int i=0;i<k;i++)
        cout<<a[i]<<" ";
}

Java Implementation

import java.util.*;
public class Main {

    public static void removeDuplicates(int a[], int n)
    {
        Set<Integer> hash_set = new HashSet<Integer>();
        for (int i = 0; i<n; i++) {
            hash_set.add(a[i]);
        }  
        System.out.print(hash_set);

    }
    public static void main(String[] args)
    {
        int a[] = {1,2,1,2,3,4,5,6,4,4,5,5};
        int n = a.length;
        removeDuplicates(a, n);
    }
}

Python Implementation

def removeDuplicates(a, n):
    hashset=set()
    for i in range(0, n):
        hashset.add(a[i])
    print(hashset)

a = [1,2,1,2,3,4,5,6,4,4,5,5]
n = len(a)
removeDuplicates(a, n)

Complexity Analysis

Time Complexity Analysis

In the first step, we are inserting all the elements of the array into the set by running a loop from i=0 to i=n-1 which will take $O(N)$ time.
Internally, the set uses hashing and takes $O(NlogN)$ time for sorting the elements.
Finally, we are traversing the set which will take $O(N)$ time in the worst case (if every element is unique)

So the total worst case time complexity for this approach to remove duplicates from array is $O(N)+O(NlogN)+O(N) = O(NlogN)$

Auxilary Space Analysis
In this approach, we are using a set and we are inserting all the elements of the array into the set. So we are using $O(N)$ auxiliary space in this approach.

Time Complexity: $O(NlogN)$

Auxilary Space: $O(N)$

Approach 4 – Using indexOf() and filter() methods in Javascript

In this approach, we will see how we can remove duplicates from array by using indexOf() and filter() methods in JavaScript.

Note: The indexOf() method returns the position of the first occurrence of an element in the array.

Note: The filter() method returns a new array that is filled with all the elements that passed the test provided by the function.

Algorithm

Create a new array named result
Use filter() method to include only elements whose indexes match their indexOf() values
Print the result array

Javascript Implementation

const a = [1,2,2,3,4,4,4,5,];
const result = a.filter((x, index) => {
    return a.indexOf(x) === index;
});
console.log(result);

Complexity Analysis

Time Complexity Analysis

In the first step, we are traversing the array which will take $O(N)$ time.
Inside the loop, we are using the indexOf() method, which will take $O(N)$ time.

So the total worst-case time complexity for this approach to remove duplicates from array is $O(N^2)$

Auxilary Space Analysis
In this approach, we are using the resultant array to store the result. So we are using $O(N)$ auxiliary space in this approach.

Time Complexity: $O(N^2)$

Auxilary Space: $O(N)$

Approach 5 – Using forEach() and include() methods in Javascript

In this approach, we will see how we can remove duplicates from array by using forEach() and include() methods in JavaScript.

Note: The forEach() method calls the function for traversing each element in the array.

Note: The include() method returns true if a particular element is present in the array, else returns false.

Algorithm

Create a new array named result
Traverse the given array by using the forEach() method and use the include() method on the resultant array which returns true if an element is present in an array, otherwise returns false
Print the result array

Javascript Implementation

const a = [1,2,2,3,4,4,4,5,];
const result = [];
a.forEach((x) => {
    if (!result.includes(x)) {
        result.push(x);
    }
});
console.log(result);

Complexity Analysis

Time Complexity Analysis

In the first step, we are traversing the array using forEach loop which will take $O(N)$ time.
Inside the loop we are using the includes() method, which will take $O(N)$ time.

So the total worst-case time complexity for this approach to remove duplicates from array is $O(N^2)$

Auxilary Space Analysis
In this approach, we are using the resultant array to store the result. So we are using $O(N)$ auxiliary space in this approach.

Time Complexity: $O(N^2)$

Auxilary Space: $O(N)$

Conclusion

In this quick tutorial, we have discussed 5 different approaches to remove duplicates from array, of which 2 are specific to Javascript language

In Approach 1, we sorted the array and used $O(N)$ extra space
In Approach 2, we sorted the array but we used only $O(1)$ extra space
In Approach 3, we used Set data structure, and used $O(N)$ extra space
In Approach 4, we learned how to remove duplicates in Javascript by using indexOf() and filter() methods
In Approach 5, we learned how to remove duplicates in Javascript by using forEach() and include() methods

Frequently asked questions (FAQs)

Q: How can we remove duplicates from an unsorted array?

A: We can use a hashmap that maintains the frequency of each element, and then by letting only 1 occurrence of each element into the array, we can remove duplicates from array.

Q: What can be the best time complexity for removing the duplicates?

A: By using an unordered set, we can remove duplicates from array in only O(N) time complexity and O(N) auxiliary space.

Author

Abhinav Kaushik

View all posts

Remove Duplicates from Array

Problem Statement

Example

Explanation

Constraints

Approach 1 – Using Extra Space

Algorithm:

Complexity Analysis

Approach 2 – Using Constant Extra space

Algorithm:

Complexity Analysis

Approach 3 – Using a Set

Algorithm:

Complexity Analysis

Approach 4 – Using indexOf() and filter() methods in Javascript

Algorithm

Complexity Analysis

Approach 5 – Using forEach() and include() methods in Javascript

Algorithm

Complexity Analysis

Conclusion

Frequently asked questions (FAQs)

Author

Latest articles

Largest Number Formed from an Array

Fundamentals of Enhanced Interior Gateway Routing Protocol

What is Network Hub?

Largest Number Formed from an Array

Fundamentals of Enhanced Interior Gateway Routing Protocol

What is Network Hub?

Remove Duplicates from Array

Problem Statement

Example

Explanation

Constraints

Approach 1 – Using Extra Space

Algorithm:

Complexity Analysis

Approach 2 – Using Constant Extra space

Algorithm:

Complexity Analysis

Approach 3 – Using a Set

Algorithm:

Complexity Analysis

Approach 4 – Using indexOf() and filter() methods in Javascript

Algorithm

Complexity Analysis

Approach 5 – Using forEach() and include() methods in Javascript

Algorithm

Complexity Analysis

Conclusion

Frequently asked questions (FAQs)

Author

Latest articles

Largest Number Formed from an Array

Fundamentals of Enhanced Interior Gateway Routing Protocol

What is Network Hub?

Featured articles

Largest Number Formed from an Array

Fundamentals of Enhanced Interior Gateway Routing Protocol

What is Network Hub?