Abhinav Kaushik

Remove Duplicates from Array

Problem Statement

Given an array of integers, we have to remove duplicates from array such that each element in the array appears only once (all the values in the array should become unique).

Example

Input : [1,2,2,3,4,4,4,5,5]

Output : [1,2,3,4,5]

Explanation

As we can see the frequency of all the elements

ElementFrequency
11
22
31
43
52

After removing the duplicates, the frequency of all the elements became 1, so all the duplicate elements have been removed.

ElementFrequency
11
21
31
41
51

So the output is [1,2,3,4,5]

Constraints

1<=N<=106
1<=A[i]<=105

Approach 1 – Using Extra Space

This is the Brute-Force method to remove duplicates from array. In this method, we will be using extra space.

Algorithm:

  • Sort the array
  • Create a resultant array named res, to store the modified array
  • Run a loop from i=0 to i=n-2
  • if a[i]!=a[i+1], then push a[i] into res array
  • add the a[n-1] element into res array
  • Finally return the resultant array

C++ Implementation

#include<bits/stdc++.h>
using namespace std;

void removeDuplicates(vector<int> a,int n){
    vector<int> res;
    sort(a.begin(),a.end());
    for(int i=0;i<n-1;i++){
        if(a[i]!=a[i+1])
            res.push_back(a[i]);
    }
    res.push_back(a[n-1]);
    for(auto x:res)
        cout<<x<<" ";
}

int main(){
    vector<int> a{1,2,1,2,3,4,5,6,4,4,5,5};
    removeDuplicates(a,a.size());
}

Java Implementation

public class Main {
  
    public static void removeDuplicates(int a[], int n)
    {
        int[] res = new int[n];
        int j = 0;
        Arrays.sort(a);
        for (int i = 0; i < n - 1; i++) {
            if (a[i] != a[i + 1]) {
                res[j++] = a[i];
            }
        }  
        res[j++] = a[n - 1];
        for (int i = 0; i < j; i++) {
            System.out.print(res[i]+" ");
        }
    }
    public static void main(String[] args)
    {
        int a[] = {1,2,1,2,3,4,5,6,4,4,5,5};
        int n = a.length;
        removeDuplicates(a, n);
    }
}

Python Implementation

def removeDuplicates(a, n):
    a.sort()
    res = list(range(n))
    j = 0;
    for i in range(0, n-1):
        if a[i] != a[i+1]:
            res[j] = a[i]
            j += 1
    res[j] = a[n-1]
    j += 1
    for i in range(0, j):
        a[i] = res[i]
    for i in range(0, j):
        print(a[i],end=" ")
a = [1,2,1,2,3,4,5,6,4,4,5,5]
n = len(a)
removeDuplicates(a, n)

Complexity Analysis

Time Complexity Analysis

  • In the first step, we are sorting the array which takes $O(NlogN)$ time
  • Then we are running the loop from 1 to n-1 which takes $O(N)$ time

So the total worst case time complexity for this approach to remove duplicates from array is $O(NlogN)+O(N) = O(NlogN)$

Auxilary Space Analysis
In this approach we are using the resultant array for storing the unique elements, so it uses $O(N)$ auxiliary space to remove duplicates from array.

Time Complexity: $O(NlogN)$

Auxilary Space: $O(N)$

Approach 2 – Using Constant Extra space

In this method, we will not use any auxiliary space, we will just modify the given array such that the k elements from the starting are the unique elements, and the rest are duplicates and we will return the value of k. So in this way we can remove duplicates from array.

Algorithm:

  • Sort the array
  • We know that the first element will always be unique, so we maintain a pointer named k, which will start from 1 and keep the track of the unique elements.
  • Run a loop from i=0 to i=n-2
  • If a[i]!=a[i+1], then a[k]=a[i] and k++
  • else continue;
  • add the a[n-1] element into res array
  • Finally return the value of k, which is now the total number of unique elements in the array, so we will print the elements present in the index from i=0 to i=k-1

C++ Implementation

#include<bits/stdc++.h>
using namespace std;

int removeDuplicates(vector<int> &a,int n){
    int k=0;
    sort(a.begin(),a.end());
    for(int i=0;i<n-1;i++){
        if(a[i]!=a[i+1]){
            a[k]=a[i];
            k++;
        }
    }
    a[k]=a[n-1];
    k++;
    return k;
}

int main(){
    vector<int> a{1,2,1,2,3,4,5,6,4,4,5,5};
    int k = removeDuplicates(a,a.size());
    for(int i=0;i<k;i++)
        cout<<a[i]<<" ";
}

Java Implementation

public class Main {

    public static void removeDuplicates(int a[], int n)
    {
        int j = 0;
        Arrays.sort(a);
        for (int i = 0; i < n - 1; i++) {
            if (a[i] != a[i + 1]) {
                a[j++] = a[i];
            }
        }  
        a[j++] = a[n - 1];
        for (int i = 0; i < j; i++) {
            System.out.print(a[i]+" ");
        }
    }
    public static void main(String[] args)
    {
        int a[] = {1,2,1,2,3,4,5,6,4,4,5,5};
        int n = a.length;
        removeDuplicates(a, n);
    }
}

Python Implementation

def removeDuplicates(a, n):
    a.sort()
    j = 0;
    for i in range(0, n-1):
        if a[i] != a[i+1]:
            a[j] = a[i]
            j += 1
    a[j] = a[n-1]
    j += 1
    for i in range(0, j):
        print(a[i],end=" ")
a = [1,2,1,2,3,4,5,6,4,4,5,5]
n = len(a)
removeDuplicates(a, n)

Complexity Analysis

Time Complexity Analysis

  • In the first step, we are sorting the array which takes $O(NlogN)$ time
  • Then we are running the loop from 0 to n-2 which takes $O(N)$ time

So the total worst case time complexity for this approach to remove duplicates from array is $O(NlogN)+O(N) = O(NlogN)$

Auxilary Space Analysis
In this approach we are using only one variable i.e. k, so ultimately we are using only constant extra space which is $O(1)$

Time Complexity: $O(NlogN)$

Auxilary Space: $O(1)$

Approach 3 – Using a Set

In this method, we are going to use the Set Data structure to remove duplicates from array. Finally, we will modify the given array such that the k elements from the starting are the unique elements, and we will return k which is the total number of unique elements.

Note: Set is a data structure that stores only one occurrence of each element inserted into it.

Algorithm:

  • Create a set named s
  • Insert all the elements of the array into the set
  • Maintain a pointer named k and initialize it with 0
  • Start traversing the set and modify the array as a[k]=x and k++, where x is the element in the set
  • Finally return the value of k, which is now the total number of unique elements in the array, so we will print the elements present in the index from i=0 to i=k-1

C++ Implementation

#include<bits/stdc++.h>
using namespace std;

int removeDuplicates(vector<int> &a,int n){
    int k=0;
    set<int> s;
    for(int i=0;i<n;i++)
        s.insert(a[i]);
    for(auto x:s){
        a[k]=x;
        k++;
    }
    return k;
}

int main(){
    vector<int> a{1,2,1,2,3,4,5,6,4,4,5,5};
    int k = removeDuplicates(a,a.size());
    for(int i=0;i<k;i++)
        cout<<a[i]<<" ";
}

Java Implementation

import java.util.*;
public class Main {

    public static void removeDuplicates(int a[], int n)
    {
        Set<Integer> hash_set = new HashSet<Integer>();
        for (int i = 0; i<n; i++) {
            hash_set.add(a[i]);
        }  
        System.out.print(hash_set);

    }
    public static void main(String[] args)
    {
        int a[] = {1,2,1,2,3,4,5,6,4,4,5,5};
        int n = a.length;
        removeDuplicates(a, n);
    }
}

Python Implementation

def removeDuplicates(a, n):
    hashset=set()
    for i in range(0, n):
        hashset.add(a[i])
    print(hashset)

a = [1,2,1,2,3,4,5,6,4,4,5,5]
n = len(a)
removeDuplicates(a, n)

Complexity Analysis

Time Complexity Analysis

  • In the first step, we are inserting all the elements of the array into the set by running a loop from i=0 to i=n-1 which will take $O(N)$ time.
  • Internally, the set uses hashing and takes $O(NlogN)$ time for sorting the elements.
  • Finally, we are traversing the set which will take $O(N)$ time in the worst case (if every element is unique)

So the total worst case time complexity for this approach to remove duplicates from array is $O(N)+O(NlogN)+O(N) = O(NlogN)$

Auxilary Space Analysis
In this approach, we are using a set and we are inserting all the elements of the array into the set. So we are using $O(N)$ auxiliary space in this approach.

Time Complexity: $O(NlogN)$

Auxilary Space: $O(N)$

Approach 4 – Using indexOf() and filter() methods in Javascript

In this approach, we will see how we can remove duplicates from array by using indexOf() and filter() methods in JavaScript.

Note: The indexOf() method returns the position of the first occurrence of an element in the array.

Note: The filter() method returns a new array that is filled with all the elements that passed the test provided by the function.

Algorithm

  • Create a new array named result
  • Use filter() method to include only elements whose indexes match their indexOf() values
  • Print the result array

Javascript Implementation

const a = [1,2,2,3,4,4,4,5,];
const result = a.filter((x, index) => {
    return a.indexOf(x) === index;
});
console.log(result);

Complexity Analysis

Time Complexity Analysis

  • In the first step, we are traversing the array which will take $O(N)$ time.
  • Inside the loop, we are using the indexOf() method, which will take $O(N)$ time.

So the total worst-case time complexity for this approach to remove duplicates from array is $O(N^2)$

Auxilary Space Analysis
In this approach, we are using the resultant array to store the result. So we are using $O(N)$ auxiliary space in this approach.

Time Complexity: $O(N^2)$

Auxilary Space: $O(N)$

Approach 5 – Using forEach() and include() methods in Javascript

In this approach, we will see how we can remove duplicates from array by using forEach() and include() methods in JavaScript.

Note: The forEach() method calls the function for traversing each element in the array.

Note: The include() method returns true if a particular element is present in the array, else returns false.

Algorithm

  • Create a new array named result
  • Traverse the given array by using the forEach() method and use the include() method on the resultant array which returns true if an element is present in an array, otherwise returns false
  • Print the result array

Javascript Implementation

const a = [1,2,2,3,4,4,4,5,];
const result = [];
a.forEach((x) => {
    if (!result.includes(x)) {
        result.push(x);
    }
});
console.log(result);

Complexity Analysis

Time Complexity Analysis

  • In the first step, we are traversing the array using forEach loop which will take $O(N)$ time.
  • Inside the loop we are using the includes() method, which will take $O(N)$ time.

So the total worst-case time complexity for this approach to remove duplicates from array is $O(N^2)$

Auxilary Space Analysis
In this approach, we are using the resultant array to store the result. So we are using $O(N)$ auxiliary space in this approach.

Time Complexity: $O(N^2)$

Auxilary Space: $O(N)$

Conclusion

In this quick tutorial, we have discussed 5 different approaches to remove duplicates from array, of which 2 are specific to Javascript language

  • In Approach 1, we sorted the array and used $O(N)$ extra space
  • In Approach 2, we sorted the array but we used only $O(1)$ extra space
  • In Approach 3, we used Set data structure, and used $O(N)$ extra space
  • In Approach 4, we learned how to remove duplicates in Javascript by using indexOf() and filter() methods
  • In Approach 5, we learned how to remove duplicates in Javascript by using forEach() and include() methods

Frequently asked questions (FAQs)

Q: How can we remove duplicates from an unsorted array?

A: We can use a hashmap that maintains the frequency of each element, and then by letting only 1 occurrence of each element into the array, we can remove duplicates from array.

Q: What can be the best time complexity for removing the duplicates?

A: By using an unordered set, we can remove duplicates from array in only O(N) time complexity and O(N) auxiliary space.

Author