Ananya Dixit

Digit DP

Digit Dp as the name suggests, is a technique of dynamic programming wherein we use the digits of the numbers to reach the final solution. This technique solves those problems which concern the digits between two specified integers. To find the solution, we try building all the integers between the two given integers and compute the solution on the way.

Scope of the Article

  • This article discusses Digit DP, its key concept, time complexity, and implementation in Data Structures.
  • This article shows the implementation of Digit DP in C++.

Takeaways

The states of DP are the subproblems to the more significant problem. Each state is a subset of the input data used to build the result for the whole problem input.

What is Digit DP?

We’ve all read and heard about Dynamic Programming and how it makes use of repeating subproblems and memoization to solve a complete larger problem. Digit DP is one such technique that does the same, but in Digit DP, as the name suggests, we play with the digits of a number. This means to reach the final answer, we make use of the digits of a number.

Digit DP is very useful in solving problems that concern a range of numbers, like finding the sum of digits between two numbers $a$ and $b$. Or find how many times a particular digit occurs in the numbers in the range $[a,b]$. These are the situations where digit dp comes in handy. And it is clearly visible by the problem statements that these questions have something to do with the digits of the numbers involved.

Let us find out more about how digit dp works and what is the main idea behind it.

Key Concept

Digit DP is based on the idea of using digits to get to the final answer. Because of this reason, digit dp is used in questions that concern the digits of numbers, maybe how many times a digit occurs in a range or the sum of all odd digits in a range.

The main concept by which digit dp can solve such questions is that we consider numbers as a sequence of digits, and try to build numbers within the given range digit by digit as we go, and compute the answer also digit by digit.

Let us take an example to understand more clearly how exactly digit dp works.

Problem Statement:

Find the total number of times the digit $d$ occurs in the numbers in the range $a$ to $b$, where $a<b$.

For example, let us take the case where a = 5 and b = 10, and d = 1, the answer will be 1 as there is only one digit (10) which has the digit 1.

But how can we solve this using digit dp? We can do so by building a number digit by digit so that it is inside the given range and on the way count the number of times we set a digit as 1. But how can we do this? Let’s start with a base case, where in the problem statement, a = 0.

Finding Solution for range $0$ to $b$

To find a solution, we treat our number as a string, basically a string of digits. We need to build numbers digit by digit until they are inside the range given to us in the question. As a = 0, the string is empty in the start, and we can add digits to it.

So, we start adding digits from left to right, so at each point of time, we have a position where we can enter a digit to form a new number, and then recurse for the next position. Now, we need to look at the choices that we have for a particular position. We cannot just add any digit from 0 to 9, because we need to make sure the resulting number satisfies the range.

Let us take an example to understand this better. Let a = 0 and b = 83625. Suppose until now our number string that we are building by adding digits till now is "8362_". In the next recursion, we have to add another digit to the right of 2, but can we add any digit from 0 to 9?

Adding any digit > 5 will result in the number becoming greater than $b$ and thus going out of range. This means that for this particular position we can add only digits from 0 to 5.

On the other hand if the sequence we had built uptil now was "734__", we have no restrictions on any of the places. We can put any digit from 0 to 9 in the remaining places on the right. This is because the first digit '7' itself ensures that the complete number will always be less than $b$ no matter what digits occur to the right of it.

This observation is very important in reaching the digit dp approach. Clearly we have some positions where we need to add digits and build a number inside the given range. The number of positions will depend upon $b$, and will be equal to the number of digits $b$ has.

Now, we have some positions where we need to add digits. We can start from the leftmost position and keep adding digits. At the same time, we will also need some information as to which digits can be added to this position depending upon the previous digits as we saw in the example. How to know this?

One possible way is to keep track of the whole sequence built uptil now, and then check by placing all digits from 0 to 9, and only go ahead with the ones that are less than $b$, and discard the rest. But there is a simpler approach.

We saw using the example, that we have a restriction on the digits we can add only when the sequence built uptil now is the same as the number. If we have added even one number which is less than its corresponding digit of $b$ in the sequence we have made till now, then we have no restrictions.

This follows from the last example, if the sequence built until now is "83___" we cannot put any digit greater than 6 in the next position, but if our sequence was "82___" we have no restrictions, as the digit 2 in the second place ensures that our number will always be smaller than $b$.

This can easily be conveyed using a bool flag variable as a parameter to our recursive function. If this flag variable is set to true, it means we have a restriction and we can add digits from 0 to the digit in $b$ at the corresponding position else the resulting number will go out of range, and if the flag variable is 0, we have no restrictions. Whenever we add a digit to our sequence which is equal to the corresponding digit in $b$, we have to set the flag to 1 and if we are adding a digit smaller than the corresponding digit in $b$, we set the flag to 0 and recurse for the next position.

The initial problem can be solved very easily now because whenever we add the digit $d$ to the sequence we are building, we can add 1 to the answer, and at the end, we will be left with the total number of times the digit $d$ occurs in the range 0 to $a$.

Finding Solution For Given Range

We learned about solving from the range $0$ to some positive integer $b$, but what about the range $a$ to $b$? We already know the solution for $0$ to $b$ and we can also calculate the solution for $0$ to $a-1$ by the same method.

Then, the solution for aa to bb will simply be

solution(a,b) = solution(0,b) - solution(0,a-1)

Why is this correct? Because in subtracting solution(0,a-1) from solution(0,b) we make sure all numbers less than a get excluded, thus giving us the soltuion for a to b.

States of DP Problem and Solution

So far we have only looked at how we can solve a problem using digit dp, let us now discuss the particulars of the solution.

We know that every DP solution has states, which refers to the total number of parameters required to uniquely define each state in the problem. In this case, we have three states in the DP solution. One state is the position parameter pospos which refers to the position where a new digit is to be inserted and it can take the maximum value of the number of digits in b.

The second state is the bool flagflag parameter that we defined earlier which denotes whether we have a restriction on the digits that can be chosen for the current position or not. The third state will be the parameter cntcnt which denotes the number of digits dd we have included till now.

Hence, our memoisation dp table will look like dp[pos][flag][cnt]. Let us look at the complete solution:

#include <bits/stdc++.h>
using namespace std;

// dp table to store answers
int memo[18][19][2];

// function to count the number of times digit d occurs from 0 to a 
// where a is represented as the vector digits 
int dp(vector<int>& digits, int d, int pos, int cnt, int flag){
    
    // base case
    if(pos>=digits.size()){
        // cnt is the number of times digit d occured 
        return cnt;
    }
    // if this state has already been calculated return it
    if(memo[pos][cnt][flag]!=-1){
        return memo[pos][cnt][flag];
    }
    
    // this variable denotes the upper bound 
    // on the digits that can be chosen for this position
    int limit = 9;
    // if flag is 1, that means we will have a restriction
    // on the digits that can be chosen, it cannot be greater than
    // digits[pos] else we have no limits
    if(flag==1){
        limit = digits[pos];
    }
    
    // answer variable
    int answer = 0;
    
    // for loop over all the digits that can be considered for this position
    for(int i=0;i<=limit;i++){
        
        // in the next iteration we need to decide the value of new flag
        int new_flag = flag;
        
        // if flag is 1 and the current digit we are choosing
        // is equal to digits[pos], then we need to set flag to 1
        if(flag==1 && i==digits[pos]){
            new_flag = 1;
        }
        // else we have no restrictions in the next iteration
        else{
            new_flag = 0;
        }
        
        // variable for the new digit count
        int new_cnt = cnt;
        
        // if the current digit being selected is d then increase new_cnt
        if(i==d){
            new_cnt++;
        }
        
        // recurse for next iteration and add to answer
        answer += dp(digits,d,pos+1,new_cnt,new_flag);
        
    }
    
    // store this answer in the dp table and return it
    return memo[pos][cnt][flag] = answer;
    
}
int main() {
	
	// declare variables
	int a,b,d;
	
	cin>>a>>b>>d;
	
	// vector to store the digits in a and b
	vector<int> a_digits, b_digits;
	
	// compute digits for a and b 
	a--;
	while(a>0){
	    a_digits.push_back(a%10);
	    a /= 10;
	}
	
	reverse(a_digits.begin(),a_digits.end());

	while(b>0){
	    b_digits.push_back(b%10);
	    b /= 10;
	}
	
	reverse(b_digits.begin(),b_digits.end());
	
	// initialize memo with -1
	memset(memo,-1,sizeof(memo));
	
	// compute answer for a
	int answer_a = dp(a_digits,d,0,0,1);
	
	// initialize memo with -1
	memset(memo,-1,sizeof(memo));
	
	// compute answer for b
	int answer_b = dp(b_digits,d,0,0,1);
	
	// final answer
	int answer = answer_b - answer_a;
	
	cout<<answer<<endl;
	
	return 0;
}

Input

10 15
5

Output

1

In this solution, the dp() function calculates the number of times the digit dd occurs in the numbers from 0 to the number represented by the digits array. We calculate this value for a−1 and b and subtract them to get the answer for a to b.

Time and Space complexity

The time complexity for this solution is O(10*pos*cnt*flag). The maximum value for $pos$ can be 18. This is because we can have integers upto the range of $10^{18}$, so this means that atmost we will have 18 places to fill digits in and hence the value of $pos$ can go upto 18. Similarly, $cnt$ can be 18 and $flag$ can be 0 or 1. So, the time complexity becomes almost constant! The space complexity is O(pos*cnt*flag).

Special Case: Digit d = 0

If you run this solution for digit $d$ = 0, then you will probably get a wrong answer. Let’s take a look:

Input

1 11 0

Output

12

Clearly, the output should have been 1 as 10is the only digit that contains 0 between 1 and 11, but we got 11 extra counts added to the answer. This is because according to our algorithm whenever a 0 is added to the building sequence, we increase the count. But this means that when we are building numbers between 1 to 9 we are adding extra zeroes that are not needed. These numbers are being represented as 01, 02, 03, and so on in the sequence that we built, and the zeroes in the front don’t need to be counted as they do not have any significance. If there is a way by which we can know if the number built till now is non-zero or not, then we will know when to count 0 as a digit: count 0 as a digit when the sequence built until now is non-zero and not otherwise.

This can be easily achieved by another bool variable $is_empty$. If this bool variable is true it means that the sequence built so far is empty, and if we are adding a 0 to the sequence we should not count it as a digit as it is at the front of the sequence and if this variable is false, it means our sequence is non-empty and we must count 0 as a digit. The rest of the code implementation remains the same. Let us take a look at the code.

#include <bits/stdc++.h>
using namespace std;

// dp table to store answers
int memo[18][19][2][2];

// function to count the number of times digit d occurs from 0 to a 
// where a is represented as the vector digits 
int dp(vector<int>& digits, int d, int pos, int cnt, int is_empty, int flag){
    
    // base case
    if(pos>=digits.size()){
        // cnt is the number of times digit d occured 
        return cnt;
    }
    // if this state has already been calculated return it
    if(memo[pos][cnt][flag][is_empty]!=-1){
        return memo[pos][cnt][flag][is_empty];
    }
    
    // this variable denotes the upper bound 
    // on the digits that can be chosen for this position
    int limit = 9;
    // if flag is 1, that means we will have a restriction
    // on the digits that can be chosen, it cannot be greater than
    // digits[pos] else we have no limits
    if(flag==1){
        limit = digits[pos];
    }
    
    // answer variable
    int answer = 0;
    
    // for loop over all the digits that can be considered for this position
    for(int i=0;i<=limit;i++){
        
        // in the next iteration we need to decide the value of new flag
        int new_flag = flag;
        
        // if flag is 1 and the current digit we are choosing
        // is equal to digits[pos], then we need to set flag to 1
        if(flag==1 && i==digits[pos]){
            new_flag = 1;
        }
        // else we have no restrictions in the next iteration
        else{
            new_flag = 0;
        }
        
        // variable for the new digit count
        int new_cnt = cnt;
        
        // variable for new is empty variable
        int new_is_empty = 0;
        
        if(i==0 && is_empty==1){
            new_is_empty = 1;
        }
        
        // if the current digit being selected is d  and sequence is not empty then increase new_cnt
        if(i==d && is_empty==0){
            new_cnt++;
        }
        
        // recurse for next iteration and add to answer
        answer += dp(digits,d,pos+1,new_cnt, new_is_empty, new_flag);
        
    }
    
    // store this answer in the dp table and return it
    return memo[pos][cnt][flag][is_empty] = answer;
    
}
int main() {
	
	// declare variables
	int a,b,d;
	
	cin>>a>>b>>d;
	
	// vector to store the digits in a and b
	vector<int> a_digits, b_digits;
	
    int x = a;
	// compute digits for a and b 
	if(a!=0){
	    a--;
	}
	while(a>0){
	    a_digits.push_back(a%10);
	    a /= 10;
	}
	
	reverse(a_digits.begin(),a_digits.end());

	while(b>0){
	    b_digits.push_back(b%10);
	    b /= 10;
	}
	
	reverse(b_digits.begin(),b_digits.end());
	
	// initialize memo with -1
	memset(memo,-1,sizeof(memo));
	
	// compute answer for a
	int answer_a = dp(a_digits,d,0,0,1,1);
	
	// initialize memo with -1
	memset(memo,-1,sizeof(memo));
	
	// compute answer for b
	int answer_b = dp(b_digits,d,0,0,1,1);
	
	// final answer
	int answer = answer_b - answer_a;
	
	// if a was 0, we did not count it at the start, so increase the answer here
	if(d==0 && x==0){
	    answer++;
	}
	
	cout<<answer<<endl;
	
	return 0;
}

Intput

1 12 0

Output

1

In this code we have also included a special case when a = 0. In such a case, subtracting 1 from aa will lead to a negative value, hence we keep aa as it is, and add one to the answer if our digit d = 0.

Including just one bool variable made this implementation very easy. The time complexity for this solution will be O(10*pos*cnt*isempty*flag). The maximum value for $pos$ can be 18.

This is because we can have integers upto the range of $10^{18}$, so this means that atmost we will have 18 places to fill digits in and hence the value of $pos$ can go upto 18. For $cnt$, maximum value can be 18 and $flag$ and $isempty$ can be 0 or 1. So, the time complexity becomes almost constant! The space complexity is O(pos*cnt*isempty*flag).

Example

Let us look at another example. Given two integers $a$ and $b$, find the sum of all digits of the integers that occur between $a$ and $b$. We can use digit dp to find a solution. The only difference from the above mentioned solution will be the variable cnt which will be changed to sum instead and we will add the value of each digit to this variable when we include it.

#include <bits/stdc++.h>
using namespace std;

// dp table to store answers
int memo[18][180][2];

// function to count the number of times digit d occurs from 0 to a 
// where a is represented as the vector digits 
int dp(vector<int>& digits, int pos, int sum, int flag){
    
    // base case
    if(pos>=digits.size()){
        // sum is the total sum of all digits in this number
        return sum;
    }
    // if this state has already been calculated return it
    if(memo[pos][sum][flag]!=-1){
        return memo[pos][sum][flag];
    }
    
    // this variable denotes the upper bound 
    // on the digits that can be chosen for this position
    int limit = 9;
    // if flag is 1, that means we will have a restriction
    // on the digits that can be chosen, it cannot be greater than
    // digits[pos] else we have no limits
    if(flag==1){
        limit = digits[pos];
    }
    
    // answer variable
    int answer = 0;
    
    // for loop over all the digits that can be considered for this position
    for(int i=0;i<=limit;i++){
        
        // in the next iteration we need to decide the value of new flag
        int new_flag = flag;
        
        // if flag is 1 and the current digit we are choosing
        // is equal to digits[pos], then we need to set flag to 1
        if(flag==1 && i==digits[pos]){
            new_flag = 1;
        }
        // else we have no restrictions in the next iteration
        else{
            new_flag = 0;
        }
        
        // recurse for next iteration by adding i to sum and add to answer
        answer += dp(digits,pos+1,sum+i,new_flag);
        
    }
    
    // store this answer in the dp table and return it
    return memo[pos][sum][flag] = answer;
    
}
int main() {
	
	// declare variables
	int a,b;
	
	cin>>a>>b;
	
	// vector to store the digits in a and b
	vector<int> a_digits, b_digits;
	
	// compute digits for a and b 
	a--;
	while(a>0){
	    a_digits.push_back(a%10);
	    a /= 10;
	}
	
	reverse(a_digits.begin(),a_digits.end());

	while(b>0){
	    b_digits.push_back(b%10);
	    b /= 10;
	}
	
	reverse(b_digits.begin(),b_digits.end());
	
	// initialize memo with -1
	memset(memo,-1,sizeof(memo));
	
	// compute answer for a
	int answer_a = dp(a_digits,0,0,1);
	
	// initialize memo with -1
	memset(memo,-1,sizeof(memo));
	
	// compute answer for b
	int answer_b = dp(b_digits,0,0,1);
	
	// final answer
	int answer = answer_b - answer_a;
	
	cout<<answer<<endl;
	
	return 0;
}

Input

1 10

Output

46

Conclusion

  • Digit DP is very useful in solving problems that concern a range of numbers.
  • We consider numbers as a sequence of digits, and try to build numbers within the given range digit by digit as we go, and compute the answer also digit by digit.
  • A general Digit DP solution has three states:
    • $pos$: position parameter which refers to the position where a new digit is to be inserted.
    • The time complexity for digit dp is $O(10poscnt*flag)$. The maximum value for $pos$ can be 18, for $cnt$ can be 18 and $flag$ can be 0 or 1.
    • The space complexity is $O(poscntflag)$.

Author