Hinal Vithlani

String in Data Structure

A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding.

Scope

  • This article tells about the working of Strings.
  • Different operations of strings.
  • Strings in different programming languages.
  • Implementation of Strings.

Takeaways

Library Functions

  • strcat
  • strncat
  • strlen
  • strcpy
  • strncpy
  • strcmp
  • strncmp
  • memset
  • strtok

Introduction

A String is seen anywhere and everywhere. Right from when you login onto your device, you enter a String, i.e. your password. In fact, this article is composed of a collection of Strings. When you come across any textual data, more often than not, it is composed of Strings.

In programming, a String is used as a datatype just like int or float might be used; however, the difference here is that, String deals with textual kind of data. It can consist of alphabets, numbers, spaces and special characters. Strings are often enclosed in double quotation marks(“This is a String”). Strings can be thought of as a thread connecting various letters as shown below.

There are several operations done on a String. These include, reversing a string, finding a pattern in a string, counting occurrences of certain characters in a string, etc.

Library Function for String Using Manipulation

There are certain predefined library functions that are created with the specific purpose of handling strings. These are as follows:

1. strcat

This function is used for concatenation which means it combines two strings. Using this function, a specified source string is appended to the end of a specified destination string. On success, this function returns a reference to the destination string; on failure, it returns NULL.

    Syntax:

    char *strcat(char *destination, const char *source) ;
    

    The first argument is the destination string. The second argument is the source string.

    Code:

    #include <stdio.h>
            #include <string.h>
            int main() {
                char str1[50] = "Scaler ";
                char str2[10] = "Academy";
                strcat(str1, str2);
                printf("%s",str1);
                return 0;
            }
    

    Output:

    Scaler Academy  
    

    Explanation:

    First, we defined str1 and then we defined str2. And then concatenated the two. str1 had a size large enough to accomodate the concatenated string.

    2. strncat

    The name of this function looks similar to the previous one and that is because this function is also used for concatenation. The only difference is that we use this function when we want to combine N characters of one string into another.

      One rule associated with the use of this function is that the length of the destination string must be more than that of the source string. At most N characters from the source string are appended to the end of the destination string with this function. On success, this function returns a reference to the destination string; on failure, it returns NULL.

      Syntax:

      char *strncat(char *destination, const char *source, size_t N) ;
      

      The first argument is the destination string. The second argument is the source string.

      The third argument is the number of characters that are appended.

      Code:

      #include <stdio.h>
              #include <string.h>
              int main() {
                  char str1[50] = "Scaler ";
                  char str2[10] = "Academy";
                  strncat(str1, str2,2);
                  printf("%s",str1);
                  return 0;
              }
      

      Output:

      Scaler Ac  
      

      Explanation:

      First we define string1 and then we define string2. Then we concatenate only the first two characters of string2 i.e. “Ac” to str 1 i.e. “Scaler” to form “Scaler Ac”.

      3. strlen

      The length of a string is returned by this function, excluding the null character at the end. That is, it returns the string’s character count minus 1 for the terminator.

        Syntax:

        size_t strlen(const char *string);
        

        The argument is the string whose length needs to be found.

        Code:

        #include <stdio.h>
        #include <string.h>
                int main()
                {
                    char str1[20]="Scaler";
                    printf("%zu",strlen(str1));
                    return 0;
                }
        

        Output:

        1
        

        Explanation:

        First, str1 is defined and then it’s length is found using strlen. %zu is used to print variables of size_t length.

        4. strcpy

        strcpy copies a string from the source string to the destination string, including the null character terminator. The return value of this function is a reference to the destination string.

          Syntax:

          char *strcpy(char *destination, const char *source) ;
          

          The first argument is the destination string The second argument is the source string

          Code:

                  #include <stdio.h>
                  #include <string.h>
                  int main() {
                    char str1[50] = "Scaler Academy";
                    char str2[50];
                    strcpy(str2, str1);
                    printf("%s",str2); 
                   }
          

          Output:

          Scaler Academy  
          

          Explanation:

          First str1 and str 2 are defined. Since str2 is the string in which str1 is copied, make sure str2 is larger than or equal to str1’s size.

          5. strncpy

          Strncpy is similar to strcpy, but it allows you to copy N characters.

            Syntax:

            char *strncpy(char *destination, const char *source, size_t N) ;
            
            • The first argument is the destination string.
            • The second argument is the source string.
            • The third argument is the number of characters that are to be copied.

            Code:

                   #include <stdio.h>
                    #include <string.h>
                    int main() {
                      char str1[50] = "Scaler Academy";
                      char str2[50];
                      strncpy(str2, str1,6);
                      printf("%s",str2); 
                      return 0;
                    }
            

            Output:

            Scale  
            

            Explanation:

            str1 is copied to str2 and the first 6 characters are printed.

            6. strcmp

            This function joins two strings together. It returns a value less than zero if the second string is greater. It returns a number greater than zero if the first string is greater than the second. It returns 0 if the strings are equivalent.

              STRCMP for Srtings

              Syntax:

              int strcmp(const char *str1, const char *str2) ;
              

              The first argument is the first string i.e. str1 from the image above. The second argument is the second string i.e. str2 from the image above.

              Code:

                      #include <stdio.h>
                      #include <string.h>
                      int main() {
                        char str1[] = "a";
                        char str2[] = "b";
                        int res;
                        res = strcmp(str1, str2);
                        printf("%d\n", res);
                        return 0;
                      }
              

              Output:

              -1 
              

              Explanation:

              a is smaller than b hence according to the image above, the output is -1.

              7. strncmp

              Strncmp is similar to strcmp, but it allows you to compare the first N characters of the respective strings.

                Syntax:

                int strncmp(const char *first, const char *second, size_T N) ;
                

                The first argument is the first string i.e. str1 from the image above. The second argument is the second string i.e. str2 from the image above. The third argument is the number of characters that will be compared.

                Code:

                        #include <stdio.h>
                        #include <string.h>
                        int main() {
                          char str1[] = "abc";
                          char str2[] = "acb";
                          int res;
                          res = strncmp(str1, str2,1);
                          printf("%d\n", res);
                          res = strncmp(str1, str2,2);
                          printf("%d\n", res);
                          return 0;
                        }
                

                Output:

                0 -1  
                

                Explanation:

                In the first res, only 1 character is compared and since both are a, they’re the same and the output is 0. In the second res, 2 characters are compared and hence the first string is smaller than the second string and the output is -1.

                8. memset

                To initialise a string to all nulls or any character, use memset.

                  Syntax:

                  void *memset(const void *destination, int c, site_t N) ;
                  

                  The first argument is the destination string which is the address of memory to be filled. The second argument is the value to be filled. The third argument is the number of bytes to be filled starting from the destination string.

                  Code:

                          #include <stdio.h>
                          #include <string.h>
                          int main() {
                            char str1[50]="Hello";
                            char ch='.';
                            memset(str1+5,ch,sizeof(char));
                            printf("%s", str1);
                            return 0;
                          }
                  

                  Output:

                  Hello.  
                  

                  Explanation:

                  str1+5 means the 5th position in the string, i.e. after ‘o’. After memset, the character, i.e. ‘.’ is placed after o.

                  9. strtok

                  To retrieve the next token in a string, use the strtok function. A list of possible delimiters is used to define the token.

                    Arrays of String

                    An Array of String is an array that stores a fixed number of values that are of the String data type.

                    Array of String in C/C++:

                    Syntax:

                    char strarr[m][n]
                    

                    Explanation:

                    m denotes the number of strings that can be stored in the array and n denotes the maximum length of the String.

                    Code

                    char strarr[2][5]={“Code”,”Word”}
                    **Array of String in Python:**
                    

                    Syntax:

                    strarr={“string1”,”string2”}
                    

                    Explanation:

                    This is done using the lists data structure

                    Code:

                    strarr=[“Code”,”Word”]
                    

                    Array of String in Java:

                    Syntax:

                    String[] strarr={“string1”,”string2”};  
                    

                    Explanation:

                    The index starts from 0.

                    Code:

                    String[] strarr = {"Code", "Word"};
                    

                    Passing String to Functions

                    In general, to pass a string to a function we enter it as a parameter of the function.

                    Passing a String to a function in C/C++:

                    Syntax:

                    functionName(string);  
                    

                    Code:

                        // Declaring a String
                            char str[5]="Hello";
                    
                        // Passing string to Function
                            func(str);
                    

                    In the above code, we first declared the string and then passed it to a function named func.

                    Passing a String to a function in Python:

                    Syntax:

                    functionName(string)  
                    

                    Code:

                        //Declaring a String
                        s='Hello'
                    
                        //Passing string to function
                        func(s)
                    

                    In the above code, we first declared the string and then passed it to a function named func.

                    Passing a String to a function in Java:

                    Syntax:

                    functionName(string);  
                    

                    Code:

                        //Declaring a String
                        String s="Hello";
                    
                        //Passing String to function
                        func(s);
                    

                    In the above code, we first declared the string and then passed it to a function named func.


                    String in C/C++

                    A string is a collection of characters that ends with the null character \0 in C programming. By default, the compiler appends a null character \0 to the end of a sequence of characters wrapped in double quotation marks.

                    Declaring a String:

                    Syntax:

                    char str[5];
                    

                    As seen in the photo below, the index starts from 0 and goes uptil 4 meaning 5 locations.

                    String Declaration in C

                    Initializing a String:

                    There are 4 ways to do this:

                    i) Without mentioning length and without array:

                    Syntax:

                    char c[] = "abcd";
                    

                    ii) Without mentioning length and with array:

                    Syntax:

                    char c[] = {'a', 'b', 'c', 'd', '\0'};  
                    

                    iii) With mentioning length and without array:

                    Syntax:

                    char c[50] = "abcd";  
                    

                    iv) With mentioning length and with array:

                    Syntax:

                    char c[5] = {'a', 'b', 'c', 'd', '\0'};  
                    

                    Reading string from user


                    i) scanf() is used to read a string from the user.

                    Syntax:

                    scanf(const char *format, Object *argument(s))
                    

                    Code:

                    	#include <stdio.h>
                        int main()
                        {
                            char strarr[5];
                            scanf("%s", arritem);
                            return 0;
                        }
                    

                    In the above code, we gave the format as %s which means string and arritem as the
                    address where the input will be stored.

                    ii) fgets() is used to read a line of string from the user.

                    Syntax:

                    	char* fgets(char* string, int num, FILE* stream);  
                    

                    char* string is a pointer to a string from where the characters are copied. Num specifies the number of characters that must be copied from the string and FILE* stream is a pointer that points to the file stream.

                    Code:

                        #include <stdio.h>
                        int main()
                        {
                            char strarr[5];
                            fgets(arritem, sizeof(arritem), stdin);  // read string
                            return 0;
                        }
                    

                    In the above code, arritem is the initialized string, sizeof(arritem) is the number of characters that will be copied and since the input is to be taken from standard input, stdin is supplied as the third parameter.


                    String Manipulation

                    This is done using pointers

                    Syntax for declaring a pointer:

                    char *string;
                    

                    Example Code:

                        #include <stdio.h>
                        int main(void) {
                          char strarr[] = "Coding with Strings";
                    
                          printf("%c", *strarr);     // Output: C
                          printf("%c", *(strarr+1));   // Output: o
                          printf("%c", *(strarr+7));   // Output: w
                    
                          char *strarrPtr;
                    
                          strarrPtr = strarr;
                          printf("%c", *strarrPtr);     // Output: C
                          printf("%c", *(strarrPtr+1));   // Output: o
                          printf("%c", *(strarrPtr+7));   // Output: w
                        }
                    

                    In the above code, it is seen that when *strarr is printed the first character from the string “Coding with Strings”, i.e. C is printed. When *strarr+1 is printed, the character present 1 position away from the first character is printed, i.e. o and so on. This is how manipulation in strings is done.

                    String Functions

                    • strlen() – returns length of string
                    • strcpy() – copies one string to another
                    • strcmp() – compares two strings
                    • strcat() – concatenates two strings

                    String in Java

                    In java, a string is an object that is internally backed by a character array. Since, arrays in java are immutable i.e. cannot be extended to accommodate more elements, even a string is immutable. This means that if you make any changes to a string, it will create a new string.

                    Declaring a String:

                    1. Using the string literal

                    Syntax:

                    String str = "Java String";
                    

                    When a String is declared, an object is created in the String Constant Pool. The values of all the strings defined in the program are stored in a string constant pool, which is a separate location in the heap memory. The JVM optimizes this process by checking if the declared string is already present or not. If present, it creates a reference to the String Constant Pool. Else, it creates a new object in the String Constant Pool.

                    For example, we do the following:

                    String str1=”Hello”
                    String str2=”Hello”

                    Both of these are the same, so creating two separate objects will be a waste of space, hence they can simply point to the same object. This is what is being done by the JVM.

                    Declaring a String

                    In the example above, we can see that both the strings are pointing to the same object in the String Constant Pool.

                    1. Using the new operator

                    Syntax:

                    String str = new String("Java String");
                    

                    Now “Java String” will be stored in the string constant pool, and JVM will build a new string object in normal heap memory. Hence, no optimization is done in this case. The variable str will be used to refer to the string.

                    Reading String from User

                    • Using Scanner Class

                    The Scanner class, which is part of the java.util package, is used to gather user input. next() and nextLine() functions can be used for taking String as input and this is done by creating an object of the class and calling any of the two methods as shown below shown below.

                    MethodMethod Description
                    nextLine()To take String data type input from user
                    next()To take String data type input from user

                    Difference between next() and nextLine()?

                    next() takes only a word as input whereas nextLine will take the entire line.

                    Syntax:

                    Scanner sc=new Scanner(System.in);
                    String a=sc.nextLine();  
                    

                    Code:

                    import java.util.Scanner;  // This is done to import the Scanner class
                    
                        public class Main {
                          public static void main(String[] args) {
                            Scanner sc = new Scanner(System.in);  // Create an object of Scanner class
                            System.out.println("Enter input");
                    
                            String in = sc.nextLine();  // To read user input
                            System.out.println("Input is: " + in);  // To print given user input
                          }
                        }
                    

                    In the above code, an object of scanner class is created and it is used to read and print String input.

                    • Using Buffered Reader

                    Reads text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines.

                    Constructors:

                    i) BufferedReader(Reader in) – Creates a buffering character-input stream with a default-sized input buffer

                    ii) BufferedReader(Reader in, int size) – Creates a buffering character-input stream with the provided input buffer size.

                    Methods:

                    i) void close() – To close the stream

                    ii) void mark(int readAhead) – To mark the current position in the stream. The readAhead parameter specifies the maximum number of characters that can be read while preserving the mark.

                    iii) boolean markSupported() – If stream supports mark, returns true, else, returns false.

                    iv) int read() – To read a single character

                    v) int read(char[] destBuff, int offset, int length) – The total number of characters read, or -1 if the stream has reached its end. The parameter destBuff specifies the destination buffer, the parameter offset specifies the offset at which to start storing characters and the parameter length specifies the amount of characters to be read.

                    vi) String readLine() – Reads a line of text.

                    vii) boolean ready() – Return true if the stream is ready to be used, else, returns false.

                    viii) void reset() – The stream is reset to the most recent mark.

                    ix) long skip(long num) – Skips specified number of characters

                    Syntax:

                    BufferedReader br=new BufferedReader(new InputStreamReader(System.in));  
                    String a=br.readLine();  
                    

                    Code:

                        import java.io.*;
                        public class Main {
                           public static void main(String args[]) throws IOException {
                              BufferedReader br =new BufferedReader(new InputStreamReader(System.in));
                              System.out.println("Enter your name: ");
                              String name = br.readLine();
                              System.out.println(name);
                            }
                        }
                    

                    Output:

                    Enter your name:  
                    abcd  
                    abcd  
                    

                    In the above code, an object of the BufferedReader class is created and used to read and print String input.

                    String Methods

                    i) char charAt(int index) – returns the character present at the specified index in a string.

                    ii) int length() – returns length of the string

                    iii) String substring(int beginningIndex) – returns substring of string starting from beginningIndex till the end of the string

                    iv) String substring(int beginningIndex, int endingIndex) – returns substring of string starting from beginningIndex till the endIndex

                    v) boolean contains(CharSequence str) – returns true if str is a part of the string else, returns false

                    vi) boolean equals(Object str) – checks if str is equal to the string and returns true if it is else, returns false

                    vii) boolean isEmpty() – returns true if string is empty else, returns false

                    viii) String equalsIgnoreCase(String str) – it compares the two strings and does not take their case into consideration

                    ix) String concat(String str) – concatenates the two string and returns the final, concatenated string

                    x) String replace(char old, char new) – replaces all occurrences of the specified old char with the new char

                    xi) String[] split(String regex) – returns a char array by splitting the string according to the given regex

                    xii) String[] split(String regex, int limit) – returns a char array by splitting the string according to the given regex and limit

                    xiii) int indexOf(String substr) – returns the index of the given substring

                    xiv) int indexOf(String substr, int startIndex) – returns the index of the given substring after the startIndex

                    xv)String toLowerCase() -returns the string in lowercase

                    xvi) String toUpperCase() -returns the string in uppercase

                    xvii) String trim() – removes all spaces from the beginning and the end

                    xviii) String valueOf(int value) – It converts the given value to a string.

                    String in Python

                    Strings are arrays of bytes that represent Unicode characters, in Python. However, because Python lacks a character data type, a single character is merely a one-length string. Square brackets can be used to access the string’s elements.

                    Declaring a String:

                    A String in Python can be declared with the use of single quotes(‘’), double quotes(“”) or triple quotes(‘’’’’’).

                    Syntax:

                    str=’Enter String’  
                    

                    Code:

                        s1 = 'Hello from single quotes'
                        s2="Hello from double quotes"
                        s3='''Hello from triple quotes'''
                        print(s1)
                        print(s2)
                        print(s3)
                    

                    Output:

                    Hello from single quotes  
                    Hello from double quotes  
                    Hello from triple quotes  
                    

                    Reading String from User

                    There are two ways a String can be read from the user in Python:

                    i) input(): This function takes the user’s input and then recognises if the user typed a text, a number, or a list automatically. Python will raise a syntax error or an exception if the input provided by the user is incorrect.

                    Syntax:

                    str = input("Enter input: ")  
                    print(str)  
                    

                    Output:

                    Enter input: Hello  
                    Hello  
                    

                    ii) raw_input():

                    This function takes what is typed by the user, converts it to a string, then stores it to the variable we want to keep it in.

                    Syntax:

                    str = raw_input("Enter input: ")  
                    print str  
                    

                    Output:

                    Enter input: Hello  
                    Hello 
                    

                    String Methods:

                    i) capitalize() – Capitalizes first letter of string

                    ii) casefold() – Makes all letters of string lower case

                    iii) center() – Returns a centered string

                    iv) count() – Returns the amount of occurrences of a substring in the string

                    v) encode() – Returns an encoded version of the given string

                    vi) endswith() – If string ends with a specified value then returns true, else false

                    vii) find() – Returns the position of a specified substring in a string

                    viii) rfind() – Returns the last location in the string where a substring was found in the string

                    ix) index() – Returns the position of where a given substring is present in the string.

                    x) rindex() – Returns the last location in the string where a substring was found in the string

                    xi) isalnum() – If all characters in the string are alphanumeric then it returns true, else false

                    xii) isalpha() – If all characters in the string are alphabets then it returns true, else false

                    xiii) isascii() – If all characters in the string are ascii characters then it returns true, else false

                    xiv) isdecimal() – If all characters in the string are decimal then it returns true, else false

                    xv) isdigit() – If all characters in the string are digits then it returns true, else false

                    xvi) isspace() – If all characters in the string are whitespaces then it returns true, else false

                    xvii) isnumeric() – If all characters in the string are numeric then it returns true, else false

                    xviii) islower() – If all characters in the string are in lowercase then it returns true, else false

                    xix) isupper() – If all characters in the string are in uppercase then it returns true, else false

                    xx) join() – Joins the elements of an iterable to the end of the string

                    xxi) lower() – Converts all characters of a string in lowercase

                    xxii) lstrip() – Trims the string from the left

                    xxiii) rstrip() – Trims the string from the right

                    xxiv) partition() – Returns a tuple where the string is partitioned into three parts

                    xxv) replace() – Replaces a given substring with another specified substring and returns the string

                    xxvi) format() – Formats specified values in a string

                    xxvii) startswith() – If string starts with a specified value it returns true, else false

                    xxviii) strip() – Returns a trimmed version in a string

                    xxix) swapcase() – Switches the case of a string i.e. if all characters in the string are in lowercase, it returns the string in uppercase

                    xxx) upper() – Converts all characters of a string in uppercase


                    Conclusion

                    • Strings is a very important data structure. For placements, there are a variety of important coding questions that revolve around Strings.
                    • A few of them include, checking if a string is a palindrome or not, checking for anagrams, longest common prefix, longest substring without repeat, etc.
                    • To be able to solve these, it is of utmost importance to remember all the methods provided by your favourite language as these will aid you in coding the solution.

                    Author