Strings Fundamentals in Java

Thu 06 March 2014 | tags: java

Overview

The String class represents character strings. All string literals in Java programmes, such as "abc", are implemented as instance of the String class.

String Immutability

String objects are immutable which means once they are created, then you cannot modify them. But they can be shared.

For example:

/*
 * str1 and str2 point to the same string object "abc" in the heap.
 */

String str1 = "abc";
String str2 = "abc";

This diagram illustrates the code above.

Strings immutability

However please pay attention to this: new String(str_val) will create a new String object in the heap whatever if the same string object exists in the heap.

For example:

/*
 * There is to create two distinct string object in the heap.
 * So a and b separately point to the different string object
 * But the two string object has the same value "xyz".
 *
 */
String a = new String("xyz");
String b = new String("xyz");

This diagram illustrates the code above close.

Strings construction

We can test if a and b are the same string object. Code as follows:

public class StringTest {

    public static void main(String[] args) {

        // Create three string objects with the same value
        String foo = "xyz";
        String a = new String("xyz");
        String b = new String("xyz");

        // Compare the references and values
        System.out.println("foo == a: " + (foo == a) + " // reference comparison");
        System.out.println("Are the values of foo and a equal? "  + foo.equals(a) + " // value comparison");
        System.out.println("a == b: " + (a ==b ) + " // reference comparison");
        System.out.println("Are the values of a and b equal? " + a.equals(b) + " // value comparison");

    }

 }

Here are output results:

foo == a: false // reference comparison
Are the values of foo and a equal? true // value comparison
a == b: false // reference comparison
Are the values of a and b equal? true // value comparison

So double quotes and constructor to create string objects have different underlying details.

String Interning

Double quotes and constructor have different underlying details in creating string objects. Let's see something inside.

  1. Double quotes
/*
 * Obviously, x == y is true because x and y are referring to the same string
 * literal in the method area. They have the same memory reference.
 */
String x = "abc";
String y = "abc";

We know x and y point to the same string object "abc". That is to say, the same string literal is created more than once, only one copy of each distinct string value is stored. This is called string interning. The distinct values are stored in string intern pool. In java there is String.intern() method. All compile-time constant strings in Java are automatically interned using this method.

  1. Constructor
/*
 * Obviously, x1 == y1 is false because x1 and y1 are referring to two different
 * string object in the heap. Different objects have different memory reference.
 *
 */
String x1 = new String("abc");
String y1 = new String("abc");

We know new String(strVal) will create a new string object in the heap. But do we have any method to avoid different string object when we want to do this? Just use String.intern().

String x1 = new String("abc").intern();
// Won't create new string object
String y1 = new String("abc").intern();

Interview Questions

1. Implement an algorithm to determin if a string has all unique characters.

What if you can not to use additional data structures?

/*
 * Assume char set is ASCII (If not, we need to increase the storage size.
 * The rest of the logic would be the same). This is a great thing to point
 * to your interviewer!
 *
 * ======================
 * Time complexity: O(n)
 * Space complexity: O(1)
 * ======================
 * n is the length of string
 *
 */

public static boolean isUniqueChars(String string) {

    boolean[] char_set = new boolean[256];

    for (int i = 0; i < string.length(); i++) {
        int val = string.charAt(i);
        if (char_set[val]) return false;
        char_set[val] = true;
    }

    return true;
}
  • Solution 2

[To be figured out, please be patient]

2. Find the first non-repeated character in a string

import java.util.Map;
import java.util.LinkedHashMap;
import java.util.Map.Entry;

public class NonRepeatingChar {

    /**
     * This method to implement find first non-repeated character algorithm.
     *
     * =======================
     * Time complexity: O(2*N)
     * Space complexity: O(N)
     * =======================
     *
     * Steps:
     * 1) Convert string to char array and loop through it to build a hash
     *    map with chars and their counts.
     * 2) Iterate the hash map just built to find the entry with value 1,
     *    which is the first non-repeated char.
     *
     * @param s
     *        The string
     *
     * @return The first non-repeating char
     *
     * @throws RuntimeException
     *         if any non-repeating char doesn't find
     *
     */
    public static char findFirstNonRepeatingChar(String s) {

        Map<Character, Integer> charCnts = new LinkedHashMap<>(s.length());

        // Scan the string and store character count
        for(char c: s.toCharArray()) {
            charCnts.put(c, charCnts.containsKey(c) ? charCnts.get(c) + 1 : 1);
            }

        // Iterate through LinkedHashMap to find the entry with value 1
        for(Entry<Character, Integer> entry: charCnts.entrySet()) {
            if(entry.getValue() == 1) {
                return entry.getKey();
            }
        }

        throw new RuntimeException("Didn't find any non-repeating char!");
    }
}
  • Solution 2

    To cut down one loop in Solution 1 to find the char in one pass, we need two storages to hold repeating chars and non-repeating chars separately. This is time vs space trade-off.

import java.util.List;
import java.util.ArrayList;
import java.util.Set;
import java.util.HashSet;

public class NonRepeatChar {

    /**
     * This method implement find the first non-repeating char in a string.
     *
     * ======================
     * Time complexity: O(N)
     * Space complexity: O(N)
     * ======================
     *
     * @param s
     *        The string
     *
     * @return The first non-repeating char
     *
     * @throws RuntimeException
     *         if any non-repeating char doesn't find
     *
     */
    public static char findFirstNonRepeatingChar(String s) {

        Set<Character> repChars = new HashSet<>();
        List<Character> nonRepChars = new ArrayList<>();

        for(int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            if(repChars.contains(c)) continue;

            if(nonRepChars.contains(c)) {
                nonRepChars.remove((Character) c);
                repChars.add(c);
            } else {
                nonRepChars.add(c);
            }
        }

        if(!nonRepChars.isEmpty()) {
            return nonRepChars.get(0);
        } else {
            throw new RunTimeException("Don't find any non-repeating char!");
        }
    }
}
  • Solution 3

    Use HashMap to store character counts, then loop through the original string to find the first char whose counts is 1.

import java.util.Map;
import java.util.LinkedHashMap;

public class NonRepeatChar {

    /**
     * This method implement find the first non-repeating char in a string.
     *
     * ======================
     * Time complexity: O(N)
     * Space complexity: O(N)
     * ======================
     *
     * @param s
     *        The string [shouldn't be null]
     *
     * @return The first non-repeating char
     *
     * @throws RuntimeException
     *         if any non-repeating char doesn't find
     *
     */
    public static char findFirstNonRepeatingChar(String s) {

        Map<Character, Integer> charCnts = new HashMap<>();

        for(int i = 0; i < s.length(); i++) {
            char ch = s.charAt(i);
            charCnts.put(ch, charCnts.containsKey(ch) ? charCnts.get(ch) + 1 : 1);
        }

        for(char ch: s.toCharArray()) {
            if(charCnts.get(ch) == 1) {
                return ch;
            }
        }

        throw new RuntimeException("Didn't find any non-repeating char!");

    }
}

blogroll

social