1

题目要求

You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.

For example, given:
s: "barfoothefoobarman"
words: ["foo", "bar"]

You should return the indices: [0,9].
(order does not matter).

输入一个字符串s和一个字符串数组words,其中words中的每个word的长度都相等。在字符串中找到所有子字符串的起始下标,只要该子字符串满足words中所有单词的连接结果(顺序无关)

思路一:map中存储word和对应的下标(无法解决重复问题)

在思路一中,我利用HashMap<String,Integer>来存储对应的word所在的最近的下标。同时利用start来存储当前结果值所在的起始下标。这样做是为了满足只对字符串进行一次遍历就可以得出所有的结果值。但是当输入的words中存在重复值时,该方法就会出现问题。因为该方法利用当前word的下标和start进行比较之后,判断是否要移动start的位置。如果word的下标大于start说明该值重复,则将start移到word下标的下一位。然而,一旦出现重复值后,例如输入的words为["good","bad","good"],则无法判断当前重复值是否应当在结果集中。
代码如下:

    public List<Integer> findSubstring(String s, String[] words) {
        List<Integer> result = new ArrayList<Integer>();
        if(words.length == 0){
            return result;
        }
        int wordLength = words[0].length();
        int allWordsLength = words.length;
        Map<String, Integer> wordMap = new HashMap<String, Integer>();
        
        for(int j = 0 ; j<words.length ; j++){
            wordMap.put(words[j], -1);
        }
        
        int start = 0;
        for(int i = 0 ; i<s.length()-wordLength ;){
            String current = s.substring(i, i+wordLength);
            if(wordMap.containsKey(current)){
                int key = wordMap.get(current);
                //出现重复情况
                if(key>=start){
                    start = key + wordLength;
                //长度等于所有长度
                }else if(i-start == wordLength*(allWordsLength-1)){
                    result.add(start);
                    start+=wordLength;
                }    
                wordMap.replace(current, i);
                i+=wordLength;
                continue;
            }
            i++;
            start = i;
        }
        return result;
    }

思路二 :map中存储word和重复数量

既然words中会有重复值,就想到利用map中来存储word和其数量来判断当前下标下的子数组中是否包含了map中所有的元素。如果map中的元素都被清空,则代表该子数组符合要求,即将起始下标添加进入结果集。
该方法有一个缺陷在于,每一次移动起始下标,都要重新初始化一个map的副本。而在很多情况下,该副本可能根本就没有发生改变。这样的内存利用率太低了,影响程序的效率。

    public List<Integer> findSubstring2(String s, String[] words) {
        List<Integer> result = new ArrayList<Integer>();
        if(words.length == 0){
            return result;
        }
        int wordLength = words[0].length();
        int allWordsLength = words.length;
        
        
        Map<String, Integer> wordMap = new HashMap<String, Integer>();
        for(int j = 0 ; j<words.length ; j++){
            wordMap.put(words[j], wordMap.containsKey(words[j]) ? wordMap.get(words[j])+1 : 1);
        }
        
        for(int start = 0 ; start<=s.length()-wordLength*allWordsLength ; start++){
            //副本
            Map<String, Integer> copy = new HashMap<String, Integer>(wordMap);
            for(int i=start ; i<start+wordLength*allWordsLength ; i+=wordLength){
                String current = s.substring(i, i+wordLength);
                if(copy.containsKey(current)){
                    int key = copy.get(current);
                    if(key==1){
                        copy.remove(current);
                    }else{
                        copy.put(current, key-1);
                    }
                    if(copy.isEmpty()){
                        result.add(start);
                        //及时跳出循环,否则可能造成超时问题
                        break;
                    }
                }else{
                    break;
                }    
            }
        }
        return result;
    }

思路三 :minimum-window-substring

该思路是我借鉴了一个solution中的回答。minimum-window-substring是指,在寻找到所有元素都被包含进去的最小子数组中,判断是否满足目标要求。利用左右指针来限定最小子数组的范围,即窗口大小。同时左右指针每次都按照固定长度右移,以便寻找到下一个最小子数组。具体情况请参考代码中的注释。

    public List<Integer> findSubstring3(String s, String[] words) {
        //N为字符串长度
        int N = s.length();
        //结果集,长度必定不超过字符串长度
        List<Integer> indexes = new ArrayList<Integer>(s.length());
        if (words.length == 0) {
            return indexes;
        }
        //M为单个单词的长度
        int M = words[0].length();
        //如果所有单词连接之后的长度超过字符串长度,则返回空结果集
        if (N < M * words.length) {
            return indexes;
        }
        //last 字符串中最后一个可以作为单词起始点的下标
        int last = N - M + 1;
        
        //map存储word和其在table[0]中对应的下标
        Map<String, Integer> mapping = new HashMap<String, Integer>(words.length);
        //table[0]存储每个word出现的真实次数,table[1]存储每个word目前为止出现的统计次数
        int [][] table = new int[2][words.length];
        //failures存储words中不重复值的总数,例如["good","bad","good","bad"],failures=2
        int failures = 0, index = 0;
        for (int i = 0; i < words.length; ++i) {
            Integer mapped = mapping.get(words[i]);
            if (mapped == null) {
                ++failures;
                mapping.put(words[i], index);
                mapped = index++;
            }
            ++table[0][mapped];
        }
        
        //遍历字符串s,判断字符串当前下标后是否存在words中的单词,如果存在,则填入table中的下标,不存在则为-1
        int [] smapping = new int[last];
        for (int i = 0; i < last; ++i) {
            String section = s.substring(i, i + M);
            Integer mapped = mapping.get(section);
            if (mapped == null) {
                smapping[i] = -1;
            } else {
                smapping[i] = mapped;
            }
        }
        
        //fix the number of linear scans
        for (int i = 0; i < M; ++i) {
            //reset scan variables
            int currentFailures = failures; //number of current mismatches
            int left = i, right = i;
            Arrays.fill(table[1], 0);
            //here, simple solve the minimum-window-substring problem
            //保证右节点不超出边界
            while (right < last) {
                //保证左右节点之间的子字符串能够包含所有的word值
                while (currentFailures > 0 && right < last) {
                    int target = smapping[right];
                    if (target != -1 && ++table[1][target] == table[0][target]) {
                        --currentFailures;
                    }
                    right += M;
                }
                while (currentFailures == 0 && left < right) {
                    int target = smapping[left];
                    if (target != -1 && --table[1][target] == table[0][target] - 1) {
                        int length = right - left;
                        //instead of checking every window, we know exactly the length we want
                        if ((length / M) ==  words.length) {
                            indexes.add(left);
                        }
                        ++currentFailures;
                    }
                    left += M;
                }
            }
            
        }
        return indexes;
    }

clipboard.png
想要了解更多开发技术,面试教程以及互联网公司内推,欢迎关注我的微信公众号!将会不定期的发放福利哦~


raledong
2.7k 声望2k 粉丝

心怀远方,负重前行