【leetcode】30. Substring with Concatenation of All Words 精确匹配的子串集

1. 题目

You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.

For example, given:
s: "barfoothefoobarman"
words: ["foo", "bar"]

You should return the indices: [0,9].
(order does not matter).

2. 思路

没有特别的好方法，相当于死扛计算的。
注意：words里可能有重复。
注意：“ab”，“ba”, "aa", "bb"这种，彼此交叉的。

遍历起点，每次找到一个匹配点，然后验证是否符合需求。然后+1从下一个点开始找下一个起点。
起点当前的点的第一个串是其中一个。
验证方法是，看是否能匹配到words.size个。特殊处理words里重复的情况，通过计数处理。
为每一个起点开始的匹配，为word命名一个version，记录下version。重复出现version时进行count比较。count超了就失败。

3. 代码

耗时：849ms

class Solution {
public:
    vector<int> findSubstring(string& s, vector<string>& words) {
        //cout << "s=" << s << endl;
        vector<int> ret;
        if (words.size() == 0) {
            return ret;
        }
        unordered_map<string, int> map;
        unordered_map<string, int> freq;
        for (int i = 0; i < words.size(); i++) {
            map[words[i]] = 0;
            if (freq.find(words[i]) != freq.end()) {
                freq[words[i]]++;
            } else {
                freq[words[i]] = 1;
            }
        }
        int version = 0;
        
        size_t i = 0;
        size_t one_len = words[0].length();
        size_t num = words.size();
        size_t total_len = one_len * num;
        while (i < s.length() - total_len + 1) {
            i = findFirstOf(s, i, words);
            if (i == string::npos) {
                return ret;
            }
            string s1 = s.substr(i, one_len);
            version++;
            
            unordered_map<string, int> cnt;
            cnt[s1] = 1;
            
            //cout << "i=" << i << " version=" << version << " s1=" << s1 << endl;
            int count = 0;
            int k = i;
            map[s1] = version;
            //cout << "count=" << count << " k=" << k << " str=" << s1 << " it=" << version << endl;
            k += one_len;
            count++;
            while (count < num) {
                string s1 = s.substr(k, one_len);
                auto it = map.find(s1);
                //cout << "count=" << count << " k=" << k << " str=" << s1 << " it=" << (it == map.end() ? -1 : it->second) << endl;
                if (it == map.end()) {
                    break;
                }
                if (it->second == version) {
                    cnt[s1]++;
                    if (cnt[s1] > freq[s1]) {
                        break;
                    }
                } else {
                    cnt[s1] = 1;
                }
                map[s1] = version;
                k += one_len;
                count++;
            }
            if (count == num) {
                ret.push_back(i);
                //cout << "find idx: " << i << endl;
            }
            i++;
        }
        return ret;
    }
    
    size_t findFirstOf(string&s, size_t b, vector<string>& words) {
        size_t min = s.length();
        for (int i = 0; i < words.size(); i++) {
            size_t cur = s.find(words[i], b);
            //cout << "firstOf: " << words[i] << " cur=" << cur << " s=" << s.substr(b) << endl;
            if (cur == string::npos) {
                return string::npos;
            }
            if (cur < min) {
                min = cur;
            }
        }
        if (min == s.length()) {
            return string::npos;
        } else {
            return min;
        }
    }
};

【leetcode】30. Substring with Concatenation of All Words 精确匹配的子串集

1. 题目

2. 思路

3. 代码

knzeus

引用和评论

求解平方根

2025年医疗大模型各医疗场景赋能实践研究报告130+份汇总解读|附PDF下载

C++ 中 VS 项目引入公共配置文件

通过performance面板验证浏览器资源加载与渲染机制

Visual Studio Code (VS Code) – C/C++ 入门

入选ICLR 2025，MIT/UC伯克利/哈佛/斯坦福等提出DRAKES算法，突破生物序列设计瓶颈

30分钟内输出结果，新加坡国立大学/MIT等基于SVM构建微生物污染检测模型