新手上路，请多包涵

我有一个这样的字符串：

 "CA: ABCD\nCB: ABFG\nCC: AFBV\nCD: 4567"

现在 ": " 将键与值分开，而 \n 分开对。我想将键值对添加到 C++ 中的映射中。

考虑到优化，有没有任何有效的方法可以做到这一点？

原文由 Viking 发布，翻译遵循 CC BY-SA 4.0 许可协议

c++dictionary

阅读 945

2 个回答

得票最新

社区维基

发布于
2022-11-08

✓ 已被采纳

好吧，我这里有两种方法。第一个是我一直使用的简单、明显的方法（性能很少成为问题）。第二种方法可能更有效，但我没有做任何正式的计时。

在我的测试中，第二种方法快了大约 3 倍。

 #include <map>
#include <string>
#include <sstream>
#include <iostream>

std::map<std::string, std::string> mappify1(std::string const& s)
{
    std::map<std::string, std::string> m;

    std::string key, val;
    std::istringstream iss(s);

    while(std::getline(std::getline(iss, key, ':') >> std::ws, val))
        m[key] = val;

    return m;
}

std::map<std::string, std::string> mappify2(std::string const& s)
{
    std::map<std::string, std::string> m;

    std::string::size_type key_pos = 0;
    std::string::size_type key_end;
    std::string::size_type val_pos;
    std::string::size_type val_end;

    while((key_end = s.find(':', key_pos)) != std::string::npos)
    {
        if((val_pos = s.find_first_not_of(": ", key_end)) == std::string::npos)
            break;

        val_end = s.find('\n', val_pos);
        m.emplace(s.substr(key_pos, key_end - key_pos), s.substr(val_pos, val_end - val_pos));

        key_pos = val_end;
        if(key_pos != std::string::npos)
            ++key_pos;
    }

    return m;
}

int main()
{
    std::string s = "CA: ABCD\nCB: ABFG\nCC: AFBV\nCD: 4567";

    std::cout << "mappify1: " << '\n';

    auto m = mappify1(s);
    for(auto const& p: m)
        std::cout << '{' << p.first << " => " << p.second << '}' << '\n';

    std::cout << "mappify2: " << '\n';

    m = mappify2(s);
    for(auto const& p: m)
        std::cout << '{' << p.first << " => " << p.second << '}' << '\n';
}

输出：

 mappify1:
{CA => ABCD}
{CB => ABFG}
{CC => AFBV}
{CD => 4567}
mappify2:
{CA => ABCD}
{CB => ABFG}
{CC => AFBV}
{CD => 4567}

原文由 Galik 发布，翻译遵循 CC BY-SA 3.0 许可协议

社区维基

发布于
2022-11-08

这种格式称为“标签值”。

在行业中使用这种编码的最性能关键的地方可能是金融 FIX 协议（ = 用于键值分隔符，以及 '\001' 作为条目分隔符）。因此，如果您使用的是 x86 硬件，那么您最好的选择是搜索“SSE4 FIX 协议解析器 github”并重用 HFT 商店的开源发现。

如果您仍然想将矢量化部分委托给编译器并且可以节省几纳秒以提高可读性，那么最优雅的解决方案是将结果存储在 std::string (data) + boost::flat_map<boost::string_ref, boost::string_ref> (看法）。解析是一个口味问题，while-loop 或 strtok 对编译器来说是最容易解析的。基于 Boost-spirit 的解析器对于人类（熟悉 boost-spirit）来说是最容易阅读的。

基于 C++ for 循环的解决方案

#include <boost/container/flat_map.hpp>
#include <boost/range/iterator_range.hpp>

#include <boost/range/iterator_range_io.hpp>
#include <iostream>

// g++ -std=c++1z ~/aaa.cc
int main()
{
    using range_t = boost::iterator_range<std::string::const_iterator>;
    using map_t = boost::container::flat_map<range_t, range_t>;

    char const sep = ':';
    char const dlm = '\n';

    // this part can be reused for parsing multiple records
    map_t result;
    result.reserve(1024);

    std::string const input {"hello:world\n bye: world"};

    // this part is per-line/per-record
    result.clear();
    for (auto _beg = begin(input), _end = end(input), it = _beg; it != _end;)
    {
        auto sep_it = std::find(it, _end, sep);
        if (sep_it != _end)
        {
            auto dlm_it = std::find(sep_it + 1, _end, dlm);
            result.emplace(range_t {it, sep_it}, range_t {sep_it + 1, dlm_it});
            it = dlm_it + (dlm_it != _end);
        }
        else throw std::runtime_error("cannot parse");
    }

    for (auto& x: result)
        std::cout << x.first << " => " << x.second << '\n';

    return 0;
}

原文由 bobah 发布，翻译遵循 CC BY-SA 3.0 许可协议

查看全部 2 个回答

推荐问题

使用 C 将字符串拆分为键值对

如何实现一个深拷贝函数？

C++是否有集中的点（比如一个网站），用于积累所有的C++使用的三方库？

关于new运算符重载的问题?

迟到问题，知道其他解法，但是想探求为什么以下解法不行？

Java开发者转型C++，非业务方向应学习什么技术？

有没有很方便地对C++内存管理的方式？

如果我们只进行和使用定义类/对象/函数，也可以实现编程的目的。请问是否也能做C++的项目，没有必要做很深入的学习也可以实现项目开发？

Stack Overflow 翻译