SegmentFault 个人技术分享最新的文章
2018-04-02T09:58:24+08:00
https://segmentfault.com/feeds/blogs
https://creativecommons.org/licenses/by-nc-nd/4.0/
基于bluestore的rocksdb的调优,测试ceph-4K-randwrite性能
https://segmentfault.com/a/1190000014127340
2018-04-02T09:58:24+08:00
2018-04-02T09:58:24+08:00
yinminggang
https://segmentfault.com/u/yinminggang
0
<blockquote>调节bluestore_rocksdb参数,fio来测试ceph随机写的性能,期望进行优化。<br>在<a>上一篇文章</a>中测试了在ceph环境下,通过gdbprof分析4k-randwrite的性能,可以看出rocksdb线程耗用资源较多,因为ceph的底层就是基于rocksdb进行存储的,因此尝试着去调节ceph中导出的rocksdb参数,来达到一个调优效果。</blockquote>
<h3>简单查看下集群是否正常</h3>
<p><strong>$ceph osd tree</strong><br><img src="/img/bV7rbw?w=934&h=182" alt="图片描述" title="图片描述"><br><strong>$ceph -s</strong><br><img src="/img/bV7rbO?w=674&h=480" alt="图片描述" title="图片描述"></p>
<h3>rocksdb导出参数</h3>
<p><img src="/img/bV7rdA?w=862&h=108" alt="图片描述" title="图片描述"></p>
<p>接下来对参数进行说明:<br>"bluestore_rocksdb_options": "compression=kNoCompression,max_write_buffer_number=4,min_write_buffer_number_to_merge=1,recycle_log_file_num=4,write_buffer_size=268435456,writable_file_max_buffer_size=0,compaction_readahead_size=2097152"<br><code>Compression=kNoCompression</code>:表示数据不进行压缩。对每个SST文件,数据块和索引块都会被单独压缩,默认是Snappy<br><code>write_buffer_size=268435456(2^28)</code>:memtable的最大size,如果超过这个值,RocksDB会将其变成immutable memtable,并使用另一个新的memtable。插入数据时RocksDB首先会将其放到memtable里,所以写入很快,当一个memtable full之后,RocksDB会将该memtable变成immutable,用另一个新的memtable来存储新的写入,immutable的memtable就被等待flush到level0<br><code>max_write_buffer_number=4</code>:最大的memtable个数。如果active memtable都full了,并且active memtable+immutable memtable个数超过max_write_buffer_number,则RocksDB会停止写入,通常原因是写入太快而flush不及时造成的。<br><code>min_write_buffer_number_to_merge=1</code>:在flush到level0之前,最少需要被merge的memtable个数,如min_write_buffer_number_to_merge =2,那么至少当有两个immutable的memtable时,RocksDB才会将这两个immutable memTable先merge,再flush到level0。Merge 的好处是,譬如在一个key在不同的memtable里都有修改,可以merge形成一次修改。min_write_buffer_number_to_merge太大会影响读取性能,因为Get会遍历所有的memtable来看该key是否存在。<br><code>compaction_readahead_size=2097152(2^21)</code>:预读大小,在进行compression时,执行更大的数据读取,<br><code>writable_file_max_buffer_size=0</code>:可写文件的最大写缓存</p>
<h3>修改rocksdb参数</h3>
<p>首先分为服务端机器server_host和客户端机器client_host<br>修改server_host的/etc/ceph/ceph.conf中bluestore rocksdb项<br><strong>$vim /etc/ceph/ceph.conf</strong><br>添加以下参数配置</p>
<pre><code>[osd]
bluestore rocksdb options = compression=kNoCompression,max_write_buffer_number=8,min_write_buffer_number_to_merge=4,recycle_log_file_num=4,write_buffer_size=356870912,writable_file_max_buffer_size=0,compaction_readahead_size=8388608
[osd.0]
[osd.1]
[osd.2]</code></pre>
<p><strong>wq</strong>保存退出</p>
<h3>重启osd集群</h3>
<p><strong>systemctl restart ceph-osd@0.service</strong><br><strong>systemctl restart ceph-osd@1.service</strong><br><strong>systemctl restart ceph-osd@2.service</strong><br><code>注</code>:osd一个一个的重启,不要快速重启三个,等一个osd重启并运行正常后(可用<strong>$ceph osd tree查看</strong>),再重启第二个,不然集群容易挂掉</p>
<h3>查看rocksdb参数是否有变化</h3>
<p><strong>ceph daemon osd.0 config show | grep bluestore_rocksdb</strong><br><strong>ceph daemon osd.1 config show | grep bluestore_rocksdb</strong><br><strong>ceph daemon osd.2 config show | grep bluestore_rocksdb</strong></p>
<h3>FIO测试性能</h3>
<p>(1)先创建image再进行4k-randwrite操作</p>
<pre><code>**$rbd create --pool ymg --image img01 --size 40G**</code></pre>
<p>(2)填充image</p>
<pre><code>**$fio -direct=1 -iodepth=256 -ioengine=rbd -pool=ymg -rbdname=img01 -rw=write -bs=1M -size=40G -ramp_time=5 -group_reporting -name=full-fill**</code></pre>
<p>(3)randwrite命令</p>
<pre><code>**$fio -direct=1 -iodepth=256 -ioengine=rbd -pool=ymg -rbdname=img01 -rw=randwrite -bs=4K -runtime=300 -numjobs=1 -ramp_time=5 -group_reporting -name=parameter1**</code></pre>
<h3>对比实验及结果</h3>
<h5>【Parameter0实验-原始参数】</h5>
<p>compression=kNoCompression<br>max_write_buffer_number=4<br>min_write_buffer_number_to_merge=1<br>recycle_log_file_num=4<br>write_buffer_size=268435456<br>writable_file_max_buffer_size=0<br>compaction_readahead_size=2097152<br><strong>时间段:</strong>16:57:04~17:02:04<br><strong>结果如下</strong><br><img src="/img/bV7rjE?w=864&h=506" alt="图片描述" title="图片描述"><br><img src="/img/bV7rjZ?w=708&h=618" alt="图片描述" title="图片描述"><br><img src="/img/bV7rkf?w=764&h=650" alt="图片描述" title="图片描述"><br><img src="/img/bV7rkn?w=737&h=650" alt="图片描述" title="图片描述"><br><img src="/img/bV7rkr?w=864&h=604" alt="图片描述" title="图片描述"></p>