Abstract
An implementation of Fixed Point algorithm for STM32 or any other MCUs which have no FPU support.
For You Information
- Motive
When using STM32F10x MCU, the lack of FPU makes the float computation too slow for a time sensitive processing--for example, an interrupt handle which should get job done in milliseconds. Assembly language is a good way but not best, because it involves the learning of another language and less generic. Not sure if others had built the wheels, but I like to reinvent it--for nothing else, just because I can.
- Compiler
The implementation was written in C++ with templates. By now, most embedded GCC compilers should work well.
- MCU Architecture
The implementation is designed to work on a 32-bit MCU. It works on 16-bit MCU too.
- Limitation of Precision and Range
To represent a float number, the implementation uses partial bits of 32-bit/64-bit integer to represent the integer part and fraction part. It also means the precision and the range the fixed point can represent are limited. For 32-bit representation, the integer part uses 21 bits and the fraction uses 11 bits. For 64-bit, the bits are 42 and 22, just doubled. Further more, the situation of overflow is not considered. It's your responsibility to handle it.Keep this in mind when using the code.
NOTE
32bits fixed point number is not enough. Now using 64bits int as the internal expression.
- Understanding The Fixed Point
I assume you already knew the Fixed Point algorithm. If don't, please teach yourself. It's very simple.
- Liability
The code is free for using for anyone. In no event shall I be liable for any situation arising in any way out of the use of the code. Take your own risk to use the code.
Code
Enough talk, let's show the code.
/*
* fxfloat.hpp
*
* Created on: Mar 4, 2019
* Author: igame
*/
#pragma once
namespace igame {
class FxFloat {
public:
using data_t = int64_t;
using self_t = FxFloat;
const size_t kBits = sizeof(data_t) * 8;
const uint32_t kSignBit = 1;
const uint32_t kFracBits = 18;
const int64_t kScale = (int64_t)1 << kFracBits;
const uint64_t kFracMask =
((uint64_t)0xFFFFFFFFFFFFFFFF >> (kBits - kFracBits));
const int64_t kMax = 0x7FFFFFFFFFFFFFFF / 2;
const int64_t kMin = 0xFFFFFFFFFFFFFFFF / 2;
private:
data_t m_data;
public:
FxFloat() : m_data(0) {}
FxFloat(const self_t& x) : m_data(x.m_data) {}
FxFloat(const float& x) { this->m_data = toVal(x); }
data_t data() { return m_data; }
void set(const data_t x) { m_data = x; }
data_t iData() { return m_data >> kFracBits; }
data_t fData() { return m_data & kFracMask; }
/// convert back to float
operator float() { return toFloat(); }
data_t toVal(const float& x) {
data_t i = (data_t)x;
data_t f = (data_t)((x - i) * kScale);
data_t res = (i << kFracBits) + f;
return res;
}
float toFloat() { return (float)this->m_data / kScale; }
const self_t& operator=(const self_t& right) {
if (this != &right) {
m_data = right.m_data;
}
return *this;
}
self_t operator+(self_t& right) {
self_t res;
res.set(m_data + right.m_data);
return res;
}
self_t operator-(self_t& right) {
self_t res;
res.set(m_data - right.m_data);
return res;
}
self_t operator*(self_t& right) {
self_t res;
auto x = m_data * right.m_data;
res.set(x >> kFracBits);
return res;
}
self_t operator/(self_t& right) {
if (right.data() == 0) {
if (m_data < 0) {
self_t res;
res.set(kMin);
return res;
} else if (m_data == 0) {
return self_t(0);
} else {
self_t res;
res.set(kMax);
return res;
}
}
self_t res;
res.set((m_data << kFracBits) / right.m_data);
return res;
}
const self_t& operator+=(self_t& right) {
m_data = m_data + right.m_data;
return *this;
}
const self_t& operator-=(self_t& right) {
m_data = m_data - right.m_data;
return *this;
}
const self_t& operator*=(self_t& right) {
m_data = (m_data * right.m_data) >> kFracBits;
return *this;
}
const self_t& operator*=(self_t&& right) {
m_data = (m_data * right.m_data) >> kFracBits;
return *this;
}
const self_t& operator/=(self_t& right) {
m_data = (m_data << kFracBits) / right.m_data;
return *this;
}
};
// The fast inverse square root method from wikipedia.
FxFloat FxSqrt(const FxFloat& fv) {
FxFloat y(fv);
FxFloat x2(fv);
FxFloat threehalfs(1.5f);
const FxFloat::data_t magic_num = 0x5f3759df;
x2 *= FxFloat(0.5f);
float temp = y.toFloat();
long i = *(FxFloat::data_t*)&temp;
i = magic_num - (i >> 1);
y = *(float*)&i;
y = y * (threehalfs - (x2 * y * y));
return FxFloat(1.0f) / y;
} // fn FxSqrt
} // namespace igame
Result
According to the fast inverse square root black-magic algorithm, in debug mode, the FxFloat is a little bit faster than soft float: 5229ms vs 5247ms for 10,000 loops.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。