I need to be able to use floating-point arithmetic under my dev environment in C (CPU: ~12 MHz Motorola 68000). The standard library is not present, meaning it is a bare-bones C and no - it isn't gcc due to several other issues
I tried getting the SoftFloat library to compile and one other 68k-specific FP library (name of which escapes me at this moment), but their dependencies cannot be resolved for this particular platform - mostly due to libc deficiencies. I spent about 8 hrs trying to overcome the linking issues, until I got to a point where I know I can't get further.
However, it took mere half an hour to come up with and implement the following set of functions that emulate floating-point functionality sufficiently for my needs.
The basic idea is that both fractional and non-fractional part are 16-bit integers, thus there is no bit manipulation. The nonfractional part has a range of [-32767, 32767] and the fractional part [-0.9999, +0.9999] - which gives us 4 digits of precision (good enough for my floating-point needs - albeit wasteful).
It looks to me, like this could be used to make a faster, smaller - just 2 Bytes-big - alternative version of a float with ranges [-99, +99] and [-0.9, +0.9]
The question here is, what other techniques - other than IEEE - are there to make an implementation of basic floating-point functionality (+ - * /
) using fixed-point functionality?
Later on, I will need some basic trigonometry, but there are lots of resources on net for that.
- Since the HW has 2 MBs of RAM, I don't really care if I can save 2 bytes per soft-float (say - by reserving 9 vs 7 bits in an int). Thus - 4 bytes are good enough.
- Also, from brief looking at the 68k instruction manual (and the cycle costs of each instruction), I made few early observations:
- bit-shifting is slow, and unless performance is of critical importance (not the case here), I'd prefer easy debugging of my soft-float library versus 5-cycles-faster code. Besides, since this is C and not 68k ASM, it is obvious that speed is not a critical factor.
- 8-bit operands are as slow as 16-bit (give or take a cycle in most cases), thus it looks like it does not make much sense to compress floats for the sake of performance.
What improvements / approaches would you propose to implement floating-point in C using fixed-point without any dependency on other library/code?
Perhaps it would be possible to use a different approach and do the operations on frac & non-frac parts at the same time?
Here is the code (tested only using the calculator), please ignore the C++ - like declaration and initialization in the middle of functions (I will reformat that to C-style later):
inline int Pad (int f) // Pad the fractional part to 4 digits
{
if (f < 10) return f*1000;
else if (f < 100) return f*100;
else if (f < 1000) return f*10;
else return f;
}
// We assume fractional parts are padded to full 4 digits
inline void Add (int & b1, int & f1, int b2, int f2)
{
b1 += b2;
f1 +=f2;
if (f1 > 9999) { b1++; f1 -=10000; }
else if (f1 < -9999) { b1--; f1 +=10000; }
f1 = Pad (f1);
}
inline void Sub (int & b1, int & f1, int b2, int f2)
{
// 123.1652 - 18.9752 = 104.1900
b1 -= b2; // 105
f1 -= f2; // -8100
if (f1 < 0) { b1--; f1 +=10000; }
f1 = Pad (f1);
}
// ToDo: Implement a multiplication by float
inline void Mul (int & b1, int & f1, int num)
{
// 123.9876 * 251 = 31120.8876
b1 *=num; // 30873
long q = f1*num; //2478876
int add = q/10000; // 247
b1+=add; // 31120
f1 = q-(add*10000);//8876
f1 = Pad (f1);
}
// ToDo: Implement a division by float
inline void Div (int & b1, int & f1, int num)
{
// 123.9876 / 25 = 4.959504
int b2 = b1/num; // 4
long q = b1 - (b2*num); // 23
f1 = ((q*10000) + f1) / num; // (23000+9876) / 25 = 9595
b1 = b2;
f1 = Pad (f1);
}
You are thinking in the wrong base for a simple fixed point implementation. It is much easier if you use the bits for the decimal place. e.g. using 16 bits for the integer part and 16 bits for the decimal part (range -32767/32767, precision of 1/2^16 which is a lot higher precision than you have).
The best part is that addition and subtraction are simple (just add the two parts together). Multiplication is a little bit trickier: you have to be aware of overflow and so it helps to do the multiplication in 64 bit. You also have to shift the result after the multiplication (by however many bits are in your decimal).
Division is similar.
Someone might have implemented almost exactly what you need (or that can be hacked to make it work) that's called libfixmath