So my FPGA GPU has come on leaps and bounds and now I'm building an FPU into it as well, to speed up the old Mandelbrot drawing and as a precursor to 3D graphics.
Part of testing the FPU is getting it to work with BBCBASIC, as it seems to work with floating point maths at its most basic level so would benefit from some hardware acceleration. Not being an assembly expert or maths professor, I'm having a little trouble switching the code around in BBCBASIC's fpp.z80 module to use the hardware FPU instead of its own built-in 32-bit floating point maths functions. It seems to be returning values that are twice as large (x2) as they should be, and I don't have the assembly-fu to work out why.
If there's anybody on this forum who knows enough about BBCBASIC or Z80 assembly in general to spot the mistake, I'd be really pleased to hear from you!
I've attached the modified fpp.z80 module for information - lines 423-520 are where all the action is at.
My FPU takes two 32-bit values by reading 8 I/O ports. I/O ports 224-227 are the first value (A), in Little-Endian format. Ports 228-231 are the second value (B) in Little-Endian format. The resulting multiplication is calculated instantaneously (as far as the Z80 is concerned, even at 8 MHz) and the Z80 can read it back in the same format (Little-Endian) from I/O ports 232-235 (port 232 is read first to streamline the assembly slightly).
The FPU translates the floating-point number format automatically - the FPU works with IEEE-754 format single-precision floats, whereas BBCBASIC works with 40-bit 8080 Altair BASIC-format floats, so I'm aware of that issue and have dealt with it already (it only involves moving the SIGN bit around and dropping the last byte).
I'd just assumed that the BBCBASIC floating-point multiplication code I'd commented-out in the fpp.z80 module returned the same value that the GPU FPU does, but it would appear not and looks like there's an additional x2 multiplication going on somewhere after FMUL: is called in the BBCBASIC code, I just can't work it out. I don't want to have to add a 'divide by 2' to the FPU output in the GPU as that seems like a real fudge fix.
Anyone have any ideas?
Part of testing the FPU is getting it to work with BBCBASIC, as it seems to work with floating point maths at its most basic level so would benefit from some hardware acceleration. Not being an assembly expert or maths professor, I'm having a little trouble switching the code around in BBCBASIC's fpp.z80 module to use the hardware FPU instead of its own built-in 32-bit floating point maths functions. It seems to be returning values that are twice as large (x2) as they should be, and I don't have the assembly-fu to work out why.
If there's anybody on this forum who knows enough about BBCBASIC or Z80 assembly in general to spot the mistake, I'd be really pleased to hear from you!
I've attached the modified fpp.z80 module for information - lines 423-520 are where all the action is at.
My FPU takes two 32-bit values by reading 8 I/O ports. I/O ports 224-227 are the first value (A), in Little-Endian format. Ports 228-231 are the second value (B) in Little-Endian format. The resulting multiplication is calculated instantaneously (as far as the Z80 is concerned, even at 8 MHz) and the Z80 can read it back in the same format (Little-Endian) from I/O ports 232-235 (port 232 is read first to streamline the assembly slightly).
The FPU translates the floating-point number format automatically - the FPU works with IEEE-754 format single-precision floats, whereas BBCBASIC works with 40-bit 8080 Altair BASIC-format floats, so I'm aware of that issue and have dealt with it already (it only involves moving the SIGN bit around and dropping the last byte).
I'd just assumed that the BBCBASIC floating-point multiplication code I'd commented-out in the fpp.z80 module returned the same value that the GPU FPU does, but it would appear not and looks like there's an additional x2 multiplication going on somewhere after FMUL: is called in the BBCBASIC code, I just can't work it out. I don't want to have to add a 'divide by 2' to the FPU output in the GPU as that seems like a real fudge fix.
Anyone have any ideas?