• Please review our updated Terms and Rules here

Floating Point to string/ASCII conversion

Ruud

Veteran Member
Joined
Nov 30, 2009
Messages
1,411
Location
Heerlen, NL
First some background: for years I'm busy developing my own OS, Pascal compiler and Basic interpreter. These are very slow projects. To handle numbers, I need to be able to handle Floating Point. Converting a string/ASCII to single or double precision is no problem. But the other way around surely is. If I look on Google, all answers I find point to built-in functions and not how it is done on low-level. So I started to create my own method. At the moment I'm using Pascal with the intention to translate it in assembly later on.

For single precision I use raw force: I created a tables with records of nine words filled with the hex value of 1 ... 10E+43 for the integer part and one table for 0.1 ... 10E-20 for the fraction part. OK, the last is too small but I will expand it. That works fine for single precision but double precision that supports values for 10E-308 ... 10E+308 that won't work.

So I looked at other sources of programs to see how they do it. The first problem: to find well documented sources. One of the best I found was the source for BASIC V2 of the Commodore 64. What the C64 does with fractions is quite simple:
- subtract 0.1 from the fraction as many times as possible and you have your first digit.
- multiply the rest with 10.
- subtract 0.1 from the fraction as many times as possible and you have your next digit.
- if the result is not zero got to step 2.

It does the same with the integer part but using the value 1,000,000,000 as starting value.

But this won't work for double precision, I think. I think I would need over sixty words to represent 10E+308 and even more for the fraction part. So the next idea was subtracting the giving number with the Floating Point version of 10E+308. But.... the Floating Point version of 10E+308 is not exactly 10E308, there always will be a difference. Is this difference acceptable?

I first didn't understand why the C64 used the value 1,000,000,000 as its max value until I realized that FP version of 1,000,000,000 is exactly 1,000,000,000. FYI: the C64 uses 40 bits, the single precision version of 1,000,000,000 is not the same and I have to find out yet what the max value is that it can handle.

So my questions:
- Can someone point me to a site that goes deeper in this material and/or gives practical ideas/sources?
- Has anybody other (better) ideas to handle this?

Thank you very much in advance?
 
I found several resources with implementations and paper citations by Googling "fp to string conversion algorithm". These include:

A StackOverflow post with lots of references and links. I think following links and citations from this page will be your best bet!

Someone's implementation of someone else's algorithm (scroll down to the bottom of the page). I suspect the previous link is still probably the most conventional way to go.

Someone's YouTube video presenting their method. Didn't watch it :)

The best answer for you may depend on what your target platform is. The C64/6510 has no hardware integer multiply/divide, which may help explain their choice of algorithm. Do you have it?
 
I don't know if this one suits you, but I thought it was very clever:


Here's a general rundown of several methods, including Ryu:


There's also this:


On a little MPU like a 6502, it might be better to avoid the binary-decimal issue altogether and keep your floating point in BCD. A number of early spreadsheets did this, mostly to keep money decimal. Yes, there's an IEEE standard for that:

 
Hello Stepleton and Chuck,

Not being a native English speaker, I didn't think of the word "algorithm". Thank you very much!
 
You're welcome. I thought "algorithm" Arabic... :)

On the subject of decimal floating point, I recall reading about the following representation in an issue of the HP Journal, many years ago. Basically, instead of encoding the entire mantissa/significand in straight BCD digits, you might take groups of, say, 10 bits and restrict the binary value to 0-999. Conversion of small binary numbers to decimal is easier than with large binary values and you get a bit more bang for the bit than straight packed BCD. Call it a "hybrid" representation.

In other words, a straight BCD representation of 999999 takes 3 bytes or 24 bits, but a hybrid representation takes only 20 bits (two groups of 10 bits).
 
Last edited:
I'm busy reading the various PDFs but the YouTube presentation of Ulf Adams was almost boring: I'm sorry, but the man is not a good speaker.

I could use your point of view about another idea, one triggered by the C64 using 40 bits. First: is there any reason why I should follow the standard of double-precision: 1 sign bit, 11 exponent bits and 52 mantissa bits? I rather prefer precision above being able to handle huge numbers. I think the range 10E-38 ... 10E+38 is big enough. (OK, why keeps the slogan "640 KB is more than enough!" popping up into my head?) I can use 64 bits but then as: 1 sign bit, 8 exponent bits and 55 mantissa bits. The main advantage for me: I can handle that with the method I already have and not having to fall back to FP divisions or multiplications.

The best answer for you may depend on what your target platform is. The C64/6510 has no hardware integer multiply/divide, which may help explain their choice of algorithm. Do you have it?
The 6510 indeed doesn't have multiply or divide. But more important, and I almost forgot, the 6510 is an 8-bits CPU. The C64 at least proofed that it can handle 40-bits FP numbers. So if my 64-bits version is too big, I always can fall back to 40 bits as well. In most cases downgrading is easier than upgrading.
 
Last edited:
I wrote the floating point math pack for Sorcim SuperCalc (and Pascal/M) way back in the day, initially for x80, then x86 with 8087 option.

Format for the 8080 was 1 byte for exponent 8 for mantissa. This could be varied at assembly time.

You might want to survey vintage floating point formats on various mainframes. The S/360 one was perhaps the worst of the lot; normalized to a nibble boundary, not bit.
 
I read the documents and, to my surprise, one of the documents advised using tables to calculate the FP numbers and going back to ASCII again. And that was what I was already doing! The C64, for example only uses one set of five bytes and calculates the rest when needed.
I also developed a routine that will be used as base for the routines that convert ASCII to single or double precision and can be used as base for any format.

FP to ASCII is the next challenge. I decided to use double precision as base for the moment but with the limitation of single precision: 10E-38 .... 10E+38. Main reason: I can check my own FP-to-ASCII conversion against Turbo Pascal's own code. When both conversions work as they should, I will switch to my format: an extended single precision, not 23 bits for the mantissa but 55.
 
Some years ago I wrote up that grisu3 algorithm that Florian Loitsch described (that was linked in a research paper above). You can find somewhat compact C code here:


that might be portable enough to be copy-pasteable to be deployed over to your project. Feel free to rip and use it in any form for any purpose, no attribution or anything like that needed.
 
Thank you very much but I think I managed to work things out. "think" because yesterday evening I was able to translate a string like, "12345.6789123" to FP and back to ASCII again but when I entered "1000" The resulting FP was OK but I only got "1" back in ASCIII. That's how I found out that there was still an error in the routine that creates the ASCII: it couldn't handle writing zeros at the end of an integer. So I more or less expect to run into other bugs, but I hope not of course.
What I have to deal with yet is the E-notation, like 1.2345E+23 but I'm very confident to be able to handle that.

I'm testing using double-precision but limited from 1E-38 to 1E+38, the limitation of single-precision because I can see how Free Pascal stores the numbers so I can check the FP-to-ASCII conversion. Next step is to change from double-precision to 64-bits single-precision which is 32-bits single-precision plus another 32 bits for the mantissa.

How does it work? I created an array of 24 words, 14 words reserved for the fraction part, ten for the integer part. My extended version of a single precision number always fits into this array after a conversion. A table with multiples of 10 and one with 0.1, 00.1, 0001, etc. is used to convert the given ASCII to binary which on is turn is converted to FP. The conversion to ASCII again is done by translating the FP into a binary again and subtracting the multiples of 10 for the integer part and subtracting 0.1, 00.1, 0001, etc. from the fraction part.
The above sounds simple but translating this into software is another thing. And the next step is translating the whole into ML.

For those who are interested, code and documentation will be free when I'm finished.
 
Just an update: It seems that the conversion from ASCII to FP works find vice versa as well. The use of the exponent still has to be implemented. I'm now busy with the translation to 8088 ASM.

Next/parallel steps:
- exponent
- implementing various functions in FP like multiplication, addition, etc. plus logical functions like AND, OR, etc.
 
Just an update: It seems that the conversion from ASCII to FP works find vice versa as well. The use of the exponent still has to be implemented. I'm now busy with the translation to 8088 ASM.

Next/parallel steps:
- exponent
- implementing various functions in FP like multiplication, addition, etc. plus logical functions like AND, OR, etc.
What would the use case of logical operations on FP numbers be?
 
Back
Top