The "Ahl Benchmark" of BASIC performance

voidstar78 · May 4, 2024

As an excuse to fire up the old systems, and celebrate BASIC now being over 50 years old - I ran the "Ahl Benchmark" on a few hardware systems that I have.

Exploring Project Ahl in 2024 — voidstar

Some time ago, I came across an old Computer World article from 1980 that was comparing “$15,000 computers” (high end equipment at the time). That discussion is here. And the point of those articles was showing how some of the then-new-to-the-world microcomputers were seemingly catching up with...

voidstar.blog

Code:

SYSTEM                        ACCURACY          RANDOM-NESS         TIME(S)   YEAR CPU

IBM 5100 (1975)               0.000001566000    782.326940000000      184.3   1975 PALM
IBM 5110 (1978)               0.000000000346    954.589800000000       40.4   1978 PALM

IBM 5150 (4.77MHz)            0.01159668000       9.317505000000       23.9   1983 8088
NEC PC-8001 (1979)            0.03387450000       1.763920000000       90.7   1979 8088
NEC PC-8001 mk II (1983)      0.03387450000       1.763920000000       88.14  1983 8088
Sharp PC-5000 (1983)          0.00585937500       7.188416000000       29.7   1983 8088

Commodore PET (4032)          0.00104141235       7.323270560000      120.12  1980 6502
Commodore 64 (1MHz)           0.00104141235       1.155792950000      122.4   1982 6502
Apple IIe (1986 Platinum)     0.00104141235       0.821743011000      113.7   1986 6502

Tandy Color Computer 3 (6309) 0.00039628486       5.856625800000      142.5   1986 6309

Tandy Model 102 (1986)        0.00000020580      12.494425927600      295.8   1986 80C85

Commander X16 (8MHz 65C02)    0.00056564807       1.267472270000        7.8   2024 65C02
               
ACCURACY: 0 is perfect, lower is better.  Each run is same accuracy.
RANDOM-NESS: 0 is perfect, lower is better.   1-30 is good, larger is poorer but not bad.
TIME: Elapsed time to run the benchmark in seconds.
YEAR: Initial year the given system was released.


voidstar - May 2024 - REV #1

Eudimorphodon · May 4, 2024

Just for laughs try declaring the numeric variables double precision on the 5150. (Pretty sure BASICA defaults to single.)

Chuck(G) · May 4, 2024

Many modern MCUs feature a shot-noise random number generator. All these others are better termed "pseudo-random".

alank2 · May 4, 2024

NEC PC-8001 uses a Z80 variant CPU, right?

Eudimorphodon · May 4, 2024

A couple comments/questions about the blog post:

1: Were you hand stop-watching these results? Over how many runs?

2: You said this about the NEC-8001:

For the NEC PC-8001, it is the only system that continues to flash the BASIC console cursor throughout the runtime of the benchmark program. This makes it more apparent that during the runtime of any BASIC program, system interrupts are still being invoked even if they have fairly benign activity.

It's remarkably hard to find a *good* schematic/circuit diagram of an NEC-8001 (and it also seems to matter a lot which version you have), but from what I've been able to gather those machines used a programmable CRTC that had hardware cursor and blink support. Granted not every system with a CRTC used those functions, but if the 8001 did then the blinking cursor has *zero* impact on performance. (A bigger issue, really, would be the fact those systems used a DMA controller to shared RAM instead of dedicated VRAM for video refresh; with a Z80 that inevitably comes with some overhead that's going to make the machine at least a little slower than a straight 4mhz Z80 without shared video.)

Anyway, basically *every* microcomputer is going to be keeping interrupts on when it's running a normal software. Most machines have *some* kind of heartbeat interval interrupt they use to keep a timer advancing, machines with dumb matrix keyboards need to periodically scan it (and even machines with encoders that don't use interrupts, like the Apple II, still need to periodically check the keyboard port), etc, etc. Commodore 8-bits, for instance, do their clock updating and keyboard scan on a 60hz vertical refresh interrupt... etc. Even *if* the NEC PC-8001 were manually triggering the cursor instead of the CRTC doing it completely hands off the additional overhead would be microscopic, all told.

3: 6502 stuff:

Note how the Apple IIe, Commodore PET, and Commodore C64 all have the same floating point accuracy result. This is because not only are they using the same 6502 processor, but the floating point code in each of them is derived from the same original Microsoft BASIC source.

I'm actually a little curious why the Commander X16 doesn't have the same "float accuracy"; my extremely vague understanding was that its BASIC was based on the Commodore BASIC source. But, well, I suppose they have hacked the living snot out of it, it's certainly possible they updated the floating point code.

_________________

Just for laughs I copied your version of the benchmark to my Tandy 1000 to play with. I (initially) made one change, in the form of these two lines, because I don't have the patience to stopwatch it.

Code:

11 STARTTIME$=TIME$
111 PRINT STARTTIME$, TIME$

This rounds things to the second, but I don't think hand stopwatching is going to do better than that. Interestingly leaving the rest of the code unchanged gives me these results that are quite a lot different from your IBM 5150 results. (This is using Tandy's BASICA, which is the Tandy 1000-specific version of GW-BASIC.)

Tandy 1000HX (speed)	Float Accuracy	Randomness	Time
7.16mhz	5.859375E-03	7.188416	12 seconds
4.77mhz	same	same	16 seconds

One unfortunate difference is my machine has a V20 instead of an 8088; that probably explains the speed difference (in the 4.77mhz results), but I would have expected the other figures to be the same. Just to double check I tried running the same program in "generic" GW-BASIC", the last newest version (the one included with DOS 4.01), and the results were identical, so... interesting. I'm curious if someone has a V-20 equipped 5150 if their numbers match mine or the 8088 machine's numbers. (IE, is it the CPU, or is there a difference between the code in the IBM BASIC ROM and what went into the stand-alone disk version of BASIC.)

Anyway. I guess something I find interesting about this is it hardly matters at all in the execution time if I use double precision instead of single. Adding these two lines:

Code:

1 defdbl a,r,s
2 defint i,n

Gives you more precise answers, as below, but didn't change the runtime by more than a second. This suggests very strongly to me that what this program is actually measuring in large part is how long the random number generator takes to spit out things. Which... yeah, considering the various methods these BASICs used to come up with their numbers makes this an incredibly useless benchmark.

speed	f/a	rand	time
7.16mhz	5.9387445449777226D-03	7.188560009002686	12 seconds
4.77mhz	same	same	17 seconds

voidstar78 · May 4, 2024

Pardon, yes the NEC PC-8001 uses a Z80 "clone" branded by NEC. I'll correct the notes on that.

(1)
These are all slow systems, so a hand stopwatch is good enough. I did about 5 runs each. The first run let me dial in the time to expect, then all the subsequent runs were all within a quarter second of each other since from the first run I knew about when to STOP.

(2)
Yep, the cursor-blink itself might not impact the runtime, so maybe it's just a reminder that "stuff unseen is going on." Interrupts are firing and doing "who-knows-what." Some might even just be polling if a RESET or PAUSE button of some sort had been pressed, or polling RTC and updating RAM values with current time. Or maybe just checking to see if it is time to scroll the screen (maybe not in these particular systems). Some detect and buffer serial input for you. So a "slow system" might be providing a lot of utility in those interrupts, it just depends. I suspect on the Commodore systems, an interrupt is looking for that CLR button being pressed? (it's odd to me on those system they don't have a "CLS" keyword, but I understand why).

Or at least on the 6502 systems, sliding in your own ISR is easy enough (like to play the next note in some "background" audio).

In general, most BASICs only enable the cursor-blink when at the main input prompt, or when executing an INPUT keyword to poll for some user input. Or the "blink" can be implemented in different ways - usually just inverting a font bit, but like on the Color Computer it also cycles through colors.

(3)
Good observation about the X16. Yes, parts of the math library in the X16 ROM were revamped by Michael Jørgensen around 2020 (around rev 38). There is a note in the ROM source, where SQR in particular got an 80% boost:
<https://github.com/X16Community/x16-rom/tree/master/test/fp>

# Performance measurements of floating point subroutines

Code:

| Test                     | Original | Optimized | Speedup (%) |
| ------------------------ | -------- | --------- | ----------- |
| simple for loop          |   106    |     97    |    8.5      |
| counter                  |   333    |    315    |    5.4      |
| cos calculation          |   225    |    168    |   25.3      |
| multiplication           |   464    |    417    |   10.1      |
| multiplication underflow |   555    |    386    |   30.5      |
| sqr                      |   402    |     77    |   80.8      |

Yes - my 5150 is "bone stock" original 8088, not the V20. And I think it's very possible that BASICA uses different math-implementations (even across versions of MS-DOS that it was packaged with). We'd have to dig into the .COM file to see what calls it re-uses from the System ROM.

On the IBM 5110, I also found that using single vs double precision made little difference in the runtime. That just shows it wasn't the main bottleneck. OR, the difference is superficial - maybe it is still using double internally, but just truncs the output?

I don't think it is entirely useless - it is pretty telling on the exact flow of execution involved on your system, such that slight differences will get noticed. But in comparing systems, it needs some fair consideration. The original author, Ahl, notes one major issue is a system could just alternate its RNG from .1 to .9 and it would appear to be "perfect." This isn't a good benchmark for "raw" performance comparison, but from a casual "end user experience" on "as-delivered" system I think it's a fair assessment.

voidstar78 · May 4, 2024

I've updated about the Z80, and also added two new Z80 systems: the Aquarius+ and Agon Lite 2.

I was a little on the fence calling the Agon Lite 2 a "boots up to BASIC" system - it has no ROM BASIC, so you have to pick and choose a BASIC implementation (which they feature the BBC BASIC 1.06 right now).

Note that the IBM 5100/5110 use a full 8-byte double representation, while most of the microcomputer BASICs use some 5-byte representation. So, the accuracy is pretty good on the 5110.

The Aquarius+ is probably using a single point, hence the precision is fairly poor compared to the rest. But it goes to show how even within the same group of CPU, you can get pretty widely different performance.

Code:

SYSTEM                        ACCURACY          RANDOM-NESS         TIME(S)   YEAR CPU

IBM 5100 (1975)               0.000001566000    782.326940000000      184.3   1975 PALM
IBM 5110 (1978)               0.000000000346    954.589800000000       40.4   1978 PALM

IBM 5150 (4.77MHz)            0.01159668000       9.317505000000       23.9   1983 8088
Sharp PC-5000 (1983)          0.00585937500       7.188416000000       29.7   1983 8088

NEC PC-8001 (1979)            0.03387450000       1.763920000000       90.7   1979 Z80 (Japan-version)
NEC PC-8001 mk II (1983)      0.03387450000       1.763920000000       88.14  1983 Z80 (UK-version)
Agon Lite 2 (bbcbasic 1.06)   0.000257965903      3.262725350000        1.31  2023 Z80
Aquarius+                     0.18780500000       1.203430000000       71.34  2023 Z80

Commodore PET (4032)          0.00104141235       7.323270560000      120.12  1980 6502
Commodore 64 (1MHz)           0.00104141235       1.155792950000      122.4   1982 6502
Apple IIe (1986 Platinum)     0.00104141235       0.821743011000      113.7   1986 6502

Tandy Color Computer 3        0.00039628486       5.856625800000      142.5   1986 6309-upgraded

Tandy Model 102 (1986)        0.00000020580      12.494425927600      295.8   1986 80C85

Commander X16 (8MHz 65C02)    0.00056564807       1.267472270000        7.8   2024 65C02
             
ACCURACY: 0 is perfect, lower is better.  Each run is same accuracy.
RANDOM-NESS: 0 is perfect, lower is better.   1-30 is good, larger is poorer but not necessarily bad.
TIME: Elapsed time to run the benchmark in seconds.
YEAR: Initial year the given system was released.

cjs · May 4, 2024

alank2 said:
NEC PC-8001 uses a Z80 variant CPU, right?

It's a Z80 clone: a NEC μPD780 (sometimes called the μCOM-82). Other than looking at the chip itself, I've never seen any way you can tell the difference from a Z80.

Eudimorphodon said:
It's remarkably hard to find a *good* schematic/circuit diagram of an NEC-8001....

I've only ever seen the one standard one from I/O magazine, so I'm not clear if you think that one isn't good (it seems fine to me) or if you've discovered something I've not come across. Anyway, it's not hard to find; obviously just search for "I/O別冊 PC-8001活用研究." :-) But, for the sake of convenience, I chopped just the schematic out of that book (which includes a correction from the magazine-published version) and stuck it in a repo here:

sch · main · retroabandon / pc8001-re · GitLab

NEC PC-8001 and PC-8801 reverse engineering

gitlab.com

(Note that the one annoying thing about the schematic is that they don't label the chip part numbers directly on the schematic; there are just designations such as "IC8" that you need to look up on the BOM after the schematic to find out that it's a 74LS139.)

Eudimorphodon said:
...(and it also seems to matter a lot which version you have)...

The PC-8001, PC-8001mkII and PC-8001mkIISR are probably best thought of as three different machines with backwards compatibility. (The same is true of many entries in the PC-8801 series.) They differ more amongst each other than, say, the Apple II+ and Apple IIe.

Eudimorphodon said:
(A bigger issue, really, would be the fact those systems used a DMA controller to shared RAM instead of dedicated VRAM for video refresh; with a Z80 that inevitably comes with some overhead that's going to make the machine at least a little slower than a straight 4mhz Z80 without shared video.)

Right. Later PC-8801 (and possibly PC-8001) systems had no-wait-state VRAM but compatibility modes to re-introduce those wait states for software that depended on them for timing (typically games).

voidstar78 said:
I suspect on the Commodore systems, an interrupt is looking for that CLR button being pressed? (it's odd to me on those system they don't have a "CLS" keyword, but I understand why).

On the C64 the interrupt routine is scanning the keyboard matrix and entering all newly pressed keys into a keystroke buffer (15 bytes, if I remember correctly). The one exception is the RESTORE key, which is not a scanned key in the matrix but directly generates an NMI.

voidstar78 said:
Or at least on the 6502 systems, sliding in your own ISR is easy enough (like to play the next note in some "background" audio).

This is quite common on MSX machines, which provide a hook into the 60 Hz/50 Hz interrupt to let you add whatever you like to that.

It would be interesting to run your tests on an MSX machine, actually, since it had a fairly advanced version of 8-bit MS-BASIC, including some key differences in floating point. (All numbers are tokenised in the BASIC text, and FP was single (4-byte) and double (8-byte) precision BCD, with one byte for exponent and the rest for significand. (There's no sign in the tokenisation; that's done with a preceding - token; I don't recall how they dealt with the sign internally.)

Eudimorphodon · May 4, 2024

cjs said:
I've only ever seen the one standard one from I/O magazine, so I'm not clear if you think that one isn't good (it seems fine to me) or if you've discovered something I've not come across. Anyway, it's not hard to find; obviously just search for "I/O別冊 PC-8001活用研究."

I would say the main problem here, sure, is on my end, IE, I'm not conversant in the correct language to search for information about this computer.

Anyway, now that I have an actual schematic of it to look at (all I could find before was a block diagram) it's plain that, yes, this thing has a hardware cursor. So whatever other periodic interrupt overhead it might have blinking the cursor is probably not a component of it.

voidstar78 said:
The Aquarius+ is probably using a single point, hence the precision is fairly poor compared to the rest. But it goes to show how even within the same group of CPU, you can get pretty widely different performance.

Most Microsoft 8 and 16 bit BASICs that support both single and double precision default to single precision; since you didn't specify I'm pretty positive that's what you're getting on the IBM 5150. (The format of the value you put in the table in the block post actually makes it feel like it's a sure thing it was single precision, because it's the number of digits that the single precision results returned for me on GW-BASIC.) Also, FWIW:

Eudimorphodon said:
I'm curious if someone has a V-20 equipped 5150 if their numbers match mine or the 8088 machine's numbers. (IE, is it the CPU, or is there a difference between the code in the IBM BASIC ROM and what went into the stand-alone disk version of BASIC.)

I did the obvious control and ran the benchmark using GW-BASIC in DOSBOX; the precision and randomness numbers agree with my Tandy 1000. So I would *guess* it is a difference in the BASIC code stored in the original IBM ROM implementation and *not* a CPU variance.

Anyway, per my observation setting double precision with GW-BASIC at least hardly changed the execution time *at all*. I'm pretty confident saying that most of the variation in these scores has practically nothing to do with the actual CPU in the machine and it's all about the actual language implementation. I mean, recall my guess about what was actually taking so long for GW-BASIC was the repeated runs of the RND command? Just for laughs I stuck a REM in front of the contents of lines 40, 70, and 90, IE, commented out the square-ing, square root-ing, and the floating point addition at 90, just leaving the two sets of R+RND(1), and at 7.16mhz it cut the runtime from 12 seconds to 8. And then, to really take it over the top, I converted lines 45 and 75 to just "A-RND(1)", getting rid of the addition entirely, all the nested loops are doing is getting, what, 2000 random numbers? And... that takes about six seconds.

Trimming it down to the very, very minimum:

Code:

1 DEFDBL R
2 DEFINT N
20 FOR N=1 to 2000
75 R=RND(1)
95 NEXT N

IE, getting rid of the conversion of the iterator N from integer to float at the start of each cycle and ditching the nested loops, takes it down to 4 seconds at 7.16mhz.

So... yeah. It looks like at least a third of the runtime of this program when run on GW-BASIC is *just* getting the random numbers, and it also looks like the overhead of running the FOR loops is a significant factor; at least with GW-BASIC the time actually spent on doing the arithmatic is a minor enough component that it doesn't even matter if it's single or double precision.

You know what's really a hoot? Since it was there I decided to try running the program using several other BASICs I have installed on my Tandy 1000, IE, the "qbasic" interpreter included with DOS 5, and then I tried compiling it with actual QuickBASIC 4.5 for DOS. Here are the results in single and double precision at 7.16 mhz:

lang/precision	f/a	randomness	time
gwbasic/double	5.938744544977226D-03	7.188650009002686	12 seconds
qbasic/double	7.275846591880963D-13	6.234999179840088	34 seconds
qb/double	same	same	31 seconds
qbasic/single	1.152344E-02	6.234985	35 seconds
qb/single	same	same	30 seconds

These results make my head hurt, because usually qbasic is a *little* faster than GW-BASIC, while compiled QuickBASIC will completely and utterly blow the doors off it. (Well, at least with code that uses integers as much as possible.) But, no, both newer BASICS take three times longer than GW-BASIC.

What's really odd is that loop of 2000 RNDs takes the same 4 seconds as GW-BASIC. So apparently the two BASICs use the same method to generate random numbers... but why in the world is the math performance so much worse on Q[uick]Basic? The "float accuracy" for double precision *is* much, much higher than GW (if the numbers on your table are to believed it actually crushes *all* other contestants), but the speed doesn't improve with single precision? Does QuickBASIC do everything internally as double even if you say single?(*)

(*EDIT: Looked it up, apparently QBASIC/QuickBASIC use full IEEE representation for floating point numbers instead of Microsoft's binary format. So it's legit lugging a much bigger stone around its neck. If I had a numeric coprocessor that'd be great, but, well, I don't.)

Also, just for grins I tried this in a "cycle-accurate" TRS-80 Model I emulator (1.77mhz Z80) with both single and double precision specified. If I get a chance I'll try it on my real TRS-80 to confirm the speed is correct. Just like GW-BASIC single vs. double precision seems to make very little difference. I have to admit I find this rather puzzling. Is single vs. double more about saving memory than performance?

	f/a	randomness(*)	time
double precision	.03404722213744549	8.140957355499268	2:19
single precision	.0338745	23.7814 ... or 6.61066 ... or...	2:15

* the "randomness" value varies *a lot* between different runs on the TRS-80... which triggers vague memories that GW-BASIC uses the same seed every time you RUN unless you use the RANDOMIZE command at the start of a program, while the TRS-80 just uses the value of the DRAM refresh register as the seed unless you give it a specific one with RANDOM?

EDIT: Did some more reading on RND in Level II basic, and it says the value returned by RND(0) is always a single-precision value. I'm willing to wager that the bulk of the time difference between the single and double precision TRS-80 runs is just converting the output of RND into double on those lines.

Anyway, again... as a real world *hardware* benchmark I still have to say this thing... leaves a lot to be desired. Unless, I dunno, you're writing a lot of BASIC software that *really* relies on random numbers and will only consider in your evaluation the software that comes in ROM? What this *doesn't* do is tell you anything meaningful about the actual hardware of the computer. As *all* the numbers prove, your mileage is all down to the software implementation of your interpreter, not the hardware under it.

Chuck(G) · May 4, 2024

Didn't Byte or one of the other rags run a pretty wide comparison among BASICs/CPUs in the 1970s/80s? I think I hung unto a copy somewhere in my files.

Also, has anyone ever ported LINPACK to anyone's BASIC just for yucks?

voidstar78 · May 4, 2024

Maybe GW-BASIC has a lookup table for SQR's of 100 or less? On the 5150 I'm using the ROM BASIC (C1.10) from 1981 that it boots into when no system disks are available, not the GW-BASIC on any of the DOS boot disks. And my results for the 5150 match the original articles results for "IBM PC".

And my TRS-80 Model 1 isn't running right now, so I'd be curious on an on-hardware result on that (or a Model 3).

I looked around at MSX systems, but there are just too many variations of them across the markets - I don't know what a "typical" MSX is. It seems those originals were 8KB systems, so the MSX2 spec might be a bit more enjoyable. Actually, in the original article about this benchmark, they include a SpectraVideo 318/328 - wasn't that the ancestor to MSX?

krebizfan · May 5, 2024

The major set of benchmarks in BASIC were the Kilobaud suite. Those were important enough that the pre-release PC-DOS disk found a while back had those benchmarks along with a version of ROM BASIC that could be loaded from disk. I have linked in the past to a PDF that included results of those tests and the PCW enhanced tests and some additional tests including implementations done with other languages. https://en.wikipedia.org/wiki/Rugg/Feldman_benchmarks has a description of the benchmarks.

The Ahl benchmarks seems more in line with a lot of calculator tests that checked for accuracy after complex operations that are then reversed* and speed. The randomization test was done to see if the random number generator was balanced enough to be reliable for statistics use.

* Getting a square root followed by squaring the result was a staple of HP versus TI comparisons.

voidstar78 · May 5, 2024

Eudimorphodon said:
* the "randomness" value varies *a lot* between different runs on the TRS-80... which triggers vague memories that GW-BASIC uses the same seed every time you RUN unless you use the RANDOMIZE command at the start of a program, while the TRS-80 just uses the value of the DRAM refresh register as the seed unless you give it a specific one with RANDOM?

I'd say values from 0.5 on up to 20 or 30 is fine and can be considered within the same "quality tier." The original author speaks to this in the article. I generally just noted the best randomness value result of the runs I did. Higher values aren't necessarily bad (for purposes of modeling random events), just "poorer." The author also notes a loophole, if the RND just alternates between .1 and .9 then it would artificially appear like it is doing a very good job in this metric.

For each system, I did have to check by running RND(0) and RND(1) a couple of times, and make sure it was producing different values. It two back to back calls did the same value, then it is just re-initializing the seed. So you're right, there is a lot of inconsistency on some systems on if they "auto randomize a seed" when doing a RUN, or if you have to explicitly call RANDOMIZE (or something like RND(-TI) on Commodore systems).

Speak of Commodore systems, RND(.) parses faster on that system. Which is interesting because, to me, that means it is actively re-tokenizing during runtime. There are some BASIC implementations that tokenize "up front" (as soon as you press ENTER on the line of code). Plus if you type the code out of order, that can impact the linked list traveled when executing the code (even if you go out of order inadvertantly, like to add or remove or correct lines later on). I don't think that'd have a huge impact on a short program like this. But notice how some BASICs will re-format your original line input, to subjectively format things better (or at least more uniform). The ones that I do that, I suspect are tokenizing just once (when the line is entered, or maybe when doing a RUN or LIST). But some more compact BASIC's might actively re-token on the fly (perhaps ones that intend to run on limited 4KB RAM systems, and they don't want to hold the tokenized version of each line in memory- so they tokenize as they run, even within a FOR loop).

Chuck(G) · May 5, 2024

Dunno--square root, IIRC, converges pretty quickly--and inverse square root can be even faster {https://en.wikipedia.org/wiki/Fast_inverse_square_root). Of course, if you're interested in integer results, it's almost trivial.

Chuck(G) · May 5, 2024

krebizfan said:
The major set of benchmarks in BASIC were the Kilobaud suite.

That was the one I was thinking of, thank you!

cjs · May 5, 2024

voidstar78 said:
I looked around at MSX systems, but there are just too many variations of them across the markets - I don't know what a "typical" MSX is. It seems those originals were 8KB systems, so the MSX2 spec might be a bit more enjoyable. Actually, in the original article about this benchmark, they include a SpectraVideo 318/328 - wasn't that the ancestor to MSX?

They are all very similar; that was the whole point of MSX. The only thing you really need to worry about is MSX 1 vs. MSX 2 (and maybe 2+ etc.) if you are doing graphical stuff. But for your test, they're all effectively the same.

And yes, 8 KB was the minimum RAM allowed on an MSX system, but that was pretty rare; most of the earlier MSX1 machines had 16 or 32 KB, and later many of them went to 64 KB. (MSX 2 and above could go a lot higher.)

But this, again, makes no difference to your benchmark, since that will fit easily in a few KB. (And from BASIC you don't get access to more than 32 KB of memory, anyway.)

voidstar78 said:
Speak of Commodore systems, RND(.) parses faster on that system. Which is interesting because, to me, that means it is actively re-tokenizing during runtime.

Not quite. Technically, in the earlier MS-BASICs the numbers are not tokenised at all; they are parsed when the parser-of-tokenised-text reads the line. (Later MS-BASICs, such as MSX-BASIC, do parse numbers to tokenised form when they process the line and add it to the "text" area.)

voidstar78 said:
But some more compact BASIC's might actively re-token on the fly (perhaps ones that intend to run on limited 4KB RAM systems, and they don't want to hold the tokenized version of each line in memory- so they tokenize as they run, even within a FOR loop).

For MS-BASIC, the tokenised line is the only one you have in memory after you've entered it. the LIST, SAVE "...",A etc. commands all detokenise to produce their output.

voidstar78 · May 5, 2024

So any whitespace itself is also counted as a token? On the "mainframe" style BASIC of the 5110, it forces "minimal" whitespace. On the Agon Lite BBC BASIC, it does things like auto-identing the lines between a FOR and the corresponding NEXT (when doing a LIST). The SuperBASIC on the Feonix does some "syntax highlighting" of keywords during a LIST.

Might there be "two-stages" to the tokenization? For example, a given line might remain "dirty" until it is encountered during a RUN. This would allow syntax-invalid lines temporarily, that maybe haven't been tokenized yet. The 5110 doesn't allow this, if you mistype "PRNT" it forces you to correct that before proceeding.

And I've wondered about "self modifying BASIC" - such as if you could stream into std console, and inject new lines (or modifying exist ones) during the RUN of a BASIC program?

Eudimorphodon · May 5, 2024

In MS Basic tokenized command bytes had the high bit set (IE, they were all 80h or higher), while other text was stored raw. So effectively yes, space is a “token” that’s preserved and needs to be skipped over by the interpreter. This does mean that a tightly packed MS-BASIC program with no spaces can run *slightly* faster than a nicely formatted one.

On TRS-80 BASIC at least a well-known optimization technique for doing fast animated displays was something called “string packing”, where you’d use self-modifying techniques to locate the in-memory representation of a string you’d initialized with blanks or whatever and fill it with graphics characters, cursor positioning codes, etc. (The print statement was very fast at taking strings and shoving them into the terminal driver, so this technique was practically the sprite graphics of the era. Games like Dancing Demon and Voyage of the Valkyrie were BASIC/machine language hybrids that made good use of it.).

Anyway, if you LISTed a program that had packed strings embedded in it you could end up with a screenful of random BASIC keywords because the LIST would expand all those high graphics characters into BASIC keywords. Obviously that wasn’t super great if you had to edit the program later.

TRS-80 Tips and Tricks – String Packing and USR() Routines – Ira Goldklang's TRS-80 Revived Site

WSM · May 5, 2024

This got me curious to see how the eZ80F91 at 50 MHz on my Min-Ez system using BBC BASIC 3.00 under CP/M 2.2 would compare to the AgonLite 2. The accuracy is the same and the random-ness is all over the map from ~2.09 to ~23.77. Since I don't have internal BBC BASIC timing routines, the execution time is too quick to reliably use STOPWATCH. The Min-eZ is running at 50/18.342 MHz = 2.7 times the speed of the Agon Lite 2 which would equate to roughly 0.48 seconds.

krebizfan · May 5, 2024

http://justsolve.archiveteam.org/wiki/Tokenized_BASIC links to descriptions of many tokening systems. MS BASICs could include spaces in the file listing; Sinclair BASIC stripped out all the white space. Most avoided going with large amounts of leading spaces in a line. Narrow screens and narrow printers were common. With potentially 5 character line numbers and a space, that only leaves 16 characters before reaching the opposite side of the screen on a VIC-20.

Self modifying BASIC code was possible on the micro BASICs. Mini and mainframe BASICs made it a lot harder. Not impossible since that is what happens during debugging when a new statement replaces the old.

VCF West	Aug 01 - 02 2025,	CHM, Mountain View, CA
VCF Midwest	Sep 13 - 14 2025,	Schaumburg, IL
VCF Montreal	Jan 24 - 25, 2026,	RMC Saint Jean, Montreal, Canada
VCF SoCal	Feb 14 - 15, 2026,	Hotel Fera, Orange CA
VCF Southwest	May 29 - 31, 2026,	Westin Dallas Fort Worth Airport
VCF Southeast	June, 2026	Atlanta, GA

The "Ahl Benchmark" of BASIC performance

Veteran Member

Veteran Member

25k Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

25k Member

Veteran Member

Veteran Member

Veteran Member

25k Member

25k Member

Veteran Member

Veteran Member

Veteran Member

Experienced Member

Veteran Member