• Please review our updated Terms and Rules here

The running PDP-8/SBC6120 software project/questions thread

commodorejohn

Veteran Member
Joined
Jul 6, 2010
Messages
3,311
Location
California, USA
So I'm dusting off something I started on a year or so back - I had wanted to cobble together a modest game project to demo on my SBC6120 at VCF West. Unfortunately, other stuff piled up on my to-do list and I ended up missing it anyway due to car trouble. I'd like to see if I can actually make a go of it this year, so I'm starting a thread in the hopes that that'll help keep me a little more on-task. Not a whole lot to share at present other than the basics: the aim is to throw together a simple roguelike, since that's something well-suited to terminal I/O, just a little ways out from being period-appropriate (the development of Rogue began around 1980 which was past the -8's heyday, but there were still plenty out in the field,) and not too taxing (I figure the game state for a simple one could fit comfortably within one 4K field, though text and program code are another matter.)

For a couple reasons (I want to keep code size down so there's more room for other stuff, I'd like to abstract it out a bit for portability, and I don't want to get too bogged down in PDP-8 assembler optimization,) I'm going to be building this on top of a simple interpreter (not bytecode..."wordcode?") implementing a basic stack machine, with performance-critical or overly juggly bits done in assembler. I'm currently in the middle of building that; I have a modest chunk of the core opcodes implemented, plus a few more complex utility routines, but there's still a bit left to go on that before I start the real work of actually building the game program.

While I'm working on that, though, I thought I'd see if any of the -8 enthusiasts 'round here wanted to weigh in on one of the bits that I had to think for a bit before coming up with a decent solution for: arithmetic-right-shift. My first attempt was pretty clunky and involved a frustrating amount of register-juggling, but after sitting down and having a good think about it, I was able to cut the process down to four instructions (minus the general overhead.) Just curious if anyone knows a better way to do it - here's what I've got now:
Code:
	CLA CLL CML RAR			/ Load AC with 4000
	TAD variable			/ Load/add the operand
	/ L now contains the original MSB, while the real MSB is inverted
	RAL				/ The inverted MSB is now in L
	CML RTR				/ Normalize the MSB and shift into place
 
I was able to cut the process down to four instructions (minus the general overhead.) Just curious if anyone knows a better way to do it - here's what I've got now:
Code:
	CLA CLL CML RAR			/ Load AC with 4000
	TAD variable			/ Load/add the operand
	/ L now contains the original MSB, while the real MSB is inverted
	RAL				/ The inverted MSB is now in L
	CML RTR				/ Normalize the MSB and shift into place

Your solution looks good. The rest of this is just spit balling ideas. Maybe you will see something in it you didn't think of.

The first solution that came to mind:
Code:
        CLA CLL
        TAD variable
        SPA
        CML
        RAR

This is one instruction longer than your solution unless we can assume that the AC and L are initially clear which makes them the same length but the positive variable path is one instruction execution shorter. A lot of optimization depends on what comes before since the way the AC and link were left by the previous operation is important. If the variable happens to be in the AC already then this works:

Code:
        CLL
        SPA
        CML
        RAR

This is the fastest code sequence to copy the sign bit into the L without mucking up the AC.

One other solution I came up with is not particularly good but there might be a situation where the idea works:
Code:
        CLA
        TAD variable
        TAD variable
        CLA
        TAD variable
        RAR

And this is the same as the previous but slightly faster:
Code:
        CLA
        TAD variable
        RAL
        CLA
        TAD variable
        RAR

Good luck with your project! I look forward to seeing it.
 
Yeah...unfortunately, while AC is known to be clear going in, L is unknown (I'd have to add an instruction to the dispatch routine to clear it, and that'd just slow things down for most opcodes where it doesn't even matter.) The first alternative would still work, though.
 
Sounds like you are writing an interpreter. One of the speed ups I came up with is placing the start of the dispatch routine on page zero. That way you can jump directly there saving 1.2 microseconds by eliminating the defer portion of the JMP instruction cycle. A defer cycle adds 1.2 us on the 6120 CPU. It is worse on all he other CPU's. You might also consider placing the interpreters PC in one of the auto increment locations although I was never able to make this work to my advantage since the auto increment takes place before the defer.

I wrote an 8080 emulator in 1976 and finished debugging it about 2 years ago when a paper tape copy resurfaced. I then spent a considerable amount of time speeding it up. It runs 8080 code on the 8/a at about 1/60th the speed of a 2 mhz 8080. It runs MITS basic reasonably.

There can be a huge speed advantage to using the PC in the PDP-8 to keep track of the execution and eliminate the dispatch code. Every instruction turns into a JMS to a destination on page 0 or a JMS I through an address on page zero. The downside to this approach is that your program ends up all living on the same field so ends up limited to 4k code. If you treat is as a Harvard architecture and limit the code space to 4k the rest of memory is your data space. This might work for your project.
 
Yeah, the zero-page dispatch and auto-increment IP are good notions - had those in place already. Didn't know about auto-increment taking place before the defer, though - I don't think that should be an issue, but it will mean needing to add a fix-up instruction to the jump/call routines.

Subroutine-threaded code is an interesting notion - it limits you to 128 opcodes, but that's not a huge deal; the other downside is that I wouldn't be able to use the scheme I'm currently using where the MSB of the opcode determines whether it's a constant or an instruction, so every constant push would require a preceding instruction - plus, if I'm understanding the extended addressing scheme correctly, it'd require a copy of the interpreter in every program field. But it would represent a major speed-up...I'll have to have a think on that.

For reference, this is the dispatch code as it currently stands:
Code:
const,	/ CONST routine - pushes opcode in AC onto the stack as a constant
	mql					/ save the value for later
	cla cma					/ load AC with -1
	tad z psp				/ load/decrement stack pointer
	dca z psp				/ save the new pointer
	swp					/ retrieve the value
	dca i z psp				/ save it to the stack
	/ falls through to:
next,	/ NEXT routine - assumes AC is clear on entry
	cdf 1					/ set data field to current program location
	tad i z ip				/ get the next opcode
stkfld,	cdf 0					/ select the stack field
	sma					/ if the MSB is clear,
	jmp z const				/ push it as a constant
	dca z tmp				/ otherwise, save it as a dispatch address
	jmp i z tmp				/ and dispatch
 
Last edited:
For reference, this is the dispatch code as it currently stands:
Code:
const,	/ CONST routine - pushes opcode in AC onto the stack as a constant
	mql					/ save the value for later
	cla cma					/ load AC with -1
	tad z psp				/ load/decrement stack pointer
	dca z psp				/ save the new pointer
	swp					/ retrieve the value
	dca i z psp				/ save it to the stack
	/ falls through to:
next,	/ NEXT routine - assumes AC is clear on entry
	cdf 1					/ set data field to current program location
	tad i z ip				/ get the next opcode
stkfld,	cdf 0					/ select the stack field
	sma					/ if the MSB is clear,
	jmp z const				/ push it as a constant
	dca z tmp				/ otherwise, save it as a dispatch address
	jmp i z tmp				/ and dispatch

I am guessing you haven't run this yet. CDF 1 won't do what you want. It is CDF 10 to select field 1. The opcodes are built with a logical or of the individual codes. With CDF the encoding is 62n1 where the selected field replaces the n. CDF 1 would generate 6201 the same as CDF 00 which selects field 0. CDF 10 will give you 6211 which selects data field 1.

I am assuming that ip is in one of the auto increment registers. If I understand correctly your interpreter code is in the upper half (4000 - 7777) of some field. You can generate the jump/call destination addresses as destination-1 and then you won't have to patch up the destination on the fly when you load the new ip.

This bit is well coded. I tried several variations and was not able to improve on it. My only comment is the use of the MQL and SWP which don't exist on the 8/s, 8/l or on other machines that don't have an EAE. Not certain if MQ functionality is present on 8/i or 8/e without EAE but I think it is there on the 8/a (M8315 CPU). It is not that big a deal to fix if someone wanted to run on a different platform.

It was pointed out to me a few years ago that the Z (page zero selector) is not needed and is in fact ignored by pal and that I was a luddite for using it. My personal feeling is that PAL should recognize the Z and issue a warning if you specify a Z for a variable that is not in page 0. It should also issue a warning if the Z is not given and the variable is on page 0. But that is not the way it works but I continue to use the Z where appropriate and someday I will write a PAL with better error checking. PAL lets you do things that can't work without any notifications.

Palbart doesn't care about case. I never tried to feed lower case into real PAL. I wonder it it would work? Your code looks strange to me because it is in lower case.
 
Hah, yeah, I haven't even tried a build yet as I still have more pretty core stuff left to go - but as regards the CDF instruction, you're right that it's wrong; init code and dedicated far-jump/far-call instructions are meant to replace that with a properly formatted field change to the current program field. Though I suppose it might as well be a correctly-formatted wrong initial value. As far as the EAE instructions go, my initial target is the 6120 (and the 8/e, if anybody out there should happen to want to try this when it's functional,) so I'm not concerning myself with other models at the moment - but, as you point out, it's fairly trivial to modify the code to not use the extra registers, especially since the real program is going to be running on top of this abstraction. Just adds more juggling with temporary variables into the equation, is all.

Anyway, the more I think about the subroutine-threaded approach you brought up, the more I think I'm going to redesign around that; the lack of a shortcut for constants is a bummer, but in practice any negative/large unsigned numbers would require a separate instruction anyway. Having to keep a copy of the interpreter in every program field is possibly a bigger deal, but looking at the average opcode-implementation routine I don't think the core interpreter code is going to end up that large - and the speed gain from reducing the whole dispatch routine from eight instructions down to two should be substantial, even given that they're both deferred.
 
Last edited:
I saw some code a long time ago and came across one of those things that I have used ever since. In a place where the code gets modified like a computed CDF or the entry point to a subroutine this guy used .-. as the placeholder. It is a visually distinctive thing and the code that gets generated for it is a 0.
Code:
        *200
START,  JMS JUNK
        JMP I C7605     /QUICK RETURN TO OS/8
C7605,  7605

JUNK,   .-.             /JUNK ENTRY POINT RETURN ADDRESS
        JMP I JUNK      /RETURN
        $

Looks like that. Whenever you see that you know it is going to get overwritten.
 
I am guessing you haven't run this yet. CDF 1 won't do what you want. It is CDF 10 to select field 1. The opcodes are built with a logical or of the individual codes. With CDF the encoding is 62n1 where the selected field replaces the n. CDF 1 would generate 6201 the same as CDF 00 which selects field 0. CDF 10 will give you 6211 which selects data field 1.

I am assuming that ip is in one of the auto increment registers. If I understand correctly your interpreter code is in the upper half (4000 - 7777) of some field. You can generate the jump/call destination addresses as destination-1 and then you won't have to patch up the destination on the fly when you load the new ip.

This bit is well coded. I tried several variations and was not able to improve on it. My only comment is the use of the MQL and SWP which don't exist on the 8/s, 8/l or on other machines that don't have an EAE. Not certain if MQ functionality is present on 8/i or 8/e without EAE but I think it is there on the 8/a (M8315 CPU). It is not that big a deal to fix if someone wanted to run on a different platform.

It was pointed out to me a few years ago that the Z (page zero selector) is not needed and is in fact ignored by pal and that I was a luddite for using it. My personal feeling is that PAL should recognize the Z and issue a warning if you specify a Z for a variable that is not in page 0. It should also issue a warning if the Z is not given and the variable is on page 0. But that is not the way it works but I continue to use the Z where appropriate and someday I will write a PAL with better error checking. PAL lets you do things that can't work without any notifications.

Palbart doesn't care about case. I never tried to feed lower case into real PAL. I wonder it it would work? Your code looks strange to me because it is in lower case.

The 8/E also have the MQ, and the instructions directly related to that even without an EAE. Same as the 8/A.
The 8/I do not.
Also the SWP only exists on 8/E and newer.

The thing with the Z flag is a bit more complicated. Depending on which version of PAL you are using, explicit Z might or might not be needed. PAL8, which is used in OS/8 do not need the Z indication. It's only there to be backwards compatible with other PAL versions. PAL8 explicitly checks if the high 5 bits of an address is 0, and if so, it sets the Z flag in the instruction. The Z indicator itself is ignored, and you will not get any warning if it is used incorrectly.
A little twist to all this is that the 6120 is "funny" in that the index registers (address 10 to 17 in page 0) only acts properly if you have the Z flag set in the instruction. So if you have code which actually resides in page 0, and which do indirect references through the index registers, and which do not have the Z flag set, then the registers are not incremented. Other PDP8 models do increment in this situation.

Anyway, bottom line, Z is not really needed or used in PAL8, and only is sortof silently accepted in code to be backward compatible.
 
Interesting. Might as well leave it there for clarity/just in case, I figure. Trying to figure out how that 6120 example would work, though - you'd have to be running out of code located in ZP and do an indirect reference through 10-17 in the program (actually zero) page?
 
Interesting. Might as well leave it there for clarity/just in case, I figure. Trying to figure out how that 6120 example would work, though - you'd have to be running out of code located in ZP and do an indirect reference through 10-17 in the program (actually zero) page?

Correct. If the code is in page 0, you can address the locations in page 0 both with and without the Z bit set. And the 6120 don't increment the index registers if you use current page addressing.
 
Update: been poking away at this for a while now, though I've been a bit busy with other stuff. Funnily enough, I actually ended up going back to to the conventional indirect-threaded approach for a good while because I wasn't keen on the fact that the PDP-8's return-address-in-the-body scheme for subroutine calls ruled out some nice fall-through cases, and got most of the core instructions written that way, but I just realized today that doing it the other way frees up AC and enables a TOS-in-register scheme, and that changed the balance so majorly that I'm going back to the subroutine-threaded approach. Still working through that, but a lot of routines that were 5-9 instructions (plus 7 instructions in NEXT) are down to 1-3 plus the 2-instruction return/dispatch overhead, so I'm ending up saving space in addition to being faster. Loving it :D
 
Well, I've got the bulk of the interpreter put together in at least a first-draft form, I think. Switching to top-of-stack-in-register did slow down some operations a bit (particularly the more complex stack-juggling ones,) but the core math operations saw a significant speed-up, and most of the unary operations I was able to convert to straight-up inline single instructions, which is as fast as it gets :)

I briefly toyed with making the parameter-stack pointer an auto-increment register, since more operations take values off the stack than add values to it, but that turned out to be just too many plates to keep in the air (since you can't read a value off the stack without incrementing the pointer, you have to make a copy of the stack pointer every time you want to write back to the same location.) Still might be worth a look at some point...

And of course it still needs a few niceties like multiply/divide and, um, I/O. But the subroutine-threaded approach makes it trivial to intermingle interpreter code and native code, so there's plenty of room for fancy stuff outside the core "resident" area that has to be present in every program field. (In fact, I should probably move the LFSR routine out of the main segment...) Also feels kinda janky redefining the AND mnemonic to point to an interpreter routine when it's a valid PDP-8 instruction...eh.

Code:
	/ Stack interpreter - token/subroutine-threaded version, #0 in AC /
	
	/// ZP tables/initial values ///
	
	*0000		/ begin in zero page
	
	/ Default configuration is for return stack to begin (upper bound) at
	/ 7200, with three pages reserved for interpreter code, and parameter
	/ stack to begin immediately below the lower bound of the return stack.
	/ Note: since #0 is always in register, actual stack storage will begin
	/ one word below the specified address.
	
rsp,	7200		/ begin return stack immediately below interpreter
psp,	7160		/ begin parameter stack at 7160 - 16 words of return
	
	/ --- Temporary variable(s)/common constants --- /
	
tmp,	0
c7777,	7777
	
	/ --- Auto-increment variable(s) --- /
	
	*0010
	
tmpinc,	0
tmpinc2,	0
	
	/ --- Jump table --- /
	
	*0120		/ Reserve the last 48 words for the jump table
	
jlit,	dolit
jchf,	dochf
jld,	dold
jldn,	doldn
jst,	dost
jstn,	dostn
jadd,	doadd
jsub,	dosub
jmul,	domul
jdiv,	dodiv
jand,	doand
jor,	door
jxor,	doxor
jasr,	doasr
jrnd,	dornd
jskn,	doskn
jskp,	doskp
jskz,	doskz
jiskz,	doiskz
jrtn,	dortn
jjp,	dojp
jjpi,	dojpi
jcl,	docl
jcli,	docli
jjpf,	dojpf
jclf,	doclf
jdup,	dodup
jdrop,	dodrop
jdrip,	dodrip
jswap,	doswap
jover,	doover
jpick,	dopick
jptor,	doptor
jrtop,	dortop
jgetr,	dogetr
jrot,	dorot
	
	/// Initializer routine ///
	/ TODO: this
	
	/// Resident interpreter code begins here ///
	
	*7200		/ Default resident address for interpreter code
	
	/ --- Control instructions --- /
	
doskn,	/ DOSKN - skip (IP += 2) if #0 is negative
	0
	spa cla		/ if #0 is not negative,
	jmp sn		/ skip skipping
	isz doskn	/ otherwise, skip
	isz doskn
sn,	tad i z psp	/ get the new #0
	isz z psp	/ increment the stack pointer
	jmp i doskn	/ 7 instructions, 8 words
	/ HD6120: 47* cycles, 8/e: 15* us, 8: 19.5* us
	/ * 61 cycles/19 us/22.5 us if skip taken
	
doskp,	/ DOSKP - skip (IP += 2) if #0 is positive (non-negative, non-zero)
	0
	sma sza cla	/ if #0 is zero or negative,
	jmp sp		/ skip skipping
	isz doskp	/ otherwise, skip
	isz doskp
sp,	tad i z psp	/ get the new #0
	isz z psp	/ increment the stack pointer
	jmp i doskp	/ 7 instructions, 8 words
	/ HD6120: 47* cycles, 8/e: 15* us, 8: 19.5* us
	/ * 61 cycles/19 us/22.5 us if skip taken
	
doskz,	/ DOSKZ - skip (IP += 2) if #0 is zero
	0
	sna cla		/ if #0 is nonzero,
	jmp sz		/ skip skipping
	isz doskz	/ otherwise, skip
	isz doskz
sz,	tad i z psp	/ get the new #0
	isz z psp	/ increment the stack pointer
	jmp i doskz	/ 7 instructions, 8 words
	/ HD6120: 47* cycles, 8/e: 15* us, 8: 19.5* us
	/ * 61 cycles/19 us/22.5 us if skip taken
	
doiskz,	/ DOISKZ - increment RSP #0 and drop/skip if zero
	0
	isz i z rsp	/ increment return-stack #0
	jmp i doiskz	/ if result is nonzero, do nothing
	isz z rsp	/ otherwise, drop it
	isz doiskz	/ and skip
	isz doiskz
	jmp i doiskz	/ 6 instructions, 7 words
	/ HD6120: 29* cycles, 8/e: 10* us, 8: 12* us
	/ * 58 cycles/17.8 us/21 us if skip taken
	
dort,	/ DORT - jump to return-stack (#0)
	mql		/ save #0
	tad i z rsp	/ get the return address
	dca z tmp	/ save it
	isz z rsp	/ increment the return-stack pointer
	mqa		/ retrieve #0
	jmp i z tmp	/ 6 instructions, 6 words
	/ HD6120: 52 cycles, 8/e: 16.2 us, 8: 19.5 us
	
dojp,	/ DOJP - jump to (#0)
	dca z tmp	/ save the address
	tad i z psp	/ load the new #0
	isz z psp	/ increment the stack pointer
	jmp i z tmp	/ 4 instructions, 4 words
	/ HD6120: 46 cycles, 8/e: 15.2 us, 8: 18 us
	
dojpi,	/ DOJPI - jump to (IP)
	0
	mql		/ save #0
	tad i dojpi	/ get address from the instruction stream
	dca z tmp	/ save the address
	mqa		/ retrieve #0
	jmp i z tmp	/ 5 instructions, 5 words
	/ HD6120: 46 cycles, 8/e: 16.4 us, 8: 19.5 us
	
docl,	/ DOCL - save IP and jump to (#0)
	0
	dca z tmp	/ save the address
	cla cma		/ load AC with -1
	tad z rsp	/ load/decrement RSP
	dca z rsp	/ save it back
	tad docl	/ get the return address
	dca i z rsp	/ save it on the return stack
	tad i z psp	/ get the new #0
	isz z psp	/ increment the stack pointer
	jmp i z tmp	/ 9 instructions, 9 words
	/ HD6120: 80 cycles, 8/e: 29.4 us, 8: 34.5 us
	
docli,	/ DOCLI - save IP and jump to (IP)++
	0
	mql		/ save #0
	cla cma		/ load AC with -1
	tad z rsp	/ load/decrement RSP
	dca z rsp	/ save it back
	tad i doicall	/ get the address
	dca z tmp	/ save it
	tad doicall	/ get the return address
	iac		/ increment it
	dca i z rsp	/ save it on the return stack
	mqa		/ retrieve #0
	jmp i z tmp	/ 11 instructions, 11 words
	/ HD6120: 92 cycles, 8/e: 30.4 us, 8: 36 us
	
	/// --- MUST BE IN THE SAME FIELD! --- ///
	
cthunk,	thunk		/ pointer to return thunk
doclf,	/ DOCLF - save IP and jump to (#1) in field #0
	0
	mql		/ save #0
	cla cma cll rtl	/ load AC with -3
	tad z rsp	/ load/decrement the return-stack pointer
	dca z rsp	/ save it back
	tad z rsp	/ get it again
	dca z tmpinc	/ save it in an autoincrement location
	tad cthunk	/ get the return-thunk address
	/ Field and address are swapped here because the return thunk pops
	/ them to the parameter stack (reversing order) before doing a JPF
	dca i z tmpinc	/ save it on top of the return stack
	rif		/ get the current instruction field
	dca i z tmpinc	/ save it next on the stack
	tad doclf	/ get the return address
	dca i z tmpinc	/ save it last on the stack
	mqa		/ retrieve #0
	/ 13 instructions, 14 words
	/ HD6120: 103 cycles, 8/e: 35.6 us, 8: 42 us
dojpf,	/ DOJPF - jump to (#1) in field #0
	tad ccdf	/ add in the CDF instruction
	dca jfdf	/ save it
	tad jfdf	/ get it back
	iac		/ make it a CIF instruction
	dca jfif	/ save it
	tad i z psp	/ get the address
	dca z tmp	/ save it
	isz z psp	/ increment the stack pointer
	tad i z psp	/ get the new #0
	mql		/ save i
jfdf,	.-.		/ work in the new instruction field
	tad z psp	/ get the stack pointer
	dca i z ppsp	/ transfer it to the new field
	tad z rsp	/ get the return-stack pointer
	dca i z prsp	/ transfer it to the new field
	cdf 00		/ work in the stack field again
	mqa		/ retrieve #0
jfif,	.-.		/ switch to the new instruction field
	jmp i z tmp	/ 19 instructions, 21 words - GAHHHHH
	/ HD6120: 176* cycles, 8/e: 48* us, 8: 57* us
	/ * 169/45.6 us/54 us when falling through from DOCLF
ppsp,	psp		/ pointer to the stack pointer
prsp,	rsp		/ pointer to the return-stack pointer
	
	/ --- Load/store instructions --- /
	
ccdf,	cdf 00
dochf,	/ DOCHF - changes the data field to #0
	0
	tad ccdf	/ add in the CDF instruction
	dca ldfld	/ store the result in place
	tad ldfld	/ get it back
	dca stfld	/ store it in the other place
	tad i z psp	/ get the new #0
	isz z psp	/ increment the stack pointer
	jmp i dochf	/ 7 instructions, 9 words
	/ HD6120: 64 cycles, 8/e: 23 us, 8: 27 us
	
dold,	/ DOLD - pushes (#0)
	0
	dca z tmp	/ save the address
ldfld,	.-.		/ select the data field
	tad i z tmp	/ get the datum
	cdf 00		/ select the stack field
	jmp i dold	/ 5 instructions, 6 words
	/ HD6120: 46 cycles, 8/e: 15 us, 8: 18 us
	
dost,	/ DOST - saves #1 to (#0)
	0
	dca z tmp	/ save the address
	tad i z psp	/ get the datum
stfld,	.-.		/ select the data field
	dca i z tmp	/ save the datum to memory
	cdf 00		/ select the stack field
	isz z psp	/ increment the stack pointer
	tad i z psp	/ get the new #0
	isz z psp	/ increment the stack pointer
	jmp i dost	/ 9 instructions, 10 words
	/ HD6120: 84 cycles, 8/e: 26.8 us, 8: 30 us
	
	/// --- END FIELD RESTRICTION --- ///
	
dolit,	/ DOLIT - pushes (IP)++
	0
	mql		/ save #0
	cla cma		/ load AC with -1
	tad z psp	/ load/decrement PSP
	dca z psp	/ save it back
	mqa		/ retrieve #0
	dca i z psp	/ save it to the stack
	tad i dolit	/ get constant as the new #0
	isz dolit	/ increment return address
	jmp i dolit	/ 9 instructions, 10 words
	/ HD6120: 76 cycles, 8/e: 24 us, 8: 30 us
	
doldn,	/ DOLDN - pushes (#0) in the stack field
	0
	dca z tmp	/ save the address
	tad i z tmp	/ get the datum
	jmp i doldn	/ 3 instructions, 4 words
	/ HD6120: 34 cycles, 8/e: 12.6 us, 8: 15 us
	
dostn,	/ DOSTN - saves #1 to (#0) in the stack field
	0
	dca z tmp	/ save the address
	tad i z psp	/ get the datum
	dca i z tmp	/ save the datum to memory
	isz z psp	/ increment the stack pointer
	tad i z psp	/ get the new #0
	isz z psp	/ increment the stack pointer
	jmp i dostn	/ 7 instructions, 8 words
	/ HD6120: 72 cycles, 8/e: 25.4 us, 8: 30 us
	
	/ --- Arithmetic/logic instructions --- /
	
/donop,	/ DONOP - do nothing of consequence
/	0
/	jmp i z donop
/	/ HD6120: 17 cycles, 8/e: 7.6 us, 8: 9 us
	/ HD6120: 6 cycles, 8/e: 1.2 us, 8: 1.5 us
	
doadd,	/ DOADD - add #0 and #1
	0
	tad i z psp	/ add #1 to #0
	isz z psp	/ increment the stack pointer
	jmp i doadd	/ 3 instructions, 4 words
	/ HD6120: 36 cycles, 8/e: 12.6 us, 8: 15 us
	
dosub,	/ DOSUB - subtract #0 from #1
	0
	cma iac		/ negate #0
	tad i z psp	/ add #1 to #0
	isz z psp	/ increment the stack pointer
	jmp i dosub	/ 4 instructions, 5 words
	/ HD6120: 42 cycles, 8/e: 13.8 us, 8: 16.5 us
	
domul,	/ DOMUL - multiply #0 and #1
	/ TODO: manual and hardware-assisted multiplication options
	0
	
dodiv,	/ DODIV - divide #1 by #0
	/ TODO: manual and hardware-assisted division options
	0
	
/doinc,	/ DOINC - increment #0
/	0
/	iac		/ increment #0
/	jmp i doinc	/ 2 instructions, 3 words
/	/ HD6120: 23 cycles, 8/e: 7.4 us, 8: 9 us
	/ HD6120: 6 cycles, 8/e: 1.2 us, 8: 1.5 us
	
/dodec,	/ DODEC - decrement #0
/	0
/	tad c7777	/ add -1 to #0
/	jmp i dodec	/ 2 instructions, 4 words
/	/ HD6120: 34 cycles, 8/e: 8.8 us, 8: 10.5 us
	/ HD6120: 7 cycles, 8/e: 2.6 us, 8: 3 us
	
/doneg,	/ DONEG - negate #0
/	0
/	cma iac		/ negate #0
/	jmp i doneg	/ 2 instructions, 3 words
/	/ HD6120: 23 cycles, 8/e: 7.4 us, 8: 9 us
	/ HD6120: 6 cycles, 8/e: 1.2 us, 8: 1.5 us
	
/donot,	/ DONOT - complement #0
/	0
/	cma		/ complement #0
/	jmp i donot	/ 2 instructions, 3 words
/	/ HD6120: 23 cycles, 8/e: 7.4 us, 8: 9 us
	/ HD6120: 6 cycles, 8/e: 1.2 us, 8: 1.5 us
	
doand,	/ DOAND - AND #0 and #1
	0
	and i z psp	/ AND #1 with #0
	isz z psp	/ increment the stack pointer
	jmp i doand	/ 3 instructions, 4 words
	/ HD6120: 36 cycles, 8/e: 12.6 us, 8: 15 us
	
	/ Thanks to Doug Jones for the OR and XOR algorithms.
	
door,	/ DOOR - OR #0 and #1
	0
	dca z tmp	/ save #0 to memory
	tad i z psp	/ get #1
	and z tmp	/ find common (carry-causing) 1s
	cma		/ invert the result
	and z tmp	/ mask out all common 1s from #0
	tad i z psp	/ add it to #1
	isz z psp	/ increment the stack pointer
	jmp i door	/ 8 instructions, 9 words
	/ HD6120: 73 cycles, 8/e: 25.4 us, 8: 30 us
	
doxor,	/ DOXOR - XOR #0 and #1
	0
	dca z tmp	/ save #0 to memory
	tad z tmp	/ retrieve it
	and i z psp	/ find common (carry-causing) 1s
	cma iac		/ negate the result
	cll ral		/ double it
	tad z tmp	/ pre-un-carry any carries
	tad i z psp	/ and add this to #1
	isz z psp	/ increment the stack pointer
	jmp i doxor	/ 9 instructions, 10 words
	/ HD6120: 79 cycles, 8/e: 26.4 us, 8: 31.5 us
	
/dolsl,	/ DOLSL - shift #0 left
/	0
/	cll ral		/ shift #0 left
/	jmp i dolsl	/ 2 instructions, 3 words
/	/ HD6120: 23 cycles, 8/e: 7.4 us, 8: 9 us
	/ HD6120: 6 cycles, 8/e: 1.2 us, 8: 1.5 us
	
/dolsr,	/ DOLSR - shift #0 right
/	0
/	cll rar		/ shift #0 right
/	jmp i dolsr	/ 2 instructions, 3 words
/	/ HD6120: 23 cycles, 8/e: 7.4 us, 8: 9 us
	/ HD6120: 6 cycles, 8/e: 1.2 us, 8: 1.5 us
	
doasr,	/ DOASR - arithmetic-shift #0 right
	0
	cll		/ clear the link
	tad c4000	/ complement MSB, shifting it to link
	ral		/ rotate complemented MSB into L
	cml rtr		/ un-complement MSB and shift the whole thing right
	jmp i doasr	/ 5 instructions, 7 words
	/ HD6120: 42 cycles, 8/e: 12.4 us, 8: 15 us
c4000,	4000
	
/dobswp,	/ DOBSWP - byte-swap #0
/	0
/	bsw		/ rotate #0 six places
/	jmp i dobswp	/ 2 instructions, 3 words
/	/ HD6120: 23 cycles, 8/e: 7.4 us, 8: 9 us
	/ HD6120: 6 cycles, 8/e: 1.2 us, 8: 1.5 us
	
dornd,	/ DORND - LFSR pseudorandom-number generation with seed/state #0
	0
	cll rar		/ shift #0 right
	szl		/ if carry-out was nonzero,
	jmp i dornd	/ just return
	dca z tmp	/ save #0 to memory
	tad z tmp	/ and get it back
	and rnmask	/ find common (carry-causing) 1s
	cma iac		/ negate the result
	cll ral		/ and double it
	tad z tmp	/ pre-un-carry the common 1s
	tad rnmask	/ and add the mask value
	jmp i dornd	/ 11 instructions, 13 words
	/ HD6120: 29* cycles, 8/e: 10* us, 8: 10.5* us
	/ * 76/24 us/28.5 us if XOR is required**
	/ ** Yes, it's pseudo-random how long the pseudo-random number routine
	/ will take to execute. How fitting!
rnmask,	4051		/ the magic number
	
	/ --- Stack operations --- /
	
dodup,	/ DODUP - duplicate #0 on the stack
	0
	mql		/ save #0
	cla cma		/ load AC with -1
	tad z psp	/ load/decrement the stack pointer
	dca z psp	/ save it back
	mqa		/ retrieve #0
	dca i z psp	/ save it to the stack
	mqa		/ retrieve it again
	jmp i dodup	/ 8 instructions, 9 words
	/ HD6120: 65 cycles, 8/e: 18.8 us, 8: 24 us
	
dodrop,	/ DODROP - drop #0 from the stack
	0
	cla		/ discard #0
	tad i z psp	/ get the new #0
	isz z psp	/ increment the stack pointer
	jmp i dodrop	/ 4 instructions, 5 words
	/ HD6120: 42 cycles, 8/e: 13.8 us, 8: 16.5 us
	
dodrip,	/ DODRIP - retrieve dropped #0
	0
	mql		/ save current #0
	cla cma cll ral	/ load AC with -2
	tad z psp	/ load/decrement the stack pointer
	dca z psp	/ save it back
	tad i z psp	/ get dropped #0
	isz z psp	/ increment the stack pointer
	mqa mql		/ retrieve initial #0
	dca i z psp	/ save it to the stack
	mqa		/ retrieve restored #0
	jmp i doundrop	/ 10 instructions, 11 words
	/ HD6120: 83 cycles, 8/e: 26.4 us, 8: 31.5 us
	
doswap,	/ DOSWAP - exchange #0 and #1
	0
	mql		/ save #0
	tad i z psp	/ get #1
	mqa mql		/ swap AC & MQ
	dca i z psp	/ save #0 as #1
	mqa		/ retrieve #1 as the new #0
	jmp i doswap	/ 6 instructions, 7 words
	/ HD6120: 55 cycles, 8/e: 17.4 us, 8: 21 us
	
doover,	/ DOOVER - copy #1 to TOS
	0
	mql		/ save #0
	tad z psp	/ get the stack pointer
	dca z tmp	/ save it
	cla cma		/ load AC with -1
	tad z psp	/ load/decrement the stack pointer
	dca z psp	/ save it back
	mqa		/ retrieve #0
	dca i z psp	/ save it to the stack
	tad i z tmp	/ get #1 as the new #0
	jmp i doover	/ 10 instructions, 11 words
	/ HD6120: 83 cycles, 8/e: 27.4 us, 8: 33 us
	
	/ Note: 0 PICK is *not* equivalent to DUP, but rather DRIP SWAP DROP
dopick,	/ DOPICK - copy item ##0 to TOS
	0
	tad z c7777	/ adjust for #0 being in register and not in memory
	tad z psp	/ add #0 to the stack pointer
	dca z tmp	/ save it for indirection
	tad i z tmp	/ get the specified stack item
	jmp i dopick	/ 4 instructions, 5 words
	/ HD6120: 48 cycles, 8/e: 17.8 us, 8: 21 us
	
doptor,	/ DOPTOR - move #0 to the return stack
	0
	mql		/ save #0
	cla cma		/ load AC with -1
	tad z rsp	/ load/decrement the return stack pointer
	dca z rsp	/ save it back
	mqa		/ retrieve #0
	dca i z rsp	/ save it to the return stack
	tad i z psp	/ get the new #0
	isz z psp	/ increment the stack pointer
	jmp i doptor	/ 9 instructions, 10 words
	/ HD6120: 78 cycles, 8/e: 25.2 us, 8: 30 us
	
dortop,	/ DORTOP - move return-stack #0 to the stack
	0
	mql		/ save #0
	cla cma		/ load AC with -1
	tad z psp	/ load/decrement the stack pointer
	dca z psp	/ save it back
	mqa		/ retrieve #0
	dca i z psp	/ save it to the stack
	tad i z rsp	/ get the new #0
	isz z rsp	/ increment the stack pointer
	jmp i dortop	/ 9 instructions, 10 words
	/ HD6120: 78 cycles, 8/e: 25.2 us, 8: 30 us
	
dogetr,	/ DOGETR - copy return-stack #0 to the stack
	0
	mql		/ save #0
	cla cma		/ load AC with -1
	tad z psp	/ load/decrement the stack pointer
	dca z psp	/ save it back
	mqa		/ retrieve #0
	dca i z psp	/ save it to the stack
	tad i z rsp	/ get the new #0
	jmp i dogetr	/ 8 instructions, 9 words
	/ HD6120: 69 cycles, 8/e: 22.6 us, 8: 27 us
	
dorot,	/ DOROT - bump #2 to top-of-stack and shift #0/#1 down accordingly
	0
	mql		/ save #0
	tad z psp	/ get the stack pointer
	iac		/ increment it
	dca z tmp	/ save it
	tad i z psp	/ get #1
	mqa mql		/ swap #0 and #1
	dca i z psp	/ save the new #1
	tad i z tmp	/ get #2
	mqa mql		/ swap #0 and #2
	dca i z tmp	/ "The new Number Two..."
	mqa		/ retrieve the new #0
	jmp i dorot	/ 12 instructions, 13 words
	/ HD6120: 101 cycles, 8/e: 32.6 us, 8: 39 us
	
doroll,	/ DOROLL - bump ##0 to top-of-stack and shift #0-#n down accordingly
	0
	cma iac		/ negate #0
	iac		/ increment it for counting and index purposes
	dca z tmp	/ save it as a counter
	tad z psp	/ get the stack pointer
	cma iac		/ negate it
	tad z tmp	/ add it to #0
	cma iac		/ de-negate it
	dca z tmpinc	/ save it as a pointer
	tad i z tmpinc	/ get ##0
	mql		/ save it for later
	tad z psp	/ get the stack pointer
	dca z tmpinc	/ save it for auto-increment
	tad z psp	/ get it again
	dca z tmpinc2	/ save it for a second auto-increment
	/ loop begins here
rloop,	tad i z tmpinc	/ get the next item on the stack
	mqa mql		/ save it and retrieve the previous
	dca i z tmpinc2	/ save the previous item in its place
	isz z tmp	/ loop (#0 - 1) times
	jmp rloop
	/ loop end - cleanup
	mqa		/ get the last saved item
	dca i z tmpinc2	/ save it in place
	tad i z psp	/ get the new #0
	isz z psp	/ increment the stack pointer
	jmp i doroll	/ 24 instructions, 25 words
	/ HD6120: 190* cycles, 8/e: 61.8* us, 8: 72* us
	/ * add 43 cycles/12.6 us/15 us per extra (#0 > 1) iteration
	
	/// Opcode defines ///
	
lit	= jms i z jlit
chf	= jms i z jchf
ld	= jms i z jld
st	= jms i z jst
ldn	= jms i z jldn
stn	= jms i z jstn
nop	= cll
add	= jms i z jadd
sub	= jms i z jsub
mul	= jms i z jmul
div	= jms i z jdiv
dec	= tad z c7777	/jdec,	dodec
inc	= iac		/jinc,	doinc
neg	= cma iac	/jneg,	doneg
not	= cma		/jnot,	donot
and	= jms i z jand
or	= jms i z jor
xor	= jms i z jxor
lsl	= cll ral	/jlsl,	dolsl
lsr	= cll rar	/jlsr,	dolsr
asr	= jms i z jasr
bswp	= bsw		/jbswp,	dobswp
rnd	= jms i z jrnd
skn	= jms i z jskn
skp	= jms i z jskp
skz	= jms i z jskz
iskz	= jms i z jiskz
rt	= jmp i z jrt
jp	= jmp i z jjp
jpi	= jms i z jjpi
cl	= jms i z jcl
cli	= jms i z jcli
jpf	= jmp i z jjpf
clf	= jms i z jclf
dup	= jms i z jdup
drop	= jms i z jdrop
drip	= jms i z jdrip
swap	= jms i z jswap
over	= jms i z jover
pick	= jms i z jpick
ptor	= jms i z jptor
rtop	= jms i z jrtop
getr	= jms i z jgetr
rot	= jms i z jrot
	
	/// Far-call return thunk ///
	
	*7775		/ placed at the end of the field, safeguards rollover

thunk,	/ Return thunk - pops address and field to the parameter stack and JPFs
	rtop
	rtop
	jpf
	/ HD6120: 332 cycles, 8/e: 98.4 us, 8: 117 us
 
Aaaand of course I don't notice errors in DOJPF until after the edit timeout expires :/

Code:
	isz z psp	/ increment the stack pointer
	tad i z psp	/ get the new #0
	[b]isz z psp	/ increment the stack pointer[/b]
	mql		/ save i
jfdf,	.-.		/ work in the new instruction field
	tad z psp	/ get the stack pointer
	[b]dca i ppsp	/ transfer it to the new field[/b]
	tad z rsp	/ get the return-stack pointer
	[b]dca i prsp	/ transfer it to the new field[/b]
 
Yeah...unfortunately, while AC is known to be clear going in, L is unknown (I'd have to add an instruction to the dispatch routine to clear it, and that'd just slow things down for most opcodes where it doesn't even matter.) The first alternative would still work, though.
Sanity-check me on this, but I think I bettered it - three instructions, all microcoded:
Code:
CLL RTL
IAC RAR
CML RTR
Progress has been a bit slow on the code end as I've been A. distracted by other projects and B. pondering design questions for the actual game program, but I'm trying to keep on it!
 
Sanity-check me on this, but I think I bettered it - three instructions, all microcoded:
Code:
CLL RTL
IAC RAR
CML RTR

Nice! That one violates the "don't combine IAC and rotates in portable code" rule, but I suppose if you are going to presume MQL and SWP, you've already set a high bar for the required machine.

Vince
 
Yeah, I just realized that. I'm more or less targeting the 6120 and 8/e here, initially, but fortunately it's trivial to modify for straight-8 compatibility, and still faster than the original code (as well as not requiring a separate constant.)

MQ instructions I'm counting on both for convenience and because I expect the game to require extended addressing in any case (I think there was a super-bare-bones VIC-20 roguelike that ran in 4KB a few years back, but I'd like to get at least a bit fancier than that.) But it should be doable to modify it for vanilla straight-8 in the (unlikely?) event that anyone feels like using this for other projects.
 
Nice! That one violates the "don't combine IAC and rotates in portable code" rule, but I suppose if you are going to presume MQL and SWP, you've already set a high bar for the required machine.

Vince

As far as I know, the IAC and rotate combination is only an issue in the straight 8 and the 8/S.
Anything that uses BSW or MQ is already excluding those machines anyhow, so the IAC + rotate limitation don't really exist anyway...
 
Yeah, I just realized that. I'm more or less targeting the 6120 and 8/e here, initially, but fortunately it's trivial to modify for straight-8 compatibility, and still faster than the original code (as well as not requiring a separate constant.)

MQ instructions I'm counting on both for convenience and because I expect the game to require extended addressing in any case (I think there was a super-bare-bones VIC-20 roguelike that ran in 4KB a few years back, but I'd like to get at least a bit fancier than that.) But it should be doable to modify it for vanilla straight-8 in the (unlikely?) event that anyone feels like using this for other projects.

MQ have nothing to do with extended addressing though.
 
As far as I know, the IAC and rotate combination is only an issue in the straight 8 and the 8/S. Anything that uses BSW or MQ is already excluding those machines anyhow, so the IAC + rotate limitation don't really exist anyway...

Well, technically there exist surviving straight-8 with EAE, though not many in working condition.

The requirement for MQ leaves the out the 8/L, and BSW leaves out the EAE subset of 8/I as well. I'm not sure which machines actually implement SWP, but I believe not all the EAE implementations do.

The 8/A has just barely enough of the EAE to support MQA, MQL, and SWP.

Vince
 
Back
Top