• Please review our updated Terms and Rules here

A Z-80 Puzzle

gp2000

Experienced Member
Joined
Jun 8, 2010
Messages
470
Location
Vancouver, BC, Canada
Here's a bit of code I wrote which does something entirely useless. It's for a Model 4 though only because it's the only TRS-80 where you can change the memory map to all RAM.

Can you figure out what it does? Can you figure out the benefit of the excess code?

Code:
	org	$8082
start:	di
	ld	bc,$384
	out	(c),b		; make all memory RAM
	ld	h,a
	ld	l,a
	ld	b,a
	cp	$cd
	jr	z,iscall
	cp	$34
	jr	z,idhl
	cp	$35
	jr	nz,notid
idhl:	ld	sp,hl
	pop	de
	ld	de,$e9d5
	inc	sp
	push	de
	ld	d,e
	jr	cont
notid:	and	$c7
	cp	$c7
	jr	nz,norst
	inc	hl
	inc	hl
iscall:	dec	hl
	dec	hl
	dec	hl
	dec	hl
norst:	ld	sp,hl
	pop	de
	ld	de,$e9d5
cont:	push	de
	ld	a,b
	and	8
	add	$f8
	sbc	a,a
	cpl
	and	$f5
	ld	c,a
	push	bc
	pop	af
	ld	c,a
	ld	d,a
	ld	e,a
	jp	(hl)
	end	start
 
Can you figure out what it does? Can you figure out the benefit of the excess code?

Oh, man...I can barely remember what my own z80 code does a few weeks after writing it unless I comment the heck out of it! And you want me to solve cryptic z80 puzzle code? :)
 
Can you figure out what it does?
Yes, but there doesn't seem to be a SPOILER tag, so I'll just say that it opiescay ethay Ayay egisterray otay everyay emorymay ocationlay. The four special cases all have the same principal effect as the general case, but I can't figure out what additional benefit they have.
 
Here's a bit of code I wrote which does something entirely useless. It's for a Model 4 though only because it's the only TRS-80 where you can change the memory map to all RAM.

Can you figure out what it does? Can you figure out the benefit of the excess code?

Hmm, an interesting snippet. Looks almost like simulator, exerciser, or debugger code. It looks to me like a timing piece of a sorts, to generate specific signals. The code appears to fill RAM, from a particular address derived from the contents of A at the beginning, with the contents of A, with two different speeds based on the contents of A, filling down from the address either two bytes at an iteration or four bytes at an iteration, presumably wrapping around afterwards and eventually executing the instruction in A (infinitely). The nice use of the D5 E9 opcode pair
Code:
PUSH DE
JP (HL)
and the second doubled D5 (PUSH DE again) makes it something of a timing loop, with the memory being filled faster if A is either 34H or 35H (INC (HL) and DEC (HL) opcodes). That is, sometimes you get:
Code:
 HL:    PUSH DE
        JP (HL)
and sometimes you get
Code:
 HL:    PUSH DE
        PUSH DE
        JP (HL)
where in both cases SP<HL at the end.

What this snippet is good for is seeing just how well you understand stack operations; using the LD SP,HL instruction and then doing various POPs and PUSHes makes you think.

Now, why you are doing the math based on bit 3 of A at label cont: is a bit of a mystery, as SP ends up pointing to a place where the first time the PUSH DE is executed the contents of (HL-1) and (HL-2) look like they will get overwritten. I realize the flags will end up with the contents of register C, and that those contents have been calculated, but to what end? And then what happens when the stack wraps and the E9 gets overwritten...... but I of course always reserve the right to be wrong, and may have just misanalyzed the whole snippet....

I look forward to your explanation; an interesting bit of code.
 
Last edited:
Yep, both answers nail down what the code mainly does. The second question isn't really fair in that you're trying to predict what some weirdo who fills RAM cares about. So here's a little hint -- it has to do with what happens after RAM has been filled. To re-purpose a marketing phrase, "Hey, Z-80, you've just filled RAM with $XX. What are you going to do next?"

Not incidentally, the idea came to me when I learned that the Z-80 can have arbitrarily large instructions because you can stick as many $DD and $FD prefixes in front of an IX or IY instructions as you want. Only the last one matters ($DD choosing IX, $FD choosing IY). But it is a single instruction because the Z-80 can't take an interrupt until it gets past all the prefixes.

Since you can fill all of RAM with $DD, you can get the Z-80 executing a single instruction for an infinite number of cycles. This amuses me to no end.

So the next time you're at a party and someone starts speculating about the longest Z-80 instruction, don't say "Oh, 'inc (ix)' if you mean time, 'res 0,(ix)' if you mean size." No, hit 'em with RAM full of $FD prefixes. Everyone will be soooo impressed. Or maybe be a little coy and start with $FD $DD $FD $DD $FD $E9 -- a whopping 24 T-State version of "JP (IY)".
 
...The second question isn't really fair in that you're trying to predict what some weirdo who fills RAM cares about. So here's a little hint -- it has to do with what happens after RAM has been filled. ...

As the whole snippet was originally billed as doing something 'useless' it is interesting that eventually you're going to have to execute an instruction, and that instruction is indeterminate (being the contents of A on entry) and infinite. Popping that F4 or F5 'just out of reach' so to speak is pretty bizarre, but since I've slept since I looked closely at where the stack pointer was right before the initial JP (HL).....

This sounds like a "Mythbusters look at the Z80: using instructions in ways they were never designed to be used."

Now, at the time I was doing the analysis by hand and at the assembly level; it could be that the 'extra' code's benefit is some unique binary combination used as a copyright violation detection mechanism; it's not likely at all that two different authors would use the exact same dummy code..... (There is this sort of code in the original LS-DOS 6.3.0's 'copy protection' scheme; code that sends you on a wild goose chase in the last sector extra bytes of system overlays, among other locations, and dummy code that basically serves in the same role as dummy towns and streets in various publishers' road atlases.)

EDIT:

As an enhancement to the idhl cases, you could always have the code push the D5 E9 into place at (HL+1) and (HL+2), but let the 34H or 35H get put at (HL) and watch the hilarity at (HL) on each loop.... just make sure SP ends up at (HL-1) or lower, of course.... That would then be one of my favorite jokes.... (my favorite Z80 joke, in hex: 0000: 01 FF FF 11 0C 00 21 0B 00 ED B0 76 but the punchline of that is just a bit too straightforward for this crowd..... :) ).
 
Last edited:
Okay, now I get it. Filling memory with a copy of A is just a means to an end. The actual goal is "execute an infinite sequence of instructions in which every byte of every instruction is the initial value of A", right?

To make it easier to talk about this at George's parties, let's define a "homogeneous instruction" to be one in which every byte is the same. Every single-byte instruction is trivially a homogeneous instruction, and there are also several 2-byte and 3-byte homogeneous instructions like LD B, 6 (which is 06 06) and JP 0C3C3h (C3 C3 C3), and there are two infinite homogeneous not-quite-instructions (DD DD DD ... and FD FD FD ...).

Most homogeneous instructions do not modify memory. For those instructions, it is sufficient to load every memory location with a copy of A, and then let the Z-80 infinitely fetch instructions from memory.

Dealing with all the homogeneous instructions that do modify memory is the point of all the extra code. A simple case to deal with is LD (HL), r. To make sure that doesn't modify memory, registers B through L are all initialized with copies of A. The cases of INC/DEC (HL), CALL, and RST are the source of the label names IDHL, ISCALL, and NORST. An important case that's not indicated by any of the label names is PUSH AF (code F5). For that to not modify memory, F must be initialized with F5, and that's the reason for the PUSH BC / POP AF lines.

Thanks for the cool puzzle, George! I'm going to be the life of the party at which I pull out this chestnut.
 
The actual goal is "execute an infinite sequence of instructions in which every byte of every instruction is the initial value of A", right?

Wait a second, that's not possible for nine of the possible values of A (the eight RST instructions and the unconditional CALL 0CDCDh), is it?

So, the achievable goal is "execute the longest possible sequence of instructions in which every byte of every instruction is the initial value of A". The maximum number of consecutive CALL 0CDCDh instructions is 32768. For each of the RST instructions, the maximum number is 32769. It looks like the puzzle code falls a little bit short and executes only 32765 of the CALL instruction, and in the case of RST 0 it falls more seriously short, executing only 25471 (0C700h/2 - 1) consecutive RST 0 instructions. Do I have that right?

For the homogeneous conditional CALL instructions (like CALL NC, 0D4D4h), an infinite sequence *is* possible, and the puzzle code makes this happen by setting the condition to be false.
 
As an enhancement to the idhl cases, you could always have the code push the D5 E9 into place at (HL+1) and (HL+2), but let the 34H or 35H get put at (HL) and watch the hilarity at (HL) on each loop.... just make sure SP ends up at (HL-1) or lower, of course.... That would then be one of my favorite jokes.... (my favorite Z80 joke, in hex: 0000: 01 FF FF 11 0C 00 21 0B 00 ED B0 76 but the punchline of that is just a bit too straightforward for this crowd..... :) ).

Oooh, I like it and will change the code accordingly. Nice little ping-pong between the two. I struggled to figure out a way to handle that case so had no energy for cleverness left when I got it working. More on that below.

Nice HALT bomber there, too.
 
Wait a second, that's not possible for nine of the possible values of A (the eight RST instructions and the unconditional CALL 0CDCDh), is it?

So, the achievable goal is "execute the longest possible sequence of instructions in which every byte of every instruction is the initial value of A". The maximum number of consecutive CALL 0CDCDh instructions is 32768. For each of the RST instructions, the maximum number is 32769. It looks like the puzzle code falls a little bit short and executes only 32765 of the CALL instruction, and in the case of RST 0 it falls more seriously short, executing only 25471 (0C700h/2 - 1) consecutive RST 0 instructions. Do I have that right?

For the homogeneous conditional CALL instructions (like CALL NC, 0D4D4h), an infinite sequence *is* possible, and the puzzle code makes this happen by setting the condition to be false.

My aim was a little bit simpler -- once RAM has been filled try to maintain that state. Not that executing the longest possible sequence of homogeneous instructions isn't a laudable goal. Otherwise you're dead on that for the most part RAM can be unchanged by loading up all the registers with the target value. $ED prefix is OK since $ED $ED is an undocumented NOP. $CB prefix is OK since $CB $CB is "set 1,e".

I don't know if it is somehow convenient to the Z-80 internally, but it is nice that loading a zero into F register will make "NZ", "NC", "PO" and "P" true while $FF will make "Z", "C", "PE" and "M" true. Thus all the conditional calls can be handled as a single case. The really sweet bit is that $F5 will work as well as $FF. $F5 is "PUSH AF" so it luckily turns out that the code to handle the conditional calls will put $F5 in F when the target value is $F5. I don't have to handle $F5 as a special case.

There is little to be done about the RST and CALL instructions. But by fudging the position of the fill loop I can make it so the first RST/CALL that executes keeps the RAM filled with the target value. So it only sustains the RAM full state for 2 instructions, but that's twice as good as one instruction.

The "inc/dec (hl)" case just about finished me. I thought it would just work out because it would do 65536 of the instructions which would get (hl) back to the target value ($34 or $35). Not so. Consider the "inc (hl)" ($34) case. The very last "push de" at $3434 wipes out the "jp (hl)" and the "push de". Then we execute at $3435. So we end up only doing 65535 "inc (hl)" instructions leaving the a $33 at $3434. Took me quite a while to come up with the idea to stick a "push de" at $3433 that would wipe out itself and set $3434 to $34. Now we do 65536 "inc (hl)" instructions so when we get back to $3434 it is still "inc (hl)". So while "inc/dec (hl)" cannot keep RAM all the same, it will be the same every 256 instructions.

My only concession to sanity was to not think about how this would go down with other processors. Much. 6809 and 68000 and ARM with their ability to push multiple registers in a single instruction might make it easy. Their rich indexing modes could make sustain a little trickier. More advanced chips can sidestep the problem entirely but running the clear loop in the instruction cache. I don't think the 6502 can do this in general since it has only JSR and BRK that can write 2 bytes at once. Seems like it could manage for 1 or 2 specific values, though.
 
If the goal is "set every memory location's value to X and keep it that way" (with the X argument being passed in the A register), then this is achievable for 245 out of the 256 possible values of X (namely, all values but the opcodes for INC/DEC (HL), RST, and unconditional CALL).

A related goal is "enter a loop during which every value ever placed on the data bus (either by the Z80 or by the memory) is X". The puzzle code happens to achieve this for the same 245 values.

Here's my puzzle for y'all: how can this be done for all 256 values?



Hint: it requires using a feature for which: (a) the Zilog documentation is insufficiently detailed for our purposes; (b) the description in "The Undocumented Z80 Documented" (version 0.91) is sufficiently detailed, but wrong; and (c) the implementation in trs80gp (version 1.8) is also wrong.

(I know that trs80gp 1.8 is way out of date, but that's the most recent version I could find today on http://members.shaw.ca/gp2000/. If George uploads a more recent version I'll tell you whether it gets this right now.)
 
A related goal is "enter a loop during which every value ever placed on the data bus (either by the Z80 or by the memory) is X". The puzzle code happens to achieve this for the same 245 values.

Here's my puzzle for y'all: how can this be done for all 256 values?

I think I know the answer and if so the latest version of trs80gp does not get it right. It comes ever so close and, funnily enough, I recently made a note to investigate this particular issue to confirm what I'd read. If you combine the various sources together and think "what would be the most straightforward implementation" then how it will be is clear. So there was what I read and what I should have read into it.

To not give away the answer yet, here's my program that demonstrates trs80gp getting it wrong:

7D00: F3AFD3EC21003C11013C01FF033676EDB0E9

For those who can't help but instantly read the Z-80 in a cramped hex string you have my apologies and admiration.
 
...
To not give away the answer yet, here's my program that demonstrates trs80gp getting it wrong:

7D00: F3AFD3EC21003C11013C01FF033676EDB0E9

For those who can't help but instantly read the Z-80 in a cramped hex string you have my apologies and admiration.

Heh, I only had to look up three opcodes.... that's not good.... I did have to pull it in to an editor and space it, though. My eyes aren't what they used to be.

Hmmm, special reset indeed... I remember reading this. Neat thought, misusing something like that..... And some folks don't realize you can run from video RAM (4P does this during boot and the initial memory test....).

What the 'Undoc Doc' says about HALT and the program counter is wrong. During the HALT state, the Z80 will fetch the instruction after HALT until a HALT exit condition is triggered; the PC is pointed to the instruction after HALT so that an NMI or INT will return to the correct place after execution, and the Z80's fetch logic fetches, but does not execute, the byte after the HALT.

Ok, my solution to Petrofsky's puzzle. This is likely not to work correctly on anything but a real Z80:
8100: F3210681777676

Petrofsky, nice technique you thought of there..... the challenge now would be to compensate for various incoming values of A and leverage the 'special reset' (I'm not sure any of the TRS-80's ever did this, and the Model I in particular used NMI and not reset.....) to make the machine cycle through, incrementing or decrementing the value pushed to the data bus....... heh, this has been a fun exercise! Thanks again, GP for piquing our interest in some really fun code....
 
Last edited:
Hmmm, special reset indeed

To be clear, Lowen is referring to this article, which talks about HALT and special resets, but I was not suggesting the use of any reset (special or not-so-special), and I don't believe a special reset (as described in patent 4,486,827) can ever happen on a TRS-80 unless you somehow modify the hardware to generate a single-cycle reset signal at just the right time.

Lowen already posted a solution to my puzzle, but for eccentrics who would like to see one in cryptic mnemonics rather than straightforward hexadecimal, here you go:
Code:
	DI
	LD (HLT+1), A
HLT	HALT

The "Z80 CPU User Manual" says (on page 14 of the January 2016 edition):
Each cycle in the HALT state is a normal M1 (fetch) cycle except that the data received from the memory is ignored and an NOP instruction is forced internally to the CPU.

What it doesn't say is whether the value placed on the address bus during those M1 cycles is: (a) the address that was used to fetch the HALT instruction (the wrong answer), or (b) the address that will be pushed on the stack when an interrupt arrives (the right answer, which is equal to the wrong answer plus one).

I used these two two-byte programs to test HALT handling:
Code:
	Org	3FFDh
Start	DI
	HALT
	End Start
Code:
	Org	3FFEh
Start	DI
	HALT
	End Start

On a Model III, the first program causes the three leftmost columns of the screen to go blank, and the second program does not. In most emulators, neither program will cause any blanking (simply because the emulators don't implement video blanking, regardless of whether or not they get HALT right). In an emulator that correctly implements blanking but incorrectly implements HALT, both programs will cause blanking.

And some folks don't realize you can run from video RAM (4P does this during boot and the initial memory test....)
I didn't know that about the 4P boot. I'll have to check it out sometime.

Of course, running from video RAM on a Model III/4/4P is pretty boring compared to running from video RAM on a Model I sans lower-case kit. You only have 128 opcodes to work with (20-5F and 80-BF). Anyone have any interesting examples of doing that?
 
To be clear, Lowen is referring to this article, which talks about HALT and special resets, but I was not suggesting the use of any reset (special or not-so-special), and I don't believe a special reset (as described in patent 4,486,827) can ever happen on a TRS-80 unless you somehow modify the hardware to generate a single-cycle reset signal at just the right time....

That was just me giving a cryptic reference to the document that first mentions what the data bus is doing during the assertion of /HALT (where the / denotes an active-low signal; the TRS-80 schematic standard is a * instead). I left a comment on that page pointing to this thread, incidentally.
 
On a Model III, the first program causes the three leftmost columns of the screen to go blank, and the second program does not. In most emulators, neither program will cause any blanking (simply because the emulators don't implement video blanking, regardless of whether or not they get HALT right). In an emulator that correctly implements blanking but incorrectly implements HALT, both programs will cause blanking.

Oh, boy, even if HALT handling were fixed trs80gp still wouldn't get this right because, much to my dismay, I don't implement that left side blanking. Or the more general case where the Z-80 is nominally denied access to video memory during active display but can "sneak in" when a non-graphics character is being drawn. I've put it off because it'll take a lot of work to get right and I can't think of a compelling demo to show it off.

But it is distressing because way back when the left-side flicker in Super Nova on the Model III was the first indication that beam interference was not a completely solved problem on the Model III. Though I didn't have the faintest clue about that at the time nor did I realize that the "hashing" on the Model I display had the same underlying cause.

Of course, running from video RAM on a Model III/4/4P is pretty boring compared to running from video RAM on a Model I sans lower-case kit. You only have 128 opcodes to work with (20-5F and 80-BF). Anyone have any interesting examples of doing that?

Although I was mistaken in its usefulness, I wrote a program that would archive itself onto Model I uppercase-only video RAM. It could then execute that program which would unpack the original back into ordinary memory. I'd have to dig it up, but essentially it encoded most of the data in a "7 bit safe" form and a tiny "7 bit safe" program would copy it from video memory to RAM. At that point I believe there was a second stage that would convert the "7 bit safe" data back to 8 bit original form.

It was an interesting though rather pointless exercise. In other words, right up my alley.

As a warm-up, think about writing programs that contain no zero bytes. Leo Christophersen did this in "Dancing Demon" and other programs which looked like BASIC code but were mostly raw data and machine language subroutines. BASIC is pretty forgiving of this kind of thing, but if you wish to retain the ability to edit the program you can't have zero bytes in a line. It's not that hard and certainly gets one thinking in the right direction on how work around bigger restrictions.
 
Oh, boy, even if HALT handling were fixed trs80gp still wouldn't get this right because, much to my dismay, I don't implement that left side blanking. Or the more general case where the Z-80 is nominally denied access to video memory during active display but can "sneak in" when a non-graphics character is being drawn. I've put it off because it'll take a lot of work to get right and I can't think of a compelling demo to show it off.

I just want to say that I think your trs80gp emulator is a fantastic bit of work, even if I don't use it much (I only use Windows when I have to; I am normally a CentOS Linux user, and I have been a Linux user for 20 years, and a Unix (and TRS-80 Xenix) user for the 9 years before that. Yow, it will be 30 years next May since I first used a TRS-80 Model 16.....).

Having said that, this type of thing is where the MESS/MAME emulators really shine, since the emulation is hardware-level-accurate, or at least it strives to be. An accurate MAME/MESS implementation of the Model 3 would do everything exactly like the real Model 3 hardware does, within the limits of the individual chip implementation. And the Z80 in MESS/MAME does it wrong. Needs a bug report.

But it is distressing because way back when the left-side flicker in Super Nova on the Model III was the first indication that beam interference was not a completely solved problem on the Model III. Though I didn't have the faintest clue about that at the time nor did I realize that the "hashing" on the Model I display had the same underlying cause.

Being 'bug-compatible' is hard. :)

Although I was mistaken in its usefulness, I wrote a program that would archive itself onto Model I uppercase-only video RAM. It could then execute that program which would unpack the original back into ordinary memory. ...

This sounds like a cool exercise..... And I want to thank you again, GP, for posting this. I haven't had this much fun with Z80 stuff in years.
 
Last edited:
Back
Top