Self Modifying Code... Best Practice or avoid?

Chuck(G) · Jan 23, 2024

Eudimorphodon said:
Although I guess I'm sort of confused how that could come up

How about power-on routines before RAM can be tested and verified? Reminds me of the PC's initial POST code. Any any rate, I was just being a bit pedantic in saying that you don't necessarily have to have self-modifying code to do variable port I/O. It's a good mental exercise. Self-modifying code using instruction-plugging in RAM is self-modifying code, even if you pretend that it's not.

Eudimorphodon · Jan 23, 2024

Chuck(G) said:
How about power-on routines before RAM can be tested and verified? Reminds me of the PC's initial POST code.

Sure, I guess... although wouldn't it be expected/reasonable that any POST code that has to do port I/O (like output a memory test fail code to a diagnostic port) would be hardcoded to send those codes to a fixed port address?

Chuck(G) said:
Any any rate, I was just being a bit pedantic in saying that you don't necessarily have to have self-modifying code to do variable port I/O. It's a good mental exercise.

True enough. My first real exposure to 6502 assembly was figuring out some way to do a meaningful hardware test on a Commodore PET in a completely unknown state where I couldn't trust the zero page RAM to work, so I definitely have some familiarity with this kind of puzzle.
.
.. meh. I wish I could remember a quote I read long ago about the difference between programmers and hardware engineers, something about the former thinking hardware engineers are cavemen for thinking every computer problem needs a soldering iron to fix, while the latter are convinced programmers are insane because they're willing to spend ten hours writing software to work around problems a 50 cent chip could solve in ten minutes...

Chuck(G) · Jan 23, 2024

There's a lot to be learned by wrestling with Harvard architecture systems...

cjs · Jan 23, 2024

Eudimorphodon said:
...while the [hardware engineers] are convinced programmers are insane because they're willing to spend ten hours writing software to work around problems a 50 cent chip could solve in ten minutes...

I'm not buying that. Much of the pain I've experienced doing low-level code like device drivers was because the hardware engineers didn't put in that 50 cent chip (or even a 5 cent chip), because it cost 50 or 5 cents and you could get the software guys to work around the need for it "for free."

(They are right that, if you're shipping enough units, it's usually cheaper to do such fixes in software where you can, but that's little comfort when you're the software developer suffering away trying to deal with the issue.)

Eudimorphodon · Jan 23, 2024

It was a pretty old quote I was (mis)remembering, I wouldn’t be surprised if it dates to some imagined before time when gold plated mainframes ruled the earth. Today of course I would guess most hardware engineers have learned from the School of Woz that minimal hardware with ridiculously convoluted memory maps that needs arcane tight polling loops to work at all is a good thing, because while the software to drive it is going to be awful even if it’s “perfect” the fact that the product is essentially all software means you can fix bugs with a firmware update in the field instead of a recall.

Chuck(G) · Jan 23, 2024

Welcome to the world of minimal MCUs.

Eudimorphodon · Jan 23, 2024

I always thought it was sort of ironic how Woz's unique genius in coming up with designs that used arcane cycle-counting software do the work of a few TTL chips was baked into Apple Computer's company mythos to the degree that it was actively touted in advertising and interviews as proof their computers were "the best" or whatever, while at the same time their machines were, by the 80's, the most expensive 6502 home computers on the market. And not by a little, there'd be a 2-3x multiplier by the time the Commodore 64 came out.

"We're so smart you deserve to pay more for less!" is an interesting sales pitch, but hey, I guess it worked.

Chuck(G) · Jan 23, 2024

Back in the day when I was interested in it, I designed and built a prototype card to read Apple II diskettes on a PC. It wasn't terribly complicated and didn't involve ROMs--it did use a bunch of CPU code to interpret the returned stream, but that wasn't a big deal.

krebizfan · Jan 23, 2024

Prices should be based on what it does, not its build cost. If Apple managed to make a better product in a way that increases margins, that is a good thing. The reason many 80s computer manufacturers are no longer around is that they forgot to make enough profit to pay for all the expenses.

cjs · Jan 23, 2024

Eudimorphodon said:
"We're so smart you deserve to pay more for less!" is an interesting sales pitch, but hey, I guess it worked.

I don't know about that. In 1978 the year after the "'77 Trinity" came out, Tandy had over 50% market share, CBM about 12%, and Apple about 3%. In the subsequent years they slowly grew to about 20% in 1982, before collapsing that year in the face of the PC.

Even Atari had significantly larger market share than Apple until 1982.

Up until the iPhone, Apple was always like Twitter: well known, and looking very important, but not really a big part of the market.

Eudimorphodon · Jan 23, 2024

cjs said:
Even Atari had significantly larger market share than Apple until 1982.

It worked well enough that they're the only one of the Trinity left standing.

Eudimorphodon · Jan 23, 2024

krebizfan said:
Prices should be based on what it does, not its build cost. If Apple managed to make a better product in a way that increases margins, that is a good thing. The reason many 80s computer manufacturers are no longer around is that they forgot to make enough profit to pay for all the expenses.

To be clear, for the most part I agree with this; it doesn't matter to the casual consumer what's *in* the thing they buy if it does what they want, and if you can do what the customer wants cheaper than the other guy but charge the same or more for it then, yep, you win. My point is simply it's kind of a weird sales pitch, trying to entice the customer into believing that your company is staffed by a bunch of amazing geniuses based on their talent at cutting corners and pocketing the savings. I mean, just imagine a world where Madman Muntz, instead of ending his sales pitches with lines like "I wanna give 'em away but Mrs. Muntz won't let me. She's crazy!", waved his pliers around and boasted that because his TVs had fewer bits inside to get in the way they were the best on the market. (And priced accordingly.)

I'm not saying what Apple was doing was *wrong* in any way. (Although it must be admitted that the Apple II's video memory map is enough to make a grown man cry, all in the name of saving a couple TTL counter chips.) Again, it's just kind of interesting that they were able to get away with coming right out and saying that they were giving people less "stuff" (but the stuff you are getting is more clever!) for the money.

cjs said:
Tandy had over 50% market share

Which actually proves the correctness of going with "cheap" over "complete" as long as you can cover up the omissions with software. Which the TRS-80 does very well, actually. It sold for less than half the price of the broadly similar SOL-20 and there's not really anything a SOL-20 can do that the Trash-80 can't. Sure, it got to that lower price by forcing the CPU to bit-bang functions like keyboard scanning and cassette I/O instead of dedicating a decent handful of chips to abstract those things into tidy I/O ports you just need to throw bytes at, but if you're not the guy writing the OS you don't need to care.

I just don't remember Radio Shack advertisements ever being as proud of their Muntzing as Apple was.

cjs · Jan 24, 2024

Eudimorphodon said:
(Although it must be admitted that the Apple II's video memory map is enough to make a grown man cry, all in the name of saving a couple TTL counter chips.)

That seems to have turned out to be an urban legend. I got into this in a couple of questions and answers on the Retrocomputing SE where we discovered that the particular layout of the frame buffers just happens to do your DRAM refresh within the timings necessary, where in certain modes a linear frame buffer wouldn't do this. So it seems to be another case of an ingenious Woz design resulting in fairly large savings, rather than being penny-pinching.

Eudimorphodon said:
Which actually proves the correctness of going with "cheap" over "complete" as long as you can cover up the omissions with software. Which the TRS-80 does very well, actually. It sold for less than half the price of the broadly similar SOL-20 and there's not really anything a SOL-20 can do that the Trash-80 can't. Sure, it got to that lower price by forcing the CPU to bit-bang functions like keyboard scanning and cassette I/O instead of dedicating a decent handful of chips to abstract those things into tidy I/O ports you just need to throw bytes at, but if you're not the guy writing the OS you don't need to care.

Tandy was far from the only company to do software scanning of the keyboard matrix and bit-bang CMT I/O, and both seem to me pretty reasonable design decisions, actually, since both have significant advantages over dedicated hardware to do this.

Bit-bang CMT I/O significantly increases the control you have over the tape format as compared to using a UART and thus allows for speed loaders, which are a godsend. And software-scanned keyboard matrices give you a lot more information about and control of the keyboard scanning, allowing for things like knowing when a key is released, changing auto-repeat settings, more flexible keyboard remapping, and so on. I've always found the keyboard subsystems on the Apple II, Fujitsu FM-7 and the like really annoying because the lack of key up detection made many games using the keyboard much worse. (Most FM-7 arcade-style games use the numeric keypad for directional control, but require you to press "5" in the centre of the keypad to stop moving, which I've never been able to get very comfortable with.)

In other areas, though, I think that certain design decisions that saved a bit of hardware probably hurt Tandy. Not being able to bank out the ROM of the Model I and Model III meant that they couldn't run CP/M without end user modification, and being able to run CP/M on such a relatively cheap machine might have made the Model I/III series more successful in the later '70s and very early '80s. The Microsoft SoftCard apparently gave quite a boost to Apple II sales, after all. (Though Tandy might well have considered this and decided that they didn't want lower-end competition for the Model II.)

But let's not forget, too, that a good part of Tandy's early success was having vastly better distribution than CBM and Apple. Having a TRS-80 in each of three thousand or more retail shops spread across the U.S., rather than in just a few shops here and there, makes quite a difference.

As for Apple being particularly proud of their Muntzing, I'm not entirely convinced that was the case. Technically oriented users of their machines certainly enjoyed talking about Woz's brilliant designs, but I doubt that was really a sales point for most users.

Eudimorphodon · Jan 24, 2024

cjs said:
That seems to have turned out to be an urban legend. I got into this in a couple of questions and answers on the Retrocomputing SE where we discovered that the particular layout of the frame buffers just happens to do your DRAM refresh within the timings necessary, where in certain modes a linear frame buffer wouldn't do this. So it seems to be another case of an ingenious Woz design resulting in fairly large savings, rather than being penny-pinching.

Do you mean this? Because this does not support what you're saying here. In fact one of the posters seems to be pulling his hair out trying to debunk the notion that the strange memory map is a necessary consequence of the DRAM refresh. In fact, if we just stop and think about it for a minute it's painfully obvious that's not why it's all messed up.

In "Understanding the Apple II" the author discusses DRAM refresh, how it's vital that every ROW address be accessed at least once every two milliseconds, and how the Apple II does some weird cut-and-shuffling of how the address lines are set up on the address multiplexors to achieve this using the video timing as the source of entropy. So let's think about the reality of the situation here.

Let's start with what the Apple II actually has: The weird memory map, in both text and graphics modes, was because Woz wanted to use the *absolute minimum* of counter chips and latches for address generation of a 40 column screen. 40 is not a power of two. This is awkward if your display system has a text mode, because with a text mode you need to run through the same 40 bytes of memory for 8 consecutive scanlines to generate the characters, so you can't just use a braindead linear ripple counter that just gets 40 pulses each line to run through the display (like you technically could if you were just doing graphics), you need to be able to latch and repeat that same set of addresses for those 8 times before you let it increment. Here is a fragment of the video address generation schematic for a Commodore PET showing a standard, pedantic way to do that. Woz apparently didn't want to have to implement the 13 bits (*) of counter and latch he'd need to implement this for the Apple II, so he went all hawg wild with his clever scheme that packs three non-contiguous lines into each 128 byte address block, "wasting" the last 8 locations, etc, etc, and, yeah, that scheme saved a few chips, at the cost of creating a weird doubly-interlaced memory map. Yay, Woz is clever genius, saved some chips...

... but how exactly was this a *good* thing for memory refresh? It's actually farging terrible, and that stackexchange thread explains why. Apple's scheme divided the screen vertically into three sectors in which the same 40 addresses (6 bits out of each 7 bits chunk) get replayed for 8 text mode lines, or 64 scanlines. Remember, we have to touch all of 7 bits worth of memory at least once in 2 milliseconds; 64 scanlines is about twice that, repeating the same 5 bits. So clearly we're badly screwed by this; we can't just use the low order address lines as our DRAM row inputs and expect this work, because this system is going to take essentially a whole frame to generate every bit combination. Thus the need to do the weird things described in Understanding the Apple II to give us some rolling values derived from somewhere else to provide a full set of refresh addresses.

By contrast, let's just pretend that instead of this bizarre setup it just had a pedantic counter/latch setup like the Commodore PET, with the low address lines used as ROW addresses on the DRAM chips. Sticking with character mode, with 8 line tall characters, without dedicated refresh cycles you're guarunteed to walk all the way through 7 bits of addressing every 25 scanlines. (You'll be hitting 120 of the addresses 8 times, and the last 8 once.) 25 scanlines is about 1.6 milliseconds. So, BOOM, you're guarunteed acceptable DRAM refresh rates *just* from the address generation circuitry cycling through valid active-area video addresses(*), no weird shuffling of address lines needed, and with a sane linear memory map.

(* We will need to keep running dummy reads during the vertical blanking period, of course, since that's like a quarter of the 16.7ms frame time.)

I don't know how much the chips involved to do it cleanly and linearly cost in 1976, I was... a little young to be into that sort of thing, so I'm not going to second guess whether the tradeoff here was worth it, but please don't spread false information.

(*Edit: We could get by with the only the same number of bits the Commodore PET uses for the counter/latch system if we're willing to have interlaced graphics memory; we could just use the character generator row lines for the top three address bits in graphics mode. I’m deeply puzzled by the claim that in “some modes” a linear framebuffer would somehow skip iterating through all possible 7 bit addresses in under 2ms; the text/interlaced graphics version with 10 bit addressing does it every 1.6ms, always, while a fully linear 13 bit address mode does it every 4 scanlines, or every quarter of a millisecond.)

Eudimorphodon · Jan 24, 2024

cjs said:
Bit-bang CMT I/O significantly increases the control you have over the tape format as compared to using a UART and thus allows for speed loaders, which are a godsend. And software-scanned keyboard matrices give you a lot more information about and control of the keyboard scanning, allowing for things like knowing when a key is released, changing auto-repeat settings, more flexible keyboard remapping, and so on.

Again, was I saying it wasn't good? For the most part it is. (Although there is the problem with software keyboard scanning that it's very difficult to implement things like type-ahead buffers, and it's also not great for tasks like high-speed serial communication. Just like how the Apple II's 100%-CPU-hogging disk system can be a bit of a pain sometimes compared to a smarter DMA-equipped controller...) However, at the time it was a "thing" that many owners of "real" computers, IE, overbuilt S100 systems and whatnot, considered the machine a "toy" because of its reliance on software over hardware, and were sometimes pretty loud about it.

(I mean, let's get real, the "Kansas City" cassette standard may have been the most overengineered personal computer widget of all time, but in light of the fact that the personal computer hobby scene was dominated by hardware-centric folks like HAM radio enthusiasts you can kind of understand it.)

cjs · Jan 24, 2024

Eudimorphodon said:
Although there is the problem with software keyboard scanning that it's very difficult to implement things like type-ahead buffers...

Err, huh? I remember looking at the C64's keyboard code (16 char type-ahead, IIRC) and the MSX's code, and neither seemed to be a difficult implementation.

Eudimorphodon said:
...and it's also not great for tasks like high-speed serial communication.

True, but high speed serial can't be used on CMT interfaces, which is what we were talking about. The TRS-80 and the Apple II both used hardware UARTS for their RS-232 serial interface cards.

Eudimorphodon said:
Just like how the Apple II's 100%-CPU-hogging disk system can be a bit of a pain sometimes compared to a smarter DMA-equipped controller...) However, at the time it was a "thing" that many owners of "real" computers, IE, overbuilt S100 systems and whatnot, considered the machine a "toy" because of its reliance on software over hardware, and were sometimes pretty loud about it.

Yeah, but the "toy" Apple II had higher disk capacity (due to a software upgrade!) than the FM and even some MFM floppy disk subsystems of the time that were common on other machines, and even though the Apple drives used only 35 tracks. And plenty of systems that used a lot more hardware didn't use DMA, either, at least here in Japan.

Regarding the Apple II DRAM refresh stuff, I maybe misremembering. I know I was very much on the "doesn't affect DRAM refresh" side at one point, but it's been a couple of years or more since I was deeply into researching this. However, there's another question on RCSE that more directly addresses the issue.

Eudimorphodon · Jan 24, 2024

cjs said:
Err, huh? I remember looking at the C64's keyboard code (16 char type-ahead, IIRC) and the MSX's code, and neither seemed to be a difficult implementation

Those machines have a vertical refresh interrupt the keyboard driver is chained off of. A trash 80 has no interrupts so it is up to running programs to be fair.

(Granted some TRS-80 DOSes latch keyboard scanning onto the clock interrupt. But you’ll still drop keystrokes during disk I/O.)

cjs · Jan 24, 2024

Eudimorphodon said:
A trash 80 has no interrupts so it is up to running programs to be fair.

(Granted some TRS-80 DOSes latch keyboard scanning onto the clock interrupt. But you’ll still drop keystrokes during disk I/O.)

Wasn't this true of most systems in the 70s, even those using hardware keyboard scanning or serial ports? The common keyboard controllers (such as the one in the Apple II) did not buffer more than one keystroke, and most UARTS of the time had only a one character receive buffer. (It looks to me that, in this respect, those systems that did use a timer interrupt and software matrix scanning were better than the hardware keyboard scanner implementations.)

Eudimorphodon · Jan 24, 2024

cjs said:
Wasn't this true of most systems in the 70s, even those using hardware keyboard scanning or serial ports?

Sure, but the point was about type-ahead buffers. On a machine that has no clue if something has hit a key unless it polls for it unless you have a hardware buffer in your keyboard you’re going to stall/drop keystrokes until whatever program is running takes time to perform a keyboard poll. The C64 and MSX don’t generate keyboard interrupts, sure, but having vertical refresh interrupts means that those machines *can* implement features like type-ahead buffers in software because they can *force* regular detours through an input poller.

(FWIW, even though many S100 machines didn’t use interrupts, at least by default, they still could benefit from whatever debouncing/latching features came with their parallel decoded keyboard or whatever was in their terminal, making checking the keyboard at least a bit less “expensive” than a manual scan.)

The Apple II pretty much has the same limitation as the TRS-80 here; it doesn’t have to do a matrix scan, but it’s still at the mercy of purely cooperative multitasking to check its parallel port. That’s not what I was comparing the TRS-80 to. I was making the point about the benefits of Smart Keyboard controllers with buffers, handshaking, and interrupts, which a rich enough person might have on a high-end terminal in the 1970’s, and (almost) every new computer since the latter half of the 80’s has now. Obviously the hardware cost today is nil compared to the 70’s, but they got that way because they got popular enough.)

Eudimorphodon · Jan 24, 2024

cjs said:
Regarding the Apple II DRAM refresh stuff, I maybe misremembering. I know I was very much on the "doesn't affect DRAM refresh" side at one point, but it's been a couple of years or more since I was deeply into researching this. However, there's another question on RCSE that more directly addresses the issue.

If you’re referring to supercat’s replies they appear to be pure nonsense. I mean… his hypotheticals don’t make sense in the first place, let alone provide any clue why he thinks the scrambled memory map helps avert them.

In theory this has me supposing if you had a system that relied on video reads for DRAM refresh and the address generator was programmable enough you might be able to set a “malicious“ video mode that intentionally starved RAM of refresh cycles. Maybe it could be an interesting exercise to try on an IBM PCjr or something using a Motorola CRTC? But the Apple II doesn’t remotely fall into this category.

Self Modifying Code... Best Practice or avoid?

25k Member

Veteran Member

25k Member

Experienced Member

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member

Experienced Member

Veteran Member

Veteran Member

Experienced Member

Veteran Member

Veteran Member

Experienced Member

Veteran Member

Experienced Member

Veteran Member

Veteran Member