• Please review our updated Terms and Rules here

Extended Bootloader Blues

jackrubin

Veteran Member
Joined
Mar 12, 2004
Messages
879
Location
Chicago, IL and Buchanan, MI
I finally finished assembling my 32K+Bootloader combo card today and popped it into my newly revived 8/E, planning to start running MAINDECs. As a warmup, I turned on the machine, noted the Cylon salute and then pushed the button on the bootloader until 26 was displayed in the MA register. I waited a few seconds until the loader loaded and then switched the select knob on the front panel to AC. I watched the accumulator increment and then the machine suddenly halted. The test should have looped forever, so I started it again. It ran for a few cycles and then halted again.

Since I have a known-good 32K "old" board that has passed memory diagnostics for hours at a time and also a stand-alone version of Roland's original extended boot board, I could do some testing.

First, using the new board with on-board boot, I again used the on-board button to auto-load the AC increment test. It loaded, ran for a little while, then failed. At this point, I moved the selector knob from AC to MD and another bit of code loaded automatically and ran. While running, it shows MA = 0025, AC = 0100, MD = 7435, so it seems to be looping.

Next, I replaced the new combo board with my original 32K board, loaded the AC increment test by hand and started the test. It ran without issues until I stopped it after 20 minutes.

I then re-installed the stand-alone boot board (switches set to 00) along with the original 32K board. I checked that the code at AD 0200 was still correct, restarted the AC increment test and again ran it until I got bored and halted the machine (about 10 minutes).

Now I removed the old boot board, left the old 32K board in place and re-installed the new board with all memory disabled. I loaded the AC test by hand and ran it. The machine stopped after about 5 cycles (each cycle is about 1:03 minutes), showing MA = 0001, AC = 7275. I switched the display to MD and again, code loaded and ran, finally displaying MA = 025, AC = 0100, MD = 7435.

I continued to ring the changes, including swapping the ATMEGA chip from the new board with the one from the original bootloader. Regardless of the combinations, whenever I run with the new card in place, even with memory disabled, the AC increment test fails after some random number of cycles at some random AC value, but then consistently loads new code when I move the selector knob and loops with MA = 0025, AC = 0100 and MD = 7435.

I currently have the old 32K board in place, with the old boot card loaded with the new ATMEGA chip; I ran the AC increment test for an hour and now feel confident that the machine is stable and I can move on with MAINDECs.

I didn't swap RAM chips between the 32K board because the new chips were disabled during testing. When I get further along, I may swap them into the old board and run DHKMAD against them for a while, but I don't think they are at issue here.

Vince cautions that running the new board with Q1 and Q2 in place but (not the ATMEGA?) will stall the system so I haven't tried running without the micro.

So what might be wrong? I'm seeing a random failure that initiates a repeatable event. Does anyone recognize what code might be loading from the loop pattern? In all cases, the switches on either boot board are set to 00, so nothing should be loading.

????

Thanks!
 
Sounds to me that the boot loader is working to the point of properly loading the boot code, the PDP8E takes off and runs, then, in my opinion the boot loader comes back into the picture and confuses the PDP8E and it halts. Maybe you should disable Q1 and Q2. I did that when trouble shooting the memory problem. I seem to remember just removing a chip and pulling the base resistors high or low. Anyway it worked and I didn't have to unsolder the transistors. Then run with the old boot loader and the new board memory.

On another note, I tried program 26, the AC INC and it has been running for nearly an hour on my machine and has not failed. So maybe there is a hardware problem on your new board? I have two of Vince's boards. Tomorrow I'll try the other board. Mike
 
I finally finished assembling my 32K+Bootloader combo card today and popped it into my newly revived 8/E, planning to start running MAINDECs. As a warmup, I turned on the machine, noted the Cylon salute and then pushed the button on the bootloader until 26 was displayed in the MA register. I waited a few seconds until the loader loaded and then switched the select knob on the front panel to AC. I watched the accumulator increment and then the machine suddenly halted. The test should have looped forever, so I started it again. It ran for a few cycles and then halted again.

Silly question, but are you trying to run with the Alliance memory chips, or with the replacement Cypress chips I sent you?

Vince
 
The new board has a pair of bright and shiny new Cypress CY62256NLL-70PXC chips, date code 1849. I've disabled them for most of my testing using the select switch but I haven't tried swapping them between the old 32K board and the new 32K+B board. Can you (or Mike) provide more detail on disabling/bypassing Q1 and Q2? Not much to play with on the bootloader side - it's built out with the components provided - ATMEGA + crystal + MAX232. I'll dig out the magnifier and re-check the resistor values today.

It seems like another one of those nasty issues where something is just on the verge of being "out of synch" and toddles along until it falls over the edge.

I'll build out my second board later this week but won't be able to get back to this machine for a couple of weeks. I guess it's time to get the 8/M running.
 
Jack, I ran the 2nd board this morning with the AC INC program 26 for nearly an hour and it is still running with no problems. To disable Q1, remove the 328 chip and connect a jumper from the junction of R10 and R13 to ground. I found a convient ground at C9. See picture. 20190603_085826.jpg Q2 needs a jumper R11 and R14 to ground. Mike
 
First, using the new board with on-board boot, I again used the on-board button to auto-load the AC increment test. It loaded, ran for a little while, then failed. At this point, I moved the selector knob from AC to MD and another bit of code loaded automatically and ran. While running, it shows MA = 0025, AC = 0100, MD = 7435, so it seems to be looping.
If I am following this, you loaded the IAC;ISZ 0300; JMP .-1; JMP 0200 loop at 0200, which is test 26?

I'm (like you) at a loss as to how MA can equal 25. What does "another bit of code loaded automatically and ran" mean? I don't think code should load just because you changed the display knob.

I continued to ring the changes, including swapping the ATMEGA chip from the new board with the one from the original bootloader. Regardless of the combinations, whenever I run with the new card in place, even with memory disabled, the AC increment test fails after some random number of cycles at some random AC value, but then consistently loads new code when I move the selector knob and loops with MA = 0025, AC = 0100 and MD = 7435.
Looping with that display, I think is consistent with having loaded and branched to bootstrap 01. If you single step from there, does it alternate 6411, 5024? (If you have a serialdisk setup, does it try to boot it?)

So there seem to be a couple of oddities..."increment test fails" and "loads new code". I assume when you say the former, you mean something has caused the AC to cease to increment? Can you press Single Step and get any info? Even just an MA value from before you move the knob could help.

If things are still wonky with R10/R13 grounded, ground R11/R14 as well. If *that* is still wonky, I'd set up to look at the signal "FLIP_FLOP", which should remain low when the boot loader is inactive.

Vince
 
Responding point-by-point:

1) Yes, that is the program I loaded and ran, either manually or by clicking in 26 with the on-board push button.

2) By "loaded automatically", I mean just that - when the AC stopped incrementing, I (after either a long or short delay) moved the selector switch from AC to MD and the panel lights started to flash. When the loader finished and the machine was looping, I halted it and then single stepped around the loop. It did indeed alternate between 5024 and 6411, so you've correctly identified the boot code that ran (bootloader select switch set to zero during all these exercises).

3) Removing the ATMEGA micro and then grounding R10/R13 locked the machine with MA = 7777 and I was unable to examine after deposit. Grounding R11/R14 in addition simply set EMA to 7 as well with no other change.

Unfortunately, I'm not well equipped to go much farther here so I'll take the boards back to Chicago and work on getting the 8/M up and running so I can continue testing there.

Thanks -
 
2) By "loaded automatically", I mean just that - when the AC stopped incrementing, I (after either a long or short delay) moved the selector switch from AC to MD and the panel lights started to flash.
Well, that's not supposed to happen :eh:.
3) Removing the ATMEGA micro and then grounding R10/R13 locked the machine with MA = 7777 and I was unable to examine after deposit. Grounding R11/R14 in addition simply set EMA to 7 as well with no other change.
That seems a little odd, too.

I had a draft, but the forum ate it instead of posting it, where I had a couple of other ideas.

Make sure C10 is installed, to minimize the possibility that stray noise could toggle SW.

There are some places where I have arguably placed a via too close to where the kit assembler would need to solder. Check especially that the via near pins 24 and 25 of the ATMEGA isn't shorted to either. There are also a couple of others near U8 and U9 that might affect FLIP_FLOP.

Vince
 
Moving along slowly, I built my second Mem+Bootloader card last week and brought it out to the Computerarium for testing. It exhibited the same behavior as the first card, running the Increment Accumulator test for a while (about 15 minutes) before failing and falling into a loop awaiting input from Serial Disk.

I also swapped RAM (new Cypress for old Cypress) between the new card and my five-year old 32K memory board (RAM date code 1343). Similar results - failed into a loop after about 20 minutes of incrementing accumulator. I put the new Cypress RAM (date code 1849) into the original 32K memory board and ran the increment test for 45 minutes before shutting it down out of boredom.

This morning I'm using the new RAM/old board and running through diagnostics, starting with Roland's v1.1 bootloader (short card) to load the toggle-in tests. I've got an uncomfortable feeling about ISZ since the D0F diagnostic ran strangely last fall. Maybe it's been failing but without an M847 in place to respond, it went unnoticed?

More to come -
 
Moving along slowly, I built my second Mem+Bootloader card last week and brought it out to the Computerarium for testing. It exhibited the same behavior as the first card, running the Increment Accumulator test for a while (about 15 minutes) before failing and falling into a loop awaiting input from Serial Disk.

Well dangit! Does your system run OK with the new board but the
boot DIP switches set to 00?

I'm assuming you've checked on C10 and the close vias?

Vince
 
At Roland's suggestion, I installed a 1 uF capacitor between pin 15 (SW_RUN) and pin 8 (GND) on the Atmega to reduce the chance of EMI glitches. I successfully ran all the short tests in the bootloader code and I'm now running the Accumulator Increment test again. It's been running 50 minutes now on the modified board and I'm hoping it will make it to one hour, at which point I'll feel confident about running MAINDECs.

I think Roland's intuition is good - I might be suffering from the Davey Jone's effect - my lab, like his new space, is well lit with lots of overhead LED light bars (8 of them, with four within 6 feet of the rack). Previously, I've been running with the CPU extended from the rack and the cover off. Currently, it is pushed back into the rack. The cover is still off but it is sitting beneath the full-depth metal enclosure of my DSD-210 floppy drives. If this run goes to one hour, I'll go back and try the unmodified board but with the CPU metal cover installed.

I'll see if I can borrow a spectrum analyzer.
 
At Roland's suggestion, I installed a 1 uF capacitor between pin 15 (SW_RUN) and pin 8 (GND) on the Atmega to reduce the chance of EMI glitches.

Isn't that essentially what I was saying about C10, except he's suggested 1uF instead of 0.1uF?

Anyway, glad to hear it is working!

Vince
 
I just tried the first (unmodified) Boot+Mem card with the machine closed (no cover) and the lab lights off. It ran the Accumulator Increment test for 2 minutes before stopping, then dropping into the Serial Loader when I switched display registers.

Roland's prototype boards were multilayer with embedded an ground plane, so I expect that his original board was much less sensitive to stray emissions then the full-size combo boards.
 
C10 is in place on both boards and didn't seem to stop the glitching. I installed the filter cap (across R_SW, not SW_RUN as I first wrote) directly at the Atmega. I'll try increasing the value of C10 on the first board and see if that helps. Unfortunately, no more 1 uF caps here, so it will have to wait a week.

Time passes slowly out here in the Midwest.
 
Last edited:
Hi Jack,

Good to hear that a capacitor directly on the Atmega input stops the problem.
Raising the value of C10 will probably not fix the problem. I think it's the place where C10 is positioned.
Without measuring just a guess: The track from C10 to the input of the Atmega is quite long on the PCB.
That might work as an antenna. That antenna is grounded at C10 because C10 is a short circuit for
high frequencies. The other end of the track is on the Atmega which has nothing more than a pull up resistor.
Powerfull noise might induct a voltage on the track and trigger the switch input on the Atmega.

Since the PDP's are known to be transmitting a lot of noise it can also be the PDP itself.
So that makes me wonder if anyone else has the same problem that you experiencing. But I've also seen weird
things happening with phone transmitters or radio transmitters nearby electronic equipment.

Roland's prototype boards were multilayer with embedded an ground plane, so I expect that his original board was much less sensitive to stray emissions then the full-size combo boards.

Nope, not multilayer. Just 2 layers... Probably less sensitive because a shorter track between that capacitor and the cpu...

I will make an update for the software to make it not sensitive for these spikes. Give me a bit of time.

And in the meanwhile just test and use the boards with a little capacitor directly on that pin 15. Let me know how it goes.
If there would ever be another run of these boards it might be wise to place a capacitor near by that input.
That is if your boards will continue to work. Again, I don't have Vince his nice board and I cant test anything.
I'm just trying to help and think about what might be the problem.

Regards, Roland
 
@Roland - Mike Zahorik seems to be using a combo board without any problems, but I don't know his exact setup. I certainly understand the long trace across the board acting as an RF antenna but I've also used your prototype on an extender card without any apparent problems.

I realized after my post that your boards are in fact 2 layer but with large copper pours between traces, so maybe that provided enough shielding to prevent interference?

Anyhow, I'm glad to have the ghost exorcised from my machine and look forward to pushing it through the MAINDECs next weekend.
 
I haven't assembled my combo card yet for my PDP-8m, but I might also suggest it depends on where the combo board is within the backplane, and what card(s) are adjacent to the combo card in the backplane. Any EMI from adjacent cards would be strongest. If the combo card is on an extender then other EMI sources could come into play.

I am a big fan of 4 layer cards with full power and ground planes. They are not that much more expensive nowadays (from sites like JLPCB at least; not OSHpark) and they provide a much better environment for on card signals.

Don
 
Well, the ghost came back, but I think it is now well and truly squashed.

Earlier this week, the Computerarium was honored with a visit from a small subset of the Family of Eight - Doug Ingraham and Mike Zahorik stopped in for a couple of days of manual labor and PDP-8 trouble shooting. Mike brought along his own boot+mem combo card which has been working reliably for him on his home machine; we put it into my system and it failed just like my board did. My modified board worked better with the increment accumulator test but went crazy when I tried to load the RIM loader from the onboard Atmega micro.

Time to change perspective. We backed up and started with a power supply checkout. Everything looked good - values in spec and minimal ripple. Next was a discussion of the action of the SW switch and associated logic levels. It turns out the Mike generally was running his machine with the switch down, while I usually kept the switch up, then toggled it down and back up to initiate a boot of either my MI8E or Roland's boot loader. We discussed this, consulted the manual and agreed that typically the switch should be up during operation. I did, however, try running my board with the switch down and found that it was now running reliably. ??? We decided to scope the SW line on the OMNIBUS and saw a pretty surprising display:


The trace shows the SW line - it should be either a solid 0 or a solid 1 but it clearly wasn't. This isn't noise - it's data. To make a long story less long, SW is carried on OMNIBUS line D2V and DATA11 (bit 11) is carried on OMNIBUS line D1V. A quick binary search of the backplane (I pulled the inter-backplane connectors) verified that the SW line behaved normally when only the front backplane was in use. A little more investigation located a short in the very last slot of the rear backplane between D1V and D2V - opposing pins, normally isolated by the plastic connector housing, but the back corner of the housing was chipped and the spring loaded contacts were touching. During the increment accumulator test, I was actually running bit 11 output on the SW line; given the right (wrong) timing of a data transition, a new boot was initiated.

It is interesting to note that this system had been used reliably for years and was only decommissioned last summer. So why no problems? Because in it's original configuration, the CPU and interface cards were in the chassis I'm using while 32K of core memory resided in a separate expansion chassis. A set of low-profile ribbon cables occupied the last slot of the chassis and kept the opposing pins from touching.

The solution - a strategically positioned guitar pick to keep the two DV contacts in their appropriate positions.

Thanks to Doug and Mike for noodling this through with me and for helping to install the power supplies in my 8 and 8/I and then racking a TU56 and a pair of TU55's. Things are looking good in the Computerarium. Some stuff even works!
 
Last edited:
Hi Jack,

Glad that you have measured what's happening on that line.
So Vince his board is okay and I don't have to update the software :)

Regards, Roland
 
Back
Top