• Please review our updated Terms and Rules here

VaxStation 3100 M38 - help with boot errors (?)

dcminter

New Member
Joined
Jan 20, 2024
Messages
2
Hello!

Having done a lot of messing around on a V8800, a V8650, and a small lab full of VaxStations (all running VMS) as a user in college (DEC$PHONE and DEC$NOTES have a special place in my heart) I've long had a hankering to own a Vax of my own. Happily I recently acquired a VaxStation 3100 M38 from a local (Stockholm, Sweden) auction site as a matrimonially safe option (not one of the 400 volt wardrobes), and now I'm starting to try to get it up and running.

Internally it seems to have 8M (2x4M) of memory, and the B&W graphics cards. There's a SCSI card and a SCSI hard drive that spins up and sounds surprisingly healthy. I have it hooked up to a software serial terminal as the alternate console and I'm able to get console output at boot time - here's what I see currently:
Code:
KA42-B  V1.5         

F...E...D...C...B...F...E...D...C...B...A...9...8...7...6...5...4...3...2...1...


 ?  F  010A  010A.1901
 ?  E  010A  010A.1901
 ?  D  010A  010A.1901
 ?  C  010A  010A.1901
 ?  B  010A  010A.1901
 ?  A  010A  010A.1901
 ?  9  010A  010A.1901
 ?  8  010A  010A.1901
 ?  7  010A  010A.1901
 ?  6  010A  010A.1901
 ?  5  010A  010A.1901
 ?  4  010A  010A.1901
 ?  3  010A  010A.1901
 ?  2  010A  010A.1901
 ?  1  010A  010A.1901

By my understanding the lack of any '?' or '_' characters in the initial long line ought to imply that it's passing the self test - but I'm puzzled to see B to F repeated in that. The following lines, however, I don't know how to interpret. One '?' seems to imply a "soft error" for the device, but if that's the case why would all the devices have the same error code and what does it mean? Meanwhile the diagnostic led code on the back is 1000 0000 (i.e. bit 7 set only) which seems to mean the self-test is complete with (presumably) the substate of 0 meaning no failures.

No further output appears however long I leave it at this point. Hitting the "Halt" button on the back or sending Break just restarts the boot process from scratch. I havent found any way to get it to the chevron prompt from which the test results might be shown. I have the manuals available on bitsavers but they don't seem to have the answers on how to get further.

Any suggestions? I'm kind of hoping I'm just missing something blindingly obvious :D
 
I don't know what it is worth but I asked Bard, the large language model from Google. Here is what it told:
The error messages on the boot screen indicate that there are problems with all of the SCSI devices on the system. Each "? n 010A 010A.1901" message corresponds to a different SCSI device, where "n" is the SCSI device ID. The "010A" and "010A.1901" characters are the error codes for the SCSI device.

The specific cause of these errors can vary depending on the specific SCSI devices and the system configuration.

etc...

It's probably only LLM hallucinations but who knows...
 
Unlike previous generations of DEC machines which included not only test documentation but its source code, the Microvax era tests are a deep, dark secret. Ran into this here with a MVax 3100-96. It was failing a test consistently and dumping a lot of information. But unfortunately I was never able to find a shred of information on how to interpret it.
 
(Out for the day... back on the case now)

So, @jplr I think that LLM is just rambling; amongst other things this is a SCSI-1 system and I don't think you could have more than 8 devices on the bus before SCSI-2! Anyway as far as I've been able to follow from the VS3100 maintenance guide that's the status table and the columns in it reflect:
  • An indicator as to whether it's a soft or hard error (? or ?? respectively)
  • The test number - they correspond to the tests of the long line, so e.g. device E is for the Time-of-year clock.
  • The device ID (whatever that means in this context? But SCSI device IDs would be at most 2 digits)
  • The status code
Nothing in the doc suggests that you'd stall at this point and not get to the boot chevron (but nothing says you won't either). I suppose it's possible they changed everything with later versions of the boot rom (the maintenance guide explicitly mentions V1.0 whereas I see V1.5) but hopefully (?) not. Unfortunately the maintenance guide does say...
Any code in the configuration table other than 0000.0001 on the MONO, DZ, MM, FP, IT, or SYS devices indicates a hard error and the system module must be replaced for proper operation of the system
...so that looks bad, although the long line seems to be saying that there are no "hard errors" so I don't know what to think.

It's also a little concerning that device "4" is shown with the three dots in the long line (i.e. soft errors only) because that's supposed to be the 8 plane graphics module and I haven't got one so it ought to be showing with underscore to indicate "option not installed." Sigh. The VaxStation also doesn't have an internal battery currently (a pity as otherwise @thephysicist 's suggestion would seem very plausible).

Anyway, let me give it a whirl with the SCSI board unplugged and see if there's any difference... (does thing) ...and there is, but only in a small way; every test in the status table now has a status code of 010A.1D01 which is perplexing. Also devices 6 & 7 definitely ought to be showing as 'not present' (underscore) because they're supposed to be the SCSI board's bus controllers. Hey ho.

@shirsch I'm saddened to hear that you couldn't find more info on this either.

@g4ugm Yes, console mode is enabled (switch up) per the docs. With the switch in the other position there's no output at all. Oh and I'm plugged into the printer port also per the docs :D

I guess I'll try unplugging one of the memory modules... and no change. Well, I'm stumped at this point :'(

Edit: Ah. Since I was pulling bits out, I just popped out the motherboard to have a good look and I guess this is why the battery was removed:
IMG_20240121_193118758_HDR.jpg

Some corrosion. I guess I'm likely going to have to replace the board, but I'll see if I can clean it up and get away with it first.
 
Last edited:
A leaking NiCad battery is most likely responsible for killing my 3100 M96 motherboard. I gave it a soak in vinegar, cleaned off the obvious corrosion and studied it under a magnifier until my eyes crossed. No obviously digested traces or other chemical damage but it fails test 42, subtest 0 every time. There's a cryptic test name: "Memory_shorts". Could mean anything, but changing out the SIMM modules didn't get it going. Ended up buying an M90 motherboard on eBay. Hopefully that will result in a working system.
 
Maybe it can be saved with a cleaning and a few bodge wires. If not the "Bits Please" guy on ebay with all the 3100s has three M38s left. They are pricey ATM, but he is ameniable to offers. I got a few M76s from him for a great price. He has been sitting on those M38s for a while now so may go cheap if you want a spare for parts. Cant hurt to try.
 
Back
Top