• Please review our updated Terms and Rules here

Exploring the NEC V20

GloriousCow

Experienced Member
Joined
Oct 28, 2022
Messages
121
I've got a new blog post out summarizing my findings fuzzing the NEC V20: https://martypc.blogspot.com/2024/05/exploring-nec-v20-cpu.html
Some various details on instruction differences vs the 8088 and 80188, undefined opcode behavior, and a general overview of the performance improvements made by NEC.

Given that I have a V20 under microcontroller control, if you have any questions about specific instruction execution scenarios on the V20 I can do my best to answer them in this thread.
I can simulate all of the pin inputs to the CPU, although I can't test interactions with an FPU, sadly.

Of course general questions about the V20 are welcome too.
 
One interesting thing to explore is the behavior of "PUSH reg" instructions with opcode FFh. Normally this encoding would only be used for memory operands, but on other processors it works for registers too. Not so on the NEC - some text file says it will simply act as a NOP, but from my own experiments there must be more going on: a single-step (and possibly also any other interrupt) following it will corrupt the stack.

But maybe we would need the microcode to figure out what is going on there...

Fun fact: the 80186 also allows AAM with a zero divisor, with the same result in AH/AL. All other x86's throw a divide exception.

edit: oops, it's the POP instruction which has the bug
 
Last edited:
One interesting thing to explore is the behavior of "PUSH reg" instructions with opcode FFh. Normally this encoding would only be used for memory operands, but on other processors it works for registers too. Not so on the NEC - some text file says it will simply act as a NOP, but from my own experiments there must be more going on: a single-step (and possibly also any other interrupt) following it will corrupt the stack.

But maybe we would need the microcode to figure out what is going on there...

Is there a particular encoding you'd like to have me try? I tried PUSH AX via FF F0.
Code:
Initial register state:
AX: 1234 BX: 1D00 CX: 0050 DX: 0200
SP: FFFE BP: 0882 SI: 0000 DI: 0200
CS: F000 DS: 1000 ES: 4000 SS: 1D81
IP: 0100
FLAGS: F087 1111oditSz0a0P1C


00000000 A:[F0100:00000]    M:... I:... P:.. CODE T1         [        ]        |
00000001   [F0100:20100] CS M:R.. I:... P:.. CODE T2         [        ]        |
00000002   [F0100:201FF] CS M:R.. I:... P:.. PASV T3 r-> FF  [        ]        |
00000003   [F0100:201FF] CS M:... I:... P:.. PASV T4         [        ]        |
00000004 A:[F0101:F0101]    M:... I:... P:.. CODE T1         [FF      ]        |
00000005   [F0101:20101] CS M:R.. I:... P:.. CODE T2        F[        ] q-> FF | GRP5 @ [F0100]
00000006   [F0101:201F0] CS M:R.. I:... P:.. PASV T3 r-> F0  [        ]        |
00000007   [F0101:201F0] CS M:... I:... P:.. PASV T4         [        ]        |
00000008 A:[F0102:F0102]    M:... I:... P:.. CODE T1         [F0      ]        |
00000009   [F0102:20102] CS M:R.. I:... P:.. CODE T2        S[        ] q-> F0 | PUSH
00000010   [F0102:20190] CS M:R.. I:... P:.. PASV T3 r-> 90  [        ]        |
00000011   [F0102:20190] CS M:... I:... P:.. PASV T4         [        ]        |
00000012 A:[F0103:F0103]    M:... I:... P:.. CODE T1         [90      ]        |
00000013   [F0103:20103] CS M:R.. I:... P:.. CODE T2         [90      ]        |
00000014   [F0103:20190] CS M:R.. I:... P:.. PASV T3 r-> 90  [90      ]        |
00000015   [F0103:20190] CS M:... I:... P:.. PASV T4         [90      ]        |
00000016 A:[2D80C:2D80C]    M:... I:... P:.. MEMW T1         [9090    ]        |
00000017   [2D80C:1D834] SS M:.A. I:... P:.. MEMW T2         [9090    ]        |
00000018   [2D80C:1D834] SS M:.AW I:... P:.. PASV T3 <-w 34  [9090    ]        |
00000019   [2D80C:1D834] SS M:... I:... P:.. PASV T4         [9090    ]        |
00000020 A:[2D80D:2D80D]    M:... I:... P:.. MEMW T1         [9090    ]        |
00000021   [2D80D:1D812] SS M:.A. I:... P:.. MEMW T2         [9090    ]        |
00000022   [2D80D:1D812] SS M:.AW I:... P:.. PASV T3 <-w 12  [9090    ]        |
00000023   [2D80D:1D812] SS M:... I:... P:.. PASV T4         [9090    ]        |


AX: 1234 BX: 1D00 CX: 0050 DX: 0200
SP:*FFFC BP: 0882 SI: 0000 DI: 0200
CS: F000 DS: 1000 ES: 4000 SS: 1D81
IP: 0102
FLAGS: F087 1111oditSz0a0P1C

It appears to operate normally. You can see AX indeed pushed to the stack in the last two MEMW bus cycles (34, then 12) AX==1234
 
8F with reg != 0 is undefined and extremely weird on the 8088 too, and I never quite figured out the rules for its behavior.

Let's assemble the following:
Code:
    push ax
    db 0x8f
    db 0xc1
    nop
    nop

This completely breaks the V20.

Code:
Initial register state:
AX: 1234 BX: 1D00 CX: 0050 DX: 0200
SP: FFFE BP: 0882 SI: 0000 DI: 0200
CS: F000 DS: 1000 ES: 4000 SS: 1D81
IP: 0100
FLAGS: F087 1111oditSz0a0P1C


00000005   [F0101:20101] CS M:R.. I:... P:.. CODE T2        F[        ] q-> 50 | PUSH @ [F0100]
00000006   [F0101:2018F] CS M:R.. I:... P:.. PASV T3 r-> 8F  [        ]        |
00000007   [F0101:2018F] CS M:... I:... P:.. PASV T4         [        ]        |
00000008 A:[F0102:F0102]    M:... I:... P:.. CODE T1         [8F      ]        |
00000009   [F0102:20102] CS M:R.. I:... P:.. CODE T2         [8F      ]        |
00000010   [F0102:201C1] CS M:R.. I:... P:.. PASV T3 r-> C1  [8F      ]        |
00000011   [F0102:201C1] CS M:... I:... P:.. PASV T4         [8F      ]        |
00000012 A:[2D80C:2D80C]    M:... I:... P:.. MEMW T1         [8FC1    ]        |
00000013   [2D80C:1D834] SS M:.A. I:... P:.. MEMW T2         [8FC1    ]        |
00000014   [2D80C:1D834] SS M:.AW I:... P:.. PASV T3 <-w 34  [8FC1    ]        |
00000015   [2D80C:1D834] SS M:... I:... P:.. PASV T4         [8FC1    ]        |
00000016 A:[2D80D:2D80D]    M:... I:... P:.. MEMW T1         [8FC1    ]        |
00000017   [2D80D:1D812] SS M:.A. I:... P:.. MEMW T2         [8FC1    ]        |
00000018   [2D80D:1D812] SS M:.AW I:... P:.. PASV T3 <-w 12  [8FC1    ]        |
00000019   [2D80D:1D812] SS M:... I:... P:.. PASV T4         [8FC1    ]        |
00000020 A:[F0103:F0103]    M:... I:... P:.. CODE T1        F[C1      ] q-> 8F | POP @ [F0101]
00000021   [F0103:20103] CS M:R.. I:... P:.. CODE T2        S[        ] q-> C1 |
00000022   [F0103:20190] CS M:R.. I:... P:.. PASV T3 r-> 90  [        ]        |
00000023   [F0103:20190] CS M:... I:... P:.. PASV T4         [        ]        |
00000024   [F0103:20190]    M:... I:... P:.. PASV T1         [90      ]        |
00000025   [F0103:20190]    M:... I:... P:.. PASV T1        F[        ] q-> 90 | NOP @ [F0103]
00000026 A:[2C80C:2C80C]    M:... I:... P:.. MEMR T1         [        ]        |
00000027   [2C80C:1C80C] SS M:R.. I:... P:.. MEMR T2         [        ]        |
00000028   [2C80C:1C800] SS M:R.. I:... P:.. PASV T3 r-> 00  [        ]        |
00000029   [2C80C:1C800] SS M:... I:... P:.. PASV T4         [        ]        |
00000030 A:[2C80F:2C80F]    M:... I:... P:.. MEMR T1         [        ]        |
00000031   [2C80F:1C80F] SS M:R.. I:... P:.. MEMR T2         [        ]        |
00000032   [2C80F:1C800] SS M:R.. I:... P:.. PASV T3 r-> 00  [        ]        |
00000033   [2C80F:1C800] SS M:... I:... P:.. PASV T4         [        ]        |
00000034 A:[F0104:F0104]    M:... I:... P:.. CODE T1         [        ]        |
00000035   [F0104:20104] CS M:R.. I:... P:.. CODE T2         [        ]        |
00000036   [F0104:20190] CS M:R.. I:... P:.. PASV T3 r-> 90  [        ]        |
00000037   [F0104:20190] CS M:... I:... P:.. PASV T4         [        ]        |
00000038 A:[F0105:F0105]    M:... I:... P:.. CODE T1         [90      ]        |
00000039   [F0105:20105] CS M:R.. I:... P:.. CODE T2        F[        ] q-> 90 | NOP @ [F0104]
00000040   [F0105:20190] CS M:R.. I:... P:.. PASV T3 r-> 90  [        ]        |
00000041   [F0105:20190] CS M:... I:... P:.. PASV T4         [        ]        |
00000042 A:[F0106:F0106]    M:... I:... P:.. CODE T1         [90      ]        |


AX: 1234 BX: 1D00 CX: 0050 DX: 0200
SP: FFFE BP: 0882 SI: 0000 DI: 0200
CS: F000 DS: 1000 ES: 4000 SS: 1D81
IP: 0105
FLAGS: F087 1111oditSz0a0P1C

Note the stack read happens in the following NOP instruction! If I do not pad nops after the POP then the register store program fails. Depending on the following instruction, I can see how it could fail in spectacular ways.
 
Does the V20 have different microcode for prefixed and non-prefixed CMPS instructions?

That's how Intel did it in newer processors, starting with the 286. The "F1" flag and associated weirdness with MUL/DIV (and BOUND: https://news.ycombinator.com/item?id=34334799) only exists on the 80(1)86 / 88.

Intel documentation states that the nesting level is determined by the second immediate operand modulo 32. The V20 apparently ignores this detail and uses the unmasked value of the immediate.

The 186 also doesn't mask it, and I don't think it's documented to do it either. That was added in the 286.

edit:

Wow, that's interesting. Am I reading this correctly that it doesn't even read from the correct address (64K too low, and incrementing by 2 before the second byte)?
 
Last edited:
That's how Intel did it in newer processors, starting with the 286. The "F1" flag and associated weirdness with MUL/DIV (and BOUND: https://news.ycombinator.com/item?id=34334799) only exists on the 80(1)86 / 88.

The 186 doesn't mask it either, and I don't think it's documented to do it either. That was added in the 286.

Good to know, thanks. NEC didn't really document whether they were "borrowing" instruction behavior from the 186 or the 286.
 
Wow, that's interesting. Am I reading this correctly that it doesn't even read from the correct address?
Let me set a less random value for SS, 3000h.

Code:
00000005   [F0101:20101] CS M:R.. I:... P:.. CODE T2        F[        ] q-> 50 | PUSH @ [F0100]
00000006   [F0101:2018F] CS M:R.. I:... P:.. PASV T3 r-> 8F  [        ]        |
00000007   [F0101:2018F] CS M:... I:... P:.. PASV T4         [        ]        |
00000008 A:[F0102:F0102]    M:... I:... P:.. CODE T1         [8F      ]        |
00000009   [F0102:20102] CS M:R.. I:... P:.. CODE T2         [8F      ]        |
00000010   [F0102:201C1] CS M:R.. I:... P:.. PASV T3 r-> C1  [8F      ]        |
00000011   [F0102:201C1] CS M:... I:... P:.. PASV T4         [8F      ]        |
00000012 A:[3FFFC:3FFFC]    M:... I:... P:.. MEMW T1         [8FC1    ]        |
00000013   [3FFFC:1FF34] SS M:.A. I:... P:.. MEMW T2         [8FC1    ]        |
00000014   [3FFFC:1FF34] SS M:.AW I:... P:.. PASV T3 <-w 34  [8FC1    ]        |
00000015   [3FFFC:1FF34] SS M:... I:... P:.. PASV T4         [8FC1    ]        |
00000016 A:[3FFFD:3FFFD]    M:... I:... P:.. MEMW T1         [8FC1    ]        |
00000017   [3FFFD:1FF12] SS M:.A. I:... P:.. MEMW T2         [8FC1    ]        |
00000018   [3FFFD:1FF12] SS M:.AW I:... P:.. PASV T3 <-w 12  [8FC1    ]        |
00000019   [3FFFD:1FF12] SS M:... I:... P:.. PASV T4         [8FC1    ]        |
00000020 A:[F0103:F0103]    M:... I:... P:.. CODE T1        F[C1      ] q-> 8F | POP @ [F0101]
00000021   [F0103:20103] CS M:R.. I:... P:.. CODE T2        S[        ] q-> C1 |
00000022   [F0103:20190] CS M:R.. I:... P:.. PASV T3 r-> 90  [        ]        |
00000023   [F0103:20190] CS M:... I:... P:.. PASV T4         [        ]        |
00000024   [F0103:20190]    M:... I:... P:.. PASV T1         [90      ]        |
00000025   [F0103:20190]    M:... I:... P:.. PASV T1        F[        ] q-> 90 | NOP @ [F0103]
00000026 A:[3EFFC:3EFFC]    M:... I:... P:.. MEMR T1         [        ]        |
00000027   [3EFFC:1EFFC] SS M:R.. I:... P:.. MEMR T2         [        ]        |
00000028   [3EFFC:1EF00] SS M:R.. I:... P:.. PASV T3 r-> 00  [        ]        |
00000029   [3EFFC:1EF00] SS M:... I:... P:.. PASV T4         [        ]        |
00000030 A:[3EFFF:3EFFF]    M:... I:... P:.. MEMR T1         [        ]        |
00000031   [3EFFF:1EFFF] SS M:R.. I:... P:.. MEMR T2         [        ]        |
00000032   [3EFFF:1EF00] SS M:R.. I:... P:.. PASV T3 r-> 00  [        ]        |
00000033   [3EFFF:1EF00] SS M:... I:... P:.. PASV T4         [        ]        |

You are correct, we push to FFFC and FFFD, and read from EFFC and EFFF?. Crazy.
 
for some reason D6 takes 14-18 more cycles than D7. If anyone has a clue why this might be, please let me know.

Somewhat unlikely idea: Could it be falling through from another part of the microcode? Would depend on how the ROM is addressed: on the 286, there is a separate PLA that selects an entry point based on the opcode and some state bits (0F prefix, REP prefix for string instructions, real/protected mode for segment register loads). Maybe on the V20 the address is instead something like opcode*constant?
 
Somewhat unlikely idea: Could it be falling through from another part of the microcode? Would depend on how the ROM is addressed: on the 286, there is a separate PLA that selects an entry point based on the opcode and some state bits (0F prefix, REP prefix for string instructions, real/protected mode for segment register loads). Maybe on the V20 the address is instead something like opcode*constant?

Just curious, how do you know so much about the 286 internals? :)
 
I suspect 8E mov cs, x is slightly broken as well. The register store program fails to run after it; even though it should not care that cs has changed (I keep virtual program counters, so it will feed the store program on successive code fetches regardless of address).
 
Just curious, how do you know so much about the 286 internals? :)

From reading the patent (https://www.freepatentsonline.com/4442484.pdf), as well as long hours of staring at a slightly-too-low-resolution die shot (http://visual6502.org/images/pages/Intel_80286_die_shots.html). Seems impossible to get any bits out of the microcode ROM itself, but the decode PLA below it is slightly less impossible - enough to get an idea of how it's organized, even if a complete dump would take a lot more guesswork.

I hope some day Ken Shirriff will get to reverse engineering this chip :)
 
I personally 'begged' Ken to look at the 286 when I met him at VCF East. We'll see, I suppose :)
 
Last edited:
Additional discovery: 0F 20, ADD4S

The NEC manual makes this note:
bcd_note.png

Notice, *254* digits. If you give the BCD string instructions a count of 255, they never terminate.

This makes a certain sense if you imagine that the internal loop counter (note, CL is not modified) is only modified in increments of 2. If incremented, it will count from 0-254 then overflow.

The instruction will also never terminate if given CL == 0, which is a little harder to explain. One might assume that they would have checked for 0.
 
Last edited:
That's really strange. I assume it does terminate with odd count values, right? Then it would normally have to stop when the counter wraps, since it would never reach zero either.

Thinking more about the gaps in the opcode map:

62h BOUND (modrm, reads 2 words from memory)
63h modrm, reads 1 word, 60 clocks ???

D6h 14-18 clocks ??? + XLAT
D7h XLAT

F0h LOCK prefix
F1h does it do something?

It could be (again, making somewhat naive assumptions here) that 63h is running the second half of the BOUND microcode, and D6h something before XLAT? Possibly some fuzzing might reveal that the word read from memory isn't ignored, but compared with some - possibly internal - register and may invoke INT 05h under the right conditions.

You didn't mention F1h in the blog post; this is an alias for LOCK on the 8086, causes INT 06h on the 186, and is an undocumented prefix for ICE on the 286. Does your bus monitor show anything happening for it?
 
That's really strange. I assume it does terminate with odd count values, right? Then it would normally have to stop when the counter wraps, since it would never reach zero either.

Odd values are valid. I am suspecting that the logic is something like:
Code:
terminating_count = CL;
while (counter < terminating_count) {
 // do operation
 counter += 2;
}
if (counter != terminating_count) {
  // handle odd nibble
}

This logic explains why CL==255 loops forever, but not why CL==0 does...

Also, CL is never updated. So I don't imagine that these string BCD ops are interruptible. However, with a maximum count of 254 that's probably not an issue.

It could be (again, making somewhat naive assumptions here) that 63h is running the second half of the BOUND microcode, and D6h something before XLAT? Possibly some fuzzing might reveal that the word read from memory isn't ignored, but compared with some - possibly internal - register and may invoke INT 05h under the right conditions.
I fuzzed 0x63 for a while. I don't have an exact count but it ran for a few hours. It never did anything besides read its EA operand.

You didn't mention F1h in the blog post; this is an alias for LOCK on the 8086, causes INT 06h on the 186, and is an undocumented prefix for ICE on the 286. Does your bus monitor show anything happening for it?
I admit I skipped testing F1. I'm generating 0F tests presently but I'll test it in a bit.
 
Last edited:
Odd values are valid. I am suspecting that the logic is something like:
Code:
terminating_count = CL;
while (counter < terminating_count) {
 // do operation
 counter += 2;
}
if (counter != terminating_count) {
  // handle odd nibble
}

But I'm almost certain nobody with any experience would do it this way in assembler / microcode. It's more efficient to count down and test for zero/carry, instead of comparing with the maximum on each iteration.
 
But I'm almost certain nobody with any experience would do it this way in assembler / microcode. It's more efficient to count down and test for zero/carry, instead of comparing with the maximum on each iteration.

Agreed, but not much else makes sense. If we were counting down by 2, a count of 1 would underflow. Times like this you really wish you had the microcode to read :D

F1 appears to be treated as a prefix, at least. F1 90 produces the following:

Code:
00000058   [F0103:20190]    M:... I:... P:.. PASV T1        F[909090  ] q-> F1 | INVAL @ [F0100]
00000059   [F0103:20190]    M:... I:... P:.. PASV T1         [909090  ]        |
00000060 A:[F0104:F0104]    M:... I:... P:.. CODE T1        F[9090    ] q-> 90 | NOP @ [F0101]  
00000061   [F0104:20104] CS M:R.. I:... P:.. CODE T2         [9090    ]        |
00000062   [F0104:20190] CS M:R.. I:... P:.. PASV T3 r-> 90  [9090    ]        |

I don't currently have a command that returns the LOCK pin status. Give me a bit.
 
I've written an exhaustive test program for opcode 63h, the code does this:

Code:
        mov     bx,offset Bounds
        mov     si,offset TestVar
outer:  call    putvar          ;print value of TestVar
        xor     ax,ax
inner:  bound   ax,ds:[bx]      ;should never cause exception
        db      63h,04h         ;??? AX,[SI]
next:   inc     ax
        jnz     inner
        inc     word ptr ds:[si]
        jnz     outer

[...]

Bounds          dw -32768,32767 ;accept any value
TestVar         dw 0            ;will be read by opcode 63h

with an INT 05 handler that prints the value of AX if it should ever get called. The normal BOUND instruction is there to preload any internal registers in case that would have an effect.

The inner loop takes about a second on my palmtop, so running through every possible value for both AX and TestVar would be about 18 hours. Was hoping to get an early hit, but so far it looks like it really ignores the value read from memory...
 
Okay, I have a new theory about the BCD string operations.

Let's assume that it uses the internal loop counter, and the loop counter always counts down, but it always counts down by one.

So to initialize the loop counter, we'd use (CL+1)>>1 Because we can only read a whole byte. So if CL==1, 2>>1 = 1 byte. If CL==2, 3>>1 = also 1 byte. And so on.

(255+1)>>1 overflows to 0, and (0+1)>>1 is also 0. My guess is that the loop counter is decremented before it is checked against 0, so when initialized to 0 it underflows. But the loop counter is internally16 bits. So maybe the instructions don't run forever - the validator has an instruction timeout of 100,000 cycles, but if we're running through the whole 65k of a loop counter we could be going for a ride of over a million cycles.

EDIT:
Aaand there we go. CL=255 terminated after 1,441,867 cycles.

Code:
01441867   [401FF:00100]    M:... I:... P:.. PASV T1         [90909090]        |
01441868   [401FF:00100]    M:... I:... P:.. PASV T1        F[909090  ] q-> 90 | NOP @ [F0104]
01441869   [401FF:00100]    M:... I:... P:.. PASV T1         [909090  ]        |
01441870 A:[F0108:F0108]    M:... I:... P:.. CODE T1         [909090  ]        |

Not quite sure where we arrive at that exact figure. V20 documentation states that execution time is 19n, which would put us at 1,245,165. 1441867 is extremely close to 22n. Could the documentation have the wrong cycle counts?

Code:
01441821 A:[1FFFE:1FFFE]    M:... I:... P:.. MEMR T1         [90909090]        |
01441822   [1FFFE:3FFFE] DS M:R.. I:... P:.. MEMR T2         [90909090]        |
01441823   [1FFFE:3FF00] DS M:R.. I:... P:.. PASV T3 r-> 00  [90909090]        |
01441824   [1FFFE:3FF00] DS M:... I:... P:.. PASV T4         [90909090]        |
01441825   [1FFFE:3FF00]    M:... I:... P:.. PASV T1         [90909090]        |
01441826   [1FFFE:3FF00]    M:... I:... P:.. PASV T1         [90909090]        |
01441827   [1FFFE:3FF00]    M:... I:... P:.. PASV T1         [90909090]        |
01441828 A:[401FE:401FE]    M:... I:... P:.. MEMR T1         [90909090]        |
01441829   [401FE:001FE] ES M:R.. I:... P:.. MEMR T2         [90909090]        |
01441830   [401FE:00100] ES M:R.. I:... P:.. PASV T3 r-> 00  [90909090]        |
01441831   [401FE:00100] ES M:... I:... P:.. PASV T4         [90909090]        |
01441832   [401FE:00100]    M:... I:... P:.. PASV T1         [90909090]        |
01441833   [401FE:00100]    M:... I:... P:.. PASV T1         [90909090]        |
01441834   [401FE:00100]    M:... I:... P:.. PASV T1         [90909090]        |
01441835   [401FE:00100]    M:... I:... P:.. PASV T1         [90909090]        |
01441836   [401FE:00100]    M:... I:... P:.. PASV T1         [90909090]        |
01441837 A:[401FE:401FE]    M:... I:... P:.. MEMW T1         [90909090]        |
01441838   [401FE:00100] ES M:.A. I:... P:.. MEMW T2         [90909090]        |
01441839   [401FE:00100] ES M:.AW I:... P:.. PASV T3 <-w 00  [90909090]        |
01441840   [401FE:00100] ES M:... I:... P:.. PASV T4         [90909090]        |
01441841   [401FE:00100]    M:... I:... P:.. PASV T1         [90909090]        |
01441842   [401FE:00100]    M:... I:... P:.. PASV T1         [90909090]        |


01441843 A:[1FFFF:1FFFF]    M:... I:... P:.. MEMR T1         [90909090]        |

It certainly looks like one iteration is 22 cycles.
 
Last edited:
Back
Top