Intel 4040 little or big endian?

alank2 · Jan 24, 2024

I've done some searching on the web:

Which endian was the Intel 4004?

The Intel 4004 had 4-bit buses and data words, but the program counter and code address space was 12 bits. Was the 4004 little endian (like all of Intel's later microprocessors and microcontroller...

retrocomputing.stackexchange.com

The JUN/JMS instructions store addresses high-middle-low.

Also, the FIM instruction which picks up a byte from program memory (addressed as bytes) and puts the high nibble in the lower register (R0 for example) and the low nibble (R1 for example).

Is there agreement that the 4004 is big-endian?

Chuck(G) · Jan 24, 2024

I'd say neither, as endianess traditionally applies to the order of bytes in a word.

I have, however, had the experience of working on a bit-addressable supercomputer, so there's that.

cjs · Jan 24, 2024

Chuck(G) said:
I'd say neither, as endianess traditionally applies to the order of bytes in a word.

The 4004's 4-bit units are certainly not octets, but why are they not bytes?

Chuck(G) · Jan 24, 2024

The modern de facto standard of eight bits, as documented in ISO/IEC 2382-1:1993....
I've put plenty of time in on 6-bit and even BCD hardware and we didn't use the term "bytes:, either*. To the best of my memory, it was IBM who popularized the term "byte" with the IBM S/360. While other groupings have used the term, post 1965, byte was pretty much synonymous with 8 bits. And the 4004 is much later than 1965.

As far as I know, aside from my bit-addressable vector super, no machine architecture has ever been completely little-endian that is the lowest bit position in a word occupying the lowest bit address--and even then, the bytes, halfwords, fullwords, etc. were big-endian with respect to bit order.
--------------------
*Well, sort of. Back around 1975, there was a big discussion of how to represent the 8 bit character set on the CDC Cyber 70 series that employed 60-bit words. One approach was to pack 7.5 "bytes" in a word, which was whimsically referred to as a "snaque" (get it? Bigger than a byte). Eventually, a rather complex system of 6-and-12-bit character codes was adopted, so the "snaque" died a-borning.

cjs · Jan 24, 2024

Chuck(G) said:
The modern de facto standard of eight bits, as documented in ISO/IEC 2382-1:1993....

You mean the document that defines a "byte" as, "A string that consists of a number of bits, treated as a unit, and usually representing a character or a part of a character.... The number of bits in a byte is usually 8"? (Emphasis mine.)

pbirkel@gmail.com · Jan 24, 2024

cjs said:
The 4004's 4-bit units are certainly not octets, but why are they not bytes?

Nibbles? Half of an octet.

pbirkel@gmail.com · Jan 24, 2024

cjs said:
You mean the document that defines a "byte" as, "A string that consists of a number of bits, treated as a unit, and usually representing a character or a part of a character.... The number of bits in a byte is usually 8"? (Emphasis mine.)

"Standards" that avoid the word shall are a real pain. (Should, may, might, could, simply give folks ideas about how to lock in their own customer base.)

However ISO/IEC 2382-1:1993 is just about vocabulary and thus is pretty non-binding wherever you are. disque souple?

cjs · Jan 24, 2024

pbirkel@gmail.com said:
disque souple?

Sounds good to me! Now I have a place to save my couriel. (Which word, apparently, is now endorsed by the Académie française. Vive le Québec libre!)

Gary C · Jan 24, 2024

cjs said:
You mean the document that defines a "byte" as, "A string that consists of a number of bits, treated as a unit, and usually representing a character or a part of a character.... The number of bits in a byte is usually 8"? (Emphasis mine.)

I always thought byte referred to binary-eight so bi-eight so byte.

Timo W. · Jan 24, 2024

A byte has many definitions. The one that says "8 bits" normally refers to the definition that a byte is the smallest type of data used to encode a character. With 4 bits, you can not encode alpha-numeric characters at all, so that would not make it a byte.

However, a byte can also be defined as the smallest piece of data a given system can address with its address bus. In that case, for the 4004, 4 bits would be a byte and 8 bits would be a word.

No idea what definitions were valid when the 4004 came to market...

ps: someone should correct the typo in the title.

alank2 · Jan 24, 2024

Timo W. said:
ps: someone should correct the typo in the title.

Too late for it to let me edit it I guess. Luckily the same question applies to the 4040 too.

Chuck(G) said:
As far as I know, aside from my bit-addressable vector super, no machine architecture has ever been completely little-endian that is the lowest bit position in a word occupying the lowest bit address--and even then, the bytes, halfwords, fullwords, etc. were big-endian with respect to bit order.

I've always wondered about this; it does seem that they are little-endian as far as the byte order but not the bit order.

So, the reason I am asking is that some of the features of the 4004 emulator I am writing are conversions between byte memory (program memory) and nibble memory (data memory).

Code:

-D PGM:0-F
PGM:000    D1 FE F6 1A  00 D2 FE B0  FE B1 D3 FE  A0 FE A1 FE  ╤■÷..╥■░■▒╙■á■í■
-D DATA:0-F
DATA:000   0 0 0 0  0 0 0 0  0 0 0 0  0 0 0 0  ........

Given the way that the FIM instruction works if I copy from program memory to data memory, I'm planning on doing it big endian style when it comes to the nibbles like this to match:

Code:

DATA:000   D 1 F E  F 6 1 A  0 0 D 2  F E 8 0  ╤■÷..╥■░

Then there is a question of HOW to save nibble memory to intel hex which is a byte based format. I can convert from nibbles back to the format in PGM:000 above, but what to do with the address field in intel hex? convert it to? I suppose I could leave the addressing for nibbles, but then it would be a pretty strange intel hex that apparently skips addresses between lines.

pbirkel@gmail.com · Jan 24, 2024

alank2 said:
I suppose I could leave the addressing for nibbles, but then it would be a pretty strange intel hex that apparently skips addresses between lines.

Yes, but compatible ... right?

alank2 · Jan 24, 2024

I think compatible either way.

Way 1 would be to use byte addressing as intel hex is a byte format and then convert the data as it is being loaded/saved from nibble addressed memory. This would be the data represented as bytes and normal in every sense of the word in the intel hex file, though the addresses would be half of their addresses in nibble reality.

Way 2 would be to use the nibble addressing in the address field. So you would have a line starting at address 0 with say 16 bytes (32 nibbles) and then the next line would begin with address 32 instead of 16. My concern here would be that any intel hex tools might not be so happy about that, but maybe it is unfounded.

I'm leaning towards way 1 because the data is in one blob and it had to be converted to bytes anyway, might as well convert the addresses too, but I can be swayed if anyone has good arguments why way 2 is better.

Chuck(G) · Jan 24, 2024

Timo W. said:
A byte has many definitions. The one that says "8 bits" normally refers to the definition that a byte is the smallest type of data used to encode a character. With 4 bits, you can not encode alpha-numeric characters at all, so that would not make it a byte.

Depends on how a character is encoded. Consider 5 bit encodings using 5 level-TTY Baudot code, using "shift" characters. Reserve one of those 16 combos for a two-nibble shift value and you can encode 30 values, for example. As weird as it seems, it wasn't uncommon. I cited CDC Cyber 6 bit coding using certain reserved character values as escape codes to expand the character set. DEC WPS-8 does a similar thing to expand the 6-bit character set, using "shift" codes. What's a byte there? I have no idea. Then there's Shift-JIS...
The PDP-10 used 5 7-bit characters in a 36-bit word. If 7 bits was a "byte", what was the extra bit?

rmay635703 · Jan 24, 2024

Bytes, Nibbles and bits go back a very long time.

I highly doubt the definitions were different in 1970 than today.

Chuck(G) · Jan 24, 2024

Yousta be that core was spec-ed in "characters" or "digits" or "positions" or some such if not "words".

pbirkel@gmail.com · Jan 25, 2024

Chuck(G) said:
Yousta be that core was spec-ed in "characters" or "digits" or "positions" or some such if not "words".

For example, IBM 1401 spec'd as "characters". 6-bits each (B,A,8,4,2,1).

Chuck(G) · Jan 25, 2024

And the 1620 needed two 6-bit (CF8421) positions to represent an alphanumeric character, although it did have a numeric-only I/O mode (cf WNTY and WATY). And then, there was the numeric blank...

Sigh, it's a shame that the younger set aren't exposed to the wide varied architectures of early systems.

alank2 · Jan 25, 2024

Curiously would you guys explain B,A,8,4,2,1 or CF8421 ?

Chuck(G) · Jan 25, 2024

The 1620 was a variable word-length, nominally decimal machine. C was the parity bit, for some unknown reason made visible on the console display. F is the word marker"flag"/sign bit, depending on position. Numeric fields addressed by low order digit (highest address); records addressed by lowest-address. Records terminated by an xx8x2x marker.

Being variable word length, one could have a number that was the size of a memory bank (20,000 digits).

After that, it gets complicated....

Intel 4040 little or big endian?

Veteran Member

25k Member

Experienced Member

25k Member

Experienced Member

Veteran Member

Veteran Member

Experienced Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member

25k Member