For a non-isomorphism, consider 8086 assembly. What does "MOV" translate to? One thing that I found moderately amusing is that the same mnemonic and syntax translates to a different binary rendition when run through, say, MASM vs. DEBUG.
This is why I was careful to say that "Opcode mnemonics are
part of the syntax" and "there is a strict one between an 8080 opcode
and its operands." MOV itself could translate to multiple different opcodes; you need the operands in order to know the opcode.
Hmm, but when you do a LIST, you are effectively "de-compiling" the set of BASIC tokens back into a set of symbolic keywords. So, from a parsing perspective it just seems a lot of similar concepts: if your opcode or p-code is one byte, then you have a set of 256 symbols that get interpreted into some kind of operation or action.
No, tokenised BASIC is
very different from P-code. Tokenised BASIC is another form of source code; P-code is object code generated by a compiler that is one
implementation of a compilation of the the source code (out of many possible ones).
That there's a direct and easy isomorphism (performed by the tokeniser and the
LIST command) should make it clear that tokenisation is source code. And you can also observe that tokenisation does no checks of even syntax, much less semantics: you can type
10 =IF)7
into a C64, which will happily accept it, and show it back to you with
LIST, though this is nonsense as a BASIC program.
I'm just saying there is a reason BASIC became popular on those late 70s micros rather than something like FORTRAN or C.
Yes. And that would be because Paul Allen, Monte Davidoff and Bill Gates wrote MS-BASIC right at the start of the personal computer revolution, marketed it well, and sold enough of it early on that it essentially became a standard that soon (nearly) every manufacturer decided to follow.
They chose BASIC because it was what they happened to know; there were other options that were both more powerful and easier to implement, such as Lisp.
Implementing BASIC is similar to an assembler, in just being a kind of token-translator....
No, that's very, very wrong. To see why that's so wrong, write a little "token-translator" that can deal with something even as simple as:
Code:
10 J=0
20 GOSUB 100
30 END
100 IF J > 100 THEN RETURN
110 FOR I = 1 TO 3: J = J + I : NEXT
120 GOSUB 100
What you will come up with will be
very much different from an assembler.
Or just mentally think about how to implement a "real" high level language without a file system: you're parsing say 10KB of plain-text that's in RAM. That plain-text is parsed into multiple assembler files....
Well, yes, that would be very difficult, if you chose such a ridiculous way to implement it. Not even compilers with a file system available are quite so foolish.
If you're going to implement a compiled high-level language without a filesystem, you'd generally do it the same way assemblers without filesystems were done: you have an area of RAM for source code, another area of RAM for object code, and you directly assemble/compile the source to object code. There's no need for or point to a linker in such a situation; linkers are useful when you have library archives, which clearly you don't in that situation.
And, in fact, at last one (somewhat) high-level language was done in exactly this way. I can't recall the name of it now, but I believe it was something from Motorola designed for doing mathematical programs.