because a space was typed by mistake before the comma. Requiring a comment to start with a special character would have caught that error.
That's why I strip all whitespace that isn't in quotes at assembly time... My assembler has very simple rules.
* Colons mean a label = PC,
* Semicolon = comment until EOL.
* Space is ignored unless in quotes
* Only the same kind of quote terminates a quote, so 'x' and "x" are the same, and " ' " and ' " ' are both valid as a literal "quoted" character.
$=Hex
%=Binary
Anything else is decimal.
32 byte limit on variable names ( it ignores the rest, which possibly isn't ideal... but using the same variable twice will generally cause a value reassignment error. )
Once that's done, there's an order to checking, so if it's LD, then it goes to the LD routine, which goes looking for the next component until the comma. It then processes that and what's after it, and gets both sides of the argument, terminating the argument at either EOL or semicolon. Also it works out for itself whether its r or rr or something else by an order of doing things, by eliminating things it's not sure about early, such as 8 bit operations before 16 bit.
And it looks for things like ,( to indicate whether it's indirect or direct (since all whitespace is removed, it's always ,( and not , (
It's a simple way to write an assembler, but it has it's advantages too, such as not caring too much about formatting, case or even if a directive is preceeded by a period.
I started as a single pass, but once the single pass elements were complete, and standin values were assigned to unknown variables, it was easier to just perform a second pass as masked unknown values from creating errors in the first pass rather than backfilling them and looking for ones still missing. Everything is stored as strings and arrays in BASIC, so it's not very efficient, but it doesn't matter much on a modern PC.
The maths evaluation routine is called regardless of the value, whether hex, variables, multiple variables or operators, and it processes in forward notation ( eg, 1 + 3 * 3 = 12 ) Which again is lazy, but easy to think about. Also, it doesn't care if there's a number of a variable. If any variable value get reassigned ( except for those that are unknown ) then it generates a warning. It just throws all the errors and warning it detects into a big text string, and spits them out at the end and refuses to save the binary in most cases, though some warnings will still produce code.
Also, not caring about whitespace means I can space values as I need for readability eg,%1101 0110 1010 0000 is easier to reach and check than %1101011010100000 - Same with hex and digits.
Also it checks for 8 bit values in variables for 8 bit operations, and 16 bit values for 16 bit operations and will error if it happes, which I noted some assemblers do not intentionally.
The only thing I intend to add at the moment, is a way to assemble at a different offset, but append linearly - eg, if I do org $100 then I want to be able to do somthing like org-offset 8000 which will consider all absolute jumps and labels with colons after that point to be (label+8000) so I can create code that I intend to relocate later without having a huge file.
Though it's my first time at writing an assembler, and I never intended it to be as complicated as it got... I'd definitely plan it out if I did it again, rather than just adding stuff as I thought of it. But the architecture did make it easy to "bolt on" new functions.
David.