• Please review our updated Terms and Rules here

8088 assembly: Funny string stuff

otacon14112

Experienced Member
Joined
Apr 19, 2012
Messages
115
Location
Iowa, United States
Now that I got my 20x4 LCD display working on my 8088 homebrew computer, I'm playing around with strings. I'm trying to learn proper string manipulation, and so I thought I'd start with trying to find the length of a string. It sounds simple, especially since there is scasb. But I'm doing something wrong. Here's my code so far:

Code:
;              ************************************************
;              *                                              *
;              *  Experimenting with strings and their length *
;              *                                              *
;              ************************************************
;
;
;

section		.data
_1KB		equ	1024
_2KB		equ	2048
_32KB		equ	32*_1KB
_128KB		equ	128*_1KB
_256KB		equ	256*_1KB
PORT1		equ	0x01
PORT2		equ	0x02
ROM_SIZE	equ	_128KB		; Set size of ROM here

org		0x100

section		.text
start:		
		; Set stuff up
		mov	sp,0xFFFF	; Initialize the stack pointer to 64KB.
					;   This is 384KB in memory, 64KB 
					;   above the SS 
		mov	bx, ss		; Initialize BX to 0, since SS hasn't
					;   changed yet
		mov	ax,0x5000 	; Initialize SS to be at 320KB,
		mov	ss,ax		;   64KB below the end of RAM
		call	InitPorts	; Set up the 8255
		call	InitDelay	; Give the LCD time to self-initialize
		call	InitLCD		; Run the initialization sequence

		mov	al,"H"
		call	PrintChar
		mov	al,"e"
		call	PrintChar
		mov	al,"l"
		call	PrintChar
		mov	al,"l"
		call	PrintChar
		mov	al,"o"
		call	PrintChar
		mov	al, ":"
		call	PrintChar

		; Now try to use the stack to store a string
		push	0x00		; Null char to terminate string
		push	"g"
		push	"n"
		push	"i"
		push	"r"
		push	"t"
		push	"s"

		mov	ax, sp 		; Put address of string into AX
		call	GetStrLen

		mov	al, cl		; Put str len into AL
		add	al, 48		; Convert the number to ASCII
		call	PrintChar

		jmp	$

GetStrLen:
		mov	di, ax		; Move string address into di
		mov	cx, 0xFFFF	; Initialize CX
		sub	al, al		; Initialize AL
		cld
	repne	scasb
		not	ecx
		sub	ecx, 1		; Get the length of the string
		ret

InitDelay:	mov	word [bx],0x01FF; Set the countdown timer.
StartInitDel:	dec	word [bx]	; Decrement it by 1 each time.  
		cmp	word [bx],00h	; If the timer has counted down
					; all the way, return to the 
					; 'nextloop' label so that it
					; can move on to the next hex
					; value to display on the LEDs.  
		jnz	StartInitDel	; If the counter hasn't counted
					; down to 00h yet, keep going.
		ret 

CharDelay:	mov	word [bx],0x001F; Set the countdown timer.
StartCharDel:	dec	word [bx]	; Decrement it by 1 each time.  
		cmp	word [bx],00h	; If the timer has counted down
					; all the way, return to the 
					; 'nextloop' label so that it
					; can move on to the next hex
					; value to display on the LEDs.  
		jnz	StartCharDel	; If the counter hasn't counted
					; down to 00h yet, keep going.
		ret 

LatchDelay:	mov	word [bx],0x002F; Set the countdown timer.
StartDel:	dec	word [bx]	; Decrement it by 1 each time.  
		cmp	word [bx],00h	; If the timer has counted down
					; all the way, return to the 
					; 'nextloop' label so that it
					; can move on to the next hex
					; value to display on the LEDs.  
		jnz	StartDel	; If the counter hasn't counted
					; down to 00h yet, keep going.
		ret 

PrintChar:	
		push	ax		; Save the ASCII character
		mov	al,0x02		; Make RS high, E low
		out	PORT2,al
		pop	ax		; Get the ASCII character back
		out	PORT1,al	; Send the character to the display
		mov	al,0x06		; Make E and RS high
		out	PORT2,al
		call	CharDelay
		mov	al,0x02		; Make RS high, E low
		out	PORT2,al
		call	CharDelay
		ret 

LatchCMD:
		mov	al,0x04		; Make E high to latch the data
		out	PORT2,al 
		call	LatchDelay
		call	ClearPort2
		call	LatchDelay
		ret

ClearPort2:
		mov	al,0x00		; Clear the port
		out	PORT2,al
		ret

InitPorts:	
		mov	al,0x90		; This sets the 8255 to operate
                out	0x03,al		;   in Mode 0 (basic I/O)
					;   Input	Output  
					;   *****	******
					;   Port 0	Ports 1, 2 
		ret

InitLCD: 
		; Reset sequence 1
		mov	al,0x30
		out	PORT1,al
		call	LatchCMD

		; Reset sequence 2
		mov	al,0x30
		out	PORT1,al
		call	LatchCMD

		; Reset sequence 3
		mov	al,0x30
		out	PORT1,al
		call	LatchCMD 

		; 8-bit, 2 lines, 5x8 characters
		mov	al,0x38
		out	PORT1,al
		call	LatchCMD
		; End of step 1

		; Increment, and no display shift
		mov	al,0x06
		out	PORT1,al
		call	LatchCMD

		; Turn display on, cursor on, and do not blink the 
		; character at cursor
		mov	al,0x0C
		out	PORT1,al
		call	LatchCMD

		; Clear the display
		mov	al,0x01
		out	PORT1,al
		call	LatchCMD

		; DDRAM address set to home, position top left most character
		mov	al,0x80
		out	PORT1,al
		call	LatchCMD 
		ret 
			
		times	((ROM_SIZE-16) - ($-$$)) db 0
		db	0xEA			; far jump
		dw	start			; Sets the offset IP value
		dw	0x10000-(ROM_SIZE/16)-0x10; Target CS value
		times	ROM_SIZE - ($-$$) db 0

The display shows:
Code:
Hello:
and flickers. I have a 373 latching any data being sent out ports, and the eight LEDs flash as the LCD flickers. When I run my simple "Hello" program that only prints "Hello", it works just fine, and the last thing the 373 latches when it runs the simple "Hello" program is the LCD command (02h) to make it latch the command to print the last character, which is what it should do. But when I run this program, I have determined that when I call GetStrLen is when it acts funny.

I have tried to follow the advice of various articles showing how to use scasb. Maybe I'm not utilizing the stack correctly. The reason I pushed the string onto the stack, is because when I write more advanced programs, strings from input will not be hardcoded using "db", and so I was trying to experiment and practice how I would manipulate those strings.

I intended this program to:
  • Display "Hello:"
  • Load a string into memory with a trailing NULL character to symbolize the end of the string
  • Put the address of the string into what I've read are the necessary registers for scasb to do its job
  • Invert the value of CX
  • Decrement it by 1 to get the length of the string (which I read was a clever and quick way to find it)
  • This may be another part where I'm getting confused. I think the integer value of the length of the string would be in the lower byte, so I'm converting the lower byte to ASCII
  • Print the length

If anyone could help me I'd greatly appreciate. Thanks in advance! :D
 
Well, I just tried setting ES to SS to see if that worked, but it didn't. The display still flickers, and there's still IO going on after "Hello:" prints, which the LEDs/latch picks up, and nothing is printed after the ":".

Thinking about it now, I should have mentioned how much RAM I have in the system. I have 384KB of SRAM from 0h - 5FFFFh.
 
That was the other thing--on an 8088, there's no eax, ebx, ecx, etc.

What are you using for an assembler? Most decent assemblers allow for setting the target architecture and will err out when you try to use a feature not present on the CPU.
 
I've discovered the problem using debug. Upon dumping the stack, I found that each of the characters I pushed onto the stack were separated by a null character, because I was trying to push a single byte onto the stack, which is impossible because the stack is a word in width. This caused scasb to exit earlier than I intended, putting a "1" in CX.

So then, do strings normally get compacted together, with characters in both the H and L bytes of words on the stack? I guess so, right? Otherwise, they'd take up twice the space.

That was the other thing--on an 8088, there's no eax, ebx, ecx, etc.

What are you using for an assembler? Most decent assemblers allow for setting the target architecture and will err out when you try to use a feature not present on the CPU.
I use NASM. And that's a good idea; I really should specify to the assembler which CPU I'm writing for.
 
It's not that common to use the stack like that. A C-compiler, for example, might produce code that pass a word-size pointer to a string on the stack (when calling some subroutine), but not the whole string itself! The string would be stored as-is using db "string",0 somewhere in the data-segment.
 
Oh, I don't know.

Code:
void stringer( char *what)
{

    char s1[10[;

    strcpy( s1, what);
    printf( s1);
}

s1 would be onstack, no? Maybe not the best code in the world, but certainly within the constructs of C.
 
It's not that common to use the stack like that. A C-compiler, for example, might produce code that pass a word-size pointer to a string on the stack (when calling some subroutine), but not the whole string itself! The string would be stored as-is using db "string",0 somewhere in the data-segment.

Are you saying the programmer would have to hardcode db "string", 0 in the program? That would create a constant, which is not very helpful. A programmer would not know how long a string would be that a user would type. That's why pointers to bytes or pointers to chars are so helpful in C; instead of knowing how many bytes to allocate, you just tell it to start putting data at the address contained in the pointer, which is in RAM.

Or are you saying that RAM isn't necessarily a stack, depending on where the pointer to the string points to?
 
Oh, I don't know.

Code:
void stringer( char *what)
{

    char s1[10[;

    strcpy( s1, what);
    printf( s1);
}

s1 would be onstack, no? Maybe not the best code in the world, but certainly within the constructs of C.

The arguments passed to strcpy are addresses, pointers. Not the full string itself. But still, the strings would be in RAM, but now per, you have me wondering if they would be stored in a place in RAM that is not being used as a stack.
 
Yes, but the strcpy copies the string onto the stack, and the address of the string is passed in the call. So the string is contained on the stack. The method to get the string onto the stack is different, but the effect is the same.

You could even get closer using a varargs scheme and pass individual characters on the stack. Weird, yes, but C doesn't place any restrictions on weirdness.


Personally, if I'm programming in assembly and I'm dealing with character literals, I prefer letting the CALL do all of the work for me. e.g.

Code:
    CALL  LITOUT
    DB    'Hello, world',0
    ...

LITOUT:
     POP    BX
LITOUT2:
     MOV   AL,CS:[BX]
     INC    BX
     TEST  AL,AL
     JNZ    LITOUT4
     JMP    BX

LITOUT4:
     CALL   OUTCHAR
     JMP    LITOUT2

...or some such. For my ROM code, I use a somewhat more complicated version that allows for printf-type format strings and values passed in other registers.

On 8080/8085 architectures, PUSH can be a sneaky way scavenge some cycles when transferring data from a device. Much faster than a MOV M,A/INX H pair.
 
Last edited:
As a programmer who for years has programmed in many languages, the answer is it depends.

As a C programmer on a machine with loads of ram I would allocate a chunk of ram to use. As an assembler programmer I would use predefined general purpose buffers in ram.
If we are talking about string constants then, I would define them as a constant.

As an input buffer in assembler I'd define it at the end of the codes so an overflow will trigger an error instead of overwriting code (assuming a .com program where ES CS and DS are the same), however its is better to ensure you dont overflow in your input routine.

mylabel db BUFFERSIZE dup (?)


Are you saying the programmer would have to hardcode db "string", 0 in the program? That would create a constant, which is not very helpful. A programmer would not know how long a string would be that a user would type. That's why pointers to bytes or pointers to chars are so helpful in C; instead of knowing how many bytes to allocate, you just tell it to start putting data at the address contained in the pointer, which is in RAM.

Or are you saying that RAM isn't necessarily a stack, depending on where the pointer to the string points to?
 
Yes, but the strcpy copies the string onto the stack, and the address of the string is passed in the call. So the string is contained on the stack. The method to get the string onto the stack is different, but the effect is the same.

You could even get closer using a varargs scheme and pass individual characters on the stack. Weird, yes, but C doesn't place any restrictions on weirdness.


Personally, if I'm programming in assembly and I'm dealing with character literals, I prefer letting the CALL do all of the work for me. e.g.

Code:
    CALL  LITOUT
    DB    'Hello, world',0
    ...

LITOUT:
     POP    BX
LITOUT2:
     MOV   AL,CS:[BX]
     INC    BX
     TEST  AL,AL
     JNZ    LITOUT4
     JMP    BX

LITOUT4:
     CALL   OUTCHAR
     JMP    LITOUT2

...or some such. For my ROM code, I use a somewhat more complicated version that allows for printf-type format strings and values passed in other registers.

On 8080/8085 architectures, PUSH can be a sneaky way scavenge some cycles when transferring data from a device. Much faster than a MOV M,A/INX H pair.

Interesting. I like how you think, Chuck. I was on another site yesterday, and they acted like I was psycho for playing with an 8088. They even tried to persuade me to put it in a glass case and write code for CPUs with 32-bit registers or higher instead. Very insulting! I didn't even bother replying, because it was clear it wouldn't "register" with them. It was at that moment when I questioned why I was even on the site, and scolded myself for not immediately going here instead. I knew you guys would 100% think that messing with vintage stuff is encouraged. :D
 
Last edited:
I got two programs to work last night on the 8088:
A program to find the length of a string
A program that prints a string

I edited the post to combine both programs into one; they were the same anyway. This program was written for my 8088 homebrew computer, with 384KB of RAM beginning at the bottom of the memory map. The EEPROM is 128KB, which can be conveniently changed by setting the size in the .data section. If you use the program, you might have to change the ROM_SIZE constant and the SS and SP values.
Code:
;            ************************************************
;            *  Assembler: NASM                             *
;            *  Experimenting with strings and their length *
;            *                                              *
;            ************************************************
; GetStrLen takes the address of the first char of the string
;   and saves the length of the string, with the LSB in CL
;
; PrintStr takes the address of the first char of a string and
;   iterates through the string in memory, printing each char
;   as it encounters it, to the LCD display.
;
; Both subroutines return when they encounter a NULL 0x00 character.
; 

section		.data
_1KB		equ	1024
_2KB		equ	2048
_32KB		equ	32*_1KB
_128KB		equ	128*_1KB
_256KB		equ	256*_1KB
PORT1		equ	0x01
PORT2		equ	0x02
ROM_SIZE	equ	_128KB		; Set size of ROM here

org		0x100

section		.text
start:		
		; Set stuff up
		mov	sp,0xFFFF	; Initialize the stack pointer to 64KB.
					;   This is 384KB in memory, 64KB 
					;   above the SS 
		mov	bx, ss		; Initialize BX to 0, since SS hasn't
					;   changed yet
		mov	ax,0x5000 	; Initialize SS to be at 320KB,
		mov	ss,ax		;   64KB below the end of RAM
		call	InitPorts	; Set up the 8255
		call	InitDelay	; Give the LCD time to self-initialize
		call	InitLCD		; Run the initialization sequence

		mov	al,"H"
		call	PrintChar
		mov	al,"e"
		call	PrintChar
		mov	al,"l"
		call	PrintChar
		mov	al,"l"
		call	PrintChar
		mov	al,"o"
		call	PrintChar
		mov	al, ":"
		call	PrintChar

		; Now try to use the stack to store a string
		mov	ax, 0000h	; Null character to terminate string
		push	ax

		mov	ax, "ng"
		push	ax

		mov	ax, "ri"
		push	ax

		mov	ax, "st"
		push	ax
		
		mov	ax, sp 		; Put address of string into AX
		call	GetStrLen
		call	PrintStr

		mov	al, cl		; Put str len into AL
		add	al, 48		; Convert the number to ASCII
		call	PrintChar

		jmp	$

PrintStr:
		mov	bx, ax		; Move the address of the string into BX
printnextchar:
		cmp	byte [ss:bx], 0h; Compare the byte pointed to by BX
		jz	psfinished	; Jump if the zero flag has been set
		push	ax
		mov	al, byte [ss:bx]
		call	PrintChar
		pop	ax
		inc	bx		; If not finished, inc address by one 
		jmp	printnextchar
psfinished: 
		sub	bx, ax	; Subtract the address in BX from the address in AX
		mov	ax, bx	; AX now equals the number of bytes in our string
		ret

GetStrLen:
		mov	bx, ss
		mov	es, bx
		mov	di, ax		; Move string address into di
		mov	cx, 0xFFFF	; Initialize CX
		sub	al, al		; Initialize AL
		cld
	repne	scasb
		not	cx
		dec	cx		; Get the length of the string
		ret

InitDelay:	mov	word [bx],0x01FF; Set the countdown timer.
StartInitDel:	dec	word [bx]	; Decrement it by 1 each time.  
		cmp	word [bx],00h	; If the timer has counted down
					; all the way, return to the 
					; 'StartInitDel' label so that it
					; can move on to the next hex
					; value to display on the LEDs.  
		jnz	StartInitDel	; If the counter hasn't counted
					; down to 00h yet, keep going.
		ret 

CharDelay:	mov	word [bx],0x001F; Set the countdown timer.
StartCharDel:	dec	word [bx]	; Decrement it by 1 each time.  
		cmp	word [bx],00h	; If the timer has counted down
					; all the way, return to the 
					; 'StartCharDel' label so that it
					; can move on to the next hex
					; value to display on the LEDs.  
		jnz	StartCharDel	; If the counter hasn't counted
					; down to 00h yet, keep going.
		ret 

LatchDelay:	mov	word [bx],0x002F; Set the countdown timer.
StartDel:	dec	word [bx]	; Decrement it by 1 each time.  
		cmp	word [bx],00h	; If the timer has counted down
					; all the way, return to the 
					; 'StartDel' label so that it
					; can move on to the next hex
					; value to display on the LEDs.  
		jnz	StartDel	; If the counter hasn't counted
					; down to 00h yet, keep going.
		ret 

VeryLongDelay:	mov	word [bx],0xFFFF; Reset the countdown timer.
StartVLD:	dec	word [bx]	; Decrement it by 1 each time.  
		cmp	word [bx],00h	; If the timer has counted down
					; all the way, return to the 
					; 'StartVLD' label so that it
					; can move on to the next hex
					; value to display on the LEDs.  
		jnz	StartVLD	; If the counter hasn't counted
		ret			; down to 00h yet, keep going.  

PrintChar:	
		push	ax		; Save the ASCII character
		mov	al,0x02		; Make RS high, E low
		out	PORT2,al
		pop	ax		; Get the ASCII character back
		out	PORT1,al	; Send the character to the display
		mov	al,0x06		; Make E and RS high
		out	PORT2,al
		call	CharDelay
		mov	al,0x02		; Make RS high, E low
		out	PORT2,al
		call	CharDelay
		ret 

LatchCMD:
		mov	al,0x04		; Make E high to latch the data
		out	PORT2,al 
		call	LatchDelay
		call	ClearPort2
		call	LatchDelay
		ret

ClearPort2:
		mov	al,0x00		; Clear the port
		out	PORT2,al
		ret

InitPorts:	
		mov	al,0x90		; This sets the 8255 to operate
                out	0x03,al		;   in Mode 0 (basic I/O)
					;   Input	Output  
					;   *****	******
					;   Port 0	Ports 1, 2 
		ret

InitLCD: 
		; Reset sequence 1
		mov	al,0x30
		out	PORT1,al
		call	LatchCMD

		; Reset sequence 2
		mov	al,0x30
		out	PORT1,al
		call	LatchCMD

		; Reset sequence 3
		mov	al,0x30
		out	PORT1,al
		call	LatchCMD 

		; 8-bit, 2 lines, 5x8 characters
		mov	al,0x38
		out	PORT1,al
		call	LatchCMD
		; End of step 1

		; Increment, and no display shift
		mov	al,0x06
		out	PORT1,al
		call	LatchCMD

		; Turn display on, cursor on, and do not blink the 
		; character at cursor
		mov	al,0x0C
		out	PORT1,al
		call	LatchCMD

		; Clear the display
		mov	al,0x01
		out	PORT1,al
		call	LatchCMD

		; DDRAM address set to home, position top left most character
		mov	al,0x80
		out	PORT1,al
		call	LatchCMD 
		ret 
			
		times	((ROM_SIZE-16) - ($-$$)) db 0
		db	0xEA			; far jump
		dw	start			; Sets the offset IP value
		dw	0x10000-(ROM_SIZE/16)-0x10; Target CS value
		times	ROM_SIZE - ($-$$) db 0

I wanted to share this code in case it helps anyone.
 
Last edited:
...
Personally, if I'm programming in assembly and I'm dealing with character literals, I prefer letting the CALL do all of the work for me. e.g.
...
...or some such. For my ROM code, I use a somewhat more complicated version that allows for printf-type format strings and values passed in other registers.

Something very similar was done in the Z80 control side of the TRS-80 Model 16 Xenix system. The Z80 code:
Code:
    call  print
    ascii 'Hello!',0
cont:
...
print:
    ex   (sp),hl
lp: ld   a,(hl)
    inc  hl
    or   a
    jr   z,done
    call putc
    jr   lp
done:
    ex   (sp),hl
    ret

This dates from ca. 1982, but looks quite similar to your technique, Chuck.
 
This is a bit late of a response but I wanted to add that Microsoft C 5.10 does put local strings on the stack as Chuck suggested earlier. Given the following code ...

Code:
#include <string.h>

void strfoo(char *s)
{
	char buf[20];
	strcpy(buf, s);
}

void main(void)
{
	strfoo("Hello World!");
}

The assembly for strfoo is ...

Code:
	PUBLIC	_strfoo
_strfoo	PROC NEAR
	push	bp
	mov	bp,sp
	mov	ax,20
	call	__chkstk
;	s = 4
;	buf = -20
; Line 6
	push	WORD PTR [bp+4]	;s
	lea	ax,WORD PTR [bp-20]	;buf
	push	ax
	call	_strcpy
; Line 7
	mov	sp,bp
	pop	bp
	ret	
	nop	

_strfoo	ENDP

I have also observed that, unless explicitly telling the compiler to optimize for space, there will be empty gaps in the stack between odd-sized strings or variables since the data there must be WORD-aligned.
 
Just as a point of discussion, Turbo Pascal gives the user a choice: Immutable strings (which are copied onto the stack and can then be manipulated as a local variable, then discarded once the procedure/function ends) and mutable strings (where a pointer to the source string is passed, and manipulating the string will be immediate and permanent).
 
Something very similar was done in the Z80 control side of the TRS-80 Model 16 Xenix system. The Z80 code:

This dates from ca. 1982, but looks quite similar to your technique, Chuck.

Sure, I'm not claiming something new. I was doing this in the 1970s on x80 stuff, as were other people.

My point is that most messages (i.e. ASCII text) are, in fact, mostly read-only literals with just a few substitutions. So, if you're trying to make a small program in ROM, the technique eliminates the need for an extra instruction to load the address of the string. If you're reading a dump, it's also very convenient--you can see exactly where the message is invoked.
----
If anyone is really interested in this stuff, I can post tidbits of code.
 
Sure, I'm not claiming something new. I was doing this in the 1970s on x80 stuff, as were other people.

My point is that most messages (i.e. ASCII text) are, in fact, mostly read-only literals with just a few substitutions. So, if you're trying to make a small program in ROM, the technique eliminates the need for an extra instruction to load the address of the string. If you're reading a dump, it's also very convenient--you can see exactly where the message is invoked.
----
If anyone is really interested in this stuff, I can post tidbits of code.

Yes, please share it. I would really like to read it.
 
Back
Top