• Please review our updated Terms and Rules here

Programming XMS memory usage from MS-DOS 6.22 with Borland Turbo C++ 3.0 (real mode)

juj

Member
Joined
Jan 25, 2022
Messages
32
Hi all,

I am participating to Advent of Code ( https://adventofcode.com/ ) this year on a MS-DOS 386SX 16 MHz/2MB RAM/Borland Turbo C++ 3.0 setup as a computing theme. I've got through all the puzzles (1-18) so far on the machine, although in one of the puzzles I find that I would like to squeeze out more memory usage from the machine to optimize runtime performance.

As one possible direction, I am trying to understand how XMS memory works, and maybe use that. There are two good looking sources that I found for reference:

1) http://www.techhelpmanual.com/943-extended_memory_specification__xms_.html
2) http://www.phatcode.net/res/219/files/xms20.txt

It all seems straightforward, and I am able to code up a program based on that, but I have a problem: I don't quite understand how I can access the memory I allocate with XMS function 09h: http://www.techhelpmanual.com/954-xms_09h__allocate_extended_memory_block.html . The function returns a handle to the memory. The function 0Ch can be used to lock the handle and get a 32-bit memory address to that memory block: http://www.techhelpmanual.com/957-xms_0ch__lock_extended_memory_block.html .

However, in real mode, can I even access such a 32-bit memory address directly? (e.g. via a MK_FP(seg, ofs) type of construct, or similar?)

The only functions besides the locking and unlocking I find, are ones that perform a memcpy between conventional memory and XMS memory (identified by a handle). Is that the only way to access XMS memory from a real-mode program? i.e. by memcpying it over to conventional memory, and then back?

Or is there a way to memory map the XMS memory to address space of the real mode program so that it can be directly byte addressed?

I am starting to think that maybe there isn't (after all, real mode is real mode?), and so the handle-based mechanism is the only thing I can get with XMS?

In that case, I am pondering whether EMS would be more appropriate for my use case.

What I am trying to implement is a large hash table cache data structure, where each cache entry is of 6 bytes in size. The hot inner loop of my program reads and writes these cache entries. It feels like if my access to the cache would consist of 6-byte XMS<->conventional memcpy operations, then it might not be that fast to do with XMS? Or is there a direct way to access the XMS memory?

The accesses to the cache memory locations are practically random, so with EMS I suppose I would be switching the active EMS pages on-demand in the page frame if the next memory access would not lie in the same page than the currently mapped one. Since the lookups are random, this would also amount to a lot of page frame switches, which might be quite slow as well.

What would my options be here - what do you think would be the fastest method? (I have already stolen other memory blocks I can find, e.g. video memory. Haven't consumed UMB and HMA yet, but those are also on the list)

Thanks, and have a merry christmas to you all!
 
But if you want to go through DOS calls, the incantation is something along this line: Using the multiplex interrupt you can issue calls to allocate a block of XMS, free a block of XMS, etc. But you can't directly access XMS memory--you can only issue calls to move to or from it. Recall that switching from protected to real mode on the 80286 was originally done by saving the register file and rebooting--there's a byte in "CMOS" memory queried by the BIOS POST that tells the BIOS why the system was booted. Later, unreal mode enabled XMS routines to do their thing without rebooting.

It's all a hack anyway--but here's the seminal document: http://www.phatcode.net/res/219/files/xms30.txt

I used XMS back in the day to store image data when copying diskettes. Staged operation: base memory->EMS->XMS->hard disk in that order. I can recommend this if you're not sure what sort of system you're running on. For example, if you have an 8088 with an EMS card, XMS doesn't exist. You can also have an 80386 running dos without an EMS driver, but with XMS.
 
Last edited:
Thanks all, super informative and these clarify the situation a lot!

Given the requirement to block copy the data back and forth, it feels like XMS might not be a good solution then for this kind of fast random access over 6 byte long cache records at a time. I imagine the copy overhead would dominate if naively done on such small amounts at a time.

I think I'll try grabbing all the UMB and HMA next, and then look into EMS.

I had the impression that XMS kind of replaced or obsoleted EMS as the newer and better feature. It seems that this might not be so for all use cases, e.g. this kind of random hash map cache access patterns. With EMS I suppose I could get opportunistically lucky, if subsequent cache accesses would at least hit the same EMS page, so the EMS page remap operation overhead would be amortized a bit.

What is your gut feeling on the general overhead comparison: how slow would changing EMS page mapping be (on a 386, where EMM386.EXE is implementing EMS) vs a tiny 6-byte XMS block copy? I am guessing those would be roughly the same order of magnitude in terms of overhead, since they both amount to an interrupt call?

Maybe I'll find that either of them will be prohibitively slow, and UMB and HMA will be my best bet here.
 
XMS is simply a system for managing extended memory, with some helper functions for real mode. It is possible to directly access extended memory (including that managed by XMS) in real mode with loadall (286) or "unreal mode" (386+). This is how XMS managers like HIMEM.SYS work internally.

The XMS move function was intended for 286 era real mode programs to easily utilize extended memory without switching to protected mode or using the aforementioned hacks. It was slower than direct access, but it was fine for things like disk caching, ram drives, print spoolers, etc. As far as indirect XMS vs EMS performance, that's going to depend on your application and you can't really know until you test it.

With the introduction of the 386, protected mode was improved and easier to use (DPMI) and is really the "new and better feature". But you need a compiler that supports it.
 
Last edited:
Makes sense. One part of the challenge was to specifically do Borland Turbo C++ 3.0. (I did eventually switch to DJGPP+RHIDE as my main compiler target, which was all protected mode)

I did a conclusions writeup of my Advent of Code challenge here: http://clb.confined.space/aoc2022/

Wish you all a wonderful end of the year!
 
Back
Top