• Please review our updated Terms and Rules here

Best way to implement directories while maintaining CP/M 2.2 compatibility.

cj7hawk

Veteran Member
Joined
Jan 25, 2022
Messages
1,155
Location
Perth, Western Australia.
I'm just looking to program up some code to add directories to my CP/M implementation and was wondering of the best considered methods, and ideas from the past.

I was thinking to make a file "filename.dir" with the DIR extension, then either add in the change functions into the BDOS or implement as a command that can "mount" a directory as a root disk - eg, A:FILENAME.DIR becomes E:

To avoid file allocation issues, I'll need to "hook" a routine for each directory that remains resident and is called when the directory changes, which I can store in extended memory space using paging, so it won't consume space in the TPA, since for each iteration of directory, I need to update all above-level allocations when writes occur - which will get a little problematic if too many are nested, or I could limit nesting (eg, 1 or 2 directory levels deep ) or alternately, I could just allow it to extend while memory is available.

The nice thing about this approach is that it means directories can be treated as files, meaning rename and copy work just file and directories can be renamed, moved between drives or copied using existing CCP commands, and it retains compatability with CP/M.

It's also possible to use the user number to extend directory structures, but that denies the possibility to reduce the allocation size with new directory and requires that all filenames or implement directories that don't directly work with CP/M, which is also pretty valid.

So what are some of the better directory structure ideas for CP/M? What worked well and didn't work so well in the past? My preference is to extend on the CP/M file system rather than adapt FAT or similar to CP/M.

All input appreciated -

Thanks
David
 
Just for us benighted-first-cuppa-coffee people, are you talking about doing a complete implementation of CP/M or dovetailing something into an existing OS, say MSDOS?
 
Based on what little I know about CP/M the closest it ever came to supporting directories was "User Numbers". If you wanted to use them to fake nested subdirectories I can imagine you could just come up with a CLI alias system where you can place named objects (dummy files) in a user area that when "CD'ed" to automatically change the active user area to that one. (And you could modify how the drive prompt is generated so instead of displaying "2B" or whatever it's show the directory name in front of or behind the drive letter.) Of course, all the limits associated with user areas would persist, like the the total number of possible files being limited to the one root directory's length, etc. But it would at least be "portable" in the sense of you'd be able to read the disk with "normal" CP/M by manually switching between the aliased user areas.

I'm a little confused about your "directories as files" idea; it reads like you want to literally implement the subdirectories as "sparse" disk images? (Which would essentially have a "header" that looks like a CP/M directory, but the block entries associated with files are fake/translated to the "real" blocks the files inside occupy?) That sounds... kind of like a nightmare in terms of overhead, and of course these "disk image" files aren't going to mean anything to a normal CP/M system. (Maybe that's not important.)

The blunt truth is that Digital Research never solved this problem. They ripped off FAT and incorporated it into "Dos Plus"/Concurrent CP/M-86/GEMDOS around 1984 and abandoned the original CP/M disk format. Switching away from the CP/M disk format isn't necessarily a deal breaker for program compatibility, the various dingus-es for running CP/M programs on top of MS-DOS generally work fine without trying to emulate the low-level disk format, and for that matter Microsoft's MSX-DOS, which is a Z80 OS that natively uses FAT, also manages to be mostly compatible with CP/M software.
 
USCD P-System on CP/M had volumes as allocated space on the CP/M drive and subvolumes could be allocated out of the various volumes to work around the 77 file limit. Changing drives within P-System would change the active subvolume. Something like that, where the subdirectory is effectively a new drive, could be possible without too much redesign of CP/M.

I think that any subdirectory design would need a new file systems. CP/M's classic file system doesn't readily handle enough files to make a subdirectory useful. The new file system doesn't have to be FAT though FAT is fairly good example of what a file system needing a bit more memory could do.
 
The objective is that while I could just design it into the BDOS, I really don't want to do that - I want the stock BDOS or my BDOS to handle the directories.

I also want to keep the standard directory structure as much as possible.

It's also OK if it only works on large disks.

I've been toying with ideas such as playing with the extent mask to recover half of the directory entry extent section to carry information on where the file is, as well as use of the user number to indicate differences in the directory, so that existing tools under CP/M that examine the allocations within extents in the directory entry will ignore any changes.

Critical functions would need to be:
* Normal CP/M must not corrupt anything and should still handle directories even if it can't show the files. (They can appear as files)
* Non-OS Software should be able to access it if the OS can not ( like LBR ).
* Software that makes direct disk access should not be confused, or corrupt the method.

Principles are;
* It's intended for a hardware-architecture specific - so use of extended memory is OK, and it should not exist in the TPA if possible.
* Assume 128K memory minimum, ideally 512Kb to 768Kb of memory.
* Assume the disk is fast ( Hard Disk or other large capacity drive ).
* Directories are not really necessary on an 80kb disk... But it's useful at more than 2Mb. Somewhere in the middle, this changes.
* It should occur in a way that CP/M software that is not aware of directories can access it.


So faking directories is OK, though I'd prefer to have a way of at least a single layer of directories - ie, Root level directory of directories, and first level with separate directories.

Use of a translation table is also OK - eg, LBA translations- for example, a 4k translation table would provide 2048 entries, which could refer to extents or entries rather than blocks... This means a file would have two extents - one in it's local directory and one in the "root" directory. That model would support 33Mb of disk, which would be quite era-respectable and aligns with DOS expectations of the same era. So it's a heck of a vector table, but 4K is a single page for the architecture I'm planning and isn't a huge imposition under CP/M and would keep blocks as small as 2K if I modify the extent mask.

Ideally, it would be great to be able to extend the FCB structure, which playing with the extent would allow - for example, a filename could contain the directory name in the remainder of the allocation area. That could even allow matching with a flat directory structure if the directory size was expanded, but a referenced model would allow for each directory to be a new directory with it's own limitations unaffected by other directories.

The simplest way I can think of is to simply partition a much larger disk into multiple disks, with 16 bit allocation addressing, and common directory offset values. That would let me change disks just with a BIOS rewrite of a few bytes, and calling the routine to rebuild the AV tables. But it also means having fixed directory sizes and fixed directory locations.

The most complex is storing directory structure in the FCB next to the allocations by manipulating the extent mask to shorten the extent, then manipulate system writes to also write the directory file with additional extents, and then ignore deletes ( directory doesn't return space to the pool ) or perhaps even use dummy higher-level directory files with a pre-determined structure and sparse files to make allocations easy to find when maintaining them creating a pseudo FAT.

Though I'm interesting in what other ideas for creating a directory structure exist.
 
USCD P-System on CP/M had volumes as allocated space on the CP/M drive and subvolumes could be allocated out of the various volumes to work around the 77 file limit. Changing drives within P-System would change the active subvolume. Something like that, where the subdirectory is effectively a new drive, could be possible without too much redesign of CP/M.

I think that any subdirectory design would need a new file systems. CP/M's classic file system doesn't readily handle enough files to make a subdirectory useful. The new file system doesn't have to be FAT though FAT is fairly good example of what a file system needing a bit more memory could do.

This is similar to what I do with my existing memory drive - I have images which I re-load by manipulating the directory offset and size so that everything appears in M:, but I also have J: K: L: O: and P: which can be viewed as their own drives, but actually exist as a file within M:

It's very easy to implement, but requires that I pre-allocate the "subvolumes" and then mount them. Although I mount them in actual fixed drive letters above I:, it would be fairly simple to do the same to a real high capacity drive and mount the subvolume as another letter.

The downside is that while the directories can be easily manipulated by CP/M, the software has to be aware of what they are. In my case, some are ROMs ( All BIOS ROMs in my system appear as legitimate drives, so rather than looking for AA55, I just autoexecute the ROMs filename ) - Something on a large drive would fit the system architecture well, but is a bit wasteful of space as it requires the preallocation and fixed subdirectory sizes. It's fine for memory since memory is mostly pre-allocated, and even RAMDISKs are dynamic, but I think it could be improved for a hard disk.
 
Software that makes direct disk access should not be confused, or corrupt the method.

Ignorant question: when does “generic” CP/M software ever make “direct” disk accesses? I’m looking at ye old big list of BDOS calls and I don’t see anything akin to INT13h direct sector access, everything that touches the disk goes through an FCB. So far as I’m aware any programs that run under CP/M that touch the disk at a lower level than that (like for, say, reading alien formats) are platform specific.

MS-DOS and CP/M 4.x use FCBs to provide backwards compatibility on top of completely different underlying filesystems, so I’m not really clear why you think subdirectories really need to “look like” standard CP/M structures. This is a BIOS level thing, not BDOS, and well behaved CP/M programs don’t know squat about what happens in there.
 
Ignorant question: when does “generic” CP/M software ever make “direct” disk accesses? I’m looking at ye old big list of BDOS calls and I don’t see anything akin to INT13h direct sector access, everything that touches the disk goes through an FCB. So far as I’m aware any programs that run under CP/M that touch the disk at a lower level than that (like for, say, reading alien formats) are platform specific.

MS-DOS and CP/M 4.x use FCBs to provide backwards compatibility on top of completely different underlying filesystems, so I’m not really clear why you think subdirectories really need to “look like” standard CP/M structures. This is a BIOS level thing, not BDOS, and well behaved CP/M programs don’t know squat about what happens in there.
It was quite common for programs that needed direct disk access to make BIOS calls. Even Digital Research's reference implementation of SYSGEN does that. With CP/M 3, there was added a BDOS call for "Direct BIOS calls", but that was more out of necessity due to banked memory (but was also an attempt to "reign-in" programs that were bypassing the BDOS). Such programs were not expected to access the file area of the disk, at least not beyond simply copying it sector-for-sector.
 
No, the BIOS abstracts away from platform-specific features.

I guess what I mean is, what non-system applications (disk copy programs, etc) would use this? Is it a common use case outside of system and disk utilities, the latter of which would seem to be unsafe by design to unleash on a disk with anything nonstandard about it?

You used the example of SYSGEN: that’s an OS utility, and according to this reference it has the expected disk geometry baked into it and needs to be patched for different disk formats. Is this not the definition of a platform specific program?
 
I guess what I mean is, what non-system applications (disk copy programs, etc) would use this? Is it a common use case outside of system and disk utilities, the latter of which would seem to be unsafe by design to unleash on a disk with anything nonstandard about it?

You used the example of SYSGEN: that’s an OS utility, and according to this reference it has the expected disk geometry baked into it and needs to be patched for different disk formats. Is this not the definition of a platform specific program?
It is true that directly accessing the disk is not something a "standard" application would do. But it can be done in a way that abstracts away from specifics of a platform. Such a program can access the DPB and determine all it needs to know about the disk geometry, and that's how many disk recovery utilities work - while not being strapped to a specific platform.
 
So… exactly the kind of programs that you shouldn’t run against filesystems that have something proprietary about them? I mean, analogous programs certainly exist for “MS-DOS” that go around the DOS API and directly hit the PC BIOS calls for similar reasons, but said programs also have a bad habit of nuking your disk if there’s anything “weird” about your system. (Or, for instance, your disk is formatted with a newer version of FAT than it expects, like when >32MB partitions came out. Or the issues that cropped up when translating controllers became a thing…)

Anyway. My only point here is that it seems like a stretch to expect compatibility with these sorts of programs for any sort of subdirectory implementation for CP/M that isn’t either just user areas in a trench coat or something that effectively is named disk volume mounts, not really subdirectories.
 
So… exactly the kind of programs that you shouldn’t run against filesystems that have something proprietary about them? I mean, analogous programs certainly exist for “MS-DOS” that go around the DOS API and directly hit the PC BIOS calls for similar reasons, but said programs also have a bad habit of nuking your disk if there’s anything “weird” about your system. (Or, for instance, your disk is formatted with a newer version of FAT than it expects, like when >32MB partitions came out. Or the issues that cropped up when translating controllers became a thing…)

Anyway. My only point here is that it seems like a stretch to expect compatibility with these sorts of programs for any sort of subdirectory implementation for CP/M that isn’t either just user areas in a trench coat or something that effectively is named disk volume mounts, not really subdirectories.

I guess it's because I'm writing an operating system and building an architecture that supports that Operating System, not just supporting things that were written for that operating system. So things like STAT and disk utilities are of particular use and value to ensure compatability with - in fact, someone recently wrote a similar utility that reviews the file area for corruptions and misconfigurations and I found it useful - it worked right away when I ran it and helped me find some bugs too, as well as proving useful as system software.

For example, I use STAT as a part of my MMU - it tells me what processes I have resident in memory - drivers, disk images, which BIOS are loaded and available, the current memory location of my video card etc, and it means that these can be read by any software through the BDOS - even if I swap out my BDOS for the DRI BDOS - it still works, and can be accessed by higher level software.

Maintaining system level compatability right down to the BIOS is a lot more important than you might imagine. I had issues with some infocom games when I was developing the OS - they just wouldn't work... IIRC, they were accessing the BIOS directly for some graphics routines. I can't remember if they were doing random disk access as well - but I think they did that through the BDOS.

Anyway, it's an objective. A principle of design. It can be ignored and broken and I'm interested in suggestions that do away with the rules also - eg, simulated extent data is fine, but I'd like to know of the more elegant ways to do it.

The problems of supporting directories seems to boil down to a few key elements;

1) How do you support the base directory structure and access the files under CP/M when in a directory.
2) How do you change directories? eg, Mount a directory.
3) How do you *Navigate* directories? eg, Go up or down a directory level?
4) Where do you store this information and how is it accessed? And how do you maintain the file allocations across disparate directories?
5) What parts of CP/M can be leveraged to provide a solution that doesn't depart from CP/M 2.2 very much?

Preallocating blocks within a drive space and partitioning into lots of disks, for example, is pretty close to hitting all of these. You can store directory information in a fake file with minimal risk with no allocations and use .COM based commands to go up and down directories. You can even maintain the same drive name. But you can't access directories via file access ift he program isn't aware of it, and you can't use dynamically allocated space due to the requirement for contiguous allocations. A bad sector directory entry can also deal with bad sectors, which are likely on any reasonable sized disk and could be marked during the format process in the root, and hidden from the directory commands using sparse files.

Also, I am anticipating being able to break out of a program and back into the system command prompt while suspending the program execution ( I should add pre-emptive multitasking too at some point ) - so it's possible to change the contents of a logged drive on the fly - which might be needed with fixed size directories, but it's frought with danger. Better to just expand directories while disk space remains for compatability. Still, it means I could swap disks that the user knows are only data disks.

So my fall back idea is just partition the disk as a lot of disks and mount them via the OS as required... And avoid directory navigaton altogether. I'd like to find a better and more elegant solution if possible.
 
Whats wrong with using ZCPR3 where you can name the user groups, such as

That's definitely an option, especially for smaller disks, though I'm not sure how well it would scale to a large format drive - say, 33Mb. It might be a little small with respect to the number of files, and directory entries.

There are 32 user numbers, 31 if you consider 0 to be "all", so that's a limit. I suppose it's not too bad if you consider that 30Mb means that you get 30 directories, but you forgo any user IDs after that too - it's not too bad a trafeoff and you could always still use the user IDs and give each user separate directories.

Also, it means there's a lot of directory space to search for file applications, which might be a little slow even on a hard disk. That's for read as well as write operations.

And use of numbers rather than filenames isn't optimal... Also you can't copy directories via the command line common commands - eg, PIP NEW.DIR = OLD.DIR

So I'm exploring options and different thinking here so I can try to find a better solutions.

The CP/M structure is pretty good. I find that with a little thinking, I can often find ways to use it that were never originally intended - eg, using the BDOS as my defacto MMU - it works better than great ! But I'm certain it would have had a lot of people scratching their heads back in the 80s.
 
Just for laughs I'll note that there's an open-source version of MSX-DOS floating around, it might be interesting to fire it up in an emulator. This version has been modified to use FAT16, and has full subdirectory support while claiming to retain CP/M compatibility.

Granted it looks like trying to port this to a machine *not* structured like an MSX2 in terms of memory paging, etc, would be a real fun time.

 
Just for laughs I'll note that there's an open-source version of MSX-DOS floating around, it might be interesting to fire it up in an emulator. This version has been modified to use FAT16, and has full subdirectory support while claiming to retain CP/M compatibility.

Compatible is a very broad term... But it's quite practical to leave the CP/M format behind and still run much (most?) CP/M software.
 
Back
Top