• Please review our updated Terms and Rules here

Commodore static PET (early dynamic PET) video timings?

3605.7692 / 5.7692 = 622.0909....

It looks like I pressed a wrong button on my calculator.

Doing the sums again I find that a value of 625 works out to be an almost perfect correction factor ??

Can you advise what memory locations of the ROM hold the jiffy clock correction routine? This is something I've been meaning to look into for a while now as I have a PET with an accurate 60 Hz system interrupt and it has been my plan to modify the ROM to delete the correction routine.

Thanks.
 
Hi all! I am new to this forum but not new to the PET (it was the first computer I ever used). I am currently the person in the VICE emulator team who does most of the PET stuff. And as a result of this discussion I now have a question :)

There is the HI-RES demo from Cursor (we also have it in the VICE testbench: https://sourceforge.net/p/vice-emu/code/HEAD/tree/testprogs/PET/hi-res/ ). More recently there is also the demo "A Bright Shining Star) at https://www.pouet.net/prod.php?which=91735 . Both of these demos use the "racing the beam" technique to modify the screen RAM just before it gets displayed.

For that it depends on the vertical retrace IRQ plus exact timing.

In hardware, that IRQ gets triggered by a signal from the video circuit which goes into a PIA, which is set to create an IRQ when the signal goes low (or high, whatever, I didn't look it up).

In the emulation, which was written for a large part by André, I think, this is modeled by using a "secret" 6545 CRTC which is pre-set to fixed settings, and can't be changed (because it isn't present in the memory map). A value called "venable" (vertical enable, I guess, or perhaps video) is cleared when the 26th text line is about to be displayed, i.e. after the 25 th line. I'm not sure if that would directly correspond to any particular physical signal from this thread.
In the emulation, the code runs at the "end" of every scan line, but I'd have to look carefully how that corresponds to a position on the diagram from post #16 .

Anyway, after this long sketch of the context: the demos mentioned above run better in emulation if the IRQ is triggered a bit later, by about 16 - 24 clocks (usec). Is there anything in the schematic, or measurements, which indicates when exactly the IRQ is triggered? Can this explain some 16-24 usec delay compared to something that could reasonably called "the end of the line"?

Looking at the diagram again, it could for example be that the emulation code considers the right-hand side of the dark-green area "the end of the line". If you add the given time period, you have the horizontal flyback (12 usec) and the left border (6.7 usec) which makes 18.7 usec, and then the IRQ could be triggered at the left edge of the text area but just below it. This sounds plausible but is it backed up by the schematics?
 
Hey, welcome Rhialto!

Long time no talk...

Time to get out the data sheet for the PIA and see what the timing is on that.

Dave
 
Last edited:
Hi all! I am new to this forum but not new to the PET (it was the first computer I ever used). I am currently the person in the VICE emulator team who does most of the PET stuff. And as a result of this discussion I now have a question :)

There is the HI-RES demo from Cursor (we also have it in the VICE testbench: https://sourceforge.net/p/vice-emu/code/HEAD/tree/testprogs/PET/hi-res/ ). More recently there is also the demo "A Bright Shining Star) at https://www.pouet.net/prod.php?which=91735 . Both of these demos use the "racing the beam" technique to modify the screen RAM just before it gets displayed.

For that it depends on the vertical retrace IRQ plus exact timing.

In hardware, that IRQ gets triggered by a signal from the video circuit which goes into a PIA, which is set to create an IRQ when the signal goes low (or high, whatever, I didn't look it up).

The IRQ is triggered by a falling edge of the SYNC/VIDEO ON signal. (The PIA can be programmed to interrupt on the rising edge but it don't think any demos do that.)

In the emulation, which was written for a large part by André, I think, this is modeled by using a "secret" 6545 CRTC which is pre-set to fixed settings, and can't be changed (because it isn't present in the memory map). A value called "venable" (vertical enable, I guess, or perhaps video) is cleared when the 26th text line is about to be displayed, i.e. after the 25 th line. I'm not sure if that would directly correspond to any particular physical signal from this thread.
In the emulation, the code runs at the "end" of every scan line, but I'd have to look carefully how that corresponds to a position on the diagram from post #16 .

Anyway, after this long sketch of the context: the demos mentioned above run better in emulation if the IRQ is triggered a bit later, by about 16 - 24 clocks (usec). Is there anything in the schematic, or measurements, which indicates when exactly the IRQ is triggered? Can this explain some 16-24 usec delay compared to something that could reasonably called "the end of the line"?

Looking at the diagram again, it could for example be that the emulation code considers the right-hand side of the dark-green area "the end of the line". If you add the given time period, you have the horizontal flyback (12 usec) and the left border (6.7 usec) which makes 18.7 usec, and then the IRQ could be triggered at the left edge of the text area but just below it. This sounds plausible but is it backed up by the schematics?

I don't know much about the CRTC but I have simulated the discrete circuit to find the exact timings. I wanted to update my emulator to correctly model this kind of "beam racing" thing which I find fascinating.

From my notes: If the falling edge of SYNC is cycle 0, the rising edge of SYNC is cycle 3840 (microseconds). The first read of video RAM is 23 cycles later or cycle 3863. RAM is read every subsequent cycle for a total of 40 cycles, then there are 24 cycles until the line is read again starting at cycle 3927. etc (a total of 64 cycles per line). This goes on for a total of 200 lines and the last byte of the last line is read at cycle 16638 and the SYNC signal falls again (triggering another IRQ) two cycles later.

Hope this helps.

--Thomas
 
From my notes: If the falling edge of SYNC is cycle 0, the rising edge of SYNC is cycle 3840 (microseconds). The first read of video RAM is 23 cycles later or cycle 3863. RAM is read every subsequent cycle for a total of 40 cycles, then there are 24 cycles until the line is read again starting at cycle 3927. etc (a total of 64 cycles per line). This goes on for a total of 200 lines and the last byte of the last line is read at cycle 16638 and the SYNC signal falls again (triggering another IRQ) two cycles later.
So what you're saying is that the IRQ is triggered 2 cycles after the bottom scan line of the last character on the screen has been fetched. I suppose that means that it happens in the 2nd "phantom character in the border". I think that is actually before the current emulation code runs, and so it is already too late with the IRQ. If delaying it more seems to help some demos, then that would be more by accident. So I would need to trigger the IRQ earlier, not later... which is unfortunately slightly more tricky to do (I'd have to set it up at the end of the scan line before).

Edited to add: I changed some code in VICE to trigger the IRQ earlier, but judging from the resulting screen of the hi-res program, it is too early now. I wonder if somebody could run that program on a real pre-CRTC PET, and photograph the screen? It would be interesting to compare it with the file "hi-res.bitmap" from the same directory. That (text!) file represents what the hi-res program *attempts* to display; if it manages to do that in practice is not yet 100% certain. I do remember from running that think back-in-the-day that it was somewhat unstable and/or had a lot of display snow.

Oh, and would there be a timing difference between different generations of the pre-CRTC PETs?
 
Last edited:
I'll just throw a curved ball into the ring. It is late in the UK, and I am about to go to bed, but on CRTC PETs there are 50 and 60 Hz EDIT ROMs, and two 'modes' - being graphics and text - that changes the number of video lines per line of characters. Does this affect the equation?

Also, but this may have been mentioned, there are latches between the video RAM data output and the input of the character generator and from the output of the character generator when it gets latched into the parallel to serial shift register to produce the actual stream of pixels on the screen.

Or is this what we are referring to as 'phantom characters'?

Dave
 
It is late here too, but in first instance I'm interested in the pre-CRTC timings. I'm not sure if there are any similar demo programs for CRTC-PETs. The emulation code in VICE uses a different criterion for the vertical blank anyway (vsync starts in the "text line" indicated in register 7, "vertical sync position", which is usually larger than 25, e.g. 29), which may or may not be close to how the CRTC really does it.
 
So what you're saying is that the IRQ is triggered 2 cycles after the bottom scan line of the last character on the screen has been fetched. I suppose that means that it happens in the 2nd "phantom character in the border". I think that is actually before the current emulation code runs, and so it is already too late with the IRQ. If delaying it more seems to help some demos, then that would be more by accident. So I would need to trigger the IRQ earlier, not later... which is unfortunately slightly more tricky to do (I'd have to set it up at the end of the scan line before).
Yes. The data is fetched in the second to the last cycle (16638), run through the ROM, and latched in the shift register. The video data for that last character goes out to the display during the last cycle (this is a simplification, it actually starts going out a few pixel clocks earlier than that) and VIDEO ON falls on the next cycle.
Edited to add: I changed some code in VICE to trigger the IRQ earlier, but judging from the resulting screen of the hi-res program, it is too early now. I wonder if somebody could run that program on a real pre-CRTC PET, and photograph the screen? It would be interesting to compare it with the file "hi-res.bitmap" from the same directory. That (text!) file represents what the hi-res program *attempts* to display; if it manages to do that in practice is not yet 100% certain. I do remember from running that think back-in-the-day that it was somewhat unstable and/or had a lot of display snow.

IMG_8527.JPG
It does produce a lot of weird "snow". The vertical garbage flickers around the left and right side of the "hires" box but the hires pixels remain surprisingly stable.

I have been trying some "race the beam" experiments on this PET but the "snow" makes it difficult. The only way I've been able to do anything is to turn the screen blanking on (that takes 6 cycles), write some screen data, and then turn the screen blanking back off and make sure it is off when the beam is displaying the data. It's difficult and also means you can't have any static PETSCII graphics on the left or right side of the "hires" stuff.
Oh, and would there be a timing difference between different generations of the pre-CRTC PETs?
I simulated both the static and dynamic board video circuits and they are effectively the same timing-wise though a bit different internally. The dynamic RAM can be read or written twice in one microsecond cycle (thus no "snow").
 

Attachments

  • IMG_8516.JPG
    IMG_8516.JPG
    64.6 KB · Views: 8
Which Version of BASIC have you got?

Dave

***COMMODORE BASIC*** and
###COMMODORE BASIC###.

These are called either BASIC 1 and BASIC 2 or BASIC 2 and BASIC 3 respectively, depending on which authority you're reading.
Jiffy clock issue outlined on page 9 here : pet2001clone.pdf
 
Interesting about the Jiffy pulses. It is essentially a firmware (or software ) method to fudge a hardware clock that is running at slightly the wrong speed.

It kind of reminds me of the fudging that once went on (not so much nowadays) with the line power frequency, whether it was 50 or 60Hz, depending on the country. For a while at least in history, going way back to the 1920's, the commonly available form of "accurate" household clocks used a line operated synchronous (typically 1 RPM motor) to drive the clock mechanism. But the line frequency had a drift this way and that, so to keep the clocks accurate, in the long run, the power generation companies would make small adjustments, this way and that over time, so that, on the average, people's clock's would not consistently gain or lose time. Some people using these clocks still, are now running them from crystal controlled converters, since some power authorities stopped putting in the corrections.

The whole idea though seem flawed to some extent in that you make an inaccurate clock and then correct it, not dissimilar to using a GPS clock and dragging a less accurate clock into line when the GPS signal is being received. On the other hand, a mechanical clock (if properly adjusted) tends to run a little fast when fully wound and a little slower near unwound and about at the right speed when 1/2 wound, so as to average out the errors. Taking that to the extreme, is the stopped clock that is exactly right twice per day.
 
Yes. The data is fetched in the second to the last cycle (16638), run through the ROM, and latched in the shift register. The video data for that last character goes out to the display during the last cycle (this is a simplification, it actually starts going out a few pixel clocks earlier than that) and VIDEO ON falls on the next cycle.
ah yes of course, there is some delay between fetching the data from the screen memory and seeing it on the screen... that would affect how to emulate this, if it were cycle-exact, which it isn't.

I've been pondering "when exactly" the emulation renders the scan line, relative to the position of the imaginary video beam. But there isn't really an answer to that, unless it can be related to something visible to a program.

Where the hardware accesses the video memory one-by-one, in successive cycles, the emulation in essence just halts time, then accesses all 40 screen locations at once. That has an effect on whether the hi-res program appears correctly or not, depending on how exactly it would do its memory accesses.

What the hi-res program does in its IRQ routine is that its writes to the 10 screen locations in the first text line, to prepare the first scan line. Then it waits a "long time", essentially the whole vertical blank interval. Then, just after the first of the memory locations has been read out to be displayed, the program starts to replace those values with those for the next scan line. It takes 60 cycles to store to 10 memory locations (see the readme.txt at that vice testbench location, which has a trace including clock cycles), so it has only 4 cycles to spare (2 NOPs). At first, the memory replacement would be only just behind the "beam", and as the line goes on it gets more and more behind. Only because it just tries to work on a small part of the screen, it's done with its replacements before the "beam" gets back to the first of them. And then it already has to replace the values again, for the next scan line.
For the emulation to show this correctly, with its instantaneous fetching of the screen memory, it has to do the rendering inside that 4 cycle window where the program is "between lines" as well. But (I hope I'm reasoning correctly here) this would need to be 1 scan line *earlier*, because the program on hardware stores the characters (slightly) ahead of time of being displayed, "behind the beam"; the emulation would then at the end of the line render the line in its already changed form, so the program appears to be "too early". This might explain why making the interrupt *later* rather than *earlier* makes the overall outcome better.
And unfortunately, different racing-the-beam programs would have different memory access patterns, so the instantaneous rendering would likely need to be at a slightly different time... bad news for emulation.
View attachment 1254923
It does produce a lot of weird "snow". The vertical garbage flickers around the left and right side of the "hires" box but the hires pixels remain surprisingly stable.

This is an amazing picture! I have never seen the "snow" so structured! You can even recognize half * characters in there. No doubt the locations of the garbage correspond to the access pattern to the screen memory; the first 3 stores of each line happen closer together than the other ones, and those might be the 3 garbage patches on the right. (They are coded as LDX #, LDY #, LDA #, STX, STY, STA) and then a sequence of LDA, STAs; I'm not sure why they did it that way since in the end it isn't faster). There would be 3 character spaces between them, right? That would correspond to fetching 3 the instruction bytes of the STA $8xxxx instruction.
This display partly seems to contradict my ideas about the timing earlier: I was expecting the replacement of the leftmost character to start right after it is displayed. But: the code first does LDX #, LDY #, LDA #, which takes 6 cycles and the STA $8xxxx which takes another 3, so that would account for 9 positions on the screen. And there is the fuzziness of the 2 cycle display delay, so all in all it isn't too unlike expectations.

I have been trying some "race the beam" experiments on this PET but the "snow" makes it difficult. The only way I've been able to do anything is to turn the screen blanking on (that takes 6 cycles), write some screen data, and then turn the screen blanking back off and make sure it is off when the beam is displaying the data. It's difficult and also means you can't have any static PETSCII graphics on the left or right side of the "hires" stuff.
Does that Bright Shining Star demo suffer the same problem? If so, I wonder how the maker created it, maybe they just tried emulation or a snow-free model...
I simulated both the static and dynamic board video circuits and they are effectively the same timing-wise though a bit different internally. The dynamic RAM can be read or written twice in one microsecond cycle (thus no "snow").
I found https://github.com/skibo/PetVideoSim from a linked thread; I like that it can model the "snow". I wanted possibly to have the snow in VICE too, but I didn't know how to model it in a realistic way.
I looked at it with gtkwave but I'll need to find my way around it. For now it's just too many trees to see the forest, for this software person :)
 
I'm not certain if this is relevant to the discussion: I would have to check on the PET with the scope, but probably there is a similar thing going on with the shift register with respect to the video signal timing:

A while back I built a Light Pen project with a Matrox S-100 video card, which generates the video signal, clocked out of its video RAM, as usual, from a shift register. I noticed there was a 12 pixel clock delay, from the start of a scan line in the active video area before the video data appeared out of the shift register with respect to the video RAM address timing. I had to correct for this offset, I chose to do it in hardware with a gated 74LS93 counter IC to make a digital delay. It is mentioned on page 3 of this article. I would have to measure this offset in the PET, but probably this sort of thing is endemic to the design of shift resister generated video, and it would have to be accounted for in any emulation I would think. It may have already been accounted for in the emulators being discussed, but I know so little about how these work, I am not sure. So I thought to mention it.

 

Attachments

  • LPEN.jpg
    LPEN.jpg
    252.2 KB · Views: 3
ah yes of course, there is some delay between fetching the data from the screen memory and seeing it on the screen... that would affect how to emulate this, if it were cycle-exact, which it isn't.
That delay is only a microsecond so I don't think anybody will notice. :) The important part is knowing which cycle the video RAM is read because that determines what goes on the screeen.
I've been pondering "when exactly" the emulation renders the scan line, relative to the position of the imaginary video beam. But there isn't really an answer to that, unless it can be related to something visible to a program.

Where the hardware accesses the video memory one-by-one, in successive cycles, the emulation in essence just halts time, then accesses all 40 screen locations at once. That has an effect on whether the hi-res program appears correctly or not, depending on how exactly it would do its memory accesses.
This would require the demo or program to do all its writes in the 24 cycles between lines which isn't a lot of time.
What the hi-res program does in its IRQ routine is that its writes to the 10 screen locations in the first text line, to prepare the first scan line. Then it waits a "long time", essentially the whole vertical blank interval. Then, just after the first of the memory locations has been read out to be displayed, the program starts to replace those values with those for the next scan line. It takes 60 cycles to store to 10 memory locations (see the readme.txt at that vice testbench location, which has a trace including clock cycles), so it has only 4 cycles to spare (2 NOPs). At first, the memory replacement would be only just behind the "beam", and as the line goes on it gets more and more behind. Only because it just tries to work on a small part of the screen, it's done with its replacements before the "beam" gets back to the first of them. And then it already has to replace the values again, for the next scan line.
For the emulation to show this correctly, with its instantaneous fetching of the screen memory, it has to do the rendering inside that 4 cycle window where the program is "between lines" as well. But (I hope I'm reasoning correctly here) this would need to be 1 scan line *earlier*, because the program on hardware stores the characters (slightly) ahead of time of being displayed, "behind the beam"; the emulation would then at the end of the line render the line in its already changed form, so the program appears to be "too early". This might explain why making the interrupt *later* rather than *earlier* makes the overall outcome better.
And unfortunately, different racing-the-beam programs would have different memory access patterns, so the instantaneous rendering would likely need to be at a slightly different time... bad news for emulation.
You don't have to render the pixels immediately after the RAM read. You just have to determine what pixels would be created at that point, cache them somewhere, and render the whole line or screen in a reasonable time.
This is an amazing picture! I have never seen the "snow" so structured! You can even recognize half * characters in there. No doubt the locations of the garbage correspond to the access pattern to the screen memory; the first 3 stores of each line happen closer together than the other ones, and those might be the 3 garbage patches on the right. (They are coded as LDX #, LDY #, LDA #, STX, STY, STA) and then a sequence of LDA, STAs; I'm not sure why they did it that way since in the end it isn't faster). There would be 3 character spaces between them, right? That would correspond to fetching 3 the instruction bytes of the STA $8xxxx instruction.
This display partly seems to contradict my ideas about the timing earlier: I was expecting the replacement of the leftmost character to start right after it is displayed. But: the code first does LDX #, LDY #, LDA #, which takes 6 cycles and the STA $8xxxx which takes another 3, so that would account for 9 positions on the screen. And there is the fuzziness of the 2 cycle display delay, so all in all it isn't too unlike expectations.


Does that Bright Shining Star demo suffer the same problem? If so, I wonder how the maker created it, maybe they just tried emulation or a snow-free model...
I think it was run on a dynamic PET with BASIC 4 ROMs. I recall somebody on the Facebook group trying to run it and getting garbage on the screen.

I found https://github.com/skibo/PetVideoSim from a linked thread; I like that it can model the "snow". I wanted possibly to have the snow in VICE too, but I didn't know how to model it in a realistic way.
I tried emulating "snow" by just replacing the video RAM data with the CPU read/write data during the cycle it occurs (the data goes through the character ROM). I think it was accurate but it appears much harsher than a real PET probably because the phosphor on a real PET doesn't light up much if its only driven one refresh. I would kind of like to see this on emulators because I think some people are developing programs on emulators and maybe don't realize they are getting a lot of snow on a lot of older machines.
I looked at it with gtkwave but I'll need to find my way around it. For now it's just too many trees to see the forest, for this software person :)
Yeah, it's a lot of waveforms to sift through.

--Thomas
 
You don't have to render the pixels immediately after the RAM read. You just have to determine what pixels would be created at that point, cache them somewhere, and render the whole line or screen in a reasonable time.
My current plan is like so:
- at the start of each scan line (which is the same as at the end of the previous line), pre-fetch all 40 screen memory values
- when the cpu stores to screen memory, after storing it in memory, also determine if it is in the range of the cached values. And if the store is ahead of the beam. If so, the value is stored also into the cache (so the updated value will be displayed). Otherwise, it isn't (and it is only stored to normal screen memory)
- at the end of the scan line, render from the cache, as modified. Any updates that didn't make it into the cache will be displayed on the next scan line.

The idea works, broadly, except I haven't found all the right offsets for "is it ahead of the beam?". So the hi-res characters don't get displayed correctly in the left part.
And because I haven't generalized it enough, I broke all 80 column models in the mean time :)
 
Is the general way to avoid snow not to alter any of the video RAM contents during active display time on the VDU ?

Most systems I have seen make all the changes to the video RAM content, during the vertical retrace interval. There is a lot of time there at 1.28mS so that the RAM contents are stable prior to the read. I had to do the same thing with the light pen project to make it snow free. You can make a change during the H flyback time too, but given there is only 12uS there, it is not as much to time play with. The VDU's image will only be stable and clean, when the RAM content is stable and not changing over the RAM read that creates the displayed video field, that assumes nothing is interrupting the address sequence too. Each subsequent field then shows the new stable state of the RAM's contents and no snow.

In the light pen project at least, the shift register delay was enough that I noticed it and the pixel received by the PEN stopped the pen address counters at an address that did not match the video RAM address and there was a noticeable displacement on the 9" VDU screen. Even a micro-second or so it significant as the H beam velocity on the VDU face is very roughly about 4mm/uS on a 9" CRT. So in a light pen project that was a significant displacement. In a PET emulation it is really only about the width of one letter or graphics symbol. But one would assume an emulator should be as close as possible to the real thing. I will try to find out just what this delay is in the PET when I have some more time.
 
Last edited:
Is the general way to avoid snow not to alter any of the video RAM contents during active display time on the VDU ?

This is arguably the main reason why the original PET even has the vertical refresh interrupt; that you can keep time with it is a bonus. (The BASIC screen I/O commands waited for the blanking area to touch the screen memory.)

That delay is only a microsecond so I don't think anybody will notice. :) The important part is knowing which cycle the video RAM is read because that determines what goes on the screeen.

It would be expected behavior that the dots coming out on the screen would lag about one position behind the address generator. (IE, the cycle goes: update character address counter -> address multiplexer set to character address -> latch memory read buffer -> buffer contents interpreted into dots by character rom -> dots latched into shift register for output. Once the memory read is latched the first part of the cycle can then start over as the second part happens.

Of course none of this would normally matter because the video output circuitry has no reason to care how much lag or overlapping phases there are in the pipeline before the bits actually make something on the screen glow. But yeah, a light pen would be the exception to that rule since it does actually “close the loop”.
 
Is the general way to avoid snow not to alter any of the video RAM contents during active display time on the VDU ?

Once Commodore advanced the design of the PET to use the faster DRAM chips for the video memory, CPU access to the video RAM was interlaced with the video generator; access alternating on opposite phases of the CPU clock cycle which also happens to be a character shifting interval during which the pixel bit-stream is buffered by an 8-bit shift register.

You might recall this thread: https://forum.vcfed.org/index.php?threads/make-pet-video-100-snow-free.1238801/

-----------------------------------------------------------------------------------------------------------------------------------------------------

Back on topic - does anyone know if there is an authoritative breakdown of the ROM jiffy clock routines out there?

Over and out.
 
Once Commodore advanced the design of the PET to use the faster DRAM chips for the video memory, CPU access to the video RAM was interlaced with the video generator

Minor correction: “DRAM” should be “SRAM”.

(All PETs through the “Universal” models used SRAM for video, but models from the “Dynamic” board onward do the interlaced access thing and therefore lack snow. Ironically I guess if they *had* used main DRAM they probably could have eliminated parts of the refresh circuitry, but with 16k banks they would have had to add another bank or done strange things to the memory map…

FWIW, the rare and exotic CBM/PET 8296 *did* use DRAM for video; it shared a single bank of 64k DRAM chips for both main and video memory. It looks like it pulled this off by effectively running the DRAM twice as fast as the older PETs so it could latch two bytes for video in every off phase of PHI0 instead of one.)
 
Back
Top