The story so far....
In the previous post we developed a routine which could write a green pixel anywhere across the top line of the DHR screen. This was achieved by SIX table lookups. Two to find the MAIN and AUX bytes used by this pixel, two more to MASK out the appropriate bits using the AND operation and finally another two to OR the bit pattern of a green pixel into memory.
Colour me impressed...
Adding the ability to select our pixels colour may seem like something of a challenge since the pixel colour comes from the MAINGR and AUXGR tables. Obviously we can create more tables. For example here are MAINMB and AUXMB tables for drawing a medium blue pixel. (From now on I'll refer to these as "colour tables".)
MAINMB LUP 20 DB %00000000,%00000000,%00000110,%01100000 DB %00000000,%00000001,%00011000 --^ AUXMB LUP 20 DB %00000011,%00110000,%00000000,%00000000 DB %00001100,%01000000,%00000000 --^But the question remains: How do we select these tables instead of MAINGR and AUXGR? Well if you remember there are only two places where the colour table is referenced in our code and they are
ORA MAINGR,X
and
ORA AUXGR,X
Wouldn't it be great if we could just change these instructions to point to different locations when we feel like it? Well brace yourself because that's exactly what we are going to do!
Self-modifying code
As a machine language program is just a stream of bytes and since machine language programs are great at manipulating bytes. It should come as no surprise that we can write programs that modify themselves. This technique is known as: self-modifying code.
The ORA instruction is the one we are interested in. It is represented by the byte 1D which is then followed by two bytes forming an address. The 6502 stores this address in what is known as little-endian format. Meaning that we store the second byte in the address first. So for example if AUXGR was located at $6E12 then the CPU would expect to read the 12 before the 6E. So when the assembler sees ORA AUXGR,X it turns it into: 1D 12 6E.
From here, it should be easy to see that we could write a routine to change the address that both our ORA instructions are pointing at. To do so we're going to have to track exactly where these instructions are in memory. This can be done simply by giving the assembler a label for each instruction like so...
DPLOT LDY MBOFFSET,X BMI AUX STA PAGE1 LDA $2000,Y AND MAINAND,X ORMAIN ORA MAINGR,X STA $2000,Y AUX LDY ABOFFSET,X BMI END STA PAGE2 LDA $2000,Y AND AUXAND,X ORAUX ORA AUXGR,X STA $2000,Y END RTSNow we have the labels ORMAIN and ORAUX pointing to where those instructions are!
Next, let's assume we have colour tables for each of the 16 DHR colours with the following names
Colour | Table name(s) |
---|---|
Black | MAINBL/AUXBL |
Magenta | MAINMG/AUXMG |
Brown | MAINBR/AUXBR |
Orange | MAINOR/AUXOR |
Dark Green | MAINDG/AUXDG |
Grey 1 | MAING1/AUXG1 |
Green | MAINGR/AUXGR |
Yellow | MAINYE/AUXYE |
Dark Blue | MAINDB/AUXDB |
Violet | MAINVI/AUXVI |
Grey 2 | MAING2/AUXG2 |
Pink | MAINPI/AUXPI |
Medium Blue | MAINMB/AUXMB |
Light Blue | MAINLB/AUXLB |
Aqua | MAINAQ/AUXAQ |
White | MAINWH/AUXWHH |
We will now create four tables containing the following information:
Table name | Description |
---|---|
CLOM | Low byte of the address of all MAIN memory colour tables |
CHIM | Hight byte of the address of all MAIN memory colour tables |
CLOA | Low byte of the address of all AUX memory colour tables |
CHIA | Hight byte of the address of all AUX memory colour tables |
Most assemblers have a way of accessing the high byte and low byte of any label defined in your code. In the case of Merlin Pro the operators are < and >. So <MAINGR and >MAINGR refer to the low and high bytes of the MAINGR table respectively. So writing the following:
CLOM DB <MAINBL,<MAINMG,<MAINBR,<MAINOR,<MAINDG DB <MAING1,<MAINGR,<MAINYE,<MAINDB,<MAINVI DB <MAING2,<MAINPI,<MAINMB,<MAINLB,<MAINAQ,<MAINWI CHIM DB >MAINBL,>MAINMG,>MAINBR,>MAINOR,>MAINDG,>MAING1 DB >MAINGR,>MAINYE,>MAINDB,>MAINVI,>MAING2,>MAINPI DB >MAINMB,>MAINLB,>MAINAQ,>MAINWI CLOA DB <AUXBL,<AUXMG,<AUXBR,<AUXOR,<AUXDG DB <AUXG1,<AUXGR,<AUXYE,<AUXDB,<AUXVI DB <AUXG2,<AUXPI,<AUXMB,<AUXLB,<AUXAQ,<AUXWI CHIA DB >AUXBL,>AUXMG,>AUXBR,>AUXOR,>AUXDG,>AUXG1 DB >AUXGR,>AUXYE,>AUXDB,>AUXVI,>AUXG2,>AUXPI DB >AUXMB,>AUXLB,>AUXAQ,>AUXWIGives us a table with all the high and low byte addresses of our colour tables. Now all we need is a routine to set the colour table location: Let's call this program SETDCOLOR and expect the programmer to choose the colour by passing it in the accumulator:
SETDCOLOR TAY LDA CLOM,Y STA ORMAIN+1 LDA CHIM,Y STA ORMAIN+2 LDA CLOA,Y STA ORAUX+1 LDA CHIA,Y STA ORAUX+2 RTSDone. Now each time we call SETDCOLOR it updates our DPLOT routine to point to the appropriate table.
Vertical take off:
So what's left? Oh right! Our routine is still "imprisoned" on the first line of the hi-res screen $2000. So how can we change this? Well...can't we use self-modifying code like we just did with the colour table information? Well you could....but....like any anything we code we need to ask the question: What are we assuming about our execution environment?
When we wrote SETDCOLOR we knew we had to change something about the way our program executed to get it to look at the right colour table. No matter what we did it was going to cost us time. Also it's not unreasonable to assume that a plotting program is going to plot a number of points in a single colour before it changes to a different colour.
Can we make the same assumption here? Maybe not. We will have to change these values every time we plot a point. Each time we do we're going to do four table lookups and rewrite four bytes. Is that going to be too much? We can get a sense of this by adding up the time it takes to execute the main part of our SETDCOLOR program:
Instruction | Cycles |
---|---|
LDA CLOM,Y | 5 |
STA ORMAIN+1 | 4 |
LDA CHIM,Y | 5 |
STA ORMAIN+2 | 4 |
LDA CLOA,Y | 5 |
STA ORAUX+1 | 4 |
LDA CHIA,Y | 5 |
STA ORAUX+2 | 4 |
TOTAL | 36 |
So every plot it's going to cost us 36 machine cycles. Let's compare this to a different method: indirect addressing using the zero page.
Indirect addressing:
Anyone who has written 6502 assembly should know how to use indirect addressing so this will be a quick refresher. Examine the following code:
LDA #$00 ;Load the low byte of the address $2000 STA $1D ;Store it in $001D LDA #$20 ;Load the high byte of the addres $2000 STA $1E ;Store it in $001E LDA #$FF ;Load the accumulator with 255 LDY #$00 ;Load Y with 0 STA ($1D),Y ;Store the contents of the accumulator at location $2000As you can see STA($1D),Y peeks into the two adjacent memory locations $1D and $1E. Sees they contain the bytes 00 and 20 respectively. It then puts them together to form the address $2000 and stores the contents of the accumulator in that memory location.
But wait! Is that really any faster? Actually yes! First, we only have to read and write two bytes instead of four. Second, the writes to the zero page only take three cycles instead of four. The only drawback is that STA ($1D),Y takes one cycle longer than STA $2000,Y. However even with that we still come out ahead.
The question remains: How do we implement this? You know the answer! That's right! Another table! This time representing the high and low bytes of the beginning of each screen row. As there are 192 screen lines, which is a lot of data. For now I'll just give you the table for the first eight lines, which if you recall are $400 bytes apart.
HTAB_LO DB $00,$00,$00,$00,$00,$00,$00,$00 HTAB_HI DB $20,$24,$28,$2C,$30,$34,$38,$3CI've called them HTAB_LO and HTAB_HI for the horizontal low-byte table and horizontal high-byte table respectively.
Now just as our program expects the horizontal co-ordinate in the X register we'll modify our code to expect the vertical co-ordinate in the Y register. To accomplish that we add the following to the beginning of our program:
LDA HTAB_LO,Y STA $1D LDA HTAB_HI,Y STA $1EThen we make one final change to the main part of our program that does the plotting. We substitute the places where we wrote:
LDA $2000,X
With
LDA ($1D),X
And believe it or not we're done! The following is a fully commented routine to draw a pixel of any colour anywhere on the DHR page. The listing also includes our SETDCOLOR routine:
XC ;Required for Merlin Pro to use 65C02 instructions ORG $6000 ;Start assembling at memory location $6000 GRAPHON EQU $C050 HIRESON EQU $C057 FULLON EQU $C052 DHRON EQU $C05E ON80STOR EQU $C001 ON80COL EQU $C00D PAGE1 EQU $C054 PAGE2 EQU $C055 SCRN_LO EQU $1D ;Zero page location for low byte of our screen row SCRN_HI EQU $1E ;Zero page location for high byte of our screen row INIT STA GRAPHON ;Turn on graphics STA HIRESON ;Turn on hi-res mode STA FULLON ;Turn on fullscreen mode STA DHRON ;Turn on Double hi-res STA ON80COL ;Turn on 80 Column mode STA ON80STOR ;Use PAGE1/PAGE2 to switch between MAIN and AUX memory LDA #$0E ;Set colour to 14 = Aqua JSR SETDCOLOR LDX #00 ;Set column to 0 LDY #00 ;Set row to 0 DPLOT LDA HTAB_LO,Y ;Find the low byte of the row address STA SCRN_LO LDA HTAB_HI,Y ;Find the high byte of the row address STA SCRN_HI LDY MBOFFSET,X ;Find what byte if any in MAIN we are working in BMI AUX ;If pixel has no bits in MAIN memory - go to aux routine STA PAGE1 ;Map $2000 to MAIN memory LDA (SCRN_LO),Y ;Load screen data AND MAINAND,X ;Erase pixel bits ORMAIN ORA MAINGR,X ;Draw coloured bits STA (SCRN_LO),Y ;Write back to screen AUX LDY ABOFFSET,X ;Find what byte if any in AUX we are working in BMI END ;If no part of the pixel is in AUX - end the program STA PAGE2 ;Map $2000 to AUX memory LDA (SCRN_LO),Y ;Load screen data AND AUXAND,X ;Erase pixel bits ORAUX ORA AUXGR,X ;Draw coloured bits STA (SCRN_LO),Y ;Write back to screen END RTS SETDCOLOR TAY ;Assume the desired colour is in the accumulator LDA CLOM,Y ;Lookup low byte of MAIN memory colour table STA ORMAIN+1 ;Update the ORA instruction LDA CHIM,Y ;Lookup high byte of MAIN memory colour table STA ORMAIN+2 ;Update the ORA instruction LDA CLOA,Y ;Lookup low byte of AUX memory colour table STA ORAUX+1 ;Update the ORA instruction LDA CHIA,Y ;Lookup high byte of AUX memory colour table STA ORAUX+2 ;Update the ORA instruction RTS MBOFFSET DB 255,0,0,0,255,1,1,255,2,2,2,255,3,3 DB 255,4,4,4,255,5,5,255,6,6,6,255,7,7 DB 255,8,8,8,255,9,9,255,10,10,10,255,11,11 DB 255,12,12,12,255,13,13,255,14,14,14,255,15,15 DB 255,16,16,16,255,17,17,255,18,18,18,255,19,19 DB 255,20,20,20,255,21,21,255,22,22,22,255,23,23 DB 255,24,24,24,255,25,25,255,26,26,26,255,27,27 DB 255,28,28,28,255,29,29,255,30,30,30,255,31,31 DB 255,32,32,32,255,33,33,255,34,34,34,255,35,35 DB 255,36,36,36,255,37,37,255,38,38,38,255,39,39 ABOFFSET DB 0,0,255,1,1,1,255,2,2,255,3,3,3,255 DB 4,4,255,5,5,5,255,6,6,255,7,7,7,255 DB 8,8,255,9,9,9,255,10,10,255,11,11,11,255 DB 12,12,255,13,13,13,255,14,14,255,15,15,15,255 DB 16,16,255,17,17,17,255,18,18,255,19,19,19,255 DB 20,20,255,21,21,21,255,22,22,255,23,23,23,255 DB 24,24,255,25,25,25,255,26,26,255,27,27,27,255 DB 28,28,255,29,29,29,255,30,30,255,31,31,31,255 DB 32,32,255,33,33,33,255,34,34,255,35,35,35,255 DB 36,36,244,37,37,37,255,38,38,255,39,39,39,255 MAINAND LUP 20 DB %01111111,%01111110,%01100001,%00011111 DB %01111111,%01111000,%00000111 --^ AUXAND LUP 20 DB %01110000,%00001111,%01111111,%01111100 DB %01000011,%00111111,%01111111 --^ MAINBL LUP 20 DB %00000000,%00000000,%00000000,%00000000 DB %00000000,%00000000,%00000000 --^ AUXBL LUP 20 DB %00000000,%00000000,%00000000,%00000000 DB %00000000,%00000000,%00000000 --^ MAINMG LUP 20 DB %00000000,%00000001,%00010000,%00000000 DB %00000000,%00000100,%01000000 --^ AUXMG LUP 20 DB %00001000,%00000000,%00000000,%00000010 DB %00100000,%00000000,%00000000 --^ MAINBR LUP 20 DB %00000000,%00000000,%00001000,%00000000 DB %00000000,%00000010,%00100000 --^ AUXBR LUP 20 DB %00000100,%01000000,%00000000,%00000001 DB %00010000,%00000000,%00000000 --^ MAINOR LUP 20 DB %00000000,%00000001,%00011000,%00000000 DB %00000000,%00000110,%01100000 --^ AUXOR LUP 20 DB %00001100,%01000000,%00000000,%00000011 DB %00110000,%00000000,%00000000 --^ MAINDG LUP 20 DB %00000000,%00000000,%0000100,%01000000 DB %00000000,%00000001,%00010000 --^ AUXDG LUP 20 DB %00000010,%00100000,%00000000,%00000000 DB %00001000,%00000000,%00000000 --^ MAING1 LUP 20 DB %00000000,%00000001,%00010100,%01000000 DB %00000000,%00000101,%01010000 --^ AUXG1 LUP 20 DB %00001010,%00100000,%00000000,%00000010 DB %00101000,%00000000,%00000000 --^ MAINGR LUP 20 DB %00000000,%00000000,%00001100,%01000000 DB %00000000,%00000011,%00110000 --^ AUXGR LUP 20 DB %00000110,%01100000,%00000000,%000000001 DB %00011000,%00000000,%00000000 --^ MAINYE LUP 20 DB %00000000,%00000001,%00011100,%01000000 DB %00000000,%00000111,%01110000 --^ AUXYE LUP 20 DB %00001110,%01100000,%00000000,%00000011 DB %00111000,%00000000,%00000000 --^ MAINDB LUP 20 DB %00000000,%00000000,%00000010,%00100000 DB %00000000,%00000000,%00001000 --^ AUXDB LUP 20 DB %00000001,%00010000,%00000000,%00000000 DB %00000100,%01000000,%00000000 --^ MAINVI LUP 20 DB %00000000,%00000001,%00010010,%00100000 DB %00000000,%00000100,%01001000 --^ AUXVI LUP 20 DB %00001001,%00010000,%00000000,%00000010 DB %00100100,%01000000,%00000000 --^ MAING2 LUP 20 DB %00000000,%00000000,%00001010,%00100000 DB %00000000,%00000010,%00101000 --^ AUXG2 LUP 20 DB %00000101,%01010000,%00000000,%00000001 DB %00010100,%01000000,%00000000 --^ MAINPI LUP 20 DB %00000000,%00000001,%00011010,%00100000 DB %00000000,%00000110,%01101000 --^ AUXPI LUP 20 DB %00001101,%01010000,%00000000,%00000011 DB %00110100,%01000000,%00000000 --^ MAINMB LUP 20 DB %00000000,%00000000,%00000110,%01100000 DB %00000000,%00000001,%00011000 --^ AUXMB LUP 20 DB %00000011,%00110000,%00000000,%00000000 DB %00001100,%01000000,%00000000 --^ MAINLB LUP 20 DB %00000000,%00000001,%00010110,%01100000 DB %00000000,%00000101,%01011000 --^ AUXLB LUP 20 DB %00001011,%00110000,%00000000,%00000010 DB %00101100,%01000000,%00000000 --^ MAINAQ LUP 20 DB %00000000,%00000000,%00001110,%01100000 DB %00000000,%00000011,%00111000 --^ AUXAQ LUP 20 DB %00000111,%01110000,%00000000,%00000001 DB %00011100,%01000000,%00000000 --^ MAINWI LUP 20 DB %00000000,%00000001,%00011110,%01100000 DB %00000000,%00000111,%01111000 --^ AUXWI LUP 20 DB %00001111,%01110000,%00000000,%00000011 DB %00111100,%01000000,%00000000 --^ CLOM DB <MAINBL,<MAINMG,<MAINBR,<MAINOR,<MAINDG DB <MAING1,<MAINGR,<MAINYE,<MAINDB,<MAINVI DB <MAING2,<MAINPI,<MAINMB,<MAINLB,<MAINAQ,<MAINWI CHIM DB >MAINBL,>MAINMG,>MAINBR,>MAINOR,>MAINDG,>MAING1 DB >MAINGR,>MAINYE,>MAINDB,>MAINVI,>MAING2,>MAINPI DB >MAINMB,>MAINLB,>MAINAQ,>MAINWI CLOA DB <AUXBL,<AUXMG,<AUXBR,<AUXOR,<AUXDG DB <AUXG1,<AUXGR,<AUXYE,<AUXDB,<AUXVI DB <AUXG2,<AUXPI,<AUXMB,<AUXLB,<AUXAQ,<AUXWI CHIA DB >AUXBL,>AUXMG,>AUXBR,>AUXOR,>AUXDG,>AUXG1 DB >AUXGR,>AUXYE,>AUXDB,>AUXVI,>AUXG2,>AUXPI DB >AUXMB,>AUXLB,>AUXAQ,>AUXWI HTAB_LO LUP 4 DB $00,$00,$00,$00,$00,$00,$00,$00 DB $80,$80,$80,$80,$80,$80,$80,$80 --^ LUP 4 DB $28,$28,$28,$28,$28,$28,$28,$28 DB $A8,$A8,$A8,$A8,$A8,$A8,$A8,$A8 --^ LUP 4 DB $50,$50,$50,$50,$50,$50,$50,$50 DB $D0,$D0,$D0,$D0,$D0,$D0,$D0,$D0 --^ HTAB_HI LUP 3 DB $20,$24,$28,$2C,$30,$34,$38,$3C DB $20,$24,$28,$2C,$30,$34,$38,$3C DB $21,$25,$29,$2D,$31,$35,$39,$3D DB $21,$25,$29,$2D,$31,$35,$39,$3D DB $22,$26,$2A,$2E,$32,$36,$3A,$3E DB $22,$26,$2A,$2E,$32,$36,$3A,$3E DB $23,$27,$2B,$2F,$33,$37,$3B,$3F DB $23,$27,$2B,$2F,$33,$37,$3B,$3F --^
This assembles to about 5K of code - mostly tables. It's worth pointing out that there are many ways to cut this program down in terms of size. For example since all the colour tables and mask tables repeat in cycles of seven, we could take the value in the X register modulus 7 before doing our lookup. This would save us a whopping 2K. It would also increase the number of cycles we spend plotting each pixel significantly.
There are even a couple of ways to make our code plot faster, mostly using approaches that are less easy to explain. I may cover some of these speed/size optimizations in a later article but for now this code fits our two primary goals:
There are even a couple of ways to make our code plot faster, mostly using approaches that are less easy to explain. I may cover some of these speed/size optimizations in a later article but for now this code fits our two primary goals:
- sufficient speed to be used in an action game
- sufficiently clear so that the reader can go on and build out their own enhancements.
Next up, we learn about some code management and put this bad boy to work with some demo code...
2 comments:
It's really nice that you took the huge time and effort to produce this blog on DHR.
I have learnt quite a lot from it!
I have a question though: to write to AUX memory is a simple matter of switching page.
BUT, how to read from AUX memory? Does it work also by switching page?
In my application I want to Get as well as Set pixel colors.
Cheers,
Tony
Hi Tony,
In order to set individual pixels I have to read as well as write to AUX memory. So if you have the 80STOREON softswitch set ($C001) you can use the PAGE1/PAGE2 ($C054/%C055) softswitches to access AUX memory for writing OR reading.
However this has a limitation. Since you are already using the PAGE1/PAGE2 softswitches you can't flip the page. Which means with 80STOREON you can only use DHGR page 1.
If you need BOTH DHGR pages. Then it gets tricky. I discuss the difficulties in part three: http://www.battlestations.zone/2017/04/apple-ii-double-hi-res-from-ground-up_70.html
Post a Comment