The story so far....
In the previous post we developed a routine which could write a green pixel anywhere across the top line of the DHR screen. This was achieved by SIX table lookups. Two to find the MAIN and AUX bytes used by this pixel, two more to MASK out the appropriate bits using the AND operation and finally another two to OR the bit pattern of a green pixel into memory.
Colour me impressed...
Adding the ability to select our pixels colour may seem like something of a challenge since the pixel colour comes from the MAINGR and AUXGR tables. Obviously we can create more tables. For example here are MAINMB and AUXMB tables for drawing a medium blue pixel. (From now on I'll refer to these as "colour tables".)
But the question remains: How do we select these tables instead of MAINGR and AUXGR? Well if you remember there are only two places where the colour table is referenced in our code and they areMAINMB
LUP 20
DB %00000000,%00000000,%00000110,%01100000
DB %00000000,%00000001,%00011000
--^
AUXMB
LUP 20
DB %00000011,%00110000,%00000000,%00000000
DB %00001100,%01000000,%00000000
--^
ORA MAINGR,X
and
ORA AUXGR,X
Wouldn't it be great if we could just change these instructions to point to different locations when we feel like it? Well brace yourself because that's exactly what we are going to do!
Self-modifying code
As a machine language program is just a stream of bytes and since machine language programs are great at manipulating bytes. It should come as no surprise that we can write programs that modify themselves. This technique is known as: self-modifying code.
The ORA instruction is the one we are interested in. It is represented by the byte 1D which is then followed by two bytes forming an address. The 6502 stores this address in what is known as little-endian format. Meaning that we store the second byte in the address first. So for example if AUXGR was located at $6E12 then the CPU would expect to read the 12 before the 6E. So when the assembler sees ORA AUXGR,X it turns it into: 1D 12 6E.
From here, it should be easy to see that we could write a routine to change the address that both our ORA instructions are pointing at. To do so we're going to have to track exactly where these instructions are in memory. This can be done simply by giving the assembler a label for each instruction like so...
Now we have the labels ORMAIN and ORAUX pointing to where those instructions are!DPLOT LDY MBOFFSET,X
BMI AUX
STA PAGE1
LDA $2000,Y
AND MAINAND,X
ORMAIN ORA MAINGR,X
STA $2000,Y
AUX LDY ABOFFSET,X
BMI END
STA PAGE2
LDA $2000,Y
AND AUXAND,X
ORAUX ORA AUXGR,X
STA $2000,Y
END RTS
Next, let's assume we have colour tables for each of the 16 DHR colours with the following names
Colour | Table name(s) |
---|---|
Black | MAINBL/AUXBL |
Magenta | MAINMG/AUXMG |
Brown | MAINBR/AUXBR |
Orange | MAINOR/AUXOR |
Dark Green | MAINDG/AUXDG |
Grey 1 | MAING1/AUXG1 |
Green | MAINGR/AUXGR |
Yellow | MAINYE/AUXYE |
Dark Blue | MAINDB/AUXDB |
Violet | MAINVI/AUXVI |
Grey 2 | MAING2/AUXG2 |
Pink | MAINPI/AUXPI |
Medium Blue | MAINMB/AUXMB |
Light Blue | MAINLB/AUXLB |
Aqua | MAINAQ/AUXAQ |
White | MAINWH/AUXWHH |
We will now create four tables containing the following information:
Table name | Description |
---|---|
CLOM | Low byte of the address of all MAIN memory colour tables |
CHIM | Hight byte of the address of all MAIN memory colour tables |
CLOA | Low byte of the address of all AUX memory colour tables |
CHIA | Hight byte of the address of all AUX memory colour tables |
Most assemblers have a way of accessing the high byte and low byte of any label defined in your code. In the case of Merlin Pro the operators are < and >. So <MAINGR and >MAINGR refer to the low and high bytes of the MAINGR table respectively. So writing the following:
Gives us a table with all the high and low byte addresses of our colour tables. Now all we need is a routine to set the colour table location: Let's call this program SETDCOLOR and expect the programmer to choose the colour by passing it in the accumulator:CLOM DB <MAINBL,<MAINMG,<MAINBR,<MAINOR,<MAINDG
DB <MAING1,<MAINGR,<MAINYE,<MAINDB,<MAINVI
DB <MAING2,<MAINPI,<MAINMB,<MAINLB,<MAINAQ,<MAINWI
CHIM DB >MAINBL,>MAINMG,>MAINBR,>MAINOR,>MAINDG,>MAING1
DB >MAINGR,>MAINYE,>MAINDB,>MAINVI,>MAING2,>MAINPI
DB >MAINMB,>MAINLB,>MAINAQ,>MAINWI
CLOA DB <AUXBL,<AUXMG,<AUXBR,<AUXOR,<AUXDG
DB <AUXG1,<AUXGR,<AUXYE,<AUXDB,<AUXVI
DB <AUXG2,<AUXPI,<AUXMB,<AUXLB,<AUXAQ,<AUXWI
CHIA DB >AUXBL,>AUXMG,>AUXBR,>AUXOR,>AUXDG,>AUXG1
DB >AUXGR,>AUXYE,>AUXDB,>AUXVI,>AUXG2,>AUXPI
DB >AUXMB,>AUXLB,>AUXAQ,>AUXWI
Done. Now each time we call SETDCOLOR it updates our DPLOT routine to point to the appropriate table.SETDCOLOR TAY
LDA CLOM,Y
STA ORMAIN+1
LDA CHIM,Y
STA ORMAIN+2
LDA CLOA,Y
STA ORAUX+1
LDA CHIA,Y
STA ORAUX+2
RTS
Vertical take off:
So what's left? Oh right! Our routine is still "imprisoned" on the first line of the hi-res screen $2000. So how can we change this? Well...can't we use self-modifying code like we just did with the colour table information? Well you could....but....like any anything we code we need to ask the question: What are we assuming about our execution environment?
When we wrote SETDCOLOR we knew we had to change something about the way our program executed to get it to look at the right colour table. No matter what we did it was going to cost us time. Also it's not unreasonable to assume that a plotting program is going to plot a number of points in a single colour before it changes to a different colour.
Can we make the same assumption here? Maybe not. We will have to change these values every time we plot a point. Each time we do we're going to do four table lookups and rewrite four bytes. Is that going to be too much? We can get a sense of this by adding up the time it takes to execute the main part of our SETDCOLOR program:
Instruction | Cycles |
---|---|
LDA CLOM,Y | 5 |
STA ORMAIN+1 | 4 |
LDA CHIM,Y | 5 |
STA ORMAIN+2 | 4 |
LDA CLOA,Y | 5 |
STA ORAUX+1 | 4 |
LDA CHIA,Y | 5 |
STA ORAUX+2 | 4 |
TOTAL | 36 |
So every plot it's going to cost us 36 machine cycles. Let's compare this to a different method: indirect addressing using the zero page.
Indirect addressing:
Anyone who has written 6502 assembly should know how to use indirect addressing so this will be a quick refresher. Examine the following code:
As you can see STA($1D),Y peeks into the two adjacent memory locations $1D and $1E. Sees they contain the bytes 00 and 20 respectively. It then puts them together to form the address $2000 and stores the contents of the accumulator in that memory location.LDA #$00 ;Load the low byte of the address $2000
STA $1D ;Store it in $001D
LDA #$20 ;Load the high byte of the addres $2000
STA $1E ;Store it in $001E
LDA #$FF ;Load the accumulator with 255
LDY #$00 ;Load Y with 0
STA ($1D),Y ;Store the contents of the accumulator at location $2000
But wait! Is that really any faster? Actually yes! First, we only have to read and write two bytes instead of four. Second, the writes to the zero page only take three cycles instead of four. The only drawback is that STA ($1D),Y takes one cycle longer than STA $2000,Y. However even with that we still come out ahead.
The question remains: How do we implement this? You know the answer! That's right! Another table! This time representing the high and low bytes of the beginning of each screen row. As there are 192 screen lines, which is a lot of data. For now I'll just give you the table for the first eight lines, which if you recall are $400 bytes apart.
I've called them HTAB_LO and HTAB_HI for the horizontal low-byte table and horizontal high-byte table respectively.HTAB_LO DB $00,$00,$00,$00,$00,$00,$00,$00
HTAB_HI DB $20,$24,$28,$2C,$30,$34,$38,$3C
Now just as our program expects the horizontal co-ordinate in the X register we'll modify our code to expect the vertical co-ordinate in the Y register. To accomplish that we add the following to the beginning of our program:
Then we make one final change to the main part of our program that does the plotting. We substitute the places where we wrote:LDA HTAB_LO,Y
STA $1D
LDA HTAB_HI,Y
STA $1E
LDA $2000,X
With
LDA ($1D),X
And believe it or not we're done! The following is a fully commented routine to draw a pixel of any colour anywhere on the DHR page. The listing also includes our SETDCOLOR routine:
XC ;Required for Merlin Pro to use 65C02 instructions
ORG $6000 ;Start assembling at memory location $6000
GRAPHON EQU $C050
HIRESON EQU $C057
FULLON EQU $C052
DHRON EQU $C05E
ON80STOR EQU $C001
ON80COL EQU $C00D
PAGE1 EQU $C054
PAGE2 EQU $C055
SCRN_LO EQU $1D ;Zero page location for low byte of our screen row
SCRN_HI EQU $1E ;Zero page location for high byte of our screen row
INIT STA GRAPHON ;Turn on graphics
STA HIRESON ;Turn on hi-res mode
STA FULLON ;Turn on fullscreen mode
STA DHRON ;Turn on Double hi-res
STA ON80COL ;Turn on 80 Column mode
STA ON80STOR ;Use PAGE1/PAGE2 to switch between MAIN and AUX memory
LDA #$0E ;Set colour to 14 = Aqua
JSR SETDCOLOR
LDX #00 ;Set column to 0
LDY #00 ;Set row to 0
DPLOT LDA HTAB_LO,Y ;Find the low byte of the row address
STA SCRN_LO
LDA HTAB_HI,Y ;Find the high byte of the row address
STA SCRN_HI
LDY MBOFFSET,X ;Find what byte if any in MAIN we are working in
BMI AUX ;If pixel has no bits in MAIN memory - go to aux routine
STA PAGE1 ;Map $2000 to MAIN memory
LDA (SCRN_LO),Y ;Load screen data
AND MAINAND,X ;Erase pixel bits
ORMAIN ORA MAINGR,X ;Draw coloured bits
STA (SCRN_LO),Y ;Write back to screen
AUX LDY ABOFFSET,X ;Find what byte if any in AUX we are working in
BMI END ;If no part of the pixel is in AUX - end the program
STA PAGE2 ;Map $2000 to AUX memory
LDA (SCRN_LO),Y ;Load screen data
AND AUXAND,X ;Erase pixel bits
ORAUX ORA AUXGR,X ;Draw coloured bits
STA (SCRN_LO),Y ;Write back to screen
END RTS
SETDCOLOR TAY ;Assume the desired colour is in the accumulator
LDA CLOM,Y ;Lookup low byte of MAIN memory colour table
STA ORMAIN+1 ;Update the ORA instruction
LDA CHIM,Y ;Lookup high byte of MAIN memory colour table
STA ORMAIN+2 ;Update the ORA instruction
LDA CLOA,Y ;Lookup low byte of AUX memory colour table
STA ORAUX+1 ;Update the ORA instruction
LDA CHIA,Y ;Lookup high byte of AUX memory colour table
STA ORAUX+2 ;Update the ORA instruction
RTS
MBOFFSET DB 255,0,0,0,255,1,1,255,2,2,2,255,3,3
DB 255,4,4,4,255,5,5,255,6,6,6,255,7,7
DB 255,8,8,8,255,9,9,255,10,10,10,255,11,11
DB 255,12,12,12,255,13,13,255,14,14,14,255,15,15
DB 255,16,16,16,255,17,17,255,18,18,18,255,19,19
DB 255,20,20,20,255,21,21,255,22,22,22,255,23,23
DB 255,24,24,24,255,25,25,255,26,26,26,255,27,27
DB 255,28,28,28,255,29,29,255,30,30,30,255,31,31
DB 255,32,32,32,255,33,33,255,34,34,34,255,35,35
DB 255,36,36,36,255,37,37,255,38,38,38,255,39,39
ABOFFSET DB 0,0,255,1,1,1,255,2,2,255,3,3,3,255
DB 4,4,255,5,5,5,255,6,6,255,7,7,7,255
DB 8,8,255,9,9,9,255,10,10,255,11,11,11,255
DB 12,12,255,13,13,13,255,14,14,255,15,15,15,255
DB 16,16,255,17,17,17,255,18,18,255,19,19,19,255
DB 20,20,255,21,21,21,255,22,22,255,23,23,23,255
DB 24,24,255,25,25,25,255,26,26,255,27,27,27,255
DB 28,28,255,29,29,29,255,30,30,255,31,31,31,255
DB 32,32,255,33,33,33,255,34,34,255,35,35,35,255
DB 36,36,244,37,37,37,255,38,38,255,39,39,39,255
MAINAND
LUP 20
DB %01111111,%01111110,%01100001,%00011111
DB %01111111,%01111000,%00000111
--^
AUXAND
LUP 20
DB %01110000,%00001111,%01111111,%01111100
DB %01000011,%00111111,%01111111
--^
MAINBL
LUP 20
DB %00000000,%00000000,%00000000,%00000000
DB %00000000,%00000000,%00000000
--^
AUXBL
LUP 20
DB %00000000,%00000000,%00000000,%00000000
DB %00000000,%00000000,%00000000
--^
MAINMG
LUP 20
DB %00000000,%00000001,%00010000,%00000000
DB %00000000,%00000100,%01000000
--^
AUXMG
LUP 20
DB %00001000,%00000000,%00000000,%00000010
DB %00100000,%00000000,%00000000
--^
MAINBR
LUP 20
DB %00000000,%00000000,%00001000,%00000000
DB %00000000,%00000010,%00100000
--^
AUXBR
LUP 20
DB %00000100,%01000000,%00000000,%00000001
DB %00010000,%00000000,%00000000
--^
MAINOR
LUP 20
DB %00000000,%00000001,%00011000,%00000000
DB %00000000,%00000110,%01100000
--^
AUXOR
LUP 20
DB %00001100,%01000000,%00000000,%00000011
DB %00110000,%00000000,%00000000
--^
MAINDG
LUP 20
DB %00000000,%00000000,%0000100,%01000000
DB %00000000,%00000001,%00010000
--^
AUXDG
LUP 20
DB %00000010,%00100000,%00000000,%00000000
DB %00001000,%00000000,%00000000
--^
MAING1
LUP 20
DB %00000000,%00000001,%00010100,%01000000
DB %00000000,%00000101,%01010000
--^
AUXG1
LUP 20
DB %00001010,%00100000,%00000000,%00000010
DB %00101000,%00000000,%00000000
--^
MAINGR
LUP 20
DB %00000000,%00000000,%00001100,%01000000
DB %00000000,%00000011,%00110000
--^
AUXGR
LUP 20
DB %00000110,%01100000,%00000000,%000000001
DB %00011000,%00000000,%00000000
--^
MAINYE
LUP 20
DB %00000000,%00000001,%00011100,%01000000
DB %00000000,%00000111,%01110000
--^
AUXYE
LUP 20
DB %00001110,%01100000,%00000000,%00000011
DB %00111000,%00000000,%00000000
--^
MAINDB
LUP 20
DB %00000000,%00000000,%00000010,%00100000
DB %00000000,%00000000,%00001000
--^
AUXDB
LUP 20
DB %00000001,%00010000,%00000000,%00000000
DB %00000100,%01000000,%00000000
--^
MAINVI
LUP 20
DB %00000000,%00000001,%00010010,%00100000
DB %00000000,%00000100,%01001000
--^
AUXVI
LUP 20
DB %00001001,%00010000,%00000000,%00000010
DB %00100100,%01000000,%00000000
--^
MAING2
LUP 20
DB %00000000,%00000000,%00001010,%00100000
DB %00000000,%00000010,%00101000
--^
AUXG2
LUP 20
DB %00000101,%01010000,%00000000,%00000001
DB %00010100,%01000000,%00000000
--^
MAINPI
LUP 20
DB %00000000,%00000001,%00011010,%00100000
DB %00000000,%00000110,%01101000
--^
AUXPI
LUP 20
DB %00001101,%01010000,%00000000,%00000011
DB %00110100,%01000000,%00000000
--^
MAINMB
LUP 20
DB %00000000,%00000000,%00000110,%01100000
DB %00000000,%00000001,%00011000
--^
AUXMB
LUP 20
DB %00000011,%00110000,%00000000,%00000000
DB %00001100,%01000000,%00000000
--^
MAINLB
LUP 20
DB %00000000,%00000001,%00010110,%01100000
DB %00000000,%00000101,%01011000
--^
AUXLB
LUP 20
DB %00001011,%00110000,%00000000,%00000010
DB %00101100,%01000000,%00000000
--^
MAINAQ
LUP 20
DB %00000000,%00000000,%00001110,%01100000
DB %00000000,%00000011,%00111000
--^
AUXAQ
LUP 20
DB %00000111,%01110000,%00000000,%00000001
DB %00011100,%01000000,%00000000
--^
MAINWI
LUP 20
DB %00000000,%00000001,%00011110,%01100000
DB %00000000,%00000111,%01111000
--^
AUXWI
LUP 20
DB %00001111,%01110000,%00000000,%00000011
DB %00111100,%01000000,%00000000
--^
CLOM DB <MAINBL,<MAINMG,<MAINBR,<MAINOR,<MAINDG
DB <MAING1,<MAINGR,<MAINYE,<MAINDB,<MAINVI
DB <MAING2,<MAINPI,<MAINMB,<MAINLB,<MAINAQ,<MAINWI
CHIM DB >MAINBL,>MAINMG,>MAINBR,>MAINOR,>MAINDG,>MAING1
DB >MAINGR,>MAINYE,>MAINDB,>MAINVI,>MAING2,>MAINPI
DB >MAINMB,>MAINLB,>MAINAQ,>MAINWI
CLOA DB <AUXBL,<AUXMG,<AUXBR,<AUXOR,<AUXDG
DB <AUXG1,<AUXGR,<AUXYE,<AUXDB,<AUXVI
DB <AUXG2,<AUXPI,<AUXMB,<AUXLB,<AUXAQ,<AUXWI
CHIA DB >AUXBL,>AUXMG,>AUXBR,>AUXOR,>AUXDG,>AUXG1
DB >AUXGR,>AUXYE,>AUXDB,>AUXVI,>AUXG2,>AUXPI
DB >AUXMB,>AUXLB,>AUXAQ,>AUXWI
HTAB_LO
LUP 4
DB $00,$00,$00,$00,$00,$00,$00,$00
DB $80,$80,$80,$80,$80,$80,$80,$80
--^
LUP 4
DB $28,$28,$28,$28,$28,$28,$28,$28
DB $A8,$A8,$A8,$A8,$A8,$A8,$A8,$A8
--^
LUP 4
DB $50,$50,$50,$50,$50,$50,$50,$50
DB $D0,$D0,$D0,$D0,$D0,$D0,$D0,$D0
--^
HTAB_HI
LUP 3
DB $20,$24,$28,$2C,$30,$34,$38,$3C
DB $20,$24,$28,$2C,$30,$34,$38,$3C
DB $21,$25,$29,$2D,$31,$35,$39,$3D
DB $21,$25,$29,$2D,$31,$35,$39,$3D
DB $22,$26,$2A,$2E,$32,$36,$3A,$3E
DB $22,$26,$2A,$2E,$32,$36,$3A,$3E
DB $23,$27,$2B,$2F,$33,$37,$3B,$3F
DB $23,$27,$2B,$2F,$33,$37,$3B,$3F
--^
This assembles to about 5K of code - mostly tables. It's worth pointing out that there are many ways to cut this program down in terms of size. For example since all the colour tables and mask tables repeat in cycles of seven, we could take the value in the X register modulus 7 before doing our lookup. This would save us a whopping 2K. It would also increase the number of cycles we spend plotting each pixel significantly.
There are even a couple of ways to make our code plot faster, mostly using approaches that are less easy to explain. I may cover some of these speed/size optimizations in a later article but for now this code fits our two primary goals:
There are even a couple of ways to make our code plot faster, mostly using approaches that are less easy to explain. I may cover some of these speed/size optimizations in a later article but for now this code fits our two primary goals:
- sufficient speed to be used in an action game
- sufficiently clear so that the reader can go on and build out their own enhancements.
Next up, we learn about some code management and put this bad boy to work with some demo code...