base:8bit_divide_8bit_product
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
base:8bit_divide_8bit_product [2014-06-10 13:11] – external edit 127.0.0.1 | base:8bit_divide_8bit_product [2017-10-26 07:21] (current) – [Smaller version] white_flame | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== 8bit Divide - 8bit Result ====== | ||
+ | |||
+ | ===== Normal binary division ===== | ||
+ | ...with shifting in loop. (If I remember right - submitted by Graham at CSDb forum) | ||
+ | |||
+ | < | ||
+ | ;normal binary division | ||
+ | ASL $FD | ||
+ | LDA #$00 | ||
+ | ROL | ||
+ | |||
+ | LDX #$08 | ||
+ | .loop1 | ||
+ | CMP $FC | ||
+ | BCC *+4 | ||
+ | SBC $FC | ||
+ | ROL $FD | ||
+ | ROL | ||
+ | DEX | ||
+ | BNE .loop1 | ||
+ | |||
+ | LDX #$08 | ||
+ | .loop2 | ||
+ | CMP $FC | ||
+ | BCC *+4 | ||
+ | SBC $FC | ||
+ | ROL $FE | ||
+ | ASL | ||
+ | DEX | ||
+ | BNE .loop2 | ||
+ | </ | ||
+ | |||
+ | Divides the value in $FD by the value in $FC, 8 bit integer result in $FD, the first 8 fraction bits are in $FE. | ||
+ | |||
+ | Ofcourse both loops should be unrolled :) I didn't want to write down the unrolled code here. | ||
+ | |||
+ | doynax: The remainder (in the accumulator) in the fraction loop seems to overflow for divisors above $80. A BCS jumping directly from the top of the loop to the SBC and forcibly setting carry afterwards seems to work. Is there a cleaner solution? | ||
+ | |||
+ | ==== Smaller version ==== | ||
+ | < | ||
+ | ; by White Flame | ||
+ | ; | ||
+ | ; Input: num, denom in zeropage | ||
+ | ; Output: num = quotient, .A = remainder | ||
+ | |||
+ | lda #$00 | ||
+ | ldx #$07 | ||
+ | clc | ||
+ | : rol num | ||
+ | rol | ||
+ | cmp denom | ||
+ | bcc :+ | ||
+ | sbc denom | ||
+ | : dex | ||
+ | bpl :-- | ||
+ | rol num | ||
+ | |||
+ | ; 19 bytes | ||
+ | ; | ||
+ | ; Best case = 154 cycles | ||
+ | ; Worst case = 170 cycles | ||
+ | ; | ||
+ | ; With immediate denom: | ||
+ | ; Best case = 146 cycles | ||
+ | ; Worst case = 162 cycles | ||
+ | ; | ||
+ | ; Unrolled with variable denom: | ||
+ | ; Best case = 106 cycles | ||
+ | ; Worst case = 127 cycles | ||
+ | ; | ||
+ | ; Unrolled with immediate denom: | ||
+ | ; Best case = 98 cycles | ||
+ | ; Worst case = 111 cycles | ||
+ | </ | ||
+ | |||
+ | |||
+ | If you don't understand what :, :--, :+ means. | ||
+ | : is an anonymous label. | ||
+ | |||
+ | |||
+ | ===== Division using tables ====== | ||
+ | |||
+ | Comes from CSDb forum (FIXME source by... ???) | ||
+ | |||
+ | ;This will divide two 8-bit numbers in some 90-150 cycles. | ||
+ | ;The code can easily be extended to handle larger dividends. | ||
+ | |||
+ | < | ||
+ | _divu_8 | ||
+ | lda div_b | ||
+ | cmp #2 | ||
+ | bcs + ; >= 2 | ||
+ | |||
+ | lda div_a | ||
+ | rts | ||
+ | |||
+ | + ldx #8 | ||
+ | |||
+ | - dex | ||
+ | asl | ||
+ | bcc - | ||
+ | |||
+ | bne + | ||
+ | |||
+ | lda div_a | ||
+ | - lsr | ||
+ | dex | ||
+ | bne - | ||
+ | rts | ||
+ | |||
+ | + tay | ||
+ | lda r0_table,y | ||
+ | ldy div_a | ||
+ | |||
+ | sta zp8_1 | ||
+ | sta zp8_2 | ||
+ | eor #$ff | ||
+ | sta zp8_3 | ||
+ | sta zp8_4 | ||
+ | |||
+ | sec | ||
+ | lda (zp8_1),y | ||
+ | sbc (zp8_3),y | ||
+ | lda (zp8_2),y | ||
+ | sbc (zp8_4),y | ||
+ | |||
+ | clc | ||
+ | adc div_a | ||
+ | |||
+ | ror | ||
+ | - lsr | ||
+ | dex | ||
+ | bne - | ||
+ | rts | ||
+ | |||
+ | div_a | ||
+ | .byte $0 | ||
+ | div_b | ||
+ | .byte $0 | ||
+ | r0_table | ||
+ | .byte $01, | ||
+ | .byte $e2, | ||
+ | .byte $c8, | ||
+ | .byte $b0, | ||
+ | .byte $9a, | ||
+ | .byte $87, | ||
+ | .byte $75, | ||
+ | .byte $65, | ||
+ | .byte $56, | ||
+ | .byte $48, | ||
+ | .byte $3c, | ||
+ | .byte $30, | ||
+ | .byte $25, | ||
+ | .byte $1b, | ||
+ | .byte $12, | ||
+ | .byte $09, | ||
+ | |||
+ | </ | ||
+ | |||
+ | ==== The same routine again, slightly optimized ==== | ||
+ | |||
+ | < | ||
+ | Let me bore you with an optimized version: | ||
+ | |||
+ | ; divide acc by y, result in acc | ||
+ | _divu_8 | ||
+ | ldx t0_table,y | ||
+ | stx b1+1 | ||
+ | ldx t1_table,y | ||
+ | beq + | ||
+ | |||
+ | ldy r0_table,x | ||
+ | |||
+ | sta zp8_1 | ||
+ | sta zp8_2 | ||
+ | eor #$ff | ||
+ | sta zp8_3 | ||
+ | sta zp8_4 | ||
+ | |||
+ | sec | ||
+ | lda (zp8_1),y | ||
+ | sbc (zp8_3),y | ||
+ | lda (zp8_2),y | ||
+ | sbc (zp8_4),y | ||
+ | |||
+ | clc | ||
+ | adc zp8_1 | ||
+ | ror | ||
+ | |||
+ | + sec | ||
+ | b1 bcs b1 | ||
+ | lsr | ||
+ | lsr | ||
+ | lsr | ||
+ | lsr | ||
+ | lsr | ||
+ | lsr | ||
+ | lsr | ||
+ | |||
+ | rts | ||
+ | |||
+ | .align $100 | ||
+ | r0_table | ||
+ | .byte $01, | ||
+ | .byte $e2, | ||
+ | .byte $c8, | ||
+ | .byte $b0, | ||
+ | .byte $9a, | ||
+ | .byte $87, | ||
+ | .byte $75, | ||
+ | .byte $65, | ||
+ | .byte $56, | ||
+ | .byte $48, | ||
+ | .byte $3c, | ||
+ | .byte $30, | ||
+ | .byte $25, | ||
+ | .byte $1b, | ||
+ | .byte $12, | ||
+ | .byte $09, | ||
+ | t0_table | ||
+ | .fill $100,0 | ||
+ | t1_table | ||
+ | .fill $100,0 | ||
+ | |||
+ | _divu_8_setup | ||
+ | ldy #1 | ||
+ | next | ||
+ | tya | ||
+ | ldx #$ff | ||
+ | - inx | ||
+ | asl | ||
+ | bcc - | ||
+ | sta t1_table,y | ||
+ | txa | ||
+ | sta t0_table,y | ||
+ | iny | ||
+ | bne next | ||
+ | rts | ||
+ | </ | ||
+ | |||
+ | ==== The init optimized (well, it packs better) ==== | ||
+ | |||
+ | < | ||
+ | r0_table | ||
+ | .byte $01, | ||
+ | .byte $e2, | ||
+ | .byte $c8, | ||
+ | .byte $b0, | ||
+ | .byte $9a, | ||
+ | .byte $87, | ||
+ | .byte $75, | ||
+ | .byte $65, | ||
+ | .byte $56, | ||
+ | .byte $48, | ||
+ | .byte $3c, | ||
+ | .byte $30, | ||
+ | .byte $25, | ||
+ | .byte $1b, | ||
+ | .byte $12, | ||
+ | .byte $09, | ||
+ | .fill $80,0 | ||
+ | t0_table | ||
+ | .fill $100,0 | ||
+ | t1_table | ||
+ | .fill $100,0 | ||
+ | |||
+ | _divu_8_setup | ||
+ | ldx #$7f | ||
+ | ldy #$ff | ||
+ | - | ||
+ | lda #0 | ||
+ | sta r0_table,y | ||
+ | dey | ||
+ | lda r0_table,x | ||
+ | sta r0_table,y | ||
+ | dey | ||
+ | dex | ||
+ | bpl - | ||
+ | ldy #1 | ||
+ | next | ||
+ | tya | ||
+ | ldx #$ff | ||
+ | - inx | ||
+ | asl | ||
+ | bcc - | ||
+ | sta t1_table,y | ||
+ | txa | ||
+ | sta t0_table,y | ||
+ | iny | ||
+ | bne next | ||
+ | rts | ||
+ | </ |
base/8bit_divide_8bit_product.txt · Last modified: 2017-10-26 07:21 by white_flame