base:fast_sqrt
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionLast revisionBoth sides next revision | ||
base:fast_sqrt [2019-07-27 21:28] – verz | base:fast_sqrt [2019-08-18 10:20] – verz | ||
---|---|---|---|
Line 390: | Line 390: | ||
} | } | ||
</ | </ | ||
- | where //p// is the number of bits of the result; or half the bits of the radicand | + | where //p// is the number of bits of the result; |
// | // | ||
Line 397: | Line 397: | ||
This is the code for a 32bit integer Sqrt. Provides the result and the remainder: | This is the code for a 32bit integer Sqrt. Provides the result and the remainder: | ||
- | < | + | < |
; | ; | ||
;* sqrt32 | ;* sqrt32 | ||
;* | ;* | ||
;* | ;* | ||
+ | ; | ||
+ | ;* by Verz - Jul2019 | ||
+ | ; | ||
;* | ;* | ||
- | ;* input: | + | ;* input: |
- | ;* output: sqrt, 16bit value | + | ;* output: sqrt, |
- | ;* remnd, | + | ;* remnd, |
- | ;* | + | ;******************************************** |
sqrt32 | sqrt32 | ||
sta sqrt ; R=0 | sta sqrt ; R=0 | ||
sta sqrt+1 | sta sqrt+1 | ||
+ | sta M+4 | ||
;sta T+1 ; (T+1) is zero until last iteration; (T+0) is always 0 | ;sta T+1 ; (T+1) is zero until last iteration; (T+0) is always 0 | ||
ldy #14 ; 15 iterations (14-->0) + last iteration | ldy #14 ; 15 iterations (14-->0) + last iteration | ||
- | loopsq | + | loopsq |
- | lda stablo, | + | lda sqrt |
- | | + | |
- | sta T+2 | + | sta T+2 ; will have not been affected yet |
- | lda stabhi,y | + | lda sqrt+1 |
- | adc sqrt+1 | + | ora stabhi,y |
sta T+3 | sta T+3 | ||
+ | bcs skip0 ; takes care of large numbers; if set, M>T | ||
| | ||
lda M+3 | lda M+3 | ||
Line 436: | Line 441: | ||
sbc T+3 | sbc T+3 | ||
sta M+3 | sta M+3 | ||
- | clc ; possibly unnecessary | ||
lda sqrt ; R=R+D | lda sqrt ; R=R+D | ||
- | | + | |
sta sqrt | sta sqrt | ||
lda sqrt+1 | lda sqrt+1 | ||
- | | + | |
sta sqrt+1 | sta sqrt+1 | ||
skip1 | skip1 | ||
Line 451: | Line 455: | ||
bpl loopsq | bpl loopsq | ||
lastiter | lastiter | ||
+ | bcs skp0 ; takes care of large numbers; if set, M>T | ||
; during last iteration D=1, so [(2*R+D) LSR 1] makes D the MSB of T+1 | ; during last iteration D=1, so [(2*R+D) LSR 1] makes D the MSB of T+1 | ||
lda M+3 | lda M+3 | ||
Line 474: | Line 479: | ||
sta M+3 | sta M+3 | ||
inc sqrt ; R=R+D with D=1 | inc sqrt ; R=R+D with D=1 | ||
- | bne skp1 | + | skp1 asl M+1 ; M=M*2; location |
- | inc sqrt+1 | + | |
- | skp1 asl M | + | |
- | rol M+1 | + | |
rol M+2 | rol M+2 | ||
rol M+3 | rol M+3 | ||
+ | rol M+4 | ||
rts | rts | ||
;**** Variables and Shift table | ;**** Variables and Shift table | ||
- | byte 0 | ||
stabhi byte 0, | stabhi byte 0, | ||
stablo BYTE $01, | stablo BYTE $01, | ||
byte 0, | byte 0, | ||
- | square | + | square = $57 |
- | sqrt = $5F | + | sqrt |
- | remnd | + | remnd = M+2 ; 2 B + 1 b: is in the high bytes of M (M LSR 16); msb is in T+0 (the 5th byte of square) |
- | T | + | T = $5B |
- | M | + | M = square |
</ | </ | ||
- | The algorithm is pretty fast: it has a top cycles count of around 1700, but seems to average at 1.3ms with variables in page zero.\\ | + | The algorithm is pretty fast: it has a top cycles count of around 1700, but seems to average at 1.3ms (using |
- | I think it could be convenient to add a control to check whether M=0 and exit earlier. | + | |
base/fast_sqrt.txt · Last modified: 2019-08-18 20:28 by verz