FPU Object
Updated: Feb 11, 2008

With some practice and perhaps the help of a manual on FPU function, even the
beginning programmer can use the FPU object.  In brief, one loads one or more
values and uses methods to modify them.  A statement as simple as x=FPU.pop
retrieves the result.

Developers can implement their formulas by using both the conventional and FPU
object instructions and compare the results as follows:

  DEFREAL10 x1,x2
  x1=y*cos(z) 'conventional coding
  FPU.load y: FPU.load z: FPU.cos: FPU.mul: x2=FPU.pop
' alternatively,
' WITH FPU
' .load y: .load z: .cos: .mul
' END WITH: x2=FPU.pop
  IF x1=x2 THEN PRINT "I coded it right."

Now the x2 calculation may appear to be more cumbersome than the x1.  Sure, but
imagine you are multiplying var y times a million numbers.  With the
conventional coding for x1, you would be loading var y 999,999 times for
nothing.  The x2 steps above illustrate that one can load y just once and then
proceed through your million values of z.

Here is a common example of speed optimization.  We have calculated the mean
and standard deviation (sd) of a set of numbers D.  Now we normalize the data
in the array D:

  FPU.load mean: FPU.load sd
  for i=1 to n
  x=D(i): FPU.load x
  FPU.sub 2: FPU.div 1 'x=(x-mean)/sd
  D(i)=FPU.pop  'normalized value: D_mean=0; D_sd=1
  next i
  x=FPU.pop: x=FPU.pop  'remove preloaded values

Notice the conventional formula (x-mean)/sd would require loading the mean and
sd over and over with each iteration 1 to n above.  Rather, the FPU Object
allows you to load the mean and sd once, and then loop through the data
normalizing it.

Please keep in mind that a "constant" for a loop as shown above is simply any
value that does not change in the loop and thus may be a variable in your code.

HotBasic defaults to maximum precision by storing floating literals such as
"2.5" with REAL10 precision.  However, in loops and extensive calcuations,
reading REAL10 values takes more time than reading DOUBLE values.  Hence, if
speed is a concern and your constant does not require REAL10 precision, save
such floating literals as DOUBLE constants:

  defdbl c2_5 = 2.5
  'code
  fpu.load c2_5: fpu.load x: fpu.mul 1  'x = x * c2_5
  x = fpu.pop  'get result

The exclusive HotBasic FPU Object differs from conventional coding in several
ways.  For operations on two operands (specifically ADD, SUB, SUBR, MUL, DIV,
DIVR, IADD, ISUB, ISUBR, IMUL, IDIV and IDIVR) the operand loaded first stays
on the FPU.  E.g, if a matrix is to be multiplied by a constant, that value
would have to be loaded just once.

Normally, IDIV and IDIVR can result in floating values; however, the FPU Object
is coded to produce integer results similar to the "\" operator in conventional
coding.  E.g., i = a \ b  'is integer divide.

Note: "top" refers to ST or ST(0) or last value loaded.


PROPERTIES (Read/Write):
~~~~~~~~~~ ~~~~~~~~~~~~~
CONTROL    Returns or stores FPU control word.  See STATUS examples.

STATE      Saves or restores coprocesser state to/from a 14 byte buffer.

STATUS     Returns FPU status word.  fpustat=FPU.status
           Stores  FPU status word.  FPU.status=fpustat


PROPERTIES (Read Only Numeric):
~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~
ERROR      0 = no exception.

  FPU.error returns 7 FPU exception bits which may be examined:
  invalid operation (&H1), denormalized (&H2), divide by zero (&H4),
  overflow (&H8), underflow (&H10), precision reduced (&H20),
  FPU stack fault (&H40).

  i = FPU.error  'assign to non-float to avoid resetting FPU error flags
  IF i OR 28 THEN PRINT "zero divide or overflow or underflow"


Except where specified, METHODS modify the value at "top" of FPU stack:

METHODS    Arguments & Comments
~~~~~~~    ~~~~~~~~~~~~~~~~~~~~
ABS        Absolute value       
ACOS       Arc cosine
ADD        Add
ADD n      Add FPU value at index n (1 to 7)
ADDP       Add FPU values 0 and 1; ; value 0 popped; result at 0 
AND        Logical AND
ASIN       Arc sine
ATAN,ATN   Arc tangent where top = x/y
BCDLOAD src Load BCD data from src; src = real10-dimensioned variable.    
BCDPOP dst dst receives BCD data; dst = real10 variable (10 byte buffer).
CHS        Change sign
COPY       Load copy of top value; e.g, .load x: .copy: .mul gives x^2
COS        Cosine
DIV        Divide 
DIV n      Dividy by FPU value at index n (1 to 7)
DIVP       Divide FPU value 0 by value 1; value 0 popped; result at 0 
DIVR       Divide reverse operand order
EXAM       Examine top value and set condition codes in STATUS.
EXP        Exponential function; e raised to the power of top
EXCHANGE n Exchange FPU value n by top, n = 1 to 7; default n = 1
EXTRACT    Exponent replaces top; mantissa pushed on to FPU stack.
FRAC       Fractional part of top replaces top
HCOS       Hyperbolic cosine
HSIN       Hyperbolic sine
HTAN       Hyperbolic tangent
IADD       Integer add (integer result)
IDIV       Integer divide (integer result)
IDIVR      Integer divide reverse operand order (integer result)
IMUL       Integer multiply (integer result)
INIT       Initialize FPU unit
INT        Convert to integer less than or equal to top
ISUB       Integer subtract (integer result)
ISUBR      Integer subtract reverse operand order (integer result)
LOAD scr   Load FPU from scr, any valid numeric data type.
LOADENV scr Load FPU environment from scr (94 byte buffer)
LN         Natural logarithm
LNTWO      Load natural log of 2
LOG        Logarithm base 10
LOG2E      Load logarithm base 2 of e
LOG2TEN    Load logarithm base 2 of 10
LOGTWO     Load logarithm base 10 of 2
MUL        Multiply
MUL n      Multiply by FPU value at index n (1 to 7)
MULP       Multiply FPU values 0 and 1; value 0 popped; result at 0 
NAPIER     Load napier value e
NEG        Change sign; same as CHS
ONE        Load value = 1
OR         Logical OR FPU value 1 and top
PI         Load pi
POP        Retrieve value and "pop" FPU stack. E.g., x=FPU.pop
PREM       Partial remainder
READ       Retrieve value. E.g., x=FPU.read
ROUND      Convert to integer by round method
SAVEENV dst Save FPU environment to dst (94 byte buffer)
SCALE      Scale by power of 2
SGN        Get sign (1,0,-1) of top
SIN        Sine
SINCOS     FPU sincos generation
SQR        Square root
SUB        Subtract
SUB n      Subtract FPU value at index n (1 to 7)
SUBP       Subtract FPU value 1 from 0; value 0 popped; result at 0 
SUBR       Subtract reverse operand order
TAN        Tangent
TEST       Integer compare and pop
XOR        Logical XOR FPU value 1 and top
XEXPY      X ^ Y where X is FPU value 1; Y is top
YL2X       Y * logarithm base 2 of X
YL2XP1     Y * logarithm base 2 of (X+1)
2XM1       (2 ^ X) - 1
ZERO       Load value of zero


###########
hotfpu.bas and hotfpu2.bas show FPU Object coding examples.

Copyright 2003-2008 James J Keene PhD
Original Publication: Aug 26, 2003