I am working with a library that needs input in a Fixed Point notation. I’d like to figure out a way to convert the floating point results into fixed point representation.
The fixed point length is 16 bit. Numbers can be represented with a variable number of bits and I can specify the format of the bits. For instance, I can represent a number as Signed[3.13], upto 3 bits are used to represent the integer portion, and 13 bits are used for the fraction (resolution of 2^-13).
Q1. What is the best method to convert Mathematica floating point output into a 16 bit number. Ideally, I’d like a function like:
f[x_, signbits_, integerbits_, fractionbits_]
Q2. Given a set of numbers, what is the best method to determine the ideal number of integer bits and fractional bits used for the representation in order to minimize truncation errors
data = {0.0000618365, 0.0000701533, -0.0000747471, 0.0000595436, 0.0000705533, \
0.0000728675, 0.0000711056, 0.0000684559, 0.0000753624, -0.0000557638}
Answer
Answering my own question from a while ago.
Turns out the easiest method is using MMA’s built-in Computer Arithmetic package
<< ComputerArithmetic`
(*Set Math Parameters*)
SetArithmetic[6, 10, ExponentRange -> {-20, 20}];
fpConvert[x_, integerbits_, fractionbits_] := ComputerNumber[IntegerPart[x] + Round[FractionalPart[x], 2^-fractionbits]];
Let’s test the output:
N[fpConvert[Pi, 3, 2], {8, 8}]
N[fpConvert[Pi, 3, 4], {8, 8}]
N[fpConvert[Pi, 3, 8], {8, 8}]
N[fpConvert[Pi, 3, 13], {8, 8}]
Works fine.
Comments
Post a Comment