CPSC 220, Fall 2022
Lab 12: Floats and Files

This is a relatively short lab. It will only be worth ten points instead of the usual fifteen. You will use some the x86-64 instructions for working with floating point numbers, and you will work with files using functions from the standard C library. The program that you write will need to be linked to the C library. You can use the same build script that you used for last week's lab, with the -c option.

This lab is an individual assignment. You cannot work with a partner.

This lab is due next Thursday, as usual. Please name the file lab12.asm and submit it to your homework folder

(Note that because of the test and because it was a longer lab than usual, the due date for Lab 11 has been changed to Monday, November 14, at 3:00 PM.)

Processing a Data File

You will write this week's program from scratch. Create a new text file named lab12.asm for your program.

The program will do some simple numerical processing of floating point values. The data that you process will come from an input file, and the results will be written to an output file. This is just an exercise in using floating point and files; neither the data nor the processing is meaningful or interesting.

The input file for the program is data12.txt. You can get a copy from the folder /classes/cs220, or you can download it from the above link. The name of the output file is up to you; you do not need to turn in the output file. You do not need to ask the user to input the file names; you can hard code them into your program. Remember that when you open an existing file for output, the current contents of the file will be erased. For this lab, you are not required to check whether the output file exists before opening it.

The sample programs integral.asm and copyfile.asm work with floating point numbers and files. You might find some of the code from copyfile.asm, in particular, to be useful.


The assignment:

The format of the input file, data12.txt, is as follows: Each line of text contains a name, an integer, and several floating point numbers. The integer is the number of floating point values on the same line. The number of items is always greater than zero. Items on the line are separated by spaces. The name does not contain spaces, so it can be read using the format specification %s.

You will write one line to the output file for each line of input. The output line contains two items: a copy of the name from the input line and the average of the floating point values from the input line. (To find the average, add up the values and divide by the count.)

The end of the file is not marked in any way. You can recognize the end of file when fscanf cannot read the data from the start of a line, that is, when the return value from fscanf is less than the number of items that you were trying to read.

You should check the return value from fopen when opening the files. If a file can't be opened, you should print an error message and exit. As noted above the name of the output file is up to you. The data in data12.txt has the specified format, so if your program is correct, the only error that you will encounter while processing it is the read error from fscanf at the end the file. However, you might want to check for input errors anyway because they can reveal errors in your program. (When I wrote my own program, a bug caused it to output an infinite number of lines; it wrote several megabytes before I was able to stop it with Control-C.)


Discussion and hints:

I had a strange bug when writing my solution to this lab. I spent a lot of time on it and learned something new: When calling a varargs function such as fprintf, the rax register is supposed to contain the number of xmm registers that are used for passing arguments to the function. I am still not sure why this is necessary, but in my program, when rax was zero, printf did not print out correct floating point values from xmm registers. (Apparently, rax really just has to be non-zero when passing values in xmm registers.)

When using fscanf, you need to pass addresses to the function where you want the input values to be stored. This pretty much forces you to use variables for the data. My solution defines the variables that I needed in the following .bss section:

section .bss
     
    name resb 100  ; The name that is the first item on an input line.
    count resq 1   ; The integer that is the second item on an input line.
    data resq 1    ; One of the floating point data items from an input line.
    sum resq 1     ; The sum of the floating point data items from an input line.

Allowing 100 bytes for the name is more than enough. You can assume that all the names in the file are short.

I learned the hard way that none of the xmm registers are safe when calling functions. This is another reason to use variables rather than registers for the floating point values in particular: The values of the variables are safe and will not be changed by calling a function.

You can use the %s format for reading and for writing the name. You can use either %d or %ld for integers and either %g or %lg for floating point values. The %ld and %lg are for 64-bit values. There are other formats for floating point values, but the "g" format generally works well.