Hi,
this posting is about the internal implementation of Pascal files in the Run Time Systems. Most Pascal programs are not affected by this, so many programmers can just ignore this.
In the C library, there are two kinds of file access routines. One kind of them is for integer file handles (which interface directly to the OS on most platforms), the others use `FILE *' pointers. They do internal buffering and some other things, and internally use integer file handles.
Pascal files in GPC are currently based on `FILE *' pointers. I want to change them to use integer file handles, because I've found that using `FILE *' causes some disadvantages:
- It's obviously less efficient to go through 2 extra layers (Pascal file -> FILE * and FILE * -> Integer) than only 1 layer. So this change should speed up file operations (though probably not by very much).
- Both FILE * and Pascal files do internal buffering which is not only inefficient, but makes it hard to control how much data is read ahead. This is important for terminals, pipes etc. and currently involves some kludges to get it approximately right.
- I've found that the semantics of FILE * and those of Pascal files don't really fit together well (e.g. the ways of representing run time errors or end-of-file), so currently a lot of RTS code deals with converting between those semantics.
- By using integer handles, we get less dependent on bugs in routines out of our control. E.g., I just came across a bug (I think) in fwrite() (a FILE * routine) in a special situation, while write() (the corresponding integer routine) works correctly in this situation.
Note, this does not mean that one should use integer file handles in Pascal programs, it's just about the way that `Text', `File' and `File of foo' are represented internally. The interface to Pascal programs would not change except for the 2 things that currently deal with FILE * (aka CFilePtr):
- AssignCFile and the CFile field of BindingType. I added them only recently, so most of you probably don't even know they exist. ;-) They are needed for some purposes in the RTS and the units, but I could do the same with integer handles.
- The GetFile function which currently returns the FILE * of a Pascal file. This function would vanish, and instead there would be GetFileHandle which does the same that currently `FileNo (GetFile (f))' does. Since most usages of GetFile that I'm aware of are in fact together with FileNo, this change would actually simplify those programs.
A problem would only be there if a program really uses the FILE * for anything directly. However, a solution using fdopen() seems possible.
So, if there are no protests, I'm going to do this change until the next release.
Frank