Dear friends,
I don't know if what I found is a bug or a correct behaviour I don't understand.
The following program halts with error 'attempt ro read past end of file';
program test; var f : file of char; begin rewrite(f,'test.dat'); seek(f,0); f^:='x'; (*GIVES ERROR HERE*) put(f); close(f); end.
If I just insert a call reading eof(f) everything seems to work:
program test; var f : file of char; begin rewrite(f,'test.dat'); seek(f,0); if eof(f) then ; (* do nothing *) f^:='x'; put(f); close(f); end.
I am puzzled!
(working with gpc 20030830)
Ciao, Francesco
Francesco Bonomi wrote:
I don't know if what I found is a bug or a correct behaviour I don't understand.
The following program halts with error 'attempt ro read past end of file';
program test; var f : file of char; begin rewrite(f,'test.dat'); seek(f,0); f^:='x'; (*GIVES ERROR HERE*) put(f); close(f); end.
Read the definitions of file state in the standard. As far as ISO 7185 is concerned, a seek is an extension. At any rate, rewrite has created a new file, positioned it at eof, and no reading is possible. The seek has probably created an anomalous condition in which reading is actually possible, and using lazy i/o has marked the file as having a get pending. Use of f^ says that that get must be implemented, but the file is at eof and cannot do so.
If I just insert a call reading eof(f) everything seems to work:
"seems" is the appropriate word. Read the ISO10206 standard, which adds extensions allowing you to have this sort of random access. At any rate, reading from an empty file should still trigger an error.
Francesco Bonomi wrote:
Dear friends,
I don't know if what I found is a bug or a correct behaviour I don't understand.
The following program halts with error 'attempt ro read past end of file';
program test; var f : file of char; begin rewrite(f,'test.dat'); seek(f,0); f^:='x'; (*GIVES ERROR HERE*) put(f); close(f); end.
I consider it as a bug. The following patch corrects the problem:
--- rts/files.pas.bb 2005-01-30 00:55:45.498296856 +0100 +++ rts/files.pas 2005-01-30 03:40:10.262624728 +0100 @@ -1810,7 +1813,6 @@ if IsReadable (f) and not f^.Status.EOF then begin GetN (f); - CheckReadableNotEOF (f) end else { Buffer cannot be read in. But perhaps someone only wants to
After the patch is applied `fjf490.pas' fails. However, I think that `fjf490.pas' is wrong. At least in standard mode accessing buffer variable when at the end of file is _not_ an error: the postcondition of `get' says that at the end of file value of the file buffer is undefined, and I see no restriction on access to undefined variables.
So, IMHO the following program is correct (and works with the patch):
program fjf490b(Output);
var t: Text; c: Char;
begin Rewrite (t); Reset (t); c := t^; writeln('OK') end.
Waldek Hebisch wrote:
... snip ...
So, IMHO the following program is correct (and works with the patch):
program fjf490b(Output);
var t: Text; c: Char;
begin Rewrite (t); Reset (t);
I disagree. At this point t is open, and using lazy i/o a get is pending.
c := t^;
To implement this the system should perform the actual get. The get fails, because the file is at EOF, signalling "attempt to read past EOF". It has nothing to do with the definedness of t^. Similarly eoln(t) should fail.
writeln('OK')
So this should be "writeln('Failed');
end.
Accessing f^, or executing eoln or any read on t, should call something that checks whether a get is pending, and executes it if so. This is a library issue. The compiler issure is to make sure that the f^ action calls the routine, which should return the pointer. You cannot have lazy i/o and just access the file pointer as another address.
CBFalconer wrote:
Waldek Hebisch wrote:
... snip ...
So, IMHO the following program is correct (and works with the patch):
program fjf490b(Output);
var t: Text; c: Char;
begin Rewrite (t); Reset (t);
I disagree. At this point t is open, and using lazy i/o a get is pending.
You mix the standard with implementation details. Standard suggests, but does not requre lazy i/o. Basic criterion for correctness of lazy i/o is that programs working without lazy i/o should still work. So, from this point `t^' should be available to the programmer (read the postcondition of `Reset'). The standard says that content of `t^' is undefined (since t is at eof), which in plain language means that the implementation should put some garbage into it. AFAICS if you do not want to look at the garbage, you need explicitly test for eof.
c := t^;
To implement this the system should perform the actual get. The get fails, because the file is at EOF, signalling "attempt to read past EOF". It has nothing to do with the definedness of t^. Similarly eoln(t) should fail.
Note that accessing `t^' is logically quite different from `get'. Remember that `t^' references a record _after_ current file position. At this point `get' would attempt to move file position past the end of file and access _second_ record past the end of file, which indeed is an error. `eoln' is different.
Accessing f^, or executing eoln or any read on t, should call something that checks whether a get is pending, and executes it if so. This is a library issue. The compiler issure is to make sure that the f^ action calls the routine, which should return the pointer. You cannot have lazy i/o and just access the file pointer as another address.
All the magic has to be in the compiler and the runtime. The user can treat the file buffer almost like another variable.
Waldek Hebisch wrote:
CBFalconer wrote:
Waldek Hebisch wrote:
... snip ...
So, IMHO the following program is correct (and works with the patch):
program fjf490b(Output);
var t: Text; c: Char;
begin Rewrite (t); Reset (t);
I disagree. At this point t is open, and using lazy i/o a get is pending.
You mix the standard with implementation details. Standard suggests, but does not requre lazy i/o. Basic criterion for correctness of lazy i/o is that programs working without lazy i/o should still work. So, from this point `t^' should be available to the programmer (read the postcondition of `Reset'). The standard says that content of `t^' is undefined (since t is at eof), which in plain language means that the implementation should put some garbage into it. AFAICS if you do not want to look at the garbage, you need explicitly test for eof.
c := t^;
To implement this the system should perform the actual get. The get fails, because the file is at EOF, signalling "attempt to read past EOF". It has nothing to do with the definedness of t^. Similarly eoln(t) should fail.
Note that accessing `t^' is logically quite different from `get'. Remember that `t^' references a record _after_ current file position. At this point `get' would attempt to move file position past the end of file and access _second_ record past the end of file, which indeed is an error. `eoln' is different.
Without lazyio the access to t^ should check for eof, and error out if it is set. Again, the error is "read past eof". That way there is no need for a unique invalid value in whatever actually holds the value f^. This keeps both methods agreeing with standards.
Accessing f^, or executing eoln or any read on t, should call something that checks whether a get is pending, and executes it if so. This is a library issue. The compiler issure is to make sure that the f^ action calls the routine, which should return the pointer. You cannot have lazy i/o and just access the file pointer as another address.
All the magic has to be in the compiler and the runtime. The user can treat the file buffer almost like another variable.
I don't think we really disagree. The key point is that the access to f^ has to be protected by checking EOF. BTW, eof also has to check for a pending get, and this gets tricky because it musn't execute such if it is already set. Auto final eoln insertion can be done by saying that if eof is true then eoln is true. The point is to avoid any logical differences between a lazyio system and a pure batch system.
Tha anomaly arrives when the user checks eof or eoln, and these can unexpectedly force an actual get. So a loop of the form:
WHILE NOT eoln DO BEGIN prompt('with something:'); read(values); IF eoln readln; END;
is likely to do its prompting at the wrong moment because the final get in the expansion of readln has set 'getpending' The cure is to leave the line incompletely read within the loop:
WHILE NOT eof DO BEGIN prompt('with something); WHILE NOT eoln DO BEGIN read(values); process(values); END; linendingstuff; readln; END;
but this is the sort of i/o considerations that seem to drive the C programmers out of their skulls.
Waldek Hebisch wrote:
After the patch is applied `fjf490.pas' fails. However, I think that `fjf490.pas' is wrong. At least in standard mode accessing buffer variable when at the end of file is _not_ an error: the postcondition of `get' says that at the end of file value of the file buffer is undefined, and I see no restriction on access to undefined variables.
What about this (6.8.1: Expressions - General):
: When a primary is used, it shall be an error if the variable : denoted by a variableÂaccess of the primary is undefined. : : primary > variableÂaccess
Of course, we could say that we currently don't check for undefined access in general (as a processor is permitted to leave errors undetected), so why should we bother here?
Frank
Frank Heckenbach wrote:
Waldek Hebisch wrote:
After the patch is applied `fjf490.pas' fails. However, I think that `fjf490.pas' is wrong. At least in standard mode accessing buffer variable when at the end of file is _not_ an error: the postcondition of `get' says that at the end of file value of the file buffer is undefined, and I see no restriction on access to undefined variables.
What about this (6.8.1: Expressions - General):
: When a primary is used, it shall be an error if the variable : denoted by a variable?access of the primary is undefined. : : primary > variable?access
Yes, we may raise error in `fjf490.pas'.
Of course, we could say that we currently don't check for undefined access in general (as a processor is permitted to leave errors undetected), so why should we bother here?
Worse, ATM we fail correct programs (the original report was about a program _writing_ to the buffer variable). AFAICS to properly distinguish read form writes we have to do that in the core compiler. It seems that we could do that using apropriate tree code and delaying expansion of the reference. The question is how much do we care.
Waldek Hebisch wrote:
Frank Heckenbach wrote:
Waldek Hebisch wrote:
After the patch is applied `fjf490.pas' fails. However, I think that `fjf490.pas' is wrong. At least in standard mode accessing buffer variable when at the end of file is _not_ an error: the postcondition of `get' says that at the end of file value of the file buffer is undefined, and I see no restriction on access to undefined variables.
What about this (6.8.1: Expressions - General):
: When a primary is used, it shall be an error if the variable : denoted by a variable?access of the primary is undefined. : : primary > variable?access
Yes, we may raise error in `fjf490.pas'.
Of course, we could say that we currently don't check for undefined access in general (as a processor is permitted to leave errors undetected), so why should we bother here?
Worse, ATM we fail correct programs (the original report was about a program _writing_ to the buffer variable). AFAICS to properly distinguish read form writes we have to do that in the core compiler. It seems that we could do that using apropriate tree code and delaying expansion of the reference.
Either that (though it may be impossible to check at compile-time when used as a `var' parameter), or do it at runtime by marking the buffer undefined and give an error on reading. For general undefined-checks (with normal variables) we'd probably have to do this anyway (which seems to be a huge effort, both in the compiler and at runtime, nothing I plan to do anytime soon).
The question is how much do we care.
Probably not too much. So I'm applying your change and changing fjf490.pas, but adding a comment there (for possibly future changes in this area).
Frank