=?iso-8859-1?q?Preben=20Mikael=20Bohn?= wrote:
Hi all, I have a program where I need to iterate through a binary file many times, and where each binary data needs to be processed each time. I use something like:
read(f, data); { process data } seek(f, filepos(f)-1); write(f, data);
My program seems to spend a lot of time on file I/O, so just for fun I tried to remove the seek (of course the program now gives crap, but anyway :-)). This speeded it up by a factor of 4-5!
Now, I *could* just use two files and swap these once per iteration, but since I have a lot of these files this would really be a mess. Can anyone tell me why it's so slow, and what may be done to speed it up? Isn't Linux also supposed to have a very efficient file-cashing system, so the above shouldn't really be problem?
GPC does some caching, but not in the presence of seeks (this is possible, but more complicated to implement, especially when also writes are involved).
Linux's efficiency doesn't matter so much here, since the mere need for three system calls (read, seek, write) is quite an overhead.
I really suggest using two files (or a memory buffer if space permits) here. Swapping them doesn't seem too hard (either reset/rewrite them to different file names each turn, or use pointers to the file variables and swap them).
(I'm writing this without trying anything. If you'd send a demo program, I might try it, but I don't expect much different results.)
Frank