Hi all, I have a program where I need to iterate through a binary file many times, and where each binary data needs to be processed each time. I use something like:
read(f, data); { process data } seek(f, filepos(f)-1); write(f, data);
My program seems to spend a lot of time on file I/O, so just for fun I tried to remove the seek (of course the program now gives crap, but anyway :-)). This speeded it up by a factor of 4-5!
Now, I *could* just use two files and swap these once per iteration, but since I have a lot of these files this would really be a mess. Can anyone tell me why it's so slow, and what may be done to speed it up? Isn't Linux also supposed to have a very efficient file-cashing system, so the above shouldn't really be problem?
Best regards
Preben Bohn´
===== This message has been made up using recycled ideas and language constructs. No plant or animal has been injured in process of making this message.
__________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com
=?iso-8859-1?q?Preben=20Mikael=20Bohn?= wrote:
Hi all, I have a program where I need to iterate through a binary file many times, and where each binary data needs to be processed each time. I use something like:
read(f, data); { process data } seek(f, filepos(f)-1); write(f, data);
My program seems to spend a lot of time on file I/O, so just for fun I tried to remove the seek (of course the program now gives crap, but anyway :-)). This speeded it up by a factor of 4-5!
Now, I *could* just use two files and swap these once per iteration, but since I have a lot of these files this would really be a mess. Can anyone tell me why it's so slow, and what may be done to speed it up? Isn't Linux also supposed to have a very efficient file-cashing system, so the above shouldn't really be problem?
GPC does some caching, but not in the presence of seeks (this is possible, but more complicated to implement, especially when also writes are involved).
Linux's efficiency doesn't matter so much here, since the mere need for three system calls (read, seek, write) is quite an overhead.
I really suggest using two files (or a memory buffer if space permits) here. Swapping them doesn't seem too hard (either reset/rewrite them to different file names each turn, or use pointers to the file variables and swap them).
(I'm writing this without trying anything. If you'd send a demo program, I might try it, but I don't expect much different results.)
Frank
On Fri, 23 Nov 2001, [iso-8859-1] Preben Mikael Bohn wrote:
Hi all, I have a program where I need to iterate through a binary file many times, and where each binary data needs to be processed each time.
[..]
My program seems to spend a lot of time on file I/O,
[..]
Can anyone tell me why it's so slow, and what may be done to speed it up? Isn't Linux also supposed to have a very efficient file-cashing system, so the above shouldn't really be problem?
While the program is running check to see if swap is being used. ( use another window ). There is a known speed problem with vm that was fixed in kernel 2.4.14
Russ