Adriaan suggested I use --propagate-units to improve my compile speed.
I tried that, but given I use --uses=GPCMacOSAll, where GPCMacOSAll is all the Mac OS X system interfaces in a single unit, and it compiles to a gpi of 27Meg, that would make each of my gpi's 27Meg (perhaps worse if it ends up being included multiple times). But regardless, it didn't work because a simple compile of a trivial unit ended up taking multiple minutes itself, so something it not happy.
I'll have to see if I can manage to add explicit system units to the uses clause of my units in order to get --propagate-units in order to see if that will improve things, but I haven't done that yet.
Then I tried turning off one of my processors, reverting back to the original gp so it compiles one unit at a time, and using the Mac OS X Shark tool to profile the system while compiling some units.
It's not clear from the results what percentage of the time is actually being spent inside gpc1, but the report on a gpc1 processes is interesting (attached below). It shows 50+% in import_interface. This matches with what Adriaan saw in the progress messages (that the progress messages spent a long time in the lines around the uses clause).
Enjoy, Peter.
# Time Profile of Everything SharkProfileViewer # Generated from the visible portion of the outline view + 73.0% start (gpc1) | + 73.0% _start (gpc1) | | + 72.0% toplev_main (gpc1) | | | + 70.2% main_yyparse (gpc1) | | | : + 70.1% yyuserAction (gpc1) | | | : | + 54.3% do_extra_import (gpc1) | | | : | | + 54.3% import_interface (gpc1) | | | : | | | + 41.7% load_gpi_file (gpc1) | | | : | | | : + 27.4% load_node (gpc1) | | | : | | | : | - 6.4% get_identifier (gpc1) | | | : | | | : | - 4.7% load_string (gpc1) | | | : | | | : | - 4.7% mread1 (gpc1) | | | : | | | : | - 3.1% set_identifier_spelling (gpc1) | | | : | | | : | - 1.8% make_node (gpc1) | | | : | | | : | - 1.4% free (libSystem.B.dylib) | | | : | | | : | 1.1% szone_free (libSystem.B.dylib) | | | : | | | : | 0.3% mseek (gpc1) | | | : | | | : | 0.2% ggc_alloc (gpc1) | | | : | | | : | 0.1% itab_store_node (gpc1) | | | : | | | : | - 0.1% ht_lookup (gpc1) | | | : | | | : | - 0.1% build_decl (gpc1) | | | : | | | : | 0.1% sort_fields (gpc1) | | | : | | | : | 0.1% dyld_stub_free (gpc1) | | | : | | | : | 0.1% allocate_decl_lang_specific (gpc1) | | | : | | | : 11.8% compute_checksum (gpc1) | | | : | | | : - 1.5% gpi_open (gpc1) | | | : | | | : - 0.1% mread1 (gpc1) | | | : | | | - 12.4% import_node (gpc1) | | | : | - 13.2% finish_routine (gpc1) | | | : | - 2.0% finalize_module (gpc1) | | | : | - 0.3% import_interface (gpc1) | | | : | - 0.1% build_predef_call (gpc1) | | | : | - 0.1% start_unit_implementation (gpc1) | | | : - 0.2% yylex (gpc1) | | | - 1.5% write_global_declarations (gpc1) | | | - 0.1% init_regs (gpc1) | | | - 0.1% yyparse (gpc1) | | | - 0.1% lang_init_3_4 (gpc1) | | | - 0.1% init_emit_once (gpc1) | | 0.8% write_global_declarations (gpc1) | | 0.1% recog_12 (gpc1) | | 0.1% init_regs (gpc1) | | - 0.1% _call_mod_init_funcs (gpc1) - 15.9% thandler (mach_kernel) - 10.7% shandler (mach_kernel) - 0.3% unix_syscall (mach_kernel) - 0.1% thread_continue (mach_kernel) - 0.1% _dyld_start (dyld)
Peter N Lewis wrote:
Adriaan suggested I use --propagate-units to improve my compile speed.
More precisely, I were referring to an earlier message on the mailing list http://www.gnu-pascal.de/crystal/gpc/en/mail11158.html and there are three steps involved:
(a) activate --propagate-units (this copies .gpi data of used units into the .gpi file of the compiled unit) (b) remove all USES clause entries for units that are no longer needed (because they are not used at all or because of the --propagate-units flag) (c) use gp instead of gpc (gp works very well and doesn't have the bugs of --automake)
Step (b) is essential, importing any unit *only once* in the unit chain.
I tried that, but given I use --uses=GPCMacOSAll, where GPCMacOSAll is all the Mac OS X system interfaces in a single unit, and it compiles to a gpi of 27Meg, that would make each of my gpi's 27Meg (perhaps worse if it ends up being included multiple times). But regardless, it didn't work because a simple compile of a trivial unit ended up taking multiple minutes itself, so something it not happy.
Just using --propagate-units to improve compile speed doesn't work, as it increases the size of .gpi files.
I'll have to see if I can manage to add explicit system units to the uses clause of my units in order to get --propagate-units in order to see if that will improve things, but I haven't done that yet.
That is not necessary, as the idea is to import units only once in the unit chain. And for the Mac interfaces, you can setup a MyMacOS master unit (like the MacOS.pas unit) that includes only those units that you actually use. This decreases the size of the .gpi file, as compared to GPCMacOSAll.pas. See the original message at http://www.gnu-pascal.de/crystal/gpc/en/mail11158.html.
Then I tried turning off one of my processors, reverting back to the original gp so it compiles one unit at a time, and using the Mac OS X Shark tool to profile the system while compiling some units.
It's not clear from the results what percentage of the time is actually being spent inside gpc1, but the report on a gpc1 processes is interesting (attached below). It shows 50+% in import_interface. This matches with what Adriaan saw in the progress messages (that the progress messages spent a long time in the lines around the uses clause).
Ah, that's interesting, because it indicates that a lot of speed can be gained there !
Regards,
Adriaan van Os