Re: GPC preformance kludge with local

8 Jun 2003


      Frank Heckenbach wrote:
...
Mirsad Todorovac wrote:
...
You might think that the function is very optimized, since it requires
only two comparisons and a lookup in table per character checked?
Alas, GPC does a proper call to the real memcpy() function of complete
``v'' array on each call of function DigitValue() !!!
Normal local variables are (by definition!) created each time the
routine is called. To avoid this, give them the `static' attribute
(non-standard), or declare them as `const' (non-standard, BP). The
latter is obviously preferrable since the array is really constant.
BTW, since you only access the array in the range '0' .. 'z', you
only need to declare this part and can omit the lots of `-1'
entries. Also, char indices are perfectly alright, so you don't have
to use `Ord' here.
...
Just FYI, making ``v'' array [0..255] of Integer (for aligned access) made
it even 10s slower (probably problems with FSB and cache), instead of what
is commonly said,
Probably because of the initialization (see above) which takes 4 or
8 times as long then, of course (and which takes most time at all).
...
and complete code is not a bit faster from this variant:
function DigitValue (Dig: Char): Integer; attribute (inline, const);
  var d : Integer; attribute (register);
  begin
    if      (Dig >= '0') and (Dig <= '9' ) then
      DigitValue  := Ord (Dig) - Ord ('0')
    else if (Dig >= 'a') and (Dig <= 'z') then
      DigitValue  := Ord (Dig) - Ord ('a') + 10
    else if (Dig >= 'A') and (Dig <= 'Z') then
      DigitValue  := Ord (Dig) - Ord ('A') + 10
    else
      DigitValue := -1
  end;
... even though this code has six branches.
Only 3 branches (the backend optimized better than you think --
unfortunately in this case only with `{$B+}', since Boolean
shortcuts require special handling) which makes the array above look
rather questionable ...
My view:
PROGRAM try(output);
CONST
      digits     = ['0'..'9'];
      upcase     = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 
                    'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R',
                    'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'];
      lowcase    = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 
                    'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r',
                    's', 't', 'u', 'v', 'w', 'x', 'y', 'z'];
      ordmaxchar = 255;  (* not guaranteed immutable *)
(* and so far I am char set independant - EBCDIC is fine *)
TYPE
      base       = 2..36;
VAR
      digval     : ARRAY[char] OF integer;
      alfamerics : SET OF char;
(* 1--------------1 *)
PROCEDURE initdigval;
VAR c : char;
BEGIN (* initdigval *)
      alfamerics := digits + upcase + lowcase;
      FOR c := chr(0) TO chr(ordmaxchar) DO BEGIN
         digval := -1;
         IF c IN digits THEN digval := ord(c) - ord('0');
         END;
      digval['a'] := 10;      digval['A'] := 10;
      digval['b'] := 11;      digval['B'] := 11;
      digval['c'] := 12;      digval['C'] := 12;
      (* I'm tired -- you get the idea *)
      END; (* initdigval *)
(* 1--------------1 *)
(* Now we are forced to get into 'what is a string'
    * Feel free to adjust to other possibilities 
    * My attitude is that a number can be a substring 
    * and that an invalid char signifies the end of 
    * that substring.  So this routine doesn't deserve
    * existance, the process should extract a number
    * from a string, and indicate the end of the substr.
    *)
   FUNCTION isvalidnumber(VAR s : string; b : base) : boolean;
VAR
        cv : integer;
BEGIN (* isvalidnumber *)
     cv := digval[s[1]];
     isvalidnumber := (cv < b) AND (cv >= 0);
     (* this statement may indicate that -1 is a poor choice *)
     (* Maybe the default should be MAXINT in initdigval     *)
     END; (* isvalidnumber *)
and I won't tire you with further code.  However, portability
should IMO include proper adaptation to char sets, which the above
handles.  Simple variants can handle languages with other
characters, other casing rules, etc.  For number conversion we
need not make any such allowances, but we do need to spell things
out.
There should be no need for assert statements, with properly
restricted subranges, as in the type definition of base above.
Note that the above table can be used in an EBCDIC machine to
translate alfamerics to ASCII.
aside: Frank, I am still mulling a reply to your previous note on
initialization - I think you misunderstand my attitude.
-- 
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
   http://cbfalconer.home.att.net  USE worldnet address!

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

Re: GPC preformance kludge with local