j.logsdon@lancaster.ac.uk wrote:
Perhaps an odd request to come from the English-speaking world but I am processing some text that comes from certain European countries which can include accents. Is there a default set of characters that includes such characters that occur from time to time in non-English languages? Particularly if it is also case-specific.
ie if c in ['A'..'Z'] ... will throw a wobbly if presented with umlaut or grave accents but I want to have either (a) an automatic translation into the unaccented character (it is just used for storing) or (b) an extended set of (say) upper-case accented characters. Even just the accented characters themselves since I could obviously extend the basic set.
After all, UpCase and LoCase recognise accents ...
There are functions IsAlphaNum (check is character is alphanumeric) etc. in gpc.pas for this purpose. I just noticed that IsAlpha is missing. I'm adding it soon (if you don't want to upgrade GPC, you can, of course, use IsAlphaNum and make sure it's not a digit).
These functions are locale dependent, i.e. they work according to the language specified in the environment variable LANG or LC_CTYPE (just like UpCase and LoCase).
Frank