Re: Strings and standard Pascal

14 Mar 2006


      At 13:16 +0100 13/3/06, Frank Heckenbach wrote:
...
Peter N Lewis wrote:
...
There may not be any point in supporting Unicode any further.  From
I don't agree. It's not only Length (which is defined by the
(mis-)using them as UTF-8 bytes, but then you're on your own if
Length, SubStr/Copy, Index/Pos etc. behave strangely.)
Actually, with UTF-8, there is rarely any issues with Length, SubStr, 
Copy, Index, or Pos.
With UTF-8:
* Assuming valid UTF-8 strings, Pos will never mis-match.
* Length returns the "size" of the string.  Given UTF-8, there must 
be two different functions, one to return the size of the string in 
chars - which you call "Length" is personal preference.
* Searching for an ASCII character will always work as expected.
* SubStr/Copy require valid indexes and length, but the result will 
be explicitly either correct, or an invalid UTF-8 string.
For example, if you have a search string, a replace string, and a 
source string, the exact same code using Pos and Copy will work for 
ASCII and for UTF-8, assuming all the strings are valid ASCII or 
valid UTF-8 respectively.
Handling case insensitively is more entertaining of course, but then 
it's already rarely handled well even with just ISO-8859-1.
Anyway, if someone things Unicode32 is worth implementing in the RTS, 
go for it, I'd just suggest that it's becoming less and less relevant.
Enjoy,
    Peter.
-- 
http://www.stairways.com/  http://download.stairways.com/

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

Re: Strings and standard Pascal