Question about string type

List overview All Threads
Download

newer

older

Pascal-OpenOffice.org binding

shell scripts using pascal commands

Pascal Viandier

25 May 2005 25 May '05

7:28 p.m.

Hi,

I am porting SUN Pascal code to GNU Pascal. When I first read the GNU Pascal documentation (the pdf manual), I was pleased to see that GNU Pascal strings were in fact records with the length and the string text: this map perfectly with SUN Pascal VARYING[..] OF CHAR we use, since it is a record with 4 bytes for the length and an array of char. So, no porting problem...

But now, I see in gdb that a GNU Pascal variable of type STRING(20) is not anymore as in the documentation: There are two fields of type Cardinal holding the capacity and the actual length of the string, plus the array of char. Since the low-level part of our applications are written in C, many of them get Pascal strings as parameters and map them on C struct's. They rely on the SUN Pascal type to find the length and the text. Is there a way (a compiler option?) to have the STRING type map as explained in the documentation, or will have I to modify all the C routines (a big task)?

Pascal

Show replies by date

Prof A Olowofoyeku (The African Chief)

25 May 25 May

8:10 p.m.

On 25 May 2005 at 13:28, Pascal Viandier wrote:

...

Hi,

I am porting SUN Pascal code to GNU Pascal. When I first read the GNU Pascal documentation (the pdf manual), I was pleased to see that GNU Pascal strings were in fact records with the length and the string text: this map perfectly with SUN Pascal VARYING[..] OF CHAR we use, since it is a record with 4 bytes for the length and an array of char. So, no porting problem...

But now, I see in gdb that a GNU Pascal variable of type STRING(20) is not anymore as in the documentation: There are two fields of type Cardinal holding the capacity and the actual length of the string, plus the array of char. Since the low-level part of our applications are written in C, many of them get Pascal strings as parameters and map them on C struct's. They rely on the SUN Pascal type to find the length and the text. Is there a way (a compiler option?) to have the STRING type map as explained in the documentation, or will have I to modify all the C routines (a big task)?

It is always dangerous to rely on the internal representation of strings. What happens when it changes again? (and I am assuming that it will change when GPC gets ansistrings, UCSD style strings, and such). Be that as it may, you may wish to declare a Pascal data type that maps into your C struct, and then have a Pascal routine that converts a GPC string into that data type, ready for passing to the C routines. When the internal structure of GPC strings changes, all you will need to amend is your Pascal conversion routine.

Best regards, The Chief -------- Prof. Abimbola A. Olowofoyeku (The African Chief) web: http://www.greatchief.plus.com/

Frank Heckenbach

27 May 27 May

11:49 p.m.

Pascal Viandier wrote:

...

I am porting SUN Pascal code to GNU Pascal. When I first read the GNU Pascal documentation (the pdf manual), I was pleased to see that GNU Pascal strings were in fact records with the length and the string text: this map perfectly with SUN Pascal VARYING[..] OF CHAR we use, since it is a record with 4 bytes for the length and an array of char. So, no porting problem...

Then we need to correct the documentation. Where exactly does it say so? (The manual is long.)

Frank

-- Frank Heckenbach, frank@g-n-u.de, http://fjf.gnu.de/, 7977168E GPC To-Do list, latest features, fixed bugs: http://www.gnu-pascal.de/todo.html GPC download signing key: ACB3 79B2 7EB2 B7A7 EFDE D101 CD02 4C9D 0FE0 E5E8

Gale Paeper

28 May 28 May

4:40 a.m.

Frank Heckenbach wrote:

...

Pascal Viandier wrote:

...
I am porting SUN Pascal code to GNU Pascal. When I first read the GNU Pascal documentation (the pdf manual), I was pleased to see that GNU Pascal strings were in fact records with the length and the string text: this map perfectly with SUN Pascal VARYING[..] OF CHAR we use, since it is a record with 4 bytes for the length and an array of char. So, no porting problem...

Then we need to correct the documentation. Where exactly does it say so? (The manual is long.)

I presume what Pascal Viandier is referring to is in section 6.2.11.5 EP's Schema Types including String http://www.gnu-pascal.de/gpc/Schema-Types.html#Schema-Types. Toward the end of the section there is this description of the GPC String schema type implementation:

"An important schema is the predefined String schema (according to Extended Pascal). It has one predefined discriminant identifier Capacity. GPC implements the String schema as follows:

type String (Capacity: Cardinal) = record Length: 0 .. Capacity; Chars: packed array [1 .. Capacity + 1] of Char end;

The Capacity field may be directly referenced by the user, the Length field is referenced by a predefined string function Length (Str) and contains the current string length. Chars contains the chars in the string. The Chars and Length fields cannot be directly referenced by a user program. "

The type declaration of the implementation's data layout is somewhat misleading in that it doesn't explicitly show that the storage for the Capacity discriminant identifier is actually implemented as a "hidden" record field of the string storage layout.

Perhaps a better way to describe the implementation layout details is:

"... GPC implements the required String schema data type, String(Capacity), as a record in memory with the following data fields:

Capacity: Cardinal; Length: 0 .. Capacity; Chars: packed array [1 .. Capacity + 1] of Char

..."

However, I question the wisdom of embedding the implementation details of storage layouts in a general type description of language type capabilities since it complicates maintaining accurate user documentation with "subject to change" implementation details being scattered throughout the documentation. A better approach I've seen in some Pascal compiler manuals is to have all the internal storage implememtation details for the data types in a separate section of the manual. It is better for users since all the data type internal memory layout details can be found in one place when one needs to know that information for interfacing with other languages and/or compiler implementations. And it is better for documentation maintainers since there is only one place in the documentation which needs to be revised when the compiler's internal implementation details for data types change. (For GPC, it would be preferrable to have such internal data types implementation details autogenerated as much as practical from the compiler sources since that would be the best way to maintain accurate documentation without requiring a lot of additional manual effort for documentation.)

Gale Paeper gpaeper@empirenet.com

Frank Heckenbach

8:33 p.m.

Gale Paeper wrote:

...

Frank Heckenbach wrote:

...
Pascal Viandier wrote:

...
I am porting SUN Pascal code to GNU Pascal. When I first read the GNU Pascal documentation (the pdf manual), I was pleased to see that GNU Pascal strings were in fact records with the length and the string text: this map perfectly with SUN Pascal VARYING[..] OF CHAR we use, since it is a record with 4 bytes for the length and an array of char. So, no porting problem...

Then we need to correct the documentation. Where exactly does it say so? (The manual is long.)

I presume what Pascal Viandier is referring to is in section 6.2.11.5 EP's Schema Types including String http://www.gnu-pascal.de/gpc/Schema-Types.html#Schema-Types. Toward the end of the section there is this description of the GPC String schema type implementation:

"An important schema is the predefined String schema (according to Extended Pascal). It has one predefined discriminant identifier Capacity. GPC implements the String schema as follows:
 type
   String (Capacity: Cardinal) = record
     Length: 0 .. Capacity;
     Chars: packed array [1 .. Capacity + 1] of Char
   end;
The Capacity field may be directly referenced by the user, the Length field is referenced by a predefined string function Length (Str) and contains the current string length. Chars contains the chars in the string. The Chars and Length fields cannot be directly referenced by a user program. "

The type declaration of the implementation's data layout is somewhat misleading in that it doesn't explicitly show that the storage for the Capacity discriminant identifier is actually implemented as a "hidden" record field of the string storage layout.

Perhaps a better way to describe the implementation layout details is:

"... GPC implements the required String schema data type, String(Capacity), as a record in memory with the following data fields:
 Capacity: Cardinal;
 Length: 0 .. Capacity;
 Chars: packed array [1 .. Capacity + 1] of Char
..."

However, I question the wisdom of embedding the implementation details of storage layouts in a general type description of language type capabilities since it complicates maintaining accurate user documentation with "subject to change" implementation details being scattered throughout the documentation. A better approach I've seen in some Pascal compiler manuals is to have all the internal storage implememtation details for the data types in a separate section of the manual. It is better for users since all the data type internal memory layout details can be found in one place when one needs to know that information for interfacing with other languages and/or compiler implementations. And it is better for documentation maintainers since there is only one place in the documentation which needs to be revised when the compiler's internal implementation details for data types change.

I see, it's a misunderstanding. This declaration is not meant to describe the storage layout, but rather the type structure of the string schema. (Just as an equivalent user declaration would define the storage layout only modulo the implementation-dependent layout algorithm of the given compiler. I.e., if compiled with GPC it would result in the same layout as the built-in type has; if compiled with another compiler, it may result in a layout with the capacity field not part of the structure, etc.)

Since this applies to all schema types, I can add the following paragraph (just to avoid confusion -- of course, by default, storage is never specified in Pascal, and this holds for schema types, unless the GPC manual elsewhere makes any claims for schemata, which would be a bug):

: Note that the memory layout of schema types (including : @samp{String}, see below) is not specified in Pascal. E.g., some : compilers store the discriminants inside the schema variable, some : store them separately. GPC currently does the former, but relying on : this behaviour is not portable and thus not recommended. Do not use : schema types for interfacing with code written in other languages. : (You can, e.g., use fields of schema records, as long as they : contain portable types.)

BTW, two side notes about your suggestion: "Required" is ISO terminology, and seen from the perspective of the standard. To a user not familiar with this kind of language, it seems confusing -- perhaps making them think they were required to use string schemata (in contrast to fixed-strings) or something like that. (It was confusing to me before I read the ISO standards.)

Second, GPC does not even guarantee that schemata are stored in memory. I suppose there are cases where they can be stored in registers. Anyway, this is a backend issue which I don't want to make any statements about. (This probably wasn't your intention, but you see, on strict reading, your formulation actually makes more claims about storage layout than the original one.)

...

(For GPC, it would be preferrable to have such internal data types implementation details autogenerated as much as practical from the compiler sources since that would be the best way to maintain accurate documentation without requiring a lot of additional manual effort for documentation.)

Not really IMHO. There are some things that are subject to change, and some that are not (or should not be). The former should not be documented at all, and for the latter, if they do change, it's probably easier to change them manually (on those rare occasions, such as last year with CInteger) than setting up (and maintaining!) an automatic mechanism.

Frank

7349

Age (days ago)

7352

Last active (days ago)

gpc@gnu.de

4 comments

4 participants

tags (0)

participants (4)

Frank Heckenbach
Gale Paeper
Pascal Viandier
Prof A Olowofoyeku (The African Chief)