Comparison of (Ansi) strings

This forum is meant for questions and discussions about the X# language and tools
User avatar
ArneOrtlinghaus
Posts: 384
Joined: Tue Nov 10, 2015 7:48 am
Location: Italy

Comparison of (Ansi) strings

Post by ArneOrtlinghaus »

How can I influence the sorting of strings?
The following code does not give the same order as in VO. In VO I would have the same order as the input array (ordered similar to the internal ASCII value, in X# I have something like chr(1), chr(2), ..., chr(6), chr(149), chr (7), ..., chr(15), chr(164), chr(16), ..., chr(19), chr(182), chr(20), chr(167). chr(21), ..., chr(33), chr(147), chr(148), chr(168), chr(34), ...

a :={}
for i := 1 upto 255
c := chr(i) + "_" +ntrim(i)
aadd(a, c)
next

ASort(a)
msginfo(array2string(a, crlf))
User avatar
Chris
Posts: 4584
Joined: Thu Oct 08, 2015 7:48 am
Location: Greece

Comparison of (Ansi) strings

Post by Chris »

Hi Arne,

Is this real text that you are trying to sort, or is it binary data? Due to string in .Net being unicode, Chr() for values 128...255 will not give a char with that unicode value, it will instead return the equivalent unicode char, so you cannot directly compare this routines in VO and .X#.

If it's binary data that you are working with (it does look like it), then I think the best way to deal with it is to create your own sorting and/or use standard .Net functions like String.Compare()
Chris Pyrgas

XSharp Development Team test
chris(at)xsharp.eu
Terry
Posts: 306
Joined: Wed Jan 03, 2018 11:58 am

Comparison of (Ansi) strings

Post by Terry »

Arne

Don't know why things are different in VO.

But format the composite string (c in this case) in the order you want.

for example c:=ntrim(i) + chr(i).

Convert i to a string and pad it left before adding it to the array.

This should work for any string complexity. Others may know a better way.

Terry.
User avatar
ArneOrtlinghaus
Posts: 384
Joined: Tue Nov 10, 2015 7:48 am
Location: Italy

Comparison of (Ansi) strings

Post by ArneOrtlinghaus »

Hi Chris and Terry,

In VO we use this possibility to order records by several columns composing it to one unique sort string which is used by a Quicksort procedure. In case of descending the ANSI characters are substituted by chr(255)-chr(xxx). Probably we have other places where we use some sortings related to the ANSI values.

Now I am searching for something similar. I have looked into the DLL Xsharp.core and tried to understand working of ASC and CHR. There a helper class stringhelpers is used. This class contains a method CompareWindows, which seems to be what I could need. Is it possible to use this method?

Arne
User avatar
Chris
Posts: 4584
Joined: Thu Oct 08, 2015 7:48 am
Location: Greece

Comparison of (Ansi) strings

Post by Chris »

Hi Arne,

Chr(255) - Chr(xxx) will not work in .Net, unless you only use standard Latin chars (< 128 ascii), so no umlauts etc. Why don't you just sort in reverse order?
Chris Pyrgas

XSharp Development Team test
chris(at)xsharp.eu
User avatar
ArneOrtlinghaus
Posts: 384
Joined: Tue Nov 10, 2015 7:48 am
Location: Italy

Comparison of (Ansi) strings

Post by ArneOrtlinghaus »

Hi Chris,
some columns are sorted in ascending order and some in descending order, just as the user selects it. For the first column I can use reverse ordering in case of descending. But if the other columns differ from the selection of the first column, I have to "reverse" the character sequence of that part. Until now this worked also quite well for the west european characters above 128 ascii.
Terry
Posts: 306
Joined: Wed Jan 03, 2018 11:58 am

Comparison of (Ansi) strings

Post by Terry »

Hi Arne

>> There a helper class stringhelpers is used. This class contains a method CompareWindows, which seems to be what I could need. Is it possible to use this method?

Sorry I can't answer that - I just don't know.

But the way I suggested will work, and with suitable conditional extractions you'll be able to sort things in whatever way or ways you like.

You are only sorting through 256 characters, it's not much so the following won't apply; but it is worth bearing in mind should you ever want to do the same thing over thousands, then strings are immutable and some forced garbage collection may improve performance.

It may be the "CompareWidows" method includes something like that.

Terry
User avatar
ArneOrtlinghaus
Posts: 384
Joined: Tue Nov 10, 2015 7:48 am
Location: Italy

Comparison of (Ansi) strings

Post by ArneOrtlinghaus »

A workaround now works when transforming the single bytes to hex strings. Indeed it is more a binary comparison than a real string comparison. I have found some other places which I have to look at.
There are places we have used chr(255) assuming that it is a placeholder for the last possible character. Looking at my Tests in X#, chr(255) = ÿ comes to position 225 and is located for example before Ö chr(214) and
Ü chr(220) which are used in German.
User avatar
ArneOrtlinghaus
Posts: 384
Joined: Tue Nov 10, 2015 7:48 am
Location: Italy

Comparison of (Ansi) strings

Post by ArneOrtlinghaus »

Looking again at the code I have found some switches that influence string comparisons:
- The compilerswitch /VO13 Compatible string comparisons
- If set, the Setexact() and the Setcollation() settings. In our case /VO13 is set, SetExact is false. Testing the collation modes SetCollation(#clipper), SetCollation(#ORDINAL), SetCollation(#windows), SetCollation(#UNICODE) changed the behavior of sorting, but did not reproduce the same result as in VO.
User avatar
Chris
Posts: 4584
Joined: Thu Oct 08, 2015 7:48 am
Location: Greece

Comparison of (Ansi) strings

Post by Chris »

Hi Arne,

Yes, for compatibility with VO you need to use /vo13, same with almost every other vo compatibility option. Also note that #ORDINAL and #UNICODE did not exist as collation settings in VO, so you need to use either #CLIPPER or #WINDOWS

Also please remember the fundamental difference between VO and X#, VO has 8-bit strings, while X# (.Net) has unicode ones. String comparison in X# was designed in such a way to give correct (compatible with VO) results when "real" text is used (so it makes the necessary ansi<->unicode conversions), not with binary data. And using Char(255)-Chr(realchar) effectively changes the text into binary data.

Have you seen any difference of the behavior of X# to the one of VO with text data? If you have, can you please post a sample so we can look into it?
Chris Pyrgas

XSharp Development Team test
chris(at)xsharp.eu
Post Reply