FoxPro function list updated

Chris · Post by **Chris** » Mon Mar 02, 2020 8:48 pm

Hi Thomas,

Good point, in order to make it thread safe, better use a new object each time. You could use BEGIN LOCK...END to lock the single one while it is being used by one thread, but that would not allow it to work in parallel, so no point.

But the class can be made much more lightweight, for example there's no need to assign a cCR, cLF etc every time, those can be made (external) DEFINEs, or STATIC members of the class, so they get initialized only once. Even better, you can make them a CONST, so they are guaranteed to be initialized only once and not be modified in the class code. Also for getting literal chars, you can use the syntax cChar := c'r' for CR(13), c'n' for LF, c' 'for space etc. So you can change the code to:

PRIVATE CONST cCR := c'r' AS Char
PRIVATE CONST cLF := c'n' AS Char
PRIVATE CONST cSpace := c' ' AS Char
PRIVATE CONST cTab := c't' AS Char

this will improve performance, because the Chr() function is quite slow, as it has to do ansi<->unicode conversions that are not needed in this case.

In general regarding performance, when the speed in X# is comparable to that of VFP, I would leave it as it is, no need to spend a lot of time to squeeze just a couple more %, better use the time instead to implement more functions. Unless someone complains that a function is not fast enough, in which case we can go back to it

mainhatten · Post by **mainhatten** » Tue Mar 03, 2020 11:25 am

Hi Chris

Chris wrote:Good point, in order to make it thread safe, better use a new object each time. You could use BEGIN LOCK...END to lock the single one while it is being used by one thread, but that would not allow it to work in parallel, so no point.

Yupp, deleting the few lines felt bad from perf point, but deleting "Global" also feels "right".

But the class can be made much more lightweight, for example there's no need to assign a cCR, cLF etc every time, those can be made (external) DEFINEs, or STATIC members of the class, so they get initialized only once. Even better, you can make them a CONST, so they are guaranteed to be initialized only once and not be modified in the class code. Also for getting literal chars, you can use the syntax cChar := c'r' for CR(13), c'n' for LF, c' 'for space etc. So you can change the code to:
PRIVATE CONST cLF := c'n' AS Char
this will improve performance, because the Chr() function is quite slow, as it has to do ansi<->unicode conversions that are not needed in this case.

Had already reached same conclusion - that was how I found the Doc error reported yesterday (already fixed by Robert). Current code is, although I am not fully clear if the _Chr mentioned in the docs coupled with the defines might be better or at least more readable.

Code: Select all

    Define Asc_A_Low := 097 && Asc_A + 32
    Define Asc_Z_Low := 122 && Asc_Z + 32
	CLASS GetWordHandler
	    /// Single
        /// todo: Measure Predicate als Parameter
        /// todo: Measure Predicate als local Func
        /// lMany:
        /// todo: Measure ReversOrder/Stack Exit if smaller 
        /// todo: Measure Guard against Containskey
        /// todo: Check Guard about AscW-logic against Containskey
        *-- gets faster runtime if declared immutable? Not really...
        private const _cSpace := c' '    as Char && _Chr(ASC_BLANK)  Chr(032)[0]
        private const _c_CRet := c'r'   as Char && _Chr(ASC_CR)     Chr(013)[0]
        private const _c_LinF := c'n'   as Char && _Chr(ASC_LF)     Chr(010)[0]
        private const _c__Tab := c't'   as Char && _Chr(ASC_Tab)    Chr(009)[0]
        private const _c_Up_A := c'A'    as Char && _Chr(ASC_A)      Chr(065)[0]
        private const _c_Up_Z := c'Z'    as Char && _Chr(ASC_Z)      Chr(090)[0]
        private const _c_Lw_a := c'a'    as Char && _Chr(ASC_A_Low)  Chr(097)[0]
        private const _c_Lw_z := c'z'    as Char && _Chr(ASC_Z_Low)  Chr(122)[0]

In the already defined ASC-xxx perhaps specifying upper and lower begin/end letters might be clearer.

In general regarding performance, when the speed in X# is comparable to that of VFP, I would leave it as it is, no need to spend a lot of time to squeeze just a couple more %, better use the time instead to implement more functions. Unless someone complains that a function is not fast enough, in which case we can go back to it

Yes and No - I realize that "function speed" at the moment does not need more polish - but my education in things Dotnet AND characteristics of xSharp does.

And it already brought at least one speedup: the answer to the old C joke

What is the fastest C data structure

holds true for C# / xSharp as well:

Code: Select all

        internal method isVfpWhiteSwitchChar(tc2Check as Char) as Boolean
            /// direct switch checking against constants
            /// ToDo: flagged version, flags set on SetDict? Measure setup time!
            switch  tc2Check 
                case _cSpace 
                case _c_LinF
                case _c__Tab
                    return .t.
                otherwise
                    return .f.
            end switch

The idea of "typical ranges" fed also proved very benefical eliminating the need to call dict:ContainsKey, Most of the Chars fall into Ranges a..z, A..Z and are typically not part of delimiter, allowing to code a guard. I read the source of AscW

Code: Select all

FUNCTION AscW(c AS STRING) AS DWORD
	LOCAL ascValue := 0 AS DWORD
	LOCAL chValue AS CHAR
	IF ( !String.IsNullOrEmpty(c) ) 
		chValue := c[0]
		ascValue := (DWORD) chValue
	ENDIF
	RETURN ascValue

and I already have a Char known to be != Null, so I could safely just

Code: Select all

		ascValue := (DWORD) chValue

on the notion that lower 127 values of unicode should be identical to ASCII, which could result in efficient code

Code: Select all

        internal method isViaDictGuardedAscW(tc2Check as Char) as Boolean
            /// Asc/AscW method signature is (String), is taking the relevant cast from AscW enough for Chars<Chr(128)?
            /// as Chr(122)=="z" normal ascii letters included, which are main part in US and Europe
            /// as Callup crashes earlier (also in vfp) on tcString is Null, even shortened version of Nullcheck not necessary?
            /// IIF ( Object.ReferenceEquals(tc2Check, Null), 0, (DWORD) tc2Check )
            local lnDword := (DWord) tc2Check as DWord
            if  (Asc_A_Low<=lnDword .and. lnDWord<=Asc_Z_Low)    && lower letters first, as ocurring more often
                return .f.
            elseif  (Asc_A<=lnDword .and. lnDWord<=Asc_Z)
                return .f.
            else
                switch  lnDword 
                    case Asc_Blank 
                        return .t.
                    case Asc_LF
                        return .t.
                    case Asc_Tab
                        return .t.
                    otherwise
                        return  Self:hcCmp:ContainsKey(tc2Check)
                end switch
            endif

but here I hesitate at the moment. Dedicated returns inside switch here on purpose, as they might be combined with flags set on Self:SetDict(). When you asked on other thread

But why are you thinking to use those, why are they better in your case than normal procedural code?

I wanted to "plug together" such filters depending on tcDelimiter via small filter functions, which in normal procedural code either need 1 long method guarded with class member flags or a permutation-based # of methods for all combinations. Will probably resolve to flags. Funny: in Vfp slinging those filters together into 1 boolean statement would be faster - I assumed in xSharp to see identical performance between short-circuting Boolean statement and more readable if/elseif/else - not so. If was faster!

regards
thomas

mainhatten · Post by **mainhatten** » Wed Mar 04, 2020 11:29 am

Hi Chris,

Chris wrote:In general regarding performance, when the speed in X# is comparable to that of VFP, I would leave it as it is, no need to spend a lot of time to squeeze just a couple more %, better use the time instead to implement more functions. Unless someone complains that a function is not fast enough, in which case we can go back to it

For now I have squeezed enough speed, at least for the long strings. Most of the times now I can now surpass vfp speed on long strings. Next step is to cut back most of the things I tried but which were handled otherwise - probably 50% will be eliminated. When do you plan to release next version ? I am not totally happy with inner "interface", but the existing vfp-compatible function layer already works and will continue to work. Next 2 days are somewhat hectic here again, and cutting is not a task I like to do in a hurry - even when the task is helped by compiler, "test runner" and existing plan between my ears...

regards
thomas

P.S.
Looked over the GIT list, could sometimes add a few lines to the point of issue - sometimes directly related, sometimes very "near". Does team prefer for me to create new issue or broaden existing one ?

Chris · Post by **Chris** » Wed Mar 04, 2020 3:55 pm

Hi Thomas,

Please take your time, it's not like we have tight deadlines for including VFP functions, it will take months to include them all anyway.

Regarding GIT issues list, just please do what seems to make more sense to you, indeed for very similar/ a bit extra stuff to existing entries, it's better to just append more info. But either way is fine actually!

mainhatten · Post by **mainhatten** » Sun Mar 08, 2020 12:26 pm

Chris wrote:Please take your time, it's not like we have tight deadlines for including VFP functions, it will take months to include them all anyway.

Attached in Zip are 4 versions of GetWordNum/GetWordCount. If possible, put them into Git in the order of 1,2,3,4, so later one can look back and see approaches tried and later dismissed. 1 corresponds to the version you already saw, 2 searched for more efficient ways. Version 3 is the pruned version of 2, only sporting best approaches plus a hint on how to probably speed up for non-Latin alphabets/charsets also found in unicode-8. Version 3 keeps the "delegate" approach, which IMO is easier for trying out other filters and similar approaches, but has probably a heftier classload time when going via traditional vfp functions loading the class into memory to stay threadsafe.
Version 4 is the very close in capability to version 3, only rewritten as "normal inheritance". Size increase mostly due to special optimized GetWordNum for single char, which is not backported into version 3. It uses a "revolving executor" schema if preloaded and kept in memory, if called via old functions it first loads handler, then lazy loads only the executor for the specific task needed,
which can be: Single Char, Vfp Default[9,10,32], other multichar delimiter sets or specific DotNet:IsWhiteSpace.
.
In version 4 a few cosmetic changes probably might make it more in line with typical Java/Dotnet: Coding the lazy load not in the rather global function of SetActive, but rather create accessors on each specialized lazy load objects

Code: Select all

        public oVfpDefault := Null as GetVfpDefault
        public oDotWhiteSp := Null as GetDotNetWhite
        public oSingleChar := Null as GetSingle
        public oSingle_Opt := Null as GetSingleOpt
        public oMultipleCs := Null as GetMultiple

which make certain a real object is assigned to active executor. Casting the executor as object also seems "unDotNet", it should be casted to the base interface declared, but I did not find the correct way - probably read up on GetInterface some more.

Version 3 and 4 only tested for correctness, speed seems to be VERY similar, speed benefit hoped for in switching over to traditional overwritten virtual methods not seen in test runs of ~10 secs.

The speed benefits I added on to eliminate costly dict tries only/mostly help Latin char set below ASCII 127, which is identical to unicode - but they help a lot. Calling those functions on large texts of Greek, Cyrillic, Hiragana or other subgroups will be less effective, probably by A LOT. Guarding for every possible char set would slow down normal processing.
If given large texts in Greek to ad to the existing tests i could add more than a blueprint for other char sets if needed. Test data still the 4 records, each a Tolstoy novel, posted in other thread, result dbf structure included.

Safe.zip: (20.68 KiB) Downloaded 155 times

comments welcome

thomas

Chris · Post by **Chris** » Tue Mar 10, 2020 1:32 pm

Thanks Thomas, I have added the functions to the runtime solution in Git. Unfortunately it was complicated doing it for every iteration (had to make some changes in the code), so I just added the final version, I think that's fine as well.

I think there's no point putting even more effort on those anymore, I think they are good enough and there are 100s of other functions that need to be implemented! In case someone finds an issue, we will revisit them.

thanks again for your efforts!

mainhatten · Post by **mainhatten** » Wed Mar 11, 2020 7:48 pm

Hi Chris,

Chris wrote: I have added the functions to the runtime solution in Git. Unfortunately it was complicated doing it for every iteration (had to make some changes in the code), so I just added the final version, I think that's fine as well.

Sorry if you had to make changes - if I know what to change, I will prepare next functions better.

I think there's no point putting even more effort on those anymore, I think they are good enough and there are 100s of other functions that need to be implemented! In case someone finds an issue, we will revisit them.

I will focus on StrExtract and DefaultExt next, as the Just* and Force* are done and DefaultExt on vfp side has a few peculiarites. There are a few relatively easy targets still undone, like StrToFile, FileToStr, some of the Date-Conversions - but I hesitate to target them all, as they might be a good entry point for others to implement some functions in xSharp, same as the Force/Just and stringfunctions were for me - might be a good idea for others to achieve a running start without the need to read up lots of new things for first function. Unless dev team thinks I should target those other "easy" ones first, I will switch over to reading RDD/Server code and perhaps do a few not-too-difficult functions there, after understanding the structures there or try a few things with DotNet GUI.

Another option might be to create a table based test runner to run same 1-liner tests in vfp and xSharp with automatic logging of the results, for instance to flash out differences in NULL handling. In Lianja I had built something similar able to run scripts from memo fields in both languages, which at the moment seems to be not possible with vfp dialect.

best regards
thomas

Chris · Post by **Chris** » Wed Mar 11, 2020 10:45 pm

Hi Thomas,

It was just minor things, just putting consistent var casing because the runtime is being compiled with case sensitivity enabled, and some small changes here and there. It just felt overkill doing that 4 times.

Please do any functions you feel more comfortable with. Easy functions are also important, they may be easy, but they are also plenty...

robert · Post by **robert** » Thu Mar 12, 2020 11:16 am

Thomas,
I have looked at your GetWord..() functions.
There are some things defined in the code that are not used (anymore) such as:
- delegates
- interface
- several classes
Do you want to keep these in there ?

You have also introduced new classes and defines. Should these stay public (and then be documented) or can I make these internal and then they do not have to be documented.

Some other questions:
- oActiveObjc is declared as OBJECT and then the method GetWordHandler().SetStru() calls its method SetStru(). This is compiled as a late bound call now. I would like to use early bound code in the runtime (faster and less chance of errors). Is there a special reason why this is untyped?
- what is the idea of iMethod in GetWordHandler() and what is the meaning of the numbers such as 22, 37, -4, 1 and 3. In DotNet I would normally use an Enum for this with descriptive names.

Robert

mainhatten · Post by **mainhatten** » Fri Mar 13, 2020 1:35 pm

Chris wrote:It was just minor things, just putting consistent var casing because the runtime is being compiled with case sensitivity enabled, and some small changes here and there. It just felt overkill doing that 4 times.

must have missed the casing... used the VStudio setting from GIT. Will set VStudio case sensitive as XIDE unsensitive for own spelunking and parts brought over from fox.
// Edit: in VStudio, which I used for the GetWord*, case sensitve was already set ? What was there to correct ?

Please do any functions you feel more comfortable with. Easy functions are also important, they may be easy, but they are also plenty...

Will try to balance a bit, between own learning and vfp functions done - I need to get deeper in GUI and/or RDD/DBServer code...

regards
thomas

Main menu