xsharp.eu • Random data, and Entity Framework performance ...
Page 1 of 1

Random data, and Entity Framework performance ...

Posted: Wed Jan 04, 2017 1:02 pm
by Phil Hepburn
Hi guys,

I have come across some interesting stuff lately, while doing my LINQ and EF6 research.

One thing that I will share (here) with you is that we can get a HUGE different in the performance of an Entity (framework 6) application, depending on how we go about saving our changes to the Entity objects in our .NET code (at runtime).

As part of my eNote development(and session material preparation) I was coding my test app to randomly create Orders for Customers (data 'seeding') - I set a limit of five thousand new orders. Each Order has a bunch of OrderLine entities. About 18 thousand in total.

The performance test was to see what different exists in overall data creation time, between saving each Order separately, or waiting to save all 5005 when everything was done and finished.

Check the following attached image to see lines 223 (best result) and line 214 (by far the worst result). Times were 21 seconds compared to 263 seconds - just the placement of the 'SaveChanges' was the only difference :-
PearlsXSlinqEF6_01.jpg
PearlsXSlinqEF6_01.jpg (229.17 KiB) Viewed 425 times
I have a feeling that when web articles talk about EF6 being slow in some respects, the authors need to take into account more than just simple comparisons.

Nick has more experience in such matters and may help us to understand performance issues, and when we may need to save changes ASAP. Obviously, 5004 more 'SaveChanges' operations have been carried out the 'slow' way. That is as compared to only one for the whole set of new Orders.

The handling of the Random values generation (here) is my first working attempt, but I now have a much better and 'smarter' solution which I will share in my next post from here.

STATIC classes can be most helpful, not just STATIC methods.

Regards,
Phil.
Wales, UK.

Random data, and Entity Framework performance ...

Posted: Wed Jan 04, 2017 1:24 pm
by Phil Hepburn
Okay folks,

We now need to look at how the Auto 'seeding' code has been tidied up when we use a new and 'smarter' Random STATIC class with suitable static methods.
PearlsXSlinqEF6_03.jpg
PearlsXSlinqEF6_03.jpg (192.34 KiB) Viewed 425 times
In the attached code above, we have used five different static methods from my new STATIC class called 'RandomStockValues'. There is another related call from line 235, which does some initialisation for us.

Now lets look at the STATIC class itself - see below :-
PearlsXSlinqEF6_02.jpg
PearlsXSlinqEF6_02.jpg (205.42 KiB) Viewed 425 times
The static Constructor is run the first time any of the methods is called, so 'MyInitializer' is run when we call up the method 'FindMaxes'. It is now ready to be used over, and over, as we generate the random 5005 Orders.

I did things this was so we could pass in a 'DbContext' object to allow connection to the back-end database.

Since we can't instantiate this class we need another way to pass in objects to be used - hence the method.
 
The newer code seems to work well and has produced the WPF form with data being bound to three DataGrid controls - note the max counts for OrderId and OrderLineId.
PearlsXSlinqEF6_04.jpg
PearlsXSlinqEF6_04.jpg (113.81 KiB) Viewed 425 times
I have to say that I find STATIC members and classes extremely useful in business app related work. And now that I have found LINQ, most of these methods contain some LINQ query code. However did I manage with such stuff ? I won't be going back!!!

Hope some of this stuff is useful to you, or even just interesting.

Regards,
Phil.

Random data, and Entity Framework performance ...

Posted: Wed Jan 04, 2017 1:31 pm
by Phil Hepburn
Whoops!

should read ... "however did I manage without such stuff ..."

Phil.

Random data, and Entity Framework performance ...

Posted: Wed Jan 04, 2017 3:41 pm
by Frank Maraite
Hi Phil,

I don't understand anything about what you are doing with the data. And (only for now!) I don't try to understand.

But let me say something about your STATIC thing: it is absolutly not needed and has some disadvantages. The most importand is: you cannot derive or inherit those. Sayingthat: you cannot create substitutes for testing reason. One kind of testing you showed here: You have to comment and recompile the lines where the difference occurs.

The only reason to have a STATIC class to have only one instance of this class. But there are better way: use a class factory to create and hold the only one instance.

CLASS Factory

private _TestClass1 as TextClass1
Method GetTestclass1() As TestClass // No typo, this is the base class
IF _TestClass1 == NULL
_TestClass1 := TestClass1{}
endif
RETURN _TestClass1

private _TestClass2 as TextClass2
Method GetTestclass2() As TestClass // No typo, this is the base class
IF _TestClass2 == NULL
_TestClass2 := TestClass2{}
endif
RETURN _TestClass2
END CLASS

This way you can instantiate two different classes at once and do the performance test in one run.

Just my 2 ct's
Frank

Random data, and Entity Framework performance ...

Posted: Thu Jan 05, 2017 11:02 am
by Phil Hepburn
Hi Frank,

I will explain some things then ;-)

My static class does exactly what I wanted it to - it is actually a "Server" to my whole application code - once initialised with the 'DbContext' object (for connection to SQL database) it provides random numbers suitable for my 'Stock' database business system. And the scope is not an issue as it is within the namespace of the whole application. I certainly don't need more than one such server.

Now then, what am I doing with my data? Well, quite simply I am trying to automate a couple of methods to fill a newly made (Entity Framework) SQL database with some suitable test data - data which I can use to develop more new LINQ features.  Far too many to be hand coded.

Since running the app causes the SQL DB to be deleted and re-created each time, then filled with data, I can't possibly run two test at the same time. So the 'SaveChanges' test with the code line 'in or out' of the 5005 For/Next loop, has to be recompiled each time.

The point of the test was to say to others that we must think carefully as to what we do with our EF code, as far as impact on performance is concerned. A small change can slow things down a huge amount.

Each time we call to 'SaveChanges' the EF system looks at what it has auto-recorded as unsaved changes, and then composes a query which is then sent to the SQL engine.

YES - it is 'safer' (in some ways) to some of you guys, to save after each new order is created - BUT - we then get 5005 update queries being made instead of one ( yes ' 1 ' ). hence the extra 240 seconds, or 4 minutes extra runtime.

In fact the EF6 system handles 5005 new Order changes, and 18,000 new order line changes very well it would seem, and does these all in ONE go.

I have this morning tidied up the way I use 'Servers' to provide suitable information for this process, and now, as well as the Random number server, I have one to supply a new Order, and another to supply a new OrderLine - it looks much simpler and tidier.

I will post details on this later. Your last post made me improve my code - THANKS !!!

Yes, you are right, in that we must NOT use Static classes unwisely, or for the wrong reasons, but in this case I think it was the right choice for me. It works well.

Remember a STATIC class has its Constructor run the very first time one of its members is called, then any data changes made to properties etc., remains for the duration of the app, and until other things cause the values to change. No instantiation. Quite useful in the right places.

Hope this explains a few things better.

Cheers,
Phil.

Random data, and Entity Framework performance ...

Posted: Fri Jan 06, 2017 1:47 pm
by Phil Hepburn
Hi again guys,

In case readers are wondering, the reason I have been doing so much experimental X# stuff with Entity Framework 6, and 'Code First' database creation, is because we can use a lot of LINQ style code. And I mean a lot!

If we can get a reasonable business application to run with EF6 then our LINQ skills will be suitably tested, and developed. We can then use these skills in other places, even just for in-memory collections.

In a recent post to Frank I said that he had helped me change and improve my 'Server' method code, to aid me with seeding and populating just over five thousand 'Orders', for the STOCK database (made by Code First from EF6). Well the code is shown below, starting with the general 'auto population' for the Orders :-
PearlsXSlinqEF6_14.jpg
PearlsXSlinqEF6_14.jpg (74.9 KiB) Viewed 425 times
This is much simpler now, a loop to do 5005 times over :- just pick a customer at random by ID, and then use the ID (and LINQ) to find and retrieve the Customer object. Then ask my coded 'Server' for new Orders for an Order object. Add the order to the customers list of orders. Save the command to call for 'SaveChanges' on the Entity list till the large loop is concluded - to save time and improve performance.

Lets now look at the Server for new Orders :-
PearlsXSlinqEF6_13.jpg
PearlsXSlinqEF6_13.jpg (101.5 KiB) Viewed 425 times
The loop is to handle the fact that an Order has a handful (a few) of order lines. The new order lines are supplied from a 'Server' (for OrderLines) and we now check to see that we have not already included a line for the product just selected. [Orders do not usually have duplicate lines for a single product. If duplicated we just skip it and keep going to find another.]

You will see my Random number server used throughout the three coded images.

Now lets view the 'Server' for new 'OrderLine' objects :-
PearlsXSlinqEF6_12.jpg
PearlsXSlinqEF6_12.jpg (40.22 KiB) Viewed 425 times
Two calls to the Random server and then a LINQ query to find and retrieve the Product object - then we have enough data to create the new OrderLine object, to be passed back from the static method.

Don't forget line 299 which then makes all these thousands of changes to be saved (persisted) to disk and the back-end database.

Thanks Frank for jiggling my elbow enough to make me improve my static code, class and methods to make my three 'servers'.

The code works well, and would not work without LINQ at its heart. All current LINQ queries are in the class named 'ReturnEntityLists', these have not been shown here, but similar ones were posted in earlier posts by me on this theme.

Hope this interests a few of you guys ;-0)

Have a nice weekend,
Phil.