Speed import csv data
Posted: Mon Mar 04, 2024 9:24 am
Hi,
I'm using VO2.8SP4b to write a tool that opens a comma seperated file, analyzes its structure and then reads chunk of data from it. The data chunck are all 1024 rows in size and 3 fields wide (10N0, 10N0, C3). The data chunck rows are stored in a barrayserver.
After the csv file is imported, the barrayserver is 3 fields wide and a multiple of 1024 records long.
A typical length would be 16 data chuncks = barrayserver with 16*1024 rows
Below is some core code to import the csv data.
What i can observe is the following:
The csv file is opened
The first few data chucks are imported rather quick
The following data chuncks are read in ever more slowly
The import is finished and takes just a few seconds (2-3).
When an other, new csv file is opened with also (or less than) 16 data chucks the import is almost instantaneously (1 sec.). If the csv is larger, the first 16 data chunks are imported quick but the following ever more slowly.
In a more extreme case (160 data chucks) the import takes about almost 3 minutes which is more than the expected 10x a few seconds.
Why is that? Is that a memory issue? How could I improve the import speed for he first import?
TIA
Jack
SELF:ba_spectra:Zap()
//(10N0, 10N0, C3)
SELF:o_csv:GoTop()
n_spec:=0
DO WHILE !SELF:o_csv:EoF
s_ln:=SELF:o_csv:ReadLn()
IF 'Spectrum:,' $ Left(s_ln, 10)
//new data chuck found
n_spec++
//all spectra are 1024 long
FOR i:=1 TO 1024
GetAppObject():Exec(EXECWHILEEVENT)
SELF:ba_spectra:Append()
s_ln:=o_csv:ReadLn()
a_res:=ParseCSVRecord(s_ln, '"', ',')
//return an array with fields
//every record is 5 fields wide; we only use 2
SELF:ba_spectra:FIELDPUT(1, Val(a_res[3]))
SELF:ba_spectra:FIELDPUT(2, Val(a_res[5]))
//also store the spectrum number
SELF:ba_spectra:FIELDPUT(3, NTrim(n_spec))
NEXT i
ENDDO
I'm using VO2.8SP4b to write a tool that opens a comma seperated file, analyzes its structure and then reads chunk of data from it. The data chunck are all 1024 rows in size and 3 fields wide (10N0, 10N0, C3). The data chunck rows are stored in a barrayserver.
After the csv file is imported, the barrayserver is 3 fields wide and a multiple of 1024 records long.
A typical length would be 16 data chuncks = barrayserver with 16*1024 rows
Below is some core code to import the csv data.
What i can observe is the following:
The csv file is opened
The first few data chucks are imported rather quick
The following data chuncks are read in ever more slowly
The import is finished and takes just a few seconds (2-3).
When an other, new csv file is opened with also (or less than) 16 data chucks the import is almost instantaneously (1 sec.). If the csv is larger, the first 16 data chunks are imported quick but the following ever more slowly.
In a more extreme case (160 data chucks) the import takes about almost 3 minutes which is more than the expected 10x a few seconds.
Why is that? Is that a memory issue? How could I improve the import speed for he first import?
TIA
Jack
SELF:ba_spectra:Zap()
//(10N0, 10N0, C3)
SELF:o_csv:GoTop()
n_spec:=0
DO WHILE !SELF:o_csv:EoF
s_ln:=SELF:o_csv:ReadLn()
IF 'Spectrum:,' $ Left(s_ln, 10)
//new data chuck found
n_spec++
//all spectra are 1024 long
FOR i:=1 TO 1024
GetAppObject():Exec(EXECWHILEEVENT)
SELF:ba_spectra:Append()
s_ln:=o_csv:ReadLn()
a_res:=ParseCSVRecord(s_ln, '"', ',')
//return an array with fields
//every record is 5 fields wide; we only use 2
SELF:ba_spectra:FIELDPUT(1, Val(a_res[3]))
SELF:ba_spectra:FIELDPUT(2, Val(a_res[5]))
//also store the spectrum number
SELF:ba_spectra:FIELDPUT(3, NTrim(n_spec))
NEXT i
ENDDO