how can i schedule a csv import for large files?

sales2010

Well-Known Member
Hello,

I use import csv plugin to do it for small files and it work great, but for large files, i get the "time out" message. I increased the max_execution_time in php.ini to 120 seconds, but still get the error.

The csv file have almost 8 MB with over 100.000 rows. The plugin should remove old data from the table and then will insert new values.

Thank you
 
I have the same problems with a file of 4 MB and 7000 rows.
I hope we can find a solution for this.

Regards,
Stefan
 
Ive not tried the following idea so cant guarantee if it will work but....

You could try using the cron csv import plugin. To avoid the timeouts you would need to run it as a real crontab task probably calling the php exe and not going via wget. Sorry to be vague as i really don't know if this approach is possible
 
Hmmm, this isn't a trivial problem.

Obviously all our code runs as part of page loads, subject to the main php.ini settings. The way we run cron plugins is by hooking in to Joomla's 'onLastProcess' event (or whatever it's called), which allows system plugins to be run as the very last thing during a page load. And doing CSV imports is a time consuming operation, as there's a lot of processing we have to do, which can easily exceed reasonable max execution times set in php.ini designed for normal web usage.

Delaying the inserts wouldn't buy much, as most of the overhead is in our internal processing rather than the actual database operations, working out what needs to be inserted, and what should update existing rows, and whether it's raw values, and whether there are any new elements to add, and whether there is joined data we need to handle, etc etc. We also couldn't do delayed inserts if there is any joined data (due to the chicken and egg problem of updating the FK with the newly created PK val of the joined data).

The only way I can think of is essentially what Rob is suggesting, but it would need a little "wrapper" script which instantiates the Joomla framework, and as much of Fabrik as is needed to directly call the cron CSV import plugin.

One other option might be for us to add another option to the generic cron plugin params, where you could specify a PHP script timeout value, and we could attempt to do an ini_set() for max_execution_time before running the plugin. This would be an easier quick fix.

As an experiment, can you try this:

In ./plugins/fabrik_cron/importcsv/importcsv.php, around line 70, so it's the first line in the process() method, put this:

PHP:
@ini_set('max_execution_time', 300);

... and try running your import task using the "Run" button on the Scheduled Tasks page.

300 (5 minutes) should be enough for fastserv, with 7000 rows.

For sales2010, you may need more than 300, for your 100k rows. I have no idea how long it would take for that many rows, although you could exptrapolate by testing with 1000 rows, see how long it takes, then multiplying by 100 (and add about 10% for luck).

NOTE - this will only work if your host allows setting of INI variables in the code, which typically is not the case in cheap shared hosting plans, but should be possible with any kind of virtualized or dedicated plan. You'll be able to tell if the ini_set() worked, by timing how long it is before the script times out, and seeing if it was the value set in php.ini, or the value you used in the ini_set().

-- hugh
 
Hello again,

We were moved on a new (virtualized) host - previous we were on a dedicated host - and i set the value to a very high level:

@ini_set('max_execution_time', 100000);

..i can't make it work automatically for files having almost 1 MB..

Work fine for other cron types and also work fine if i run the cron manually.

in php.ini i have this values (cannot modify the values):

max_execution_time = 60
max_input_time = 60
memory_limit = 256M
 
For me, working like a charm without table join, but I can?t get import to a joined table work.
Rob told me I have to use {tablename___elementname} instead of the short name. But whatever I try only the first record make it into the table.
 
Hello,

so, for you work fine with modified version of the importcsv.php file? If yes, what settings do you have set in php.ini and importcsv.php? Or do you use an external cron?
 
friendly bump..i really need a solution to make it work properly, we run daily several reports for commisions and other stuffs and we need to have updated data all the time
 
Friendly bump...sorry to be stressful, but i'm stuck with this problem and i don't have any other option to run a scheduled task ( i don't know how to access importcsv.php plugin from outside Joomla)
 
I made a quick (and dirty) sql file and import the csv via server cron. This works only if you have access to server cron.

Code:
data="/full/server/path/to/your/data.csv"
sql2="LOAD DATA LOCAL INFILE '$data' REPLACE INTO TABLE your_dbtable FIELDS TERMINATED BY ';' ENCLOSED BY '\"' LINES TERMINATED BY '\n' IGNORE 1 LINES (here,are,the,columns,you,want,to,import)"
mysql -u sqluser --password=sqlpassword -h localhost --database=sqldb  --execute="$sql1"
Copy the above in a file call it somewhat.sql set the right permissions change it to your needs and run it from cron. BUT pay attention no checks will be done!
 
We are in need of some funding.
More details.

Thank you.

Members online

No members online now.
Back
Top