Experience with heavy data processing nightly?

If you have questions or if you want to share your opinion about Aware IM post your message on this forum
Post Reply
hpl123
Posts: 2594
Joined: Fri Feb 01, 2013 1:13 pm
Location: Scandinavia

Experience with heavy data processing nightly?

Post by hpl123 »

Hi all,
I am working on a app with a lot of statistics in it which needs parsing and processing on a daily basis and the way I have it set up now is, I run a nightly Aware process that parse and process all data (per tenant and there could be quite a lot of data per tenant) and the Aware process consists of various sub processes that parse data and count etc. stats. The Aware process has ca 10 sub processes and each sub processes have a lot of their own sub processes doing stuff so it´s quite complex.

The Aware process etc. works flawlessly in testing but I don´t have any experience with heavy and long processes so I am a bit concerned about it working correctly. The stats are a crucial part of the app so need to be working and processed flawlessly.

I have a couple of questions and would appreciate some feedback:
- Does anyone have any experience with running heavy and long processes on a daily basis?
- If the process works flawlessly in testing, it would be safe to assume it also works flawlessly every night (or?). I am thinking about normal operations of course and not taking into account if the data center explode or whatever other highly unlikely scenario.
- Is it better to run separate nightly processes i.e divide the monster process up into 1 process at 2am, 1 process at 3 am etc. or is it OK to run it all in 1 go at e.g 2am? The server would not have any user logged on at that time so the server should have enough resources to run a heavy/long process.
- Is the scheduler 100% reliable or should I build in some check e.g main process is scheduled to run at 2am, do a check at 4am to make sure it really ran and if not, run in then.
- When running nightly processes like these, would it be a good idea to build in some error flags notifying the administrator? (and possibly halting the steps after the error in the process).
- What are the recommended server memory allocations IF different from "normal" recommendations?
- Anything else I can configure or optimize on the server, in the DB etc. etc. to make sure heavy and long processes run smoothly?
- Anything else to think about?

Thanks in advance.
Henrik (V8 Developer Ed. - Windows)
pbrad
Posts: 781
Joined: Mon Jul 17, 2006 11:03 pm
Location: Ontario, Canada

Re: Experience with heavy data processing nightly?

Post by pbrad »

Henrik, a couple of thoughts in no specific order:
- Any chance to passing some of the heavy lifting out to stored procedures?

- If you break the process into separate spread out processes, make sure that you do some checks when running a process to make sure that a previous process that it relies upon did not fail

- If you create a small tracking object and write to it throughout your process with a text comment and timestamp you can track how long various sections of your large process took to run and whether overall process ran properly (or at all) etc... I have found this very useful for monitoring scheduled processes in the past.

- If the process runs seamlessly during the day then it will run the same way at night assuming that no other heavy processes are scheduled to run before the first process would likely end. This might not be a problem but it is always best to avoid it.

- Always take the time to analyze your logs to see if you can reduce the number of rules that repeatedly fire. It is a slow process but well worth it.

Cheers,
Pete
Pete Bradstreet
Contract developer of commercialized applications

AwareIM Ver. 8.2
hpl123
Posts: 2594
Joined: Fri Feb 01, 2013 1:13 pm
Location: Scandinavia

Re: Experience with heavy data processing nightly?

Post by hpl123 »

pbrad wrote: Sat Mar 05, 2022 11:54 am Henrik, a couple of thoughts in no specific order:
- Any chance to passing some of the heavy lifting out to stored procedures?

- If you break the process into separate spread out processes, make sure that you do some checks when running a process to make sure that a previous process that it relies upon did not fail

- If you create a small tracking object and write to it throughout your process with a text comment and timestamp you can track how long various sections of your large process took to run and whether overall process ran properly (or at all) etc... I have found this very useful for monitoring scheduled processes in the past.

- If the process runs seamlessly during the day then it will run the same way at night assuming that no other heavy processes are scheduled to run before the first process would likely end. This might not be a problem but it is always best to avoid it.

- Always take the time to analyze your logs to see if you can reduce the number of rules that repeatedly fire. It is a slow process but well worth it.

Cheers,
Pete
Good thoughts, thanks for sharing Pete and I will incorporate this in my processes.
Henrik (V8 Developer Ed. - Windows)
tford
Posts: 4238
Joined: Sat Mar 10, 2007 6:44 pm

Re: Experience with heavy data processing nightly?

Post by tford »

Always a good idea to follow Pete's advice. He taught me SO much early on during my Aware journey.

In addition to his log suggestion, I've also fired off an email to system admin (myself) at the end of regularly scheduled update processing. I include key totals & time/date stamps (beginning and ending) in the email which leaves a good audit trail and allows a quick overview without logging in.
Tom - V8.8 build 3137 - MySql / PostGres
hpl123
Posts: 2594
Joined: Fri Feb 01, 2013 1:13 pm
Location: Scandinavia

Re: Experience with heavy data processing nightly?

Post by hpl123 »

tford wrote: Mon Mar 07, 2022 2:57 am Always a good idea to follow Pete's advice. He taught me SO much early on during my Aware journey.

In addition to his log suggestion, I've also fired off an email to system admin (myself) at the end of regularly scheduled update processing. I include key totals & time/date stamps (beginning and ending) in the email which leaves a good audit trail and allows a quick overview without logging in.
Yeah, Pete is an old timer and you and me too Tom at this point ;). Always things to learn in Aware though so good to get some feedback.
Henrik (V8 Developer Ed. - Windows)
Post Reply