I need to scale up my application. When I reach a particular user volume, the app becomes unresponsive, users can't log in in a reasonable time, and the UI becomes unresponsive. I give the AwareIM and Tomcat servers 4GB and 2GB respectively. In the control panel, neither process ever exceeds 1GB and the CPU is not challenged. I'm really not sure what exactly I'm running out of, but it's something. I am happy to set up a load balancer but based on what I know of them, the load balancer will poll the machine for signs of stress, but at the machine level, there is none. I would have thought that I would see one or both Java processes getting close to their memory allocations and then pffft - all of this would happen. But it's something else. So first I need to better understand the Java internals to understand what is happening. Second, I need a strategy for scaling that will work. I really like the load balancer approach because I get server redundancy and higher availability. I think the next best option is to set up a second node. Okay, but I'd rather not.
If you are operating at high load or any scale above a single app server, I hope you add some value to this thread.
Scaling Challenges
Scaling Challenges
V8.8
MySQL, AWS EC2, S3
PDFtk Toolkit
MySQL, AWS EC2, S3
PDFtk Toolkit
Re: Scaling Challenges
I would be interested to know:
1. The size of your Mysql database
2. MySql setting : innodb_buffer_pool_size
3. Number of simultaneous users
1. The size of your Mysql database
2. MySql setting : innodb_buffer_pool_size
3. Number of simultaneous users
Re: Scaling Challenges
Database size is about 150GB
buffer_pool_size=140509184
Concurrent users around 150
buffer_pool_size=140509184
Concurrent users around 150
V8.8
MySQL, AWS EC2, S3
PDFtk Toolkit
MySQL, AWS EC2, S3
PDFtk Toolkit
Re: Scaling Challenges
Are your documents stored in the DB or file system
Re: Scaling Challenges
I would definitely start with moving all my document images to the file system before starting other initiatives. I bet that's where the huge database size comes from. We went from a 50GB DB down to 2.6gb once this had been done and things seem to be a lot easier and more efficient.
We don't have as many concurrent users as you on the system. But we hope to get there in the very near future. So I am always keen when this topic comes up to know more, because there are no real stats around to prove the efficiency of AwareiM at these levels. So one has to take a gamble and first get to high volume users and hold thumbs you can navigate a path forward.
All I can say from my experience in the past and the crap we had with images stored in the DB, I would never go there again - this could be part of your problem
We don't have as many concurrent users as you on the system. But we hope to get there in the very near future. So I am always keen when this topic comes up to know more, because there are no real stats around to prove the efficiency of AwareiM at these levels. So one has to take a gamble and first get to high volume users and hold thumbs you can navigate a path forward.
All I can say from my experience in the past and the crap we had with images stored in the DB, I would never go there again - this could be part of your problem
Re: Scaling Challenges
My experience is this area is that it is all to do with how expensive the Aware IM transactions are. For example, if you have processes that are updating multiple tables at once, or are using regular user too much for context then this can place a massive strain on things.
We have a very big application that is used by many users. Over the course of last year we had lots of performance issues.
There was lots we did, but the headlines are:
- start monitoring for any deadlocks in the DB and review which (if any) tables are getting hit the hardest.
- check the active processes and see which are not clearing quickly. These will be good candidates for moving to a stored procedure. We have moved a significant amount of Aware IM processes to stored procedures and the performance boost has been profound. For example- having Aware IM calculate ages for 150,000 people might take 4 hours or so. An SP can do this in about 5 seconds.
- avoid writing to the regular user as this will likely trigger a bunch of rules. Instead, we write to the ref user with an SP. Reading an object carries no cost.
- we have also stripped out as many rules as possible from the major objects in our system and instead these are governed with triggers in the DB.
In summary:
Monitor and resolve deadlocks
Switch processes for stored procedures
Switch rules for triggers
These changes will significantly increase your applications capacity for extra users without the need for a load balancer- however, we have also worked with Vladimir to support storing the session Id for a user in a database rather than in tomcat memory, which is an asset if you go down that route.
Making the changes I am outlining require a different style of thinking as you will need to think much more about version control for your SP's and triggers as they are not wrapped up on the bsv, but from our limited experience the change was more than worth it.
Martyn
We have a very big application that is used by many users. Over the course of last year we had lots of performance issues.
There was lots we did, but the headlines are:
- start monitoring for any deadlocks in the DB and review which (if any) tables are getting hit the hardest.
- check the active processes and see which are not clearing quickly. These will be good candidates for moving to a stored procedure. We have moved a significant amount of Aware IM processes to stored procedures and the performance boost has been profound. For example- having Aware IM calculate ages for 150,000 people might take 4 hours or so. An SP can do this in about 5 seconds.
- avoid writing to the regular user as this will likely trigger a bunch of rules. Instead, we write to the ref user with an SP. Reading an object carries no cost.
- we have also stripped out as many rules as possible from the major objects in our system and instead these are governed with triggers in the DB.
In summary:
Monitor and resolve deadlocks
Switch processes for stored procedures
Switch rules for triggers
These changes will significantly increase your applications capacity for extra users without the need for a load balancer- however, we have also worked with Vladimir to support storing the session Id for a user in a database rather than in tomcat memory, which is an asset if you go down that route.
Making the changes I am outlining require a different style of thinking as you will need to think much more about version control for your SP's and triggers as they are not wrapped up on the bsv, but from our limited experience the change was more than worth it.
Martyn
AwareIM V7.1. (productions) V8 Development, MS SQL Server 2014,
Re: Scaling Challenges
In summary:
Monitor and resolve deadlocks
Switch processes for stored procedures
Switch rules for triggers
.. the reason I use AwareIM is for its simplicity and powerful rules engine, IMHO dumping the rules engine in the way you suggest just does not make sense, simplicity goes out the window and I can't imagine how complicated it is to maintain the app.
Resolving deadlocks in the DB and ensuring Mysql settings are optimised for your application is key to any serious application. One needs to pursue the scaling thing within the context of the way AwareIM is supposed to work and not going the alternative route as you suggest. AwareIm in its current form is supposed to scale according to Vlad
There is a switch on each rule that lets you disable reference rules, if you use this on all your rules (as much as possible) the problem of the rules engine thrashing away unnecessarily goes away
We reference the RegularUser on every transaction and have never had problems
Aware IM calculate ages for 150,000 people might take 4 hours or so. An SP can do this in about 5 seconds.
BTW , I would be interested to know what process or rule you used that made AwareIM take this long in the above calculation - something seems wrong here
Last edited by ACDC on Wed Jul 05, 2017 7:59 am, edited 1 time in total.
Re: Scaling Challenges
You are likely right and I am likely wrong.
I am only reflecting on my own experience of building a very large, very successful application used by many hundreds of users.
Thanks
I am only reflecting on my own experience of building a very large, very successful application used by many hundreds of users.
Thanks
AwareIM V7.1. (productions) V8 Development, MS SQL Server 2014,
Re: Scaling Challenges
I appreciate the discussion. I have optimized my app according to recommendations. I can't say I've done everything, but I did reduce the workload for a lot of rules. Here's the scene. Around 5:00 pm eastern time I start to get heavy traffic. If I try to login, I get nowhere. Judging by the CPU graph, some users are in and there is good CPU activity, but I can't get it. Then I reset the server, blammo! I go to login immediately after and it's like nothing cleared. Is it possible the users are all just jumping right back in? It seems so. Judging by the access log, I'm getting 4-7 logins per minute. Those add up fast.
In this app, users login to complete personal history questionnaires as they apply for police officer positions. So they care and want to get it done. They have a task.
All of the above suggestions are good at some point, but what I need right now is to scale up. Having another front-end server or two would at least give me some space, but how is a load balancer supposed to know that the Java stack is dead, or very sluggish?
In this app, users login to complete personal history questionnaires as they apply for police officer positions. So they care and want to get it done. They have a task.
All of the above suggestions are good at some point, but what I need right now is to scale up. Having another front-end server or two would at least give me some space, but how is a load balancer supposed to know that the Java stack is dead, or very sluggish?
V8.8
MySQL, AWS EC2, S3
PDFtk Toolkit
MySQL, AWS EC2, S3
PDFtk Toolkit
-
- Posts: 620
- Joined: Wed Jun 17, 2015 11:16 pm
- Location: Omaha, Nebraska
- Contact:
Re: Scaling Challenges
I don't understand what it means in practical terms from an AIM architecture perspective, but the concept of a process sitting in a "wait" state is a concern to me. It breaks the tenant of web applications being stateless and on the surface feels like a performance killer.
My response helps you exactly zero Kingsley. Sorry for that.
My response helps you exactly zero Kingsley. Sorry for that.
VocalDay Solutions - Agility - Predictability - Quality
We specialize in enabling business through the innovative use of technology.
AwareIM app with beautiful UI/UX - https://screencast-o-matic.com/watch/crfUrrVeB3t
We specialize in enabling business through the innovative use of technology.
AwareIM app with beautiful UI/UX - https://screencast-o-matic.com/watch/crfUrrVeB3t
Re: Scaling Challenges
I think I have no choice but to add another front-end server and load balancer. Near as I can tell, the load balance can basically tap and html page, like logon.html, and measure the response. Seems like it would work. My fundamental problem here is user volume.
V8.8
MySQL, AWS EC2, S3
PDFtk Toolkit
MySQL, AWS EC2, S3
PDFtk Toolkit
Re: Scaling Challenges
Some optimisation suggestions:
Tomcat > conf > server.xml
* maxThreads="300" - determines the maximum number of simultaneous requests that can be handled.
MySQL > my.ini > [mysqld] (my settings)
* max_connections = 160 - the maximum amount of concurrent sessions the MySQL server will allow.
* innodb_buffer_pool_size = 10G
* innodb_buffer_pool_instances = 16
* innodb_open_files = 500
* innodb_read_io_threads = 64
* innodb_write_io_threads = 64
* innodb_io_capacity = 1000
* max_allowed_packet = 132M
* query_cache_size = 201M
Tomcat > conf > server.xml
* maxThreads="300" - determines the maximum number of simultaneous requests that can be handled.
MySQL > my.ini > [mysqld] (my settings)
* max_connections = 160 - the maximum amount of concurrent sessions the MySQL server will allow.
* innodb_buffer_pool_size = 10G
* innodb_buffer_pool_instances = 16
* innodb_open_files = 500
* innodb_read_io_threads = 64
* innodb_write_io_threads = 64
* innodb_io_capacity = 1000
* max_allowed_packet = 132M
* query_cache_size = 201M
Re: Scaling Challenges
Could you share your My.ini publicly or privately, please?
V8.8
MySQL, AWS EC2, S3
PDFtk Toolkit
MySQL, AWS EC2, S3
PDFtk Toolkit
Re: Scaling Challenges
Sure, my.ini for MySQL 5.5 (attachment renamed to my.zip)
Parameters reference:
https://dev.mysql.com/doc/refman/5.5/en ... eters.html
Parameters reference:
https://dev.mysql.com/doc/refman/5.5/en ... eters.html
- Attachments
-
- my.zip
- Mysql 5.5 - rename to my.ini
- (9.07 KiB) Downloaded 688 times