I think a tutorial on this would be nice and would be demonstrative of AwareIM's ability to scale. Here's my primary consideration regarding scaling based on my experiences. I've seen many times where one or other Java processes break down irrespective of the machine load. There are many ways to setup load balancing but at its basic level, the load balancer is monitoring the machine's CPU, RAM, or other factors. No matter how much RAM your machine has, only a certain amount of it is allocated to the two Java processes. When either of them run out of memory, the load balancer is not going to know unless you have it configured to monitor Java processes at some level, and there are tools for doing so, but they are complicated (for me) and I haven't messed with them. I've suffered a number of system failures which have been noted on this forum. In all cases, my machine was never stressed - it was the Java processes that failed. At this point, my app server is running superbly and I literally have no idea where the breaking point is, or what it is, but I don't believe it will be the capacity of the machine. Theoretically, in 64-bit operation, you should be able to feed the Java processes maximum RAM and if your CPU can keep up, you will be serving a lot of customers before something starts to break routinely. The question then will be what is breaking? Under a very heavy load, your system will simply take longer to process requests or one of the Java processes will be failing (probably due to memory starvation) and it will be time to get a second server and a load balancer. If you can't afford to run the load balancer around the clock, you will have to devise ways to determine when to turn it on and when to shut it down. If you must provide service at the 5 9s level, there is a solid case for at least two or more application servers purely for maintenance and system availability.
My last tip is this. AwareIM is fantastic, but ultimately, you have a responsibility to architect your app in such a way that it can operate at scale. If you think you're going to need a server farm, you have a lot of up-front considerations and you must think about operations at scale in everything you do. Be sure to watch the tutorial on optimization and consider your alternatives in designing objects and relationships. A good example is Ben Hayat's recent post https://www.awareim.com/forum/d/9042> in which he is looking for the optimal design of the RegularUser object. This is the kind of thinking you must apply (ask me how I know...).
To exact point of your question, you should be able to scale your app server up/down fairly rapidly by maintaining a static snapshot of the server and provisioning it to a larger or smaller instance - no license issue there. This should only take a few minutes, but there is downtime for sure. I don't do this with any regularity, but at times when I have opted to scale up, that's what I did and it's pretty easy. If well-planned, you should be able to do it in a few minutes.