There are probably a lot of people that can explain this better but I'll make an attempt at this.
The scale of the graphs you see in the application monitoring does not directly represent the size of your Cloud Node. The 100% scale of the graph does not indicate the maximum you can use on your environment.
Load indicates usage of your (virtual) cores, so a load of 1.2 means that 1 core is consistently at 100% and a second core is at 20%. This is something that is ok for a short period of time. But having a load of more than 1 means that there is a single microflow (thread) is running for a long period of time an consistently uses a whole core.
I found this website giving a very good explanation about load: http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages
As you are mentioning you have a 300% idle time, that means that the activities are waiting for 300% of the time because of your load. The 20% overhead that can't run right away is very costly in your case.
CPU usage would literally mean how active your cpu is, if it is at 400% I would assume that you have 4 cpu's running at 100%.
An overly simplified scenario: If you have 6 tasks, 4 cpu's and each cpu can run 1 task before completing.
You have a load of 1.2, that means all 4 cpu's are 100% occupied processing the first 4 tasks.
The other 2 tasks are waiting until a cpu frees up.
If a single task takes about 10 minutes to execute, with a 300% idle/wait time means that your task is idle 300% of the time. In other words it's waiting for 30 minutes
What you should focus on first is to find the process that causes this because this isn't something that you want to keep. Getting a bigger server would only mask the problem. If you have identified the exact process that is causing this you can post that on the forum too, there are probably enough people out here that could give you suggestions on improving it.