Wednesday, February 10, 2010

Oracle database tuning and being "nice"

I was reading this blog about the Linux "nice" command
http://prodlife.wordpress.com/2008/04/09/yet-another-nice-myth/

which was referenced in another blog about 100% CPU utilization.
http://hoopercharles.wordpress.com/2010/02/05/faulty-quotes-6-cpu-utilization/

Both reminded me of experiences while I was researching how to insert one billion rows per day that I'd written about in my first post.

I'd started out around 6K rows per second insert speed. With more research I was able to get the insert speed up to about 9k or 10k per second. Usually I ran Oracle data dictionary queries to monitor the performance and other activity. Then I started using unix commands as well.

On the dedicated database server, there were some unix processes taking up the majority of the cpu. So all the CPU were running at least 99%, 100% of the time.

It turned out that the unix admin was, on the company servers, privately running processes on the "spare" cpu using "nice". The processes had to do with something astronomical, calculating things with brute force algorithms for a non profit.

I said that this was slowing down Oracle, and my testing. He insisted that since it was running at the lowest priority of nice, it could not possibly be slowing down the database. Intuitively, I knew this was wrong.

Eventually, we got those astronomical calculation processes off the server. The insert speed immediately went up to 13k rows per second. This was a very significant increase in performance.

Conclusion: the extra "nice" processes on the CPU had in fact been slowing down the Oracle database.

So, given these experiences, on multi user, multi tasking Oracle, I don't think it's a good idea to have 100% CPU server utilization for sustained periods of time. Especially if there are a lot of inserts or updates (reads and writes to the hard drives) occurring.

2 comments:

  1. In an older post called "Nice Processes Finish Last" I explained that even the nicest process still uses CPU time and other processes will spend some time waiting for it.

    You were correct and your admin did not know nice as well as I'd expect from a unix admin.

    ReplyDelete
  2. Reminds me of a horrible story that I've encountered a couple of times in my travels. People renice-ing Oracle processes of low-priority business tasks so that the Oracle processes of higher-priority business tasks could run faster.

    ...Except that those low-priority processes that were sitting in the runqueue not running were holding the cache buffers chains latches that the higher-priority processes needed to complete. The higher-priority processes were getting all the CPU time they needed to spin for latches that weren't available. CPU utilization was way up there.

    Nothing could get done.

    Ugly.

    ReplyDelete