On my post about observing the Exadata V1 I had an interesting comment posted by Mark Seger (author of collectl and collectl utilities) about the correlation of activities across a system, the sample and snap time, and seeing the state of the subsystem before and after…
The comment made me curious about the effect of snap intervals on the performance numbers of the datafiles and block devices.. especially on the latency numbers.. so I made a few test cases and created some scripts that would give me 5 seconds, 10 minutes, and 60 minutes output of latency numbers on the database. Also running 5 seconds interval of OSWatcher to give me a view on the block devices.
As I was doing all of this, I had an interesting discovery about how the latency output of the datafiles are being computed and I was able to quantify by having the performance numbers how average could be misleading and mask the problem on the datafile IO latency.
I did a one-take (amateur ) clip straight from my iPhone3gs to give you an overview about it..
On the next post I will detail on the following:
- where and how to get datafile IO latency
- how is it computed
- how does the long average can affect the latency output
- how does this affect performance tuning?
- what can you do about it?
