I mostly like and use DB2’s Self Tuning Memory Memory Manager (STMM) for my OLTP databases where I have only one DB2 Instance/Database on a database server. I do have some areas that I do not let it set for me. I’ve recently learned about an analysis tool – Adam Storm did a presentation that mentioned it at IDUG 2014 in Phoenix.
Parameters that STMM Tunes
To begin with, it is important to understand what STMM tunes and what it doesn’t. I recommend reading When is ‘AUTOMATIC’ Not STMM?. There are essentially only 5 areas that STMM can change:
- DATABASE_MEMORY if AUTOMATIC
- SORTHEAP, SHEAPTHRES_SHR if AUTOMATIC, and SHEAPTHRES is 0
- BUFFERPOOLS if number of pages on CREATE/ALTER BUFFERPOOL is AUTOMATIC
- PCKCACHESZ if AUTOMATIC
- LOCKLIST, MAXLOCKS if AUTOMATIC (both must be automatic)
Any other parameters, even if they are set to “AUTOMATIC” are not part of STMM.
Why I don’t use STMM for PCKCACHESZ
A number of the e-commerce database servers I support are very much oversized for daily traffic. This is common for retail sites because there are always peak periods, and servers tend to be sized to handle those. Many retail clients have extremely drastic peak periods like Black Friday, Cyber Monday, or other very critical selling times.
I noticed for one of my clients that was significantly oversized on Memory, DB2 was making the package cache absolutely huge. I saw this:
Package cache size (4KB) (PCKCACHESZ) = AUTOMATIC(268480)
That’s a full GB allocated to the package cache. There were over 30,000 statements in package cache, the vast majority with only a single execution. The thing is that for my OLTP databases the statements for which performance is critical are often static SQL or they’re using parameter markers. Most of the ad-hoc statements that are only executed once I don’t really care if they’re stored in package cache. This was about a 50-100 GB database on a server with 64 GB of memory. The buffer pool hit ratios were awesome, so I guess DB2 didn’t really need the memory there, but still. In my mind, for well-run OLTP databases, that much package cache does not help performance. I am certain there may be databases that need that much or more in the Package Cache, but this database was simply not one of them. Because of this experience I set the package cache manually and tune it properly.
A few STMM Caveats
Just a few things to note – I have heard rumors of issues with STMM when there are multiple DB2 instances running on a server. I have not personally experienced this. Also, the settings that STMM is using are not transferred at all to the HADR standby, so when you fail over, you may have poor performance while STMM starts up. You could probably script a regular setting of the STMM parameters to deal with this. Also if you have a well-tuned, well performing non-STMM database there is probably little reason and not much reward in changing it to STMM. Most experts with database performance can likely tune the database better than STMM, but we can’t all be performance experts, or give as much time as we’d like to every database we support.
The STMM Log Parser
STMM logs the changes it make in parameter sizes both to the db2diag.log and to some STMM log files. (hint: IBM, maybe these could be used to periodically update the HADR standby too?). The log files are in the stmmlog
subdirectory of the DIAGPATH. The log files aren’t exactly tough to read, but they don’t really present the information in an easy to view way. Entries look a bit like diagnostic log entries:
2014-07-02-23.44.40.788684+000 I10464203A600 LEVEL: Event PID : 18677976 TID : 46382 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : WC42P1L1 APPHDL : 0-12466 APPID: *LOCAL.DB2.140620223552 AUTHID : DB2INST1 HOSTNAME: ecprwdb01s EDUID : 46382 EDUNAME: db2stmm (WC42P1L1) 0 FUNCTION: DB2 UDB, Self tuning memory manager, stmmMemoryTunerMain, probe:2065 DATA #1 : String, 115 bytes Going to sleep for 180000 milliseconds. Interval = 5787, State = 0, intervalsBeforeStateChange = 0, lost4KPages = 0 2014-07-02-23.47.40.807231+000 I10464804A489 LEVEL: Event PID : 18677976 TID : 46382 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : WC42P1L1 APPHDL : 0-12466 APPID: *LOCAL.DB2.140620223552 AUTHID : DB2INST1 HOSTNAME: ecprwdb01s EDUID : 46382 EDUNAME: db2stmm (WC42P1L1) 0 FUNCTION: DB2 UDB, Self tuning memory manager, stmmMemoryTunerMain, probe:1909 MESSAGE : Activation stage ended 2014-07-02-23.47.40.807661+000 I10465294A488 LEVEL: Event PID : 18677976 TID : 46382 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : WC42P1L1 APPHDL : 0-12466 APPID: *LOCAL.DB2.140620223552 AUTHID : DB2INST1 HOSTNAME: ecprwdb01s EDUID : 46382 EDUNAME: db2stmm (WC42P1L1) 0 FUNCTION: DB2 UDB, Self tuning memory manager, stmmMemoryTunerMain, probe:1913 MESSAGE : Starting New Interval 2014-07-02-23.47.40.808193+000 I10465783A925 LEVEL: Event PID : 18677976 TID : 46382 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : WC42P1L1 APPHDL : 0-12466 APPID: *LOCAL.DB2.140620223552 AUTHID : DB2INST1 HOSTNAME: ecprwdb01s EDUID : 46382 EDUNAME: db2stmm (WC42P1L1) 0 FUNCTION: DB2 UDB, Self tuning memory manager, stmmLogRecordBeforeResizes, probe:590 DATA #1 : String, 435 bytes *** stmmCostBenefitRecord *** Type: LOCKLIST PageSize: 4096 Benefit: -> Simulation size: 75 -> Total seconds saved: 0 (+ 0 ns) -> Normalized seconds/page: 0 Cost: -> Simulation size: 75 -> Total seconds saved: 0 (+ 0 ns) -> Normalized seconds/page: 0 Current Size: 27968 Minimum Size: 27968 Potential Increase Amount: 13984 Potential Increase Amount From OS: 13984 Potential Decrease Amount: 0 Pages Available For OS: 0 2014-07-02-23.47.40.808580+000 I10466709A993 LEVEL: Event PID : 18677976 TID : 46382 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : WC42P1L1 APPHDL : 0-12466 APPID: *LOCAL.DB2.140620223552 AUTHID : DB2INST1 HOSTNAME: ecprwdb01s EDUID : 46382 EDUNAME: db2stmm (WC42P1L1) 0 FUNCTION: DB2 UDB, Self tuning memory manager, stmmLogRecordBeforeResizes, probe:590 DATA #1 : String, 502 bytes *** stmmCostBenefitRecord *** Type: BUFFER POOL ( BUFF_REF16K ) PageSize: 16384 Saved Misses: 0 Benefit: -> Simulation size: 2560 -> Total seconds saved: 0 (+ 0 ns) -> Normalized seconds/page: 0 Cost: -> Simulation size: 2560 -> Total seconds saved: 0 (+ 0 ns) -> Normalized seconds/page: 0 Current Size: 25000 Minimum Size: 5000 Potential Increase Amount: 12500 Potential Increase Amount From OS: 12500 Potential Decrease Amount: 5000 Pages Available For OS: 5000 Interval Time: 180.029
Scrolling through each 10 MB file of this is not likely to give us a complete picture very easily. IBM is offering us, through developerWorks a log parser tool for STMM. The full writeup on it is here: http://www.ibm.com/developerworks/data/library/techarticle/dm-0708naqvi/index.html
The tool is free, and is a Perl script that DBAs can modify if they like. AIX and Linux tend to include Perl, and it’s not hard to install on Windows using ActivePerl or a number of other options. I happen to rather like a Perl utility as I do the vast majority of my database maintenance scripting in Perl.
Download and Set Up
The developerWorks link above includes the Perl script. Scroll down to the “download” section, click on “parseStmmLogFile.pl”, if you accept the terms and conditions, click “I Accept the Terms and Conditions”, and save the file. Then upload it to the database server you wish to use it on.
Syntax
There are several options here. Whenever you execute it, you will need to specify the name of one of your STMM logs, and the database name. The various options beyond that are covered below.
Examples
The default if you specify nothing beyond the file name and the database name is the s
option. This gives you the new size at each interval of each heap that STMM manages. The output looks something like this:
./parseStmmLogFile.pl /db2diag/stmmlog/stmm.43.log SAMPLE s # Database: SAMPLE [ MEMORY TUNER - LOG ENTRIES ] [ Interv ] [ Date ] [ totSec ] [ secDif ] [ newSz ] [ ] [ ] [ ] [ ] [ LOCKLIST BUFFERPOOL - BUFF16K:16K BUFFERPOOL - BUFF32K:32K BUFFERPOOL - BUFF4K BUFFERPOOL - BUFF8K:8K BUFFERPOOL - BUFF_CACHEIVL:8K BUFFERPOOL - BUFF_CAT16K:16K BUFFERPOOL - BUFF_CAT4K BUFFERPOOL - BUFF_CAT8K:8K BUFFERPOOL - BUFF_CTX BUFFERPOOL - BUFF_REF16K:16K BUFFERPOOL - BUFF_REF4K BUFFERPOOL - BUFF_REF8K:8K BUFFERPOOL - BUFF_SYSCAT BUFFERPOOL - BUFF_TEMP16K:16K BUFFERPOOL - BUFF_TEMP32K:32K BUFFERPOOL - BUFF_TEMP4K BUFFERPOOL - BUFF_TEMP8K:8K BUFFERPOOL - IBMDEFAULTBP ] [ 1 ] [ 02/07/2014 00:17:27 ] [ 180 ] [ 180 ] [ 27968 12500 2500 2000000 50000 500000 25000 1000000 50000 1000000 25000 1000000 50000 50000 1000 1000 1000 1000 10000 ] [ 2 ] [ 02/07/2014 00:20:27 ] [ 360 ] [ 180 ] [ 27968 12500 2500 2000000 50000 500000 25000 1000000 50000 1000000 25000 1000000 50000 50000 1000 1000 1000 1000 10000 ] [ 3 ] [ 02/07/2014 00:23:27 ] [ 540 ] [ 180 ] [ 27968 12500 2500 2000000 50000 500000 25000 1000000 50000 1000000 25000 1000000 50000 50000 1000 1000 1000 1000 10000 ] [ 4 ] [ 02/07/2014 00:26:27 ] [ 720 ] [ 180 ] [ 27968 12500 2500 2000000 50000 500000 25000 1000000 50000 1000000 25000 1000000 50000 50000 1000 1000 1000 1000 10000 ] [ 5 ] [ 02/07/2014 00:29:27 ] [ 900 ] [ 180 ] [ 27968 12500 2500 2000000 50000 500000 25000 1000000 50000 1000000 25000 1000000 50000 50000 1000 1000 1000 1000 10000 ]
If you have a number of bufferpools, this can be hard to read, even on a large screen. the width of the numeric values is not hte same as their names, making it not all that tabular. To fix that, you can try the d
option, which delimits the output with semicolons, making it easier to get into your favorite spreadsheet tool. The output in that case, raw looks like this:
./parseStmmLogFile.pl /db2diag/stmmlog/stmm.43.log SAMPLE s d # Database: SAMPLE MEMORY TUNER - LOG ENTRIES Interval;Date;Total Seconds;Difference in Seconds; LOCKLIST ; BUFFERPOOL - BUFF16K:16K ; BUFFERPOOL - BUFF32K:32K ; BUFFERPOOL - BUFF4K ; BUFFERPOOL - BUFF8K:8K ; BUFFERPOOL - BUFF_CACHEIVL:8K ; BUFFERPOOL - BUFF_CAT16K:16K ; BUFFERPOOL - BUFF_CAT4K ; BUFFERPOOL - BUFF_CAT8K:8K ; BUFFERPOOL - BUFF_CTX ; BUFFERPOOL - BUFF_REF16K:16K ; BUFFERPOOL - BUFF_REF4K ; BUFFERPOOL - BUFF_REF8K:8K ; BUFFERPOOL - BUFF_SYSCAT ; BUFFERPOOL - BUFF_TEMP16K:16K ; BUFFERPOOL - BUFF_TEMP32K:32K ; BUFFERPOOL - BUFF_TEMP4K ; BUFFERPOOL - BUFF_TEMP8K:8K ; BUFFERPOOL - IBMDEFAULTBP ; ; 1;02/07/2014 00:17:27;180;180; 27968; 12500; 2500; 2000000; 50000; 500000; 25000; 1000000; 50000; 1000000; 25000; 1000000; 50000; 50000; 1000; 1000; 1000; 1000; 10000; 2;02/07/2014 00:20:27;360;180; 27968; 12500; 2500; 2000000; 50000; 500000; 25000; 1000000; 50000; 1000000; 25000; 1000000; 50000; 50000; 1000; 1000; 1000; 1000; 10000; 3;02/07/2014 00:23:27;540;180; 27968; 12500; 2500; 2000000; 50000; 500000; 25000; 1000000; 50000; 1000000; 25000; 1000000; 50000; 50000; 1000; 1000; 1000; 1000; 10000; 4;02/07/2014 00:26:27;720;180; 27968; 12500; 2500; 2000000; 50000; 500000; 25000; 1000000; 50000; 1000000; 25000; 1000000; 50000; 50000; 1000; 1000; 1000; 1000; 10000; 5;02/07/2014 00:29:27;900;180; 27968; 12500; 2500; 2000000; 50000; 500000; 25000; 1000000; 50000; 1000000; 25000; 1000000; 50000; 50000; 1000; 1000; 1000; 1000; 10000;
Save it off to a file, import it into a spreadsheet, and you get something like this:
Ok, and finally, you can make a pretty graph to look at these in a more human way:
Now that would be a lot more exciting if I ran it on a database where things were changing more often, but that’s the one I have to play with at the moment.
There are some other interesting options besides the s
option. The b
option shows the benefit analysis that STMM does, which looks pretty boring on my database, but still:
./parseStmmLogFile.pl /db2diag/stmmlog/stmm.43.log SAMPLE b # Database: SAMPLE [ MEMORY TUNER - LOG ENTRIES ] [ Interv ] [ Date ] [ totSec ] [ secDif ] [ benefitNorm ] [ ] [ ] [ ] [ ] [ LOCKLIST BUFFERPOOL - BUFF16K:16K BUFFERPOOL - BUFF32K:32K BUFFERPOOL - BUFF4K BUFFERPOOL - BUFF8K:8K BUFFERPOOL - BUFF_CACHEIVL:8K BUFFERPOOL - BUFF_CAT16K:16K BUFFERPOOL - BUFF_CAT4K BUFFERPOOL - BUFF_CAT8K:8K BUFFERPOOL - BUFF_CTX BUFFERPOOL - BUFF_REF16K:16K BUFFERPOOL - BUFF_REF4K BUFFERPOOL - BUFF_REF8K:8K BUFFERPOOL - BUFF_SYSCAT BUFFERPOOL - BUFF_TEMP16K:16K BUFFERPOOL - BUFF_TEMP32K:32K BUFFERPOOL - BUFF_TEMP4K BUFFERPOOL - BUFF_TEMP8K:8K BUFFERPOOL - IBMDEFAULTBP ] [ 1 ] [ 02/07/2014 00:17:27 ] [ 180 ] [ 180 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 2 ] [ 02/07/2014 00:20:27 ] [ 360 ] [ 180 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 3 ] [ 02/07/2014 00:23:27 ] [ 540 ] [ 180 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 4 ] [ 02/07/2014 00:26:27 ] [ 720 ] [ 180 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 5 ] [ 02/07/2014 00:29:27 ] [ 900 ] [ 180 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]
The o
option shows only database memory and overflow buffer tuning:
./parseStmmLogFile.pl /db2diag/stmmlog/stmm.43.log SAMPLE o # Database: SAMPLE [ MEMORY TUNER - DATABASE MEMORY AND OVERFLOW BUFFER TUNING - LOG ENTRIES ] [ Interv ] [ Date ] [ totSec ] [ secDif ] [ configMem ] [ memAvail ] [ setCfgSz ] [ 1 ] [ 02/07/2014 00:17:27 ] [ 180 ] [ 180 ] [ 6912 ] [ 6912 ] [ 1990 ] [ 2 ] [ 02/07/2014 00:20:27 ] [ 360 ] [ 180 ] [ 6912 ] [ 6912 ] [ 1990 ] [ 3 ] [ 02/07/2014 00:23:27 ] [ 540 ] [ 180 ] [ 6912 ] [ 6912 ] [ 1990 ] [ 4 ] [ 02/07/2014 00:26:27 ] [ 720 ] [ 180 ] [ 6912 ] [ 6912 ] [ 1990 ] [ 5 ] [ 02/07/2014 00:29:27 ] [ 900 ] [ 180 ] [ 6912 ] [ 6912 ] [ 1990 ]
There is also a 4
option that you can use to convert all values to 4k pages.
Summary
There are some useful things in the STMM log parser, if you want to understand the changes DB2 is making. Many of us, coming from fully manual tuning naturally distrust what STMM or other tuning tools are doing, so this level of transparency helps us understand what is happening and why it is or is not working in our environments. I would love to see more power in this. Being able to query this data with a table function or administrative view (we can with the db2diag.log!) would be even more useful so the output could be further limited and tweaked. The script is well documented, and I imagine I could tweak it to limit it if I wanted to. I’d love to have it call out actual changes – that would be harder to graph, but for the text output, could be more useful for a fairly dormant system.