STMM Analysis Tool

July 15, 2014, 4:00 am

≫ Next: Quick Tip: Simple Errors on Database Connection

≪ Previous: Three Different Ways to Write the Same Join in SQL

I mostly like and use DB2’s Self Tuning Memory Memory Manager (STMM) for my OLTP databases where I have only one DB2 Instance/Database on a database server. I do have some areas that I do not let it set for me. I’ve recently learned about an analysis tool – Adam Storm did a presentation that mentioned it at IDUG 2014 in Phoenix.

Parameters that STMM Tunes

To begin with, it is important to understand what STMM tunes and what it doesn’t. I recommend reading When is ‘AUTOMATIC’ Not STMM?. There are essentially only 5 areas that STMM can change:

DATABASE_MEMORY if AUTOMATIC
SORTHEAP, SHEAPTHRES_SHR if AUTOMATIC, and SHEAPTHRES is 0
BUFFERPOOLS if number of pages on CREATE/ALTER BUFFERPOOL is AUTOMATIC
PCKCACHESZ if AUTOMATIC
LOCKLIST, MAXLOCKS if AUTOMATIC (both must be automatic)

Any other parameters, even if they are set to “AUTOMATIC” are not part of STMM.

Why I don’t use STMM for PCKCACHESZ

A number of the e-commerce database servers I support are very much oversized for daily traffic. This is common for retail sites because there are always peak periods, and servers tend to be sized to handle those. Many retail clients have extremely drastic peak periods like Black Friday, Cyber Monday, or other very critical selling times.

I noticed for one of my clients that was significantly oversized on Memory, DB2 was making the package cache absolutely huge. I saw this:

Package cache size (4KB)                   (PCKCACHESZ) = AUTOMATIC(268480)

That’s a full GB allocated to the package cache. There were over 30,000 statements in package cache, the vast majority with only a single execution. The thing is that for my OLTP databases the statements for which performance is critical are often static SQL or they’re using parameter markers. Most of the ad-hoc statements that are only executed once I don’t really care if they’re stored in package cache. This was about a 50-100 GB database on a server with 64 GB of memory. The buffer pool hit ratios were awesome, so I guess DB2 didn’t really need the memory there, but still. In my mind, for well-run OLTP databases, that much package cache does not help performance. I am certain there may be databases that need that much or more in the Package Cache, but this database was simply not one of them. Because of this experience I set the package cache manually and tune it properly.

A few STMM Caveats

Just a few things to note – I have heard rumors of issues with STMM when there are multiple DB2 instances running on a server. I have not personally experienced this. Also, the settings that STMM is using are not transferred at all to the HADR standby, so when you fail over, you may have poor performance while STMM starts up. You could probably script a regular setting of the STMM parameters to deal with this. Also if you have a well-tuned, well performing non-STMM database there is probably little reason and not much reward in changing it to STMM. Most experts with database performance can likely tune the database better than STMM, but we can’t all be performance experts, or give as much time as we’d like to every database we support.

The STMM Log Parser

STMM logs the changes it make in parameter sizes both to the db2diag.log and to some STMM log files. (hint: IBM, maybe these could be used to periodically update the HADR standby too?). The log files are in the stmmlog subdirectory of the DIAGPATH. The log files aren’t exactly tough to read, but they don’t really present the information in an easy to view way. Entries look a bit like diagnostic log entries:

2014-07-02-23.44.40.788684+000 I10464203A600        LEVEL: Event
PID     : 18677976             TID : 46382          PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000           DB   : WC42P1L1
APPHDL  : 0-12466              APPID: *LOCAL.DB2.140620223552
AUTHID  : DB2INST1             HOSTNAME: ecprwdb01s
EDUID   : 46382                EDUNAME: db2stmm (WC42P1L1) 0
FUNCTION: DB2 UDB, Self tuning memory manager, stmmMemoryTunerMain, probe:2065
DATA #1 : String, 115 bytes
Going to sleep for 180000 milliseconds.
Interval = 5787, State = 0, intervalsBeforeStateChange = 0, lost4KPages = 0

2014-07-02-23.47.40.807231+000 I10464804A489        LEVEL: Event
PID     : 18677976             TID : 46382          PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000           DB   : WC42P1L1
APPHDL  : 0-12466              APPID: *LOCAL.DB2.140620223552
AUTHID  : DB2INST1             HOSTNAME: ecprwdb01s
EDUID   : 46382                EDUNAME: db2stmm (WC42P1L1) 0
FUNCTION: DB2 UDB, Self tuning memory manager, stmmMemoryTunerMain, probe:1909
MESSAGE : Activation stage ended

2014-07-02-23.47.40.807661+000 I10465294A488        LEVEL: Event
PID     : 18677976             TID : 46382          PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000           DB   : WC42P1L1
APPHDL  : 0-12466              APPID: *LOCAL.DB2.140620223552
AUTHID  : DB2INST1             HOSTNAME: ecprwdb01s
EDUID   : 46382                EDUNAME: db2stmm (WC42P1L1) 0
FUNCTION: DB2 UDB, Self tuning memory manager, stmmMemoryTunerMain, probe:1913
MESSAGE : Starting New Interval

2014-07-02-23.47.40.808193+000 I10465783A925        LEVEL: Event
PID     : 18677976             TID : 46382          PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000           DB   : WC42P1L1
APPHDL  : 0-12466              APPID: *LOCAL.DB2.140620223552
AUTHID  : DB2INST1             HOSTNAME: ecprwdb01s
EDUID   : 46382                EDUNAME: db2stmm (WC42P1L1) 0
FUNCTION: DB2 UDB, Self tuning memory manager, stmmLogRecordBeforeResizes, probe:590
DATA #1 : String, 435 bytes

***  stmmCostBenefitRecord ***
Type: LOCKLIST
PageSize: 4096
Benefit:
  -> Simulation size: 75
  -> Total seconds saved: 0 (+ 0 ns)
  -> Normalized seconds/page: 0
Cost:
  -> Simulation size: 75
  -> Total seconds saved: 0 (+ 0 ns)
  -> Normalized seconds/page: 0
Current Size: 27968
Minimum Size: 27968
Potential Increase Amount: 13984
Potential Increase Amount From OS: 13984
Potential Decrease Amount: 0
Pages Available For OS: 0

2014-07-02-23.47.40.808580+000 I10466709A993        LEVEL: Event
PID     : 18677976             TID : 46382          PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000           DB   : WC42P1L1
APPHDL  : 0-12466              APPID: *LOCAL.DB2.140620223552
AUTHID  : DB2INST1             HOSTNAME: ecprwdb01s
EDUID   : 46382                EDUNAME: db2stmm (WC42P1L1) 0
FUNCTION: DB2 UDB, Self tuning memory manager, stmmLogRecordBeforeResizes, probe:590
DATA #1 : String, 502 bytes

***  stmmCostBenefitRecord ***
Type: BUFFER POOL ( BUFF_REF16K )
PageSize: 16384
Saved Misses: 0
Benefit:
  -> Simulation size: 2560
  -> Total seconds saved: 0 (+ 0 ns)
  -> Normalized seconds/page: 0
Cost:
  -> Simulation size: 2560
  -> Total seconds saved: 0 (+ 0 ns)
  -> Normalized seconds/page: 0
Current Size: 25000
Minimum Size: 5000
Potential Increase Amount: 12500
Potential Increase Amount From OS: 12500
Potential Decrease Amount: 5000
Pages Available For OS: 5000
Interval Time: 180.029

Scrolling through each 10 MB file of this is not likely to give us a complete picture very easily. IBM is offering us, through developerWorks a log parser tool for STMM. The full writeup on it is here: http://www.ibm.com/developerworks/data/library/techarticle/dm-0708naqvi/index.html

The tool is free, and is a Perl script that DBAs can modify if they like. AIX and Linux tend to include Perl, and it’s not hard to install on Windows using ActivePerl or a number of other options. I happen to rather like a Perl utility as I do the vast majority of my database maintenance scripting in Perl.

Download and Set Up

The developerWorks link above includes the Perl script. Scroll down to the “download” section, click on “parseStmmLogFile.pl”, if you accept the terms and conditions, click “I Accept the Terms and Conditions”, and save the file. Then upload it to the database server you wish to use it on.

Syntax

There are several options here. Whenever you execute it, you will need to specify the name of one of your STMM logs, and the database name. The various options beyond that are covered below.

Examples

The default if you specify nothing beyond the file name and the database name is the s option. This gives you the new size at each interval of each heap that STMM manages. The output looks something like this:

 ./parseStmmLogFile.pl /db2diag/stmmlog/stmm.43.log SAMPLE s
# Database: SAMPLE
[ MEMORY TUNER - LOG ENTRIES ]
[ Interv ]      [        Date         ] [ totSec ]      [ secDif ]      [ newSz ]
[        ]      [                     ] [        ]      [        ]      [ LOCKLIST  BUFFERPOOL - BUFF16K:16K BUFFERPOOL - BUFF32K:32K BUFFERPOOL - BUFF4K BUFFERPOOL - BUFF8K:8K BUFFERPOOL - BUFF_CACHEIVL:8K BUFFERPOOL - BUFF_CAT16K:16K BUFFERPOOL - BUFF_CAT4K BUFFERPOOL - BUFF_CAT8K:8K BUFFERPOOL - BUFF_CTX BUFFERPOOL - BUFF_REF16K:16K BUFFERPOOL - BUFF_REF4K BUFFERPOOL - BUFF_REF8K:8K BUFFERPOOL - BUFF_SYSCAT BUFFERPOOL - BUFF_TEMP16K:16K BUFFERPOOL - BUFF_TEMP32K:32K BUFFERPOOL - BUFF_TEMP4K BUFFERPOOL - BUFF_TEMP8K:8K BUFFERPOOL - IBMDEFAULTBP ]
[      1 ]      [ 02/07/2014 00:17:27 ] [    180 ]      [    180 ]      [ 27968 12500 2500 2000000 50000 500000 25000 1000000 50000 1000000 25000 1000000 50000 50000 1000 1000 1000 1000 10000 ]
[      2 ]      [ 02/07/2014 00:20:27 ] [    360 ]      [    180 ]      [ 27968 12500 2500 2000000 50000 500000 25000 1000000 50000 1000000 25000 1000000 50000 50000 1000 1000 1000 1000 10000 ]
[      3 ]      [ 02/07/2014 00:23:27 ] [    540 ]      [    180 ]      [ 27968 12500 2500 2000000 50000 500000 25000 1000000 50000 1000000 25000 1000000 50000 50000 1000 1000 1000 1000 10000 ]
[      4 ]      [ 02/07/2014 00:26:27 ] [    720 ]      [    180 ]      [ 27968 12500 2500 2000000 50000 500000 25000 1000000 50000 1000000 25000 1000000 50000 50000 1000 1000 1000 1000 10000 ]
[      5 ]      [ 02/07/2014 00:29:27 ] [    900 ]      [    180 ]      [ 27968 12500 2500 2000000 50000 500000 25000 1000000 50000 1000000 25000 1000000 50000 50000 1000 1000 1000 1000 10000 ]

If you have a number of bufferpools, this can be hard to read, even on a large screen. the width of the numeric values is not hte same as their names, making it not all that tabular. To fix that, you can try the d option, which delimits the output with semicolons, making it easier to get into your favorite spreadsheet tool. The output in that case, raw looks like this:

./parseStmmLogFile.pl /db2diag/stmmlog/stmm.43.log SAMPLE s d
# Database: SAMPLE
MEMORY TUNER - LOG ENTRIES
Interval;Date;Total Seconds;Difference in Seconds; LOCKLIST  ;  BUFFERPOOL - BUFF16K:16K ;  BUFFERPOOL - BUFF32K:32K ;  BUFFERPOOL - BUFF4K ;  BUFFERPOOL - BUFF8K:8K ;  BUFFERPOOL - BUFF_CACHEIVL:8K ;  BUFFERPOOL - BUFF_CAT16K:16K ;  BUFFERPOOL - BUFF_CAT4K ;  BUFFERPOOL - BUFF_CAT8K:8K ;  BUFFERPOOL - BUFF_CTX ;  BUFFERPOOL - BUFF_REF16K:16K ;  BUFFERPOOL - BUFF_REF4K ;  BUFFERPOOL - BUFF_REF8K:8K ;  BUFFERPOOL - BUFF_SYSCAT ;  BUFFERPOOL - BUFF_TEMP16K:16K ;  BUFFERPOOL - BUFF_TEMP32K:32K ;  BUFFERPOOL - BUFF_TEMP4K ;  BUFFERPOOL - BUFF_TEMP8K:8K ;  BUFFERPOOL - IBMDEFAULTBP ; ;
1;02/07/2014 00:17:27;180;180; 27968; 12500; 2500; 2000000; 50000; 500000; 25000; 1000000; 50000; 1000000; 25000; 1000000; 50000; 50000; 1000; 1000; 1000; 1000; 10000;
2;02/07/2014 00:20:27;360;180; 27968; 12500; 2500; 2000000; 50000; 500000; 25000; 1000000; 50000; 1000000; 25000; 1000000; 50000; 50000; 1000; 1000; 1000; 1000; 10000;
3;02/07/2014 00:23:27;540;180; 27968; 12500; 2500; 2000000; 50000; 500000; 25000; 1000000; 50000; 1000000; 25000; 1000000; 50000; 50000; 1000; 1000; 1000; 1000; 10000;
4;02/07/2014 00:26:27;720;180; 27968; 12500; 2500; 2000000; 50000; 500000; 25000; 1000000; 50000; 1000000; 25000; 1000000; 50000; 50000; 1000; 1000; 1000; 1000; 10000;
5;02/07/2014 00:29:27;900;180; 27968; 12500; 2500; 2000000; 50000; 500000; 25000; 1000000; 50000; 1000000; 25000; 1000000; 50000; 50000; 1000; 1000; 1000; 1000; 10000;

Save it off to a file, import it into a spreadsheet, and you get something like this:

Ok, and finally, you can make a pretty graph to look at these in a more human way:

Now that would be a lot more exciting if I ran it on a database where things were changing more often, but that’s the one I have to play with at the moment.

There are some other interesting options besides the s option. The b option shows the benefit analysis that STMM does, which looks pretty boring on my database, but still:

./parseStmmLogFile.pl /db2diag/stmmlog/stmm.43.log SAMPLE b
# Database: SAMPLE
[ MEMORY TUNER - LOG ENTRIES ]
[ Interv ]      [        Date         ] [ totSec ]      [ secDif ]      [ benefitNorm ]
[        ]      [                     ] [        ]      [        ]      [ LOCKLIST  BUFFERPOOL - BUFF16K:16K BUFFERPOOL - BUFF32K:32K BUFFERPOOL - BUFF4K BUFFERPOOL - BUFF8K:8K BUFFERPOOL - BUFF_CACHEIVL:8K BUFFERPOOL - BUFF_CAT16K:16K BUFFERPOOL - BUFF_CAT4K BUFFERPOOL - BUFF_CAT8K:8K BUFFERPOOL - BUFF_CTX BUFFERPOOL - BUFF_REF16K:16K BUFFERPOOL - BUFF_REF4K BUFFERPOOL - BUFF_REF8K:8K BUFFERPOOL - BUFF_SYSCAT BUFFERPOOL - BUFF_TEMP16K:16K BUFFERPOOL - BUFF_TEMP32K:32K BUFFERPOOL - BUFF_TEMP4K BUFFERPOOL - BUFF_TEMP8K:8K BUFFERPOOL - IBMDEFAULTBP ]
[      1 ]      [ 02/07/2014 00:17:27 ] [    180 ]      [    180 ]      [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]
[      2 ]      [ 02/07/2014 00:20:27 ] [    360 ]      [    180 ]      [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]
[      3 ]      [ 02/07/2014 00:23:27 ] [    540 ]      [    180 ]      [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]
[      4 ]      [ 02/07/2014 00:26:27 ] [    720 ]      [    180 ]      [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]
[      5 ]      [ 02/07/2014 00:29:27 ] [    900 ]      [    180 ]      [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]

The o option shows only database memory and overflow buffer tuning:

./parseStmmLogFile.pl /db2diag/stmmlog/stmm.43.log SAMPLE o
# Database: SAMPLE
[ MEMORY TUNER - DATABASE MEMORY AND OVERFLOW BUFFER TUNING - LOG ENTRIES ]
[ Interv ]      [        Date         ] [ totSec ]      [ secDif ]      [ configMem ]   [ memAvail ]    [ setCfgSz ]
[      1 ]      [ 02/07/2014 00:17:27 ] [    180 ]      [    180 ]      [ 6912 ]        [ 6912 ]        [ 1990 ]
[      2 ]      [ 02/07/2014 00:20:27 ] [    360 ]      [    180 ]      [ 6912 ]        [ 6912 ]        [ 1990 ]
[      3 ]      [ 02/07/2014 00:23:27 ] [    540 ]      [    180 ]      [ 6912 ]        [ 6912 ]        [ 1990 ]
[      4 ]      [ 02/07/2014 00:26:27 ] [    720 ]      [    180 ]      [ 6912 ]        [ 6912 ]        [ 1990 ]
[      5 ]      [ 02/07/2014 00:29:27 ] [    900 ]      [    180 ]      [ 6912 ]        [ 6912 ]        [ 1990 ]

There is also a 4 option that you can use to convert all values to 4k pages.

Summary

There are some useful things in the STMM log parser, if you want to understand the changes DB2 is making. Many of us, coming from fully manual tuning naturally distrust what STMM or other tuning tools are doing, so this level of transparency helps us understand what is happening and why it is or is not working in our environments. I would love to see more power in this. Being able to query this data with a table function or administrative view (we can with the db2diag.log!) would be even more useful so the output could be further limited and tweaked. The script is well documented, and I imagine I could tweak it to limit it if I wanted to. I’d love to have it call out actual changes – that would be harder to graph, but for the text output, could be more useful for a fairly dormant system.

↧

Quick Tip: Simple Errors on Database Connection

July 17, 2014, 4:00 am

≫ Next: DB2 Basics: Storage Groups

≪ Previous: STMM Analysis Tool

There are a couple of errors that you can get on database connection that simply mean you typed something wrong, but I figure there are people who will search on these errors, so I thought I would share. If you do not already have a database connection, you can get:

db2 conenct to SAMPLE
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL1024N  A database connection does not exist.  SQLSTATE=08003

If you already have a connection to some other database, you might get:

db2 conenct to SAMPLE
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL0104N  An unexpected token "to" was found following "conenct ".  Expected
tokens may include:  "JOIN ".  SQLSTATE=42601

In both of the cases above, the error was simply because I misspelled “connect”. Sometimes my fingers type faster than my brain. Simply correcting my syntax leads to a successful connection:

db2 connect to SAMPLE

   Database Connection Information

 Database server        = DB2/AIX64 10.5.3
 SQL authorization ID   = DB2INST1
 Local database alias   = SAMPLE

↧

DB2 Basics: Storage Groups

July 22, 2014, 4:00 am

≫ Next: DB2 Basics: Aliases

≪ Previous: Quick Tip: Simple Errors on Database Connection

What is a Storage Group?

A storage group is a layer of abstraction between the data in your database and disk. It is only used with Automatic Storage Tablespaces (AST). It allows us to group tablespaces together to live in similar places. Storage groups were first introduced in a roundabout way with automatic storage databases in DB2 8.2. These databases allowed you to specify one or more paths for the entire database, and DB2 would manage spreading the data across them. It soon became clear that a level between the old school “control where everything goes” and the newer “I don’t care, just spread it across as many read/write heads as possible” was needed. Personally, I’m just fine with having only two locations for my data. I could manage that just fine in with the old methodology with DMS tablespaces, and I manage it just fine in my more recent databases with storage groups.

With DB2 10.1, IBM introduced this middle level of control. We can now create a number of storage groups. This was introduced as a way to handle multi-temperature data and disks of varying speeds. But it’s clear that we can use it in ways beyond that. I use it to separate my administrative, event monitor, and performance data to a separate filesystem from the regular database data – mostly so that if that data gets out of control, it doesn’t affect the rest of my database. If you do have SSD or different speeds of disk, the multi-temperature approach sure makes sense.

This image represents a standard DMS tablespace with file containers:

This image represents an AST tablespaces within a storage group:

You can see the additional level of abstraction represented by a storage group in gray. Assuming a tablespace is added after both storage paths are added to the storage group, DB2 will create tablespace containers on each storage path. Interestingly, all the old DMS standards like keeping all the tablespace containers for a tablespace the same size still apply with AST tablespaces and storage groups. DB2 will continue to stripe things across the tablespace containers in the same way that it does for DMS tablespaces.

Automatic Storage Tablespaces

Storage groups can only be used with automatic storage tablespaces. Automatic storage tablespaces are essentially DMS under the covers, with mechanisms for automatically extending them. They combine the best of both SMS and DMS tablespaces in that they can have the performance of DMS tablespaces, but the ease of administration like SMS tablespaces. IBM had actually deprecated both SMS and DMS tablespace types (for regular data) in favor of AST tablespaces. This means that in the future, our ability to use these tablespace types may be removed.

How to Create a Storage Group

Unless you specified AUTOMATIC STORAGE NO on the CREATE DATABASE command or have upgraded a database all the way from DB2 8.2 or earlier, you likely already have a default storage group in your database, even if you have not worked with storage groups at all. You can look at the storage groups in a datatbase with this SQL:

select  substr(sgname,1,20) as sgname, 
        sgid, 
        defaultsg, 
        overhead, 
        devicereadrate, 
        writeoverhead, 
        devicewriterate, 
        create_time 
    from syscat.stogroups 
    with ur

SGNAME               SGID        DEFAULTSG OVERHEAD                 DEVICEREADRATE           WRITEOVERHEAD            DEVICEWRITERATE          CREATE_TIME
-------------------- ----------- --------- ------------------------ ------------------------ ------------------------ ------------------------ --------------------------
IBMSTOGROUP                    0 Y           +6.72500000000000E+000   +1.00000000000000E+002                        -                        - 2014-05-07-17.44.21.791318
DB_ADM_STOGRP                  1 N           +6.72500000000000E+000   +1.00000000000000E+002                        -                        - 2014-05-08-18.49.16.108712

  2 record(s) selected.

Notice that a lot of disk characteristics that you may be used to seeing at the tablespace level are now available at the storage group level. Tablespaces can be created or altered to inherit disk settings from the storage group. Assuming each storage group is associated with similar kinds of disks, it makes sense to do things this way. To alter an existing AST tablespace to inherit from the storage group, use this syntax:

alter tablespace TEMPSPACE1 overhead inherit transferrate inherit
DB20000I  The SQL command completed successfully.

Creating a Storage Group and AST Tablespace

Creating a new storage group is easy if you know what filesystems you want associated with it:

db2 "create stogroup DB_ADM_STOGRP on '/db_data_adm/SAMPLE'"
DB20000I  The SQL command completed successfully.

Then creating an automatic storage tablespace using that storage group can be done simply as well:

db2 "create large tablespace DBA32K pagesize 32 K managed by automatic storage using stogroup DB_ADM_STOGRP autoresize yes maxsize 4500 M bufferpool BUFF32K overhead inherit transferrate inherit dropped table recovery on"
DB20000I  The SQL command completed successfully.

Since storing permanent data in DMS and SMS tablespaces has been deprecated, it is clear that IBM’s direction is to eliminate the use of these in favor of AST and storage groups.

See these blog entries for more detailed information on Automatic Storage Tablespaces:
(AST) Automatic Storage – Tips and Tricks
Automatic Storage Tablespaces (AST): Compare and Contrast to DMS

↧

DB2 Basics: Aliases

July 29, 2014, 4:00 am

≫ Next: When Index Scans Attack!

≪ Previous: DB2 Basics: Storage Groups

My blog entries covering the DB2 Basics continue to be my most popular blog entries. This is probably because they appeal to a wider audience – even non-DBAs are interested in them and I continue to rank highly in Google search results. My blog entry on how to catalog a DB2 database gets a ridiculous 50 page views per day, which is more than my whole blog got per day for the first year I was blogging. I am also amazed that I can still come up with topics in this series – it seems like there are only so many things that count as “basics”. But there are a lot of different things in DB2 and some can be covered at a simple level.

The other day ago, I had to look in depth at some table aliases, and knew I had made a mistake with one in another environment, so was checking what table each alias pointed to, and it occurred to me that it wasn’t something I had covered in the basics series. Both table and database aliases are discussed in this blog entry.

Table Aliases

What Table Aliases Are

Table aliases are essentially a different name that you give to an existing table within the database. They allow you to call the table by a different name without having a separate object.

An alias can be used in regular SQL, but you cannot use an alias for things like ALTER TABLE or DESCRIBE TABLE statements.

The table that an alias points to is called the base table for the alias.

Use Cases for Table Aliases

The most frequent place that I use table aliases is to allow a user to query things without specifying a schema. While one of my pet peeves is an application that does not specify a schema in it’s queries, WebSphere Commerce is just such an application. Since the majority of the databases I support are WebSphere Commerce databases, I have to deal with developers and others wanting to submit queries without specifying the schema name. This quickly gets unreasonable when there are many developers on a project.

There are some custom tables in my WebSphere Commerce databases that I know are not related to the base WebSphere Commerce functionality, but are related to integration points such as AEM or custom data load processes. I like to put these tables in separate schemas so I know in the future that they are indeed separate in some way – this is useful for upgrades of WebSphere Commerce. But the developers still want to be able to refer to them without specifying a schema name. So I place them in a separate schema (and sometimes a separate tablespace too), and create an alias for them in the schema that WebSphere Commerce uses.

Aliases can also be used to control table access or allow you to present different tables to the user with the same name – but not with as much power as views allow. For example, if you had two tables – FOO.BAR_V2 and FOO.BAR_V3, you could have an alias called FOO.BAR that pointed to FOO.BAR_V2, and when an upgrade required a change in the table, you could point your FOO.BAR alias at FOO.BAR_V3. This might result in less down time during upgrades, when done in conjunction with other strategies.

Aliases do not have independent permissions from their base tables, so they cannot be used like views can to restrict the permissions differently from the base table.

An alias in DB2 for z/OS can mean something different than LUW. In DB2 for z/OS, an alias can refer to what we in LUW call a nickname – a table on a remote server accessed using federation.

Creating Table Aliases

Table aliases are easy to create. An example of the basic syntax is below:

db2 "CREATE ALIAS WSCOMUSR.X_SALES_CATEGORIES FOR TABLE AEM.X_SALES_CATEGORIES"
DB20000I  The SQL command completed successfully.

Table aliases can be created for objects that do not exist – a warning is returned, but not an error. Table aliases are also NOT dropped when the base table they point to is dropped. This can be surprising behavior the first time you encounter it.

If the schema specified does not exist, DB2 will attempt to create the schema. If the user doesn’t have implicit schema permissions at the database level, they will not be able to create the schema and the CREATE ALIAS command will fail.

Investigating Table Aliases in a Database

It is very possible to make a mistake when creating an alias. DB2 will not care if you copy and paste the wrong table name, and that can be a bit frustrating for developers if it’s an alias they use. Below is an example of exploring the mapping of a base table to an alias.

select  type,
        substr(tabschema,1,12) as tabschema, 
        substr(tabname,1,30) as tabname, 
        substr(base_tabschema,1,12) as base_tabschema, 
        substr(base_tabname,1,30) as base_tabname 
    from syscat.tables 
    where tabname like 'X_CAT%'  
    with ur
TYPE TABSCHEMA    TABNAME                        BASE_TABSCHEMA BASE_TABNAME
---- ------------ ------------------------------ -------------- ------------------------------
T    AEM          X_CATEGORIES_PRODUCTS          -              -
A    WCR101       X_CATEGORY_NAME                WSCOMUSR       X_CATEGORY_NAME
A    WSCOMUSR     X_CATEGORIES_PRODUCTS          AEM            X_CATEGORIES_PRODUCTS
T    WSCOMUSR     X_CATEGORY_NAME                -              -

  4 record(s) selected.

Notice the TYPE in the query output above. Information about aliases and tables (and views) is stored in the same system catalog view, and this column tells us whether each row is for a table or a view. Note that tables will not have values for BASE_TABSCHEMA or BASE_TABNAME. Those fields will only be populated for aliases.

Dropping Table Aliases

You can drop table aliases independently of the tables they refer to. In fact you must do so if you want to get rid of the aliases, because dropping a table does not remove the aliases that refer to it. The key is to ensure that you use the ALIAS keyword – otherwise you may inadvertently drop the base table that the alias refers to. An example of the syntax for dropping an alias is below.

db2 drop alias WCR101.X_CATEGORY_NAME
DB20000I  The SQL command completed successfully.

Database Aliases

What Database Aliases Are

Database aliases are different from table aliases. Database aliases are defined using the CATALOG DATABASE command and stored in the database directory. They can be created for local or remote databases.

Use Cases for Database Aliases

Similar to some of the use cases for table aliases, you can use a database alias to swap connectivity from one database to another or even to prevent access to a database. For example, if I have a database named SAMPLE, I can catalog it as SAMP_USR – and then only have users connecting in to SAMP_USR. If I then want to prevent users from connecting, I can uncatalog the alias SAMP_USR, and the users who were connecting in to SAMP_USR will not be able to connect, but I or anyone connecting into SAMPLE will be able to.

I have seen aliases used to simplify connections for apps over multiple similar environments (they all use the same database name for different databases on different servers.

Database aliases are sometimes used as friendly names for databases – but aliases still must be 8 characters or fewer, just like database names.

Database aliases can be defined either at the DB2 server or at the DB2 client.

Creating Database Aliases

Database aliases are created using the catalog database command. Technically, there is an alias created on database creation that is identical to the database name. You can see this in the database directory:

db2 list db directory

 System Database Directory

 Number of entries in the directory = 1

Database 1 entry:

 Database alias                       = SAMPLE
 Database name                        = SAMPLE
 Local database directory             = /db_data
 Database release level               = 10.00
 Comment                              =
 Directory entry type                 = Indirect
 Catalog database partition number    = 0
 Alternate server hostname            =
 Alternate server port number         =

To explicitly create an alias for an existing database, use the CATALOG DATABASE command:

db2 catalog database sample as samp_usr
DB20000I  The CATALOG DATABASE command completed successfully.
DB21056W  Directory changes may not be effective until the directory cache is
refreshed.

To see the results of this, list the db directory:

db2 list db directory

 System Database Directory

 Number of entries in the directory = 2

Database 1 entry:

 Database alias                       = SAMP_USR
 Database name                        = SAMPLE
 Local database directory             = /db_data
 Database release level               = 10.00
 Comment                              =
 Directory entry type                 = Indirect
 Catalog database partition number    = 0
 Alternate server hostname            =
 Alternate server port number         =

Database 2 entry:

 Database alias                       = SAMPLE
 Database name                        = SAMPLE
 Local database directory             = /db_data
 Database release level               = 10.00
 Comment                              =
 Directory entry type                 = Indirect
 Catalog database partition number    = 0
 Alternate server hostname            =
 Alternate server port number         =

Though the above is for a local database, you can also perform the same for a remote database.

All investigation of aliases can be done through the database directory.

If you have multiple aliases for a database, be cautious with any scripts that parse your database directory – you can end up doing something for the same database twice. For example, if you grepped the db directory from above on “Database name”, you would get two entries:

db2 list db directory |grep "Database name"
 Database name                        = SAMPLE
 Database name                        = SAMPLE

You get the same database twice. If you are scripting, use some method of making sure you have a unique value to avoid processing the same database twice.

Dropping Database Aliases

Much like table aliases, database aliases are not removed if you drop the database.

db2 drop db sample
DB20000I  The DROP DATABASE command completed successfully.
db2 list db directory

 System Database Directory

 Number of entries in the directory = 1

Database 1 entry:

 Database alias                       = SAMP_USR
 Database name                        = SAMPLE
 Local database directory             = /db_data
 Database release level               = 10.00
 Comment                              =
 Directory entry type                 = Indirect
 Catalog database partition number    = 0
 Alternate server hostname            =
 Alternate server port number         =

Database aliases are removed with the uncatalog command:

db2 uncatalog db SAMP_USR
DB20000I  The UNCATALOG DATABASE command completed successfully.
DB21056W  Directory changes may not be effective until the directory cache is
refreshed.

Other Aliases

Aliases can also be created for modules and sequences, in a similar manner to table aliases.

↧

When Index Scans Attack!

August 4, 2014, 11:00 pm

≫ Next: Bad Message Queue Handler. Sit. Stay.

≪ Previous: DB2 Basics: Aliases

We all know that table scans can be (but aren’t always) a negative thing. I have spent less time worrying about index scans, though. Index access = good, right? I thought I’d share a recent scenario where an index scan was very expensive. Maybe still better than a table scan, but with one index, I reduced the impact of a problem query by 80%.

I’ve recently gotten my hands on the DBI suite of tools http://www.dbisoftware.com/ and am starting to use them to analyze a new set of databases in support of a WebSphere Commerce site that will be going live in a few weeks. I’m using screen shots from those tools in this blog post. I’m not endorsing them explicitly, just showing how I used them.

First, I had a performance problem. SOLR reindexing was taking a long time. The issue was being addressed from multiple directions, but seeing as it spkied CPU on the database server to 80-90 % for 45 minutes, SQL analysis of the time period was in order.

Here’s what I saw looking at statements for the period in question:

Clearly, I have multiple dragons to slay here. But that first statement is using over 50% of the CPU used during this time period. The text of the statement turns out to be:

SELECT IDENTIFIER, CATGROUP_ID_CHILD, URLKEYWORD, SEQUENCE
    FROM CATGRPREL T1, SEOURL T2, SEOURLKEYWORD T3, CATGROUP T4
    WHERE CATGROUP_ID_CHILD = T4.CATGROUP_ID
        AND T2.SEOURL_ID = T3.SEOURL_ID
        AND TOKENVALUE = CATGROUP_ID_CHILD
        AND LANGUAGE_ID = ?
        AND T3.STATUS = :ln
        AND CATGROUP_ID_CHILD IN
            (SELECT CATGROUP_ID
                FROM CATGROUP
                WHERE MARKFORDELETE = :ln)
        AND CATGROUP_ID_PARENT = ?
        AND TOKENNAME = :ls
        AND CATALOG_ID = ?
    ORDER BY SEQUENCE WITH UR

A side note here – Brother-Pather aggregates multiple statements together for me. Note the :ln and :ls in there – those replace literal values. Multiple statements might show up in my package cache because my application is specifying different literal values for those, or the same literal values may have been used over and over again, too. Parameter markers still show up as question marks, like anywhere else.

I ran an explain and get:

Note that I have collapsed some parts of the explain above to focus in on where the bulk of the cost is coming from. Note the place where the expense really comes in is through an index scan. DB2 will tell us “index scan” in an explain plan when it uses the root page to find the intermediate page, and the intermediate page to find the leaf pages, and then fetches from the table by RID. But it will also show “index scan” for a scan where it reads every single leaf page in an index to get the RIDs it needs. I think that’s what was occurring here – page scans of every single leaf page in that index. Even with read-ahead sequential prefetch, the query was still using an awful lot of CPU cycles.

A design advisor on the query, came up with 4 index recommendations (for over 80% improvement), one of which was:

CREATE UNIQUE INDEX "DB2INST1"."IDX1408012049550" ON "WSCOMUSR"."SEOURL" ("TOKENNAME" ASC, "TOKENVALUE" ASC, "SEOURL_ID" ASC) ALLOW REVERSE SCANS COLLECT DETAILED STATISTICS;

This index has exactly the same columns as an existing index, just in a different order. This might make me shy away from it. But I could tell based on the the explain that this was the index I needed for this particular query – it would be the one with the ability to give me the most impact.

I created the index, and afterward the explain looked like this:

What a difference! Just from changing the order of two columns in a three column index.

WebSphere Commerce does not allow me to drop or in any way alter any existing unique index. I have tried before, even just changing a unique index in the include columns and keeping the same index name, but the application fails if I do so. That means that I now have two indexes on this table that cover the same data. Luckily insert/update performance on this particular table is not critical, so I can accept this for now. It is critical to understand which tables have critical insert/update performance in an OLTP database and which tables do not.

↧

Bad Message Queue Handler. Sit. Stay.

August 7, 2014, 4:00 am

≫ Next: DB2 Error Logging

≪ Previous: When Index Scans Attack!

There’s an error message that appears in my db2 diagnostic logs rather frequently. It looks like this:

2014-07-30-13.34.58.446316+000 E1638372A476         LEVEL: Error (OS)
PID     : 32374944             TID : 1              PROC : db2
INSTANCE: db2inst1             NODE : 000
HOSTNAME: redacted
EDUID   : 1
FUNCTION: DB2 UDB, oper system services, sqlodque, probe:2
MESSAGE : ZRC=0x870F003E=-2029060034=SQLO_QUE_BAD_HANDLE "Bad Queue Handle"
          DIA8555C An invalid message queue handle was encountered.
CALLED  : OS, -, msgctl
OSERR   : EINVAL (22) "Invalid argument"

Mostly I’ve just ignored it, but recently had a client with an over-zealous Tivoli implementation that started sending me alerts every time they occurred. Dozens of times per day. Now I’m personally just fine with figuring out something isn’t really a problem and then ignoring it. But when I have to explain why it isn’t a problem to a client, I have to provide more details. It didn’t take long to come up with this technote:http://www-01.ibm.com/support/docview.wss?uid=swg21259051

Reading through that, I learned that this error message is written to the diagnostic log whenever someone pipes the output of a db2 command to head, tail, or more (which I do fairly frequently). It also occurs when several different languages execute db2 commands without a database interface – so in Perl when I use open, system, or “. Yeah, I run scripts that do that at least hourly if not more frequently.

The part that perplexes me here is why DB2 considers this type of thing an ERROR level event? Is there a scenario where this same error is returned and it is a really serious problem.

I have to go now, so I can train some Tivoli admins on working with a DBA rather than just blindly alerting on everything in the DB2 diagnostic log that says “Error”…

↧

DB2 Error Logging

August 12, 2014, 4:00 am

≫ Next: DB2 Basics: db2top

≪ Previous: Bad Message Queue Handler. Sit. Stay.

(Edited 8/12/2014 to add links to the old tutorials from IBM)

There are a number of ways to cover error logging. I have covered some specific elements in previous posts, so I’m going for a more comprehensive approach in this post.

There used to be this great “Problem Determination Mastery” certification available. The study material and the test were only online. But the Tutorials associated with it were pretty good. They’re a bit on the old side now, but there’s still good information in them. I’ve included a copy of them here:
Part 1
Part 2: Installation
Part 3: Connectivity
Part 4: Tools
Part 5: Engine
Part 6: Performance
Part 7: Multi-Node
Part 8: Application
Part 9: OS

These were clearly written by IBM and not me. If I could, I’d link to IBM’s site for them, but I can’t find them on any IBM site anymore. I do wish they’d release an updated version – I found them some of the most useful material.

I also cover a bit of problem determination in my post on What to do when DB2 won’t work

Understanding the basics of DB2’s error logging is really critical for any DB2 DBA.

DB2 Error Log Locations

DB2’s default location for error logging is to $INSTHOME/sqllib/db2dump. This is true also for clients – which do have a DB2 diagnostic log. You can change or find the location that db2 is using for the various error logging using the DIAGPATH dbm cfg parameter. Check the value using:

db2 get dbm cfg |grep -i DIAG
 Diagnostic error capture level              (DIAGLEVEL) = 3
 Diagnostic data directory path               (DIAGPATH) = /db2diag/
 Current member resolved DIAGPATH                        = /db2diag/
 Alternate diagnostic data directory path (ALT_DIAGPATH) =
 Current member resolved ALT_DIAGPATH                    =
 Size of rotating db2diag & notify logs (MB)  (DIAGSIZE) = 0

There are two major error logging files. The db2diag.log is called the DB2 diagnostic log, and it is written to based on the the value of DIAGLEVEL. The other main file is called the administration notification log, and is named <instance_name>.nfy. Other files can be written to the DIAGPATH as well.

There have been some recent changes in this area. Because the db2 instance is largely unavailable if the location where DIAGPATH is fills up, ALT_DIAGPATH has been added to allow you to specify a secondary location for error logging to ensure that the database instance can continue to function. You can also now specify a DIAGSIZE value. When you do specify a number in MB, this size is the total size you want your DB2 diagnostic log and administration log (on Linux or UNIX – on Windows it does not include the administration log). DB2 will allocate 90% of this size to the DB2 diagnostic log and 10% to the administration notification log. It will then manage the rotation of 10 log files each 1/10th the size specified – when the 10th file is full, it deletes the 1st file and creates the 11th file.

Since those are relatively new additions, I tend to just go with my old management strategy – which is to archive and compress the DB2 diagnostic log file and administration log file at the beginning of each month, and clear out some of the accumulated dump and misc files in the DIAGPATH at the same time. I retain the compressed files for 3-9 months depending on space.

If either the DB2 diagnostic log or the administration notification log are removed or renamed, db2 will re-create them so it has something to write to.

DB2 Diagnostic Log

The DB2 diagnostic log (db2diag.log) has been around forever. It contains detailed information about certain activities in the database.

DIAGLEVEL

The level of information captured in db2diag.log depends on the value of DIAGLEVEL. The possible values of DIAGLEVEL from the IBM DB2 Knowledge Center are:
0 – No diagnostic data captured
1 – Severe errors only
2 – All errors
3 – All errors and warnings
4 – All errors, warnings and informational messages

The default DIAGLEVEL is 3. 4 is usually too much information and may slow down processing – particularly LOADs. 1 or 2 may be too little information. In reality, I’ve only seen 4 used when troubleshooting a problem.

I have recently been subjected to a Tivoli monitoring implementation that emails me for EVERY error. And let me tell you, there are some things that I really wouldn’t consider error worthy. Examples include: a user specifies a password incorrectly, a database name is spelled incorrectly in some commands, some SQL syntax errors, a db2 command is piped to head or tail, and certain ways of executing db2 commands in perl scripts. I now fat-finger some things and I get multiple emails from Tivoli telling me so. It is a bit frustrating. I’m remembering why I don’t usually go in for db2diag.log parsing and alerting on the results.

Parsing the DB2 Diagnostic Log

I still tend to default to just taking a quick look at the db2diagnostic log for simple looks, and then accessing the db2 diagnostic log with sql for more complicated analysis. I only use the db2diag tool for archiving really – SQL access is easier. Click on either of the links to get details on using the other methods for accessing the data.

In my opinion, it is still important to be able to read a text db2diag.log and glean information from it. There is a wealth of data in there. The IBM DB2 Knowledge Center has an excellent level of detail on this:
http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.trb.doc/doc/c0020815.html?lang=en

If you go to that link, you’ll see that the general form of a db2diag.log entry is:

A couple of the most important things to note there are #3, which is the “level” indicator – it can help you eliminate some things as informational or warning rather than errors. #9 is the database name, which is useful if you have more than one database in an instance. Finally, a lot of error messages or other data are at the end, in #16 or after. Those are merely the highlights – I highly recommend reading the IBM DB2 Knowledge Center page on it to really understand reading (and therefore querying) entries in the db2diagnostic log.

DB2 Administration Notification Log

The DB2 Administration Notification Log has been added since I first became a DBA. It is billed as a more human readable place to look for errors, but I’m still an addict of the DB2 diagnostic log – once you learn to read it, there’s so much more information in the DB2 diagnostic log.

I recently ran a highly unscientific poll on the blog, asking the question “Do you use the DB2 administration notification log (instance_name.nfy)?”. This is what the results looked like:

I should really force voters to login so I could see how many of the responses were just Ian messing with me.

Twenty percent of the responses indicated didn’t know what the DB2 administration notification log was. A larger number than I expected admitted to using it. It is up to you whether or not you use the DB2 administration notification log. Just know that there is often more information in db2diag.log, for better or worse. I’d love to hear comments of how people work with it.

OS Error logging

Never forget your OS level logging when investigating db2 problems. A disk issue, for example, can explain a lot of oddness that may be harder to explain just from database level error logging.

Other Error Related Output

This entry has focused on the DB2 diagnostic log and the DB2 adminstration notification log. But there are other things that land in this directory, including:

dump files
trap files
first occurrence data collection (FODC) packages
STMM logs
locktimeout files

These files can be useful, either for your own purposes or for providing to IBM DB2 support in case you have to open a PMR about an issue. They should be cleaned up on a regular basis, though, too, to prevent them from filling up the disk. I usually keep them for a minimum of 30 days.

↧

DB2 Basics: db2top

August 26, 2014, 4:00 am

≫ Next: DB2 Basics: Executing a Script of SQL

≪ Previous: DB2 Error Logging

There are a lot of things I can cover on db2top, and probably more tips and tricks using db2top than many other tools out there. Searching the web on db2top gets more good results than on many other db2 topics. I thought I’d start with some of the basics. Using db2top requires some general knowledge of how db2 works. I really debated whether it even qualified for my DB2 Basics series, but there are a few basic things that can help with using db2top.

What is db2top?

db2top is a “real-time” tool for seeing what is currently going on within a db2top instance or database. I put real-time in quotes, because technically it is a couple of seconds (depending on how you invoke it) behind real time, and it can also be used in a number of ways. db2top presents data in a similar manner to the unix/linux tool top. I think of it as a text-based GUI. Not fully GUI, but it is more graphical and dynamic than many text interfaces.

db2top originally came out of IBM’s Alpha Works. It was bundled into db2 starting in fixpack 10 of 8.2, 9.1 FP6, 9.5 FP2, and 9.7 and higher GA. I installed the alpha works version on databases long before it was officially included with db2. db2top is a revelation in database monitoring for those of us who haven’t generally had performance monitoring tools. A way to watch what’s going on in db2 in near real time.

db2top is only available on the unix and linux platforms, not on Windows.

My understanding is that db2top interfaces with the snapshot monitors. I think that db2top enables the monitoring switches that it needs when you start it up – for db2top’s use only, so that you get data even if you don’t have all the monitoring switches enabled by default.

Running db2top

It is simple to run db2top. There are two basic modes you can run db2top in – interactive or batch mode. This blog entry will focus on interactive mode. I’ve only run it a few times in batch mode, and found replaying the data a bit difficult and the data collected a bit large in size. For historical data, I prefer a custom process of capturing snapshot data periodically or a vended tool for that.

To use db2top, you must have SYSMON or higher (SYSMAINT, SYSCTRL, or SYSADM) authority. With a privliged user simply issue:

db2top -d sample

If you do not specify the database name, and have not previously written out a .db2toprc, then you’ll get this error:

*************************************************************************
***                                                                   ***
*** Error: Can't find database name                                   ***
***                                                                   ***
*************************************************************************

Once you have entered db2top successfully, it will look something like this:

Notice on this screen that the instance and database name are displayed in the upper right – I’ve blurred them out. I also have the CPU displayed in the upper right. That’s a customization I’ll explain later. In the upper left, you can see a number of options. One of them is the refresh interval, which in this case is the default of 2. You can specify a larger refresh value by invoking db2top with the -i option and the number of seconds. Also, if you have a problem on the system that is preventing timely refresh, this value will show the time since the last refresh, even if it is larger than the specified interval. This can be useful to know on systems that are really in trouble.

This initial screen also tells you how long this database has been active, and the time of the last backup – useful pieces of information in some cases. The initial screen also lists a few of the more common interactive options.

.db2toprc

If you only have one database that you want to run db2top for on a particular instance, or if you have a default database you want db2top to run for, once you’re in db2top, hit ‘w’ to write your current configuration out to .db2toprc. It will include the database and column orders on the various screens. Once you have done this once, the database name you used when invoking db2top that time will be assumed, and you can just get into db2top using db2top – no need to specify a database name.

A file called .db2toprc will be generated in the home directory of the current user. If you need to specify a different location, you can use the environment variable DB2TOPRC to specify the location to look for the .db2toprc file. In the absence of this variable, db2top will look first in the current directory, and last in the current user’s home directory. I like to keep my .db2toprc files in the home directory to keep things simple.

There are some fun things you can do with .db2toprc. It allows you to store your favorite order for columns on the various db2top screens – you can store this either by using w when in a session with the desired order, or by editing .db2toprc directly.

One of my favorite things to do with .db2toprc is to include cpu utilization. The line for this is in the various documentation locations – both the PDF that has been passed around since AlphaWorks days and in the IBM Knowledge Center. In some versions there were typos, and different element numbers are needed on Linux and AIX, so make sure the line works for you. To get that nifty little cpu report in the upper right corner, you just add this line to .db2toprc (this one works for me on AIX 7):

cpu=vmstat 2 2 | tail -1 | awk '{printf("%d(usr+sys)",$14+$15);}'

Note that this method generally has numbers that are 4 seconds behind, but tends to be close enough. I like to have it to glance at while I’m monitoring, but if I need something detailed or long term, my SA is likely to have much better numbers and historical information.

Help Screen

Pressing the h key in db2top will get you the help screen, which has a list of all the nifty options:

I cannot possibly go through all the available db2top options in this article, but that’s a nice place to play from and see what you can find.

From the help screen, just hit enter to get back to the screen you were on when you hit h.

Database Screen

Pressing the d key in db2top will get you to the database screen:

This is the screen I often start from before moving into other screens. Note the “Lock Wait” field at the lower right. If there are any lock-waits, that will be highlighted in white, making it stand out. It’s great for quickly seeing if you’re seeing lock-wait issues, lock time out issues, deadlock issues, and so forth. This screen also shows you values for overall key performance indicators – such as your bufferpool hit ratio, percent of sort overflows, Total and active sessions, deadlocks, log reads, and so forth. Too bad it doesn’t have an index read efficiency value.

Another tip that’s good to cover on this screen – by default, db2top subtracts metrics and shows you only what has occurred in the last interval (default 2) seconds, where the default interval is 2 seconds. You can press k to tell db2top to show you everything since the last database activation or since the last reset in this session. This can be useful for seeing things longer term, but honestly, if I wanted data since the last activation, I’d be using the mon_get functions and views. What you can do is use k in addition to R. If you press R, it will reset the monitor switches in the db2top session, meaning you can set a point in time manually, and then see everything since that time. If you press R, it will prompt you to confirm like this:

Once you have confirmed, it will switch to this cumulative view. Pressing k repeatedly will toggle between the cumulative view (with whatever reset point you’ve specified), and the current view of only things that have happened in the last interval seconds – you can switch between the two repeatedly without loosing your reset point. This behavior applies on most screens, not just the database screen. So if you then hit l to go to the sessions screen while you’re in cumulative view, you’ll be in the cumulative view and can easily get out of the cumulative view by pressing k

Sessions Screen

This is another of my favorite screens. I like to look at it to understand how busy my database is. Press l to get to the sessions screen:

This is a screen I like to view in a wider window. db2top does well with resizing your screen to see additional columns of output – it will give you as much as you can fit on screen, so simply making my terminal window wider, I get:

Also, depending on what I’m looking for, I often play with column order on this screen. To change the order of the columns on any db2top screen, simply hit c:

On this screen, you can enter the numbers of columns in the order you want them. You don’t have to specify all columns you want to see if you want to just bring a single column to the front, you can bring that to the front by specifying only it.

Hitting enter will then take you back to the sessions screen with the different column order:

From the screen itself, you can change the column sort order by hitting z (for desc sort) or Z (for asc sort), and then entering the column number. But wait, the column number for sort that you enter is the DEFAULT column number. This gets confusing because you could have a totally different order on your screen. To get the column number that you really want, you’ll have to toggle over to the columns screen by pushing c and then go back to the sessions or whatever screen that you want the sort to occur on, and then toggle back to the columns screen again to make sure the sort is what you wanted. Also remember that numbering starts with 0:

And this is the result on the columns screen:

One last useful thing that I want to cover on the sessions screen. Pressing i will toggle whether idle sessions are visible or not. Especially if you’re using a non-standard sort-order, it can be useful to only look at sessions that are not idle.

Locks Screen

My favorite way to look at lock chaining issues is using db2top – it is so much quicker to see this way. To get to the the locks screen, use the U option (upper case, not lower case):

Now this screen looks pretty boring on a largely idle database, but notice in the middle of the bottom, it says “L: Lock Chain”. What this means is that if you press L, it will bring up a lock chain that shows all application handles that are currently holding locks that are blocking other applications.

(thanks to @idbjorh for the image)

Dynamic SQL Screen

While I now prefer using mon_get_package_cache_stmt for digging into SQL, I have also used this screen before. Press D in db2top to get:

This lists the dynamic SQL currently executing in the database. Notice that like the locks screen, there’s an option for more information in the center bottom of the screen: “L: Query Text”. Since the column only shows you so much of the query, you can press L and then enter the SQL_Statement Hashvalue to get the full query text:

The full text appears like this:

Notice the options at the bottom of the window. Directly from there, you can explain the statement or write it out to a file – which is good because copying from that “window” does not work well.

Nifty New Tip

And in the “stuff I didn’t know, but learned while blogging about something I do every day” category – did you know you can scroll through the columns by pressing < and >? It works, and can be interesting. I do not know of a way to scroll up and down, but with a two second refresh, I can understand how scrolling up and down would be pretty meaningless. I wonder if there’s a way to do that in replay mode.

I also learned that hitting the right or left arrow will cause an immediate refresh of the data, not waiting for the 2 second interval.

This is Just a Primer

There are so many things on each of those screens that I did not cover in detail. Things I do fairly frequently. There are also screens I use at least once a week that I did not cover. I remember when they made it so buffer pools could easily be changed online, I spent most of a holiday peak period watching the buffer pools screen (among others) and tweaking buffer pool sizes.

db2top is astoundingly powerful, but it requires some experimenting and researching to learn about it and figure out how it works for you. If you have vended or pay-for-use IBM utilities, it may not be all that useful, but it’s great for when you don’t have those tools.

References

developerWorks article on db2top: http://www.ibm.com/developerworks/data/library/techarticle/dm-0812wang/
The K guy’s excellent series on db2top: http://www.thekguy.com/db2top

↧

DB2 Basics: Executing a Script of SQL

September 2, 2014, 4:00 am

≫ Next: Checking the Output of SQL Scripts and Commands for Errors

≪ Previous: DB2 Basics: db2top

There are quite a few scenarios in which DBAs need to execute a script of SQL. Sometimes developers provide such a script to be executed. Sometimes we just have a large number of commands that need to be done as a whole.

File Naming

For Linux and UNIX systems the file name does not matter a bit. I like to name my SQL files with .sql at the end. I also find .ddl and .dml acceptable for data definition language and data manipulation language when a file is specific to only one of those. Windows is more problematic in my experience, as are MS applications like Outlook. You may need to stick to or a avoid certain file extensions for Windows.

A list of file extensions frequently used with DB2 databases:

Extension	Purpose
.sql	General SQL statements
.ddl	Data Definition Language – such as CREATE TABLE, ALTER TABLE, and CREATE INDEX
.dml	Data Manipulation language – such as INSERT, UPDATE, and DELETE
.del	Delimited data, often CSV, but may use other delimiters
.csv	Comma delimited data
.ixf	DB2’s Integrated Exchange Format – includes both data and information on table structure
.asc	Non-delimited ASCII data

Some of those extensions are not specifically related to executing a script of SQL, but it is good information to have.

SQL vs. Shell or Other Language Scripts

Scripts of SQL commands may show up as shell scripts, actually. If they are shell scripts, they will look something like this:

db2 "connect to sample"
db2 "create table...."
db2 "alter table ...."
db2 "create index ...."
db2 "insert into ...."

Notice the “db2″ before each statement, and the quotes around the statements. The quotes around the statements may be optional in some scenarios. These files should end in .sh or .ksh to indicate they are shell scripts specifically. Shell scripts can be executed at the UNIX or Linux command line simply like this:

./filename.sh |tee ./filename.sh.out

Always send the output somewhere so you can review it later if needed.

A file of pure SQL should look more like:

connect to sample;
create table....;
alter table ....;
create index ....;
insert into ....;

Note the lack of the “db2″ at the beginning of each line. Note also that each line terminates in a semicolon(;). The semicolon is the delimiter in this example. Since the semicolon is the default delimiter, you could execute the above file using:

db2 -tvmf filename.sql |tee filename.sql.out

Finally, you might have a file that uses an alternate delimiter. This is required when working with certain triggers and stored procedures. In the case of an alternate delimiter, the file might look like this:

connect to sample@
create table....@
alter table ....@
create index ....@
insert into ....@

That file would be executed using:

db2 -td@ -vmf filename.sql |tee filename.sql.out

Basic Command Line Options

There are several command line options, I used above. Here is the breakdown of these options I use most frequently:

t – terminated – the statements are terminated with a delimiter. The default delimiter is the semi-colon
d – delimiter – the default delimiter is being overriden, and db2 uses the character immediately following d as the delimiter.
v – verbose – the statement will be echoed in output prior to the result of the statement. This is extremely useful when reviewing output or troubleshooting failed statements
m – prints the number of lines affected by DML
f – file – indicates that db2 should execute statements from a file, with the filename specified one space after the f.

There are plenty of other interesting command line options available in the DB2 Knowledge Center: http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_10.5.0%2F3-5-2-0-2&lang=en

Things that may Happen Inside or Outside of a Script

There are things that may either be a part of the SQL script you’re executing, or done outside of the SQL script prior to executing the script.

For example you could have this SQL script:

create table....;
alter table ....;
create index ....;
insert into ....;

And first connect to a database or issue SET CURRENT SCHEMA or other commands, followed by executing the script:

$ db2 connect to sample

   Database Connection Information

 Database server        = DB2/AIX64 10.5.3
 SQL authorization ID   = DB2INST1
 Local database alias   = WC42U1L1

$ db2 -tvmf filename.sql |tee filename.sql.out

The other choice is to incorporate the connect into the script like this:

connect to sample;
create table....;
alter table ....;
create index ....;
insert into ....;

And simply execute the full script:

$ db2 -tvmf filename.sql |tee filename.sql.out

The choice between the two depends on your personal preferences and environment. If you’ll be executing a script against multiple databases, would you prefer to have to edit the file to include the proper connection information for each one, or would you prefer to just connect first? Having the database name in the file can be useful for the output to verify later that you connected to the right database when executing the script.

↧

Checking the Output of SQL Scripts and Commands for Errors

September 4, 2014, 4:00 am

≫ Next: HADR Tools: the HADR Simulator

≪ Previous: DB2 Basics: Executing a Script of SQL

Many DBAs who have been DBAs for a while have been bitten by executing an SQL script, not thoroughly checking the output, and finding later that one or more statements in the SQL script failed. I have certainly been guilty of this at times.

Success of a Command

In Linux and UNIX, when a command is run, the OS notifies us if the command fails. If you’re scripting in these, you’ll check the return code of the command to see if it was successful. In my favorite scripting language, Perl, any error message is stored in $!. However, it is a little different when executing DB2 commands. Whether DB2 thinks the command succeeded or failed, the operating system is checking at a different level. Therefore even DB2 errors are seen as successes by the OS. From the Operating System’s point of view, it passed a command to DB2, and DB2 successfully handled the command and returned any output that needed to be returned. Even if that output was a error from DB2 saying DB2 could not do what you asked.

This means that parsing the output from DB2 commands takes a bit more work than just checking what’s written to STDERR.

Parsing the Output of an SQL Script for Errors

A lot of work is done with DB2 using SQL scripts of commands. See my blog entry on DB2 Basics: Executing a Script of SQL if you’re not familiar with how to execute a script of SQL.

The first step in making sure you can easily find errors in your output is saving that output somewhere. It is also important to use the v option when executing SQL to ensure that you are capturing both the statement that is failing and the output from that statement. My preferred syntax for executing SQL files is:

db2 -tvmf filename.sql |tee filename.sql.out

The tee allows me to see the output while the statements are being run. I do change the tee to a simple “>” if I’m doing a large number of statements or expect a large number of lines as output. tee can add significant processing time in these situations that “>” avoids.

The “m” tells db2 to output the count of rows affected for DML, and is only applicable in certain situations. It was introduced in DB2 9.1, so if you are seriously down level, it may not be available to you.

Using syntax like this gives you good information in checking for errors. If there are too many lines of output to easily parse through visually, I use grep to help me search through the output file for errors. I’ve found the following works well to catch many errors and eliminate many false positives:

cat filename.sql.out |grep SQL |grep -v DB20000I|grep -v "LANGUAGE SQL" |grep -v "READS SQL DATA"

Errors generally start with “SQL”. Running this on an output file I’ve been working with lately, I get this:

DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL0601N  The name of the object to be created is identical to the existing
name "DB_ADM_STOGRP" of type "STOGROUP".  SQLSTATE=42710
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL0601N  The name of the object to be created is identical to the existing
name "USERTEMP32K" of type "TABLESPACE".  SQLSTATE=42710
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL0601N  The name of the object to be created is identical to the existing
name "DBA32K" of type "TABLESPACE".  SQLSTATE=42710
CONTAINS SQL
        DECLARE SQLCODE INTEGER DEFAULT 0;
        WHILE (SQLCODE=0) DO

So the last three lines there are not errors, but lines that happen to contain the string “SQL”. But before that are valid errors. This is much easier to look at than the 9,091 rows of the actual file I was parsing in this case.

More Advanced Error Checking and Scripting

When you get beyond simple SQL files to shell or Perl or other scripting, you have to decide how to locate and handle errors. In my Perl scripts, I have several error checking routines that I can hand the output of every SQL statement to – some die on finding an SQL error, some simply warn, and others allow me to pass in an error that is OK, and so forth. If you’re using Perl, a DBI will do a lot of that for you, and may make error handling much easier. Many scripting languages more sophisticated than KSH have that kind of construct.

These languages get and parse the SQLCA. The SQLCA is the SQL Communications Area. It consists of a number of variables that are updated at the end of every SQL statement. The information defined there is well laid out in the IBM DB2 Knowledge Center: http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0002212.html?cp=SSEPGG_10.5.0%2F2-9-12-0

If you want to play with the SQLCA and start understanding it, you can use -a on the command line:

$ db2 -a "select npages from syscat.bufferpools where BPNAME='IBMDEFAULTBP'"

NPAGES
-----------
         -2

  1 record(s) selected.


SQLCA Information

 sqlcaid : SQLCA     sqlcabc: 136   sqlcode: 0   sqlerrml: 0
 sqlerrmc:
 sqlerrp : SQLRI01F
 sqlerrd : (1) -2147221503      (2) 1                (3) 1
           (4) 0                (5) 0                (6) 0
 sqlwarn : (1)      (2)      (3)      (4)        (5)       (6)
           (7)      (8)      (9)      (10)       (11)
 sqlstate: 00000

Usually the sqlcode is the thing we are most interested in.

If the sqlcode is a negative number, then it’s an error:

db2 -a "select junk from syscat.bufferpools where BPNAME='IBMDEFAULTBP'"

SQLCA Information

 sqlcaid : SQLCA     sqlcabc: 136   sqlcode: -206   sqlerrml: 4
 sqlerrmc: JUNK
 sqlerrp : SQLNQ075
 sqlerrd : (1) -2145779603      (2) 0                (3) 0
           (4) 0                (5) -10              (6) 0
 sqlwarn : (1)      (2)      (3)      (4)        (5)       (6)
           (7)      (8)      (9)      (10)       (11)
 sqlstate: 42703

And if it’s a positive number, then it’s a warning:

CREATE INDEX WSCOMUSR.I_ATTRVAL01 ON WSCOMUSR.ATTRVAL (ATTR_ID, VALUSAGE, FIELD3, FIELD2, FIELD1, STOREENT_ID, IDENTIFIER, ATTRVAL_ID) ALLOW REVERSE SCANS COLLECT DETAILED STATISTICS

SQLCA Information

 sqlcaid : SQLCA     sqlcabc: 136   sqlcode: 605   sqlerrml: 20
 sqlerrmc: WSCOMUSR.I_ATTRVAL01
 sqlerrp : SQLRL1CF
 sqlerrd : (1) -2145779603      (2) 0                (3) 0
           (4) 0                (5) 0                (6) 0
 sqlwarn : (1)      (2)      (3)      (4)        (5)       (6)
           (7)      (8)      (9)      (10)       (11)
 sqlstate: 01550

Of course there may be more than just the sqlcode that you’re interested in, and at the command line, the actual error text is often more useful than the SQLCA. Using the -a option replaces the usual text output with the SQLCA output. But if you’re scripting it may be easier to parse the SQLCA. This is the information that the Perl DBI and such modules look at.

It is a every DBA’s responsibility to check the success or failure of every statement they run at the command line and in scripts. You should never assume that something was successful without checking. This may require you to step up your scripting skills, but it is really important to ensure that things are doing what they are expected to.

↧

HADR Tools: the HADR Simulator

September 9, 2014, 4:00 am

≫ Next: Example of A Clustering Low-Cardinality Index Helping Query Performance

≪ Previous: Checking the Output of SQL Scripts and Commands for Errors

I have not made extensive use of the HADR Tools that IBM offers in the past. Most of my HADR setups to date have either been same-data-center using NEARSYNC or have used ASYNC to copy data between data centers. I haven’t had much cause to tweak my network settings or change my SYNCMODE settings based on hardware/networking.

However, I have a chance to make use of these tools in several scenarios now, so I thought I would share what I’m finding. I do not claim to be the foremost expert on these tools. And there is an incredible amount of details on them available from IBM. For the full technical specifications and details on using the HADR tools, see:
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/DB2HADR/
http://www.ibm.com/developerworks/data/library/techarticle/dm-1310db2luwhadr/index.html?ca=dat-

I thought I would share my own journey with these tools to help others. Comments, corrections, additions, are all welcome in the comments form below.

What are the HADR tools?

IBM provides three major HADR tools on a devloperWorks wiki site.

The HADR Simulator is used to look both at disk speed and network details around HADR. It can be used in several different ways, including helping you to troubleshoot the way HADR does name resolution.

The DB2 Log Scanner is used to look at log files and report details about your DB2 Workload. The output is a bit cryptic, and this tool is best used in conjunction with the HADR Calculator. This does require real log files from a real workload, so if you’re setting up a new system, you will need to have actual work on the system before you can use it. Also, IBM will not provide the tool they use internally to uncompress automatically compressed log files, so if you want to use it, you’ll have to turn automatic log compression off. I tried to get the tool, they would not give it to me.

The HADR Calculator takes input from the DB2 Log Scanner, and values that you can compute using the HADR Simulator, and tells you which HADR SYNCMODEs make the most sense for you.

These three tools do NOT require that you have DB2 on a server to run – they are fully standalone. There are versions of the first two for each operating system. The third requires that you have perl, but can be run anywhere, including on a laptop or personal computer. This allows you flexibility in considering details of a network or server you are thinking of using before actually using it. And allows you to analyze log files without adding workload to a server.

Using the HADR Simulator

In this post, I’m going to focus on the HADR simulator.

First of all, the download and details can be found at: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/DB2HADR/page/HADR%20simulator. Note there are some child pages and links there with good detail there.

The HADR Simulator is a stand-alone tool. This means that you do not need DB2 on the servers in question. It is a binary executable. To use it, you simply download it from the link above to one or more servers. You can simulate primary-standby network interaction by running it on two servers at the same time. You can also run it on one server alone to look at things like disk performance.

Simulating HADR with the HADR Simulator

To use it in the main intended way, you download the right version for your OS, place it on each of the severs in question, make sure you have execute permission on it and execute it like this:
Primary:

 simhadr_aix -lhost host1.domain -lport 18821 -rhost host2.domain -rport 18822 -role primary -syncmode NEARSYNC -t 60

Standby:

simhadr_aix -lhost host2.domain -lport 18822 -rhost host1.domain -rport 18821 -role standby

The ports in the above should be the ports you plan to use for HADR. However, you cannot use the same ports that HADR is currently running on if you happen to already be running HADR on the servers. If you try that, you will get output like this:

+ simhadr -lhost host1 -lport 18819 -rhost host2 -rport 18820 -role primary -syncmode NEARSYNC -t 60

Measured sleep overhead: 0.000004 second, using spin time 0.000004 second.
flushSize = 16 pages

Resolving local host host1 via gethostbyname()
hostname=host1
alias: host1.domain
address_type=2 address_length=4
address: 000.000.000.000

Resolving remote host host2 via gethostbyname()
hostname=host2
alias: host2.domain
address_type=2 address_length=4
address: 000.000.000.000

Socket property upon creation
BlockingIO=true
NAGLE=true
TCP_WINDOW_SCALING=32
SO_SNDBUF=262144
SO_RCVBUF=262144
SO_LINGER: onoff=0, length=0

Binding socket to local address.
bind() failed on local address. errno=67, Address already in use

You should be passing in the host names as you would use them with HADR. This allows the tool to show you how the names are resolving. The HADR calculator can be used for that purpose alone if you’re having name resolution issues. The ports that you pass in must be numbers – /etc/services or its equivalent is not consulted for port names if you’re using that.

The output from the HADR Simulator, invoked using the syntax above looks something like this:
Primary:

+ simhadr -lhost host1.domain -lport 18821 -rhost host2.domain -rport 18822 -role primary -syncmode NEARSYNC -t 60

Measured sleep overhead: 0.000004 second, using spin time 0.000004 second.
flushSize = 16 pages

Resolving local host host1.domain via gethostbyname()
hostname=host1.domain
alias: host1.domain.local
address_type=2 address_length=4
address: 000.000.000.000

Resolving remote host host2.domain via gethostbyname()
hostname=host2.domain
alias: host2.domain.local
address_type=2 address_length=4
address: 000.000.000.000

Socket property upon creation
BlockingIO=true
NAGLE=true
TCP_WINDOW_SCALING=32
SO_SNDBUF=262144
SO_RCVBUF=262144
SO_LINGER: onoff=0, length=0

Binding socket to local address.
Listening on local host TCP port 18821

Connected.

Calling fcntl(O_NONBLOCK)
Calling setsockopt(TCP_NODELAY)
Socket property upon connection
BlockingIO=false
NAGLE=false
TCP_WINDOW_SCALING=32
SO_SNDBUF=262088
SO_RCVBUF=262088
SO_LINGER: onoff=0, length=0

Sending handshake message:
syncMode=NEARSYNC
flushSize=16
connTime=2014-06-15_18:24:42_UTC

Sending log flushes. Press Ctrl-C to stop.

NEARSYNC: Total 18163171328 bytes in 60.000131 seconds, 302.718861 MBytes/sec
Total 277148 flushes, 0.000216 sec/flush, 16 pages (65536 bytes)/flush

Total 18163171328 bytes sent in 60.000131 seconds. 302.718861 MBytes/sec
Total 277148 send calls, 65.536 KBytes/send,
Total 0 congestions, 0.000000 seconds, 0.000000 second/congestion

Total 4434368 bytes recv in 60.000131 seconds. 0.073906 MBytes/sec
Total 277148 recv calls, 0.016 KBytes/recv

Distribution of log write size (unit is byte):
Total 277148 numbers, Sum 18163171328, Min 65536, Max 65536, Avg 65536
Exactly      65536      277148 numbers

Distribution of log shipping time (unit is microsecond):
Total 277148 numbers, Sum 59711258, Min 175, Max 3184, Avg 215
From 128 to 255               263774 numbers
From 256 to 511                13335 numbers
From 512 to 1023                  23 numbers
From 1024 to 2047                 15 numbers
From 2048 to 4095                  1 numbers

Distribution of send size (unit is byte):
Total 277148 numbers, Sum 18163171328, Min 65536, Max 65536, Avg 65536
Exactly      65536      277148 numbers

Distribution of recv size (unit is byte):
Total 277148 numbers, Sum 4434368, Min 16, Max 16, Avg 16
Exactly         16      277148 numbers

Standby:

+ simhadr -lhost host2.domain -lport 18822 -rhost host1.domain -rport 18821 -role standby

Measured sleep overhead: 0.000004 second, using spin time 0.000004 second.

Resolving local host host2.domain via gethostbyname()
hostname=host2.domain
alias: host2.domain.local
address_type=2 address_length=4
address: 000.000.000.000

Resolving remote host host1.domain via gethostbyname()
hostname=host1.domain
alias: host1.domain.local
address_type=2 address_length=4
address: 000.000.000.000

Socket property upon creation
BlockingIO=true
NAGLE=true
TCP_WINDOW_SCALING=32
SO_SNDBUF=262144
SO_RCVBUF=262144
SO_LINGER: onoff=0, length=0

Connecting to remote host TCP port 18821
connect() failed. errno=79, Connection refused
Retrying.

Connected.

Calling fcntl(O_NONBLOCK)
Calling setsockopt(TCP_NODELAY)
Socket property upon connection
BlockingIO=false
NAGLE=false
TCP_WINDOW_SCALING=32
SO_SNDBUF=262088
SO_RCVBUF=262088
SO_LINGER: onoff=0, length=0

Received handshake message:
syncMode=NEARSYNC
flushSize=16
connTime=2014-06-15_18:24:42_UTC

Standby receive buffer size 64 pages (262144 bytes)
Receiving log flushes. Press Ctrl-C on primary to stop.
Zero byte received. Remote end closed connection.

NEARSYNC: Total 18163171328 bytes in 59.998903 seconds, 302.725057 MBytes/sec
Total 277148 flushes, 0.000216 sec/flush, 16 pages (65536 bytes)/flush

Total 4434368 bytes sent in 59.998903 seconds. 0.073907 MBytes/sec
Total 277148 send calls, 0.016 KBytes/send,
Total 0 congestions, 0.000000 seconds, 0.000000 second/congestion

Total 18163171328 bytes recv in 59.998903 seconds. 302.725057 MBytes/sec
Total 613860 recv calls, 29.588 KBytes/recv

Distribution of log write size (unit is byte):
Total 277148 numbers, Sum 18163171328, Min 65536, Max 65536, Avg 65536
Exactly      65536      277148 numbers

Distribution of send size (unit is byte):
Total 277148 numbers, Sum 4434368, Min 16, Max 16, Avg 16
Exactly         16      277148 numbers

Distribution of recv size (unit is byte):
Total 613860 numbers, Sum 18163171328, Min 376, Max 65536, Avg 29588
From 256 to 511                  166 numbers
From 1024 to 2047              55614 numbers
From 2048 to 4095               8845 numbers
From 4096 to 8191              18028 numbers
From 8192 to 16383             34458 numbers
From 16384 to 32767           227758 numbers
From 32768 to 65535           264416 numbers
From 65536 to 131071            4575 numbers

Ok, that’s great, right, but what do I do with that?

Well, here’s one thing – you can tune your send and recieve buffers using this information. Run this process several times using different values for those like this:

./simhadr_aix -lhost host1.domain -lport 18821 -rhost host2.domain -rport 18822 -sockSndBuf 65536 -sockRcvBuf 65536 -role primary -syncmode NEARSYNC -t 60
./simhadr_aix -lhost host2.domain -lport 18822 -rhost host1.domain -rport 18821 -sockSndBuf 65536 -sockRcvBuf 65536 -role standby

./simhadr_aix -lhost host1.domain -lport 18821 -rhost host2.domain -rport 18822 -sockSndBuf 131072 -sockRcvBuf 131072 -role primary -syncmode NEARSYNC -t 60
./simhadr_aix -lhost host2.domain -lport 18822 -rhost host1.domain -rport 18821 -sockSndBuf 131072 -sockRcvBuf 131072 -role standby

./simhadr_aix -lhost host1.domain -lport 18821 -rhost host2.domain -rport 18822 -sockSndBuf 262144 -sockRcvBuf 262144 -role primary -syncmode NEARSYNC -t 60
./simhadr_aix -lhost host2.domain -lport 18822 -rhost host1.domain -rport 18821 -sockSndBuf 262144 -sockRcvBuf 262144 -role standby

./simhadr_aix -lhost host1.domain -lport 18821 -rhost host2.domain -rport 18822 -sockSndBuf 524288 -sockRcvBuf 524288 -role primary -syncmode NEARSYNC -t 60
./simhadr_aix -lhost host2.domain -lport 18822 -rhost host1.domain -rport 18821 -sockSndBuf 524288 -sockRcvBuf 524288 -role standby

In the line of output that looks like this:

NEARSYNC: Total 14220328960 bytes in 60.000083 seconds, 237.005155 MBytes/sec

Pull out the MBytes per second, and graph it like this:

In this example, it is clear that the throughput levels off at a buffer size of 128 K. Your results are likely to vary. To allow additional space, in this example, we would choose values of 256 KB, and set them using this syntax:

db2set DB2_HADR_SOSNDBUF=262144
db2set DB2_HADR_SORCVBUF=262144

This is the kind of thing I might never have gone into detail on if I didn’t blog. And yet it led to me changing parameters used and improving what I’m doing at work.

I am also interested in what I might do with some of the disk information supplied here. I sometimes have trouble getting disk information from hosting providers and, depending on the situation, there might be numbers here that I could use.

I’m really disappointed that IBM won’t share their internal log uncompression tool to use the log scanner – I’m not sure I can justify running with manually compressing logs just to run the logs scanner. Automatic log compressions is one of my favorite recent features. If I get the opportunity, I’ll play with that tool and blog about it too.

↧

Example of A Clustering Low-Cardinality Index Helping Query Performance

September 16, 2014, 4:00 am

≫ Next: DB2 Quick Tip: Checking Connection State

≪ Previous: HADR Tools: the HADR Simulator

The request from the developers was something along the lines of “Help, Ember, this query needs to perform better”. Sometimes the query I’m working on is not one that shows up as a problem from the database administrator’s perspective, but one that is especially important in some part of application functioning. In this case, this query is related to the performance of searches done on the website – a particularly problematic area on this client.

The query I was asked to help with is this one:

select distinct 
    ADSC.SRCHFIELDNAME, 
    AVD.value, 
    AV.storeent_id, 
    AVD.sequence, 
    AVD.image1, 
    AVD.image2 
from    attrdictsrchconf ADSC, 
    attr A, 
    attrval AV, 
    attrvaldesc AVD 
where 
    ADSC.ATTR_ID is not NULL 
    and ADSC.attr_id = A.attr_id 
    and ADSC.mastercatalog_id = ? 
    and ADSC.srchfieldname in (?) 
    and A.storeent_id in (?, ?, ?) 
    and A.facetable = 1 
    and AV.attr_id = A.attr_id 
    and AV.storeent_id in (?, ?, ?) 
    and AV.attrval_id = AVD.attrval_id 
    and AVD.language_id = ?

The data model in question here is a typical WebSphere Commerce database, with no customizations involved in this query.

Doing an explain (using DBI’s Brother-Panther), I get this:

A typical db2advis gives me nothing useful, but I know better than to stop there. The thing I tried first was to do an explain that recommends MQTs and MDC tables. I’ve talked to WebSphere Commerce support on this and they’ve told me that MDC Tables are considered a customization. They don’t see why it wouldn’t work, but they won’t say it’s supported. C’mon IBM – there are several areas I can see MDC’s making big performance gains for me. But even if I can’t use MDC tables, the db2 advisor for them can sometimes give me ideas for clustering indexes, which I can use.

In DBI’s Brother-Panther, I just click the right boxes:

If doing this at the command line, you would use the ‘-m C’ option on db2advis to get the same kinds of recommendations.

In this case, it recommended one clustering index and that one table be converted to an MDC table. But the MDC recommendation includes only one dimension. This makes it particularly likely that a clustering index could help me nearly as much. Both index recommendations are duplications of existing indexes, but just adding clustering. From what I’ve seen, the WebSphere Commerce data model seems to favor a lot of indexes on single columns. This leads to some low-cardinality indexes. If you want to understand why low-cardinality indexes can hurt performance, please see my DeveloperWorks article, Why low cardinality indexes negatively impact performance.

Looking at the index in question, it is clearly a low-cardinality index:

select substr(indschema,1,12) as indschema, substr(indname,1,12) as indname, fullkeycard from syscat.indexes where indname='I0001468' with ur

INDSCHEMA    INDNAME      FULLKEYCARD
------------ ------------ --------------------
WSCOMUSR     I0001468                        2

  1 record(s) selected.

Hard to get much more low-cardinality than 2. The cardinality of this full table is 1,255,954. The data is relatively evenly distributed across these two values. DB2 is (correctly) choosing not to use the index because if it’s low cardinality and the table is not clustered on it, it would make performance worse.

I changed the index to be a clustering index like this:

$ db2 "drop index wscomusr.I0001468"
DB20000I  The SQL command completed successfully.
$ db2 "create index wscomusr.i_attrvaldesc_01 on wscomusr.attrvaldesc (language_id) cluster allow reverse scans"                     <
DB20000I  The SQL command completed successfully.
$ db2 "REORG TABLE WSCOMUSR.ATTRVALDESC INDEX wscomusr.i_attrvaldesc_01 "
DB20000I  The REORG command completed successfully.
$ db2 "RUNSTATS ON TABLE WSCOMUSR.ATTRVALDESC with distribution and detailed indexes all"
DB20000I  The RUNSTATS command completed successfully.

After doing that, the explain looks like this:

Note that it's now using my clustering index for that table, even though it can't make it into index-only access. And the cost of that query is reduced by a whopping 41%. I got the cost of the query down a bit more with another clustering index, and the developers were quite happy. They asked if I could reduce it further and my response was "Well, can you stop using DISTINCT and IN?"

Lessons learned:

Information beyond the standard db2advis can be useful even if you cannot make use of all DB2 features based on vendor restrictions.
C'mon IBM - you're the vendor of both the application and the database. Can we get the use of MDC tables for base WebSphere Commerce tables certified?
I am a lifelong GUI-hater, but I'm loving DBI's Brother-Panther for looking at things.

I somewhat doubt that I would have added this index by itself if it wasn't replacing a base WebSphere Commerce index with the exact same columns. Maybe if I happened on this same performance improvement. I might have experimented more with a composite index in that case. But I like sticking as close to the WebSphere Commerce data model as possible unless I have very specific reasons to do otherwise.

↧

DB2 Quick Tip: Checking Connection State

September 18, 2014, 4:00 am

≫ Next: Tracking Table Activity using Triggers

≪ Previous: Example of A Clustering Low-Cardinality Index Helping Query Performance

Sometimes the connection state is unclear. The following can all make it fuzzy wether or not you have a valid connection:

A db2 error or warning related to your connection
A system error related to network connectivity
Changing VPNs or adding a VPN connection
Leaving a connection up overnight or over longer periods

This tip is strictly related to the DB2 command line. Various database interfaces in scripting languages or tools may have another way of doing this.

To check a connection state at the DB2 command line, you can issue:

 >db2 get connection state

   Database Connection State

 Connection state       = Connectable and Connected
 Connection mode        = SHARE
 Local database alias   = SAMPLE
 Database name          = SAMPLE
 Hostname               =
 Service name           =

There is no specific authorization needed to execute this command, unlike many DB2 commands.

The possible states include:

Connectable and connected – meaning you are connected to the database listed
Connectable and unconnected – you are not connected, but may connect
Unconnectable and connected
Implicitly connectable

In addition to the mode of SHARE seen above, the connection mode could also be EXCLUSIVE. The Hostname and the Service Name are only populated if the connection is a remote one over TCP/IP.

The Knowledge Center page on the GET CONNECTION STATE command is: http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0001953.html?cp=SSEPGG_10.5.0%2F3-5-2-4-41&lang=en

↧

Tracking Table Activity using Triggers

September 23, 2014, 4:00 am

≫ Next: A Better Understanding of TSA Resources and States

≪ Previous: DB2 Quick Tip: Checking Connection State

There are a number of situations in which a DBA may need to deterimine when data in a table is being inserted or changed. The most obvious tool for tracking this may be using the db2 auditing facility. If you haven’t looked at audit for a while, it has been significantly improved in more recent versions, and you can limit the scope of auditing to a single table.

There’s another quick and dirty method too – using triggers on every operation against the table to write data to a separate table.

What to Track

There are two different directions to go on tracking data, and the direction chosen depends on the goals of tracking the information and on the structure of a particular table.

The first direction is to create a very simple table that has the primary key of the table being tracked, plus two extra columns to tell you when a change is made and what kind of change. This can help if you’re not understanding when changes are happening and need to know.

The second direction is to track every column with before and after information for updates so you can understand exactly what is happening and when it is happening. This works better than the first method if you need more detail, or if there are many changes and you want to understand what the sequence of them is. This direction will require your tracking table to have every column that the table being tracked has, plus two or three additional columns. This can be cumbersome for a particularly large table.

Creating Tracking Tables

There are a couple of shortcuts to creating the tables you need, assuming you are taking the second direction described above. The harder way is to grab db2look information for the table, and then take the create table syntax and alter it for table name and to include the additional columns needed. In a recent example, I pulled table information from db2look that looked like this:

CREATE TABLE "WSCOMUSR"."TI_DELTA_CATENTRY"  (
                  "MASTERCATALOG_ID" BIGINT NOT NULL ,
                  "CATENTRY_ID" BIGINT NOT NULL ,
                  "ACTION" CHAR(1 OCTETS) ,
                  "LASTUPDATE" TIMESTAMP NOT NULL WITH DEFAULT CURRENT TIMESTAMP )
                 IN "USERSPACE1"
                 ORGANIZE BY ROW;

Note that it’s unlikely that you want to replicate the indexes that are on the table being tracked. You may need indexes if performance matters or if you’re worried about flooding your bufferpools with table scans. Those decisions depend on how long you will be using the table and how much data you expect. In the example I’m using here, IBM WebSphere Commerce support requested that we track this data in detail. I don’t expect a lot of data, and I only expect to have the tracking in place for a couple of days or a week to try to catch a specific scenario.

I’m not even adding a primary key to the table, which actually drives me nuts – there’s a little voice inside me screaming “Every table must have a primary key!” But the goal is as little impact when the triggers fire as possible, and so I’m leaving off all indexes and constraints for the moment.

Using the above syntax as a starting point, I came up with the following syntax for the table I’ll be using to track changes:

CREATE TABLE DBA.X_CHG_TI_DELTA_CATENTRY  (
                  action_timestamp timestamp not null with default current timestamp,
                  action_type char(1),
                  before_or_after char(1),
                  "MASTERCATALOG_ID" BIGINT NOT NULL ,
                  "CATENTRY_ID" BIGINT NOT NULL ,
                  "ACTION" CHAR(1 OCTETS) ,
                  "LASTUPDATE" TIMESTAMP NOT NULL WITH DEFAULT CURRENT TIMESTAMP )
                 IN DBA32K
                 ORGANIZE BY ROW;

I have noted the major changes in red above. Of course the table name must be different. And I’ve added three columns. The first will note when the trigger was fired, the second will record whether this is an insert (‘I’), update (‘U’), or a delete (‘D’). The third will track whether the data stored in the change tracking table represents the before (‘b’) or after (‘a’) data – which I want to track for updates – not just the new or old values. There are obviously multiple ways of handling this, and I’m not claiming this is better than others.

A faster way to create the tracking table would be to create a table like an existing table:

CREATE TABLE DBA.X_CHG_TI_DELTA_CATENTRY LIKE WSCOMUSR TI_DELTA_CATENTRY IN DBA32K

Then just alter the new table to add the additional columns:

alter table DBA.X_CHG_TI_DELTA_CATENTRY_temp 
     add column action_timestamp timestamp not null with default current timestamp 
     add column action_type char(1) 
     add column before_or_after char(1);

Creating Tracking Triggers

Once we have the change tracking tables, the triggers are the next thing needed. We need three triggers – one for update, one for insert, and one for delete. If this were code that would have to be maintained over time, we might consider the use of a multi-action trigger instead.

Here are the triggers as I defined them for my little support-requested scenario:

CREATE TRIGGER T_CHG_TI_DELTA_CATENTRY_I01
     AFTER INSERT ON WSCOMUSR.TI_DELTA_CATENTRY
     REFERENCING NEW AS N
     FOR EACH ROW
     BEGIN ATOMIC
     INSERT INTO DBA.X_CHG_TI_DELTA_CATENTRY (
        action_timestamp,
        action_type,
        before_or_after,
        MASTERCATALOG_ID,
        CATENTRY_ID,
        ACTION,
        LASTUPDATE)
         Values (
                current timestamp,
                'I',
                'A',
                N.MASTERCATALOG_ID,
                N.CATENTRY_ID,
                N.ACTION,
                N.LASTUPDATE
         );
     END@

 CREATE TRIGGER T_CHG_TI_DELTA_CATENTRY_U01
     AFTER UPDATE ON WSCOMUSR.TI_DELTA_CATENTRY
     REFERENCING NEW AS N OLD AS O
     FOR EACH ROW
     BEGIN ATOMIC
     INSERT INTO DBA.X_CHG_TI_DELTA_CATENTRY (
        action_timestamp,
        action_type,
        before_or_after,
        MASTERCATALOG_ID,
        CATENTRY_ID,
        ACTION,
        LASTUPDATE)
         Values (
                current timestamp,
                'U',
                'B',
                O.MASTERCATALOG_ID,
                O.CATENTRY_ID,
                O.ACTION,
                O.LASTUPDATE
         );
     INSERT INTO DBA.X_CHG_TI_DELTA_CATENTRY (
        action_timestamp,
        action_type,
        before_or_after,
        MASTERCATALOG_ID,
        CATENTRY_ID,
        ACTION,
        LASTUPDATE)
         Values (
                current timestamp,
                'U',
                'A',
                N.MASTERCATALOG_ID,
                N.CATENTRY_ID,
                N.ACTION,
                N.LASTUPDATE
         );

     END@

 CREATE TRIGGER T_CHG_TI_DELTA_CATENTRY_D01
     AFTER DELETE ON WSCOMUSR.TI_DELTA_CATENTRY
     REFERENCING OLD AS O
     FOR EACH ROW
     BEGIN ATOMIC
     INSERT INTO DBA.X_CHG_TI_DELTA_CATENTRY (
        action_timestamp,
        action_type,
        before_or_after,
        MASTERCATALOG_ID,
        CATENTRY_ID,
        ACTION,
        LASTUPDATE)
         Values (
                current timestamp,
                'D',
                'B',
                O.MASTERCATALOG_ID,
                O.CATENTRY_ID,
                O.ACTION,
                O.LASTUPDATE
         );
     END@

Summary

If you’re looking at a more long-term solution, you might also want to consider temporal tables. The reason I didn’t consider it here is that it requires adding columns to the table being tracked, and that was not an option for me. I only do this a couple of times a year, and I’m sure I have readers who do it all the time. Please share any suggestions and advice in the comments box at the end of this article.

References

My favorite syntax guide is Graeme Birchall’s: http://mysite.verizon.net/Graeme_Birchall/id1.html, though he does not intend to update it past 9.7. Triggers are on page 333.
IBM DB2 Knowledge Center Page on CREATE TRIGGER statement
Article on Temporal Tables: http://www.ibm.com/developerworks/data/library/techarticle/dm-1210temporaltablesdb2/index.html?ca=dat

↧

A Better Understanding of TSA Resources and States

September 30, 2014, 4:00 am

≫ Next: Clustering a Table on an Index

≪ Previous: Tracking Table Activity using Triggers

It is very possible to support DB2’s HADR using TSA to automate failover without understanding many of the TSA components and details, and even the TSA commands. I’ve posted several blog entries on TSA issues resolved without any deeper knowledge of things. In this post, I hope to shed a bit more light on the details of TSA. This means I’ll be sharing commands that are not DB2 commands, but are TSA commands, and that I’ve run as root.

Environment

The output I’m sharing in this blog entry comes from a system that I set up using the guidlines set forth in my HADR/TSA Series:
HADR
Using TSA/db2haicu to automate failover – Part 1: The Preparation
Using TSA/db2haicu to automate failover – Part 2: How it looks if it goes smoothly
Using TSA/db2haicu to Automate Failover Part 3: Testing, Ways Setup can go Wrong and What to do.

It’s a simple two-server cluster using a network quorum. I’m using DB2 10.5, FixPack 3.

Basic Status Check

My favorite way to check the status has always been lssam:

And since I have to redact a fair amount in that image, here’s what that looks like in text:

Online IBM.ResourceGroup:db2_db2inst1_db2inst1_SAMPLE-rg Nominal=Online
        |- Online IBM.Application:db2_db2inst1_db2inst1_SAMPLE-rs
                |- Online IBM.Application:db2_db2inst1_db2inst1_SAMPLE-rs:host01
                '- Offline IBM.Application:db2_db2inst1_db2inst1_SAMPLE-rs:host02
        '- Online IBM.ServiceIP:db2ip_111_00_00_30-rs
                |- Online IBM.ServiceIP:db2ip_111_00_00_30-rs:host01
                '- Offline IBM.ServiceIP:db2ip_111_00_00_30-rs:host02
Online IBM.ResourceGroup:db2_db2inst1_host01_0-rg Nominal=Online
        '- Online IBM.Application:db2_db2inst1_host01_0-rs
                '- Online IBM.Application:db2_db2inst1_host01_0-rs:host01
Online IBM.ResourceGroup:db2_db2inst1_host02_0-rg Nominal=Online
        '- Online IBM.Application:db2_db2inst1_host02_0-rs
                '- Online IBM.Application:db2_db2inst1_host02_0-rs:host02
Online IBM.Equivalency:db2_db2inst1_db2inst1_SAMPLE-rg_group-equ
        |- Online IBM.PeerNode:host01:host01
        '- Online IBM.PeerNode:host02:host02
Online IBM.Equivalency:db2_db2inst1_host01_0-rg_group-equ
        '- Online IBM.PeerNode:host01:host01
Online IBM.Equivalency:db2_db2inst1_host02_0-rg_group-equ
        '- Online IBM.PeerNode:host02:host02
Online IBM.Equivalency:db2_public_network_0
        |- Online IBM.NetworkInterface:en12:host02
        '- Online IBM.NetworkInterface:en12:host01

The first thing I look at quickly is to see the color. Things should generally be green, with a few areas of blue. The blue items are “offline”, but that’s ok, because a database or IP can only be online on one of the nodes at a time. If for some reason, I can’t see the color (like if a teammate sends me the text), then I look to see if each larger group shows as “Online” and there are no groups where both items in the group are “Offline” and there are no occurrences of the terms “SuspendedPropegated” or “Pending online” – both of which indicate common issues.

But those are very basic and visual queues. What are we really looking at here?

Domain

It is unlikely that the domain is going to appear to be offline, but here’s how to check it:

$lsrpdomain
Name       OpState RSCTActiveVersion MixedVersions TSPort GSPort
UATec_db2h Online  3.1.4.4           No            12347  12348

If it were down, you could use:

startrpdomain UATec_db2h

You can also stop the domain:

stoprpdomain UATec_db2h

Starting and stopping at this level are not things you’re likely to do. Perhaps with an upgrade.

Resource Groups

A resource is something that can be controlled by TSA (hardware or software). A resource group is a virtual group of resources. In the lssam output, we see that there is a resource group that contains the database and the virtual or floating IP. These are in the same resource group because they should always fail over together. This is the output from above that represents that resource group:

Online IBM.ResourceGroup:db2_db2inst1_db2inst1_SAMPLE-rg Nominal=Online
        |- Online IBM.Application:db2_db2inst1_db2inst1_SAMPLE-rs
                |- Online IBM.Application:db2_db2inst1_db2inst1_SAMPLE-rs:host01
                '- Offline IBM.Application:db2_db2inst1_db2inst1_SAMPLE-rs:host02
        '- Online IBM.ServiceIP:db2ip_111_00_00_30-rs
                |- Online IBM.ServiceIP:db2ip_111_00_00_30-rs:host01
                '- Offline IBM.ServiceIP:db2ip_111_00_00_30-rs:host02

Note that I can also limit the lssam output to a single resource group using the -g option(lssam -g db2_db2inst1_db2inst1_SAMPLE-rg). This might also be a useful way of filtering the output to look quickly at this most interesting of my resource groups.

There is also a resource group for the db2 instance on the primary server and a resource group for the db2 instance on the standby server. Both of these should be online, simultaneously, unless you have an instance down for maintenance. This is the output from above that represents those two resource groups:

Online IBM.ResourceGroup:db2_db2inst1_host01_0-rg Nominal=Online
        '- Online IBM.Application:db2_db2inst1_host01_0-rs
                '- Online IBM.Application:db2_db2inst1_host01_0-rs:host01
Online IBM.ResourceGroup:db2_db2inst1_host02_0-rg Nominal=Online
        '- Online IBM.Application:db2_db2inst1_host02_0-rs
                '- Online IBM.Application:db2_db2inst1_host02_0-rs:host02

The last remaining section of the lssam output shows us Equivalencies. Equivalencies are fixed sets of resources that provide the same function. A good example of this is the network interface card – in the output above, en12. There is one of these on each of the servers in our cluster. The other resources may only use one of these at a time, and it’s not something that TSA can failover.

Floating IP Address

Additional information is available on resources. The floating IP address for example, we can get more information on like this:

$ lsrsrc -Ab IBM.ServiceIP
Resource Persistent and Dynamic Attributes for IBM.ServiceIP
resource 1:
        Name              = "db2ip_111_00_00_30-rs"
        ResourceType      = 0
        AggregateResource = "0x2029 0xffff 0x454eac36 0x0cb8029a 0x9377a68d 0xa5d2f010"
        IPAddress         = "111.00.00.30"
        NetMask           = "255.255.255.0"
        ProtectionMode    = 1
        NetPrefix         = 0
        ActivePeerDomain  = "UATec_db2h"
        NodeNameList      = {"host01"}
        OpState           = 1
        ConfigChanged     = 0
        ChangedAttributes = {}
resource 2:
        Name              = "db2ip_111_00_00_30-rs"
        ResourceType      = 1
        AggregateResource = "0x3fff 0xffff 0x00000000 0x00000000 0x00000000 0x00000000"
        IPAddress         = "111.00.00.30"
        NetMask           = "255.255.255.0"
        ProtectionMode    = 1
        NetPrefix         = 0
        ActivePeerDomain  = "UATec_db2h"
        NodeNameList      = {"host01","Unknown_Node_Name"}
        OpState           =
        ConfigChanged     =
        ChangedAttributes =

Application Details

Additional information, including the exact scripts that TSA uses to manage db2 is available with this command:

$ lsrsrc -Ab IBM.Application
Resource Persistent and Dynamic Attributes for IBM.Application
resource 1:
        Name                  = "db2_db2inst1_db2inst1_SAMPLE-rs"
        ResourceType          = 0
        AggregateResource     = "0x2028 0xffff 0x454eac36 0x0cb8029a 0x9377a677 0x8bef7c78"
        StartCommand          = "/usr/sbin/rsct/sapolicies/db2/hadrV105_start.ksh db2inst1 db2inst1 SAMPLE"
        StopCommand           = "/usr/sbin/rsct/sapolicies/db2/hadrV105_stop.ksh db2inst1 db2inst1 SAMPLE"
        MonitorCommand        = "/usr/sbin/rsct/sapolicies/db2/hadrV105_monitor.ksh db2inst1 db2inst1 SAMPLE"
        MonitorCommandPeriod  = 21
        MonitorCommandTimeout = 29
        StartCommandTimeout   = 900
        StopCommandTimeout    = 900
        UserName              = "root"
        RunCommandsSync       = 1
        ProtectionMode        = 1
        HealthCommand         = ""
        HealthCommandPeriod   = 10
        HealthCommandTimeout  = 5
        InstanceName          = ""
        InstanceLocation      = ""
        SetHealthState        = 0
        MovePrepareCommand    = ""
        MoveCompleteCommand   = ""
        MoveCancelCommand     = ""
        CleanupList           = {}
        CleanupCommand        = ""
        CleanupCommandTimeout = 130
        ProcessCommandString  = ""
        ResetState            = 0
        ReRegistrationPeriod  = 0
        CleanupNodeList       = {}
        MonitorUserName       = ""
        ActivePeerDomain      = "UATec_db2h"
        NodeNameList          = {"host01"}
        OpState               = 1
        ConfigChanged         = 1
        ChangedAttributes     = {}
        HealthState           = 0
        HealthMessage         = ""
        MoveState             = [32768,{},0x0000 0x0000 0x00000000 0x00000000 0x00000000 0x00000000]
        RegisteredPID         = 0
resource 2:
        Name                  = "db2_db2inst1_db2inst1_SAMPLE-rs"
        ResourceType          = 1
        AggregateResource     = "0x3fff 0xffff 0x00000000 0x00000000 0x00000000 0x00000000"
        StartCommand          = "/usr/sbin/rsct/sapolicies/db2/hadrV105_start.ksh db2inst1 db2inst1 SAMPLE"
        StopCommand           = "/usr/sbin/rsct/sapolicies/db2/hadrV105_stop.ksh db2inst1 db2inst1 SAMPLE"
        MonitorCommand        = "/usr/sbin/rsct/sapolicies/db2/hadrV105_monitor.ksh db2inst1 db2inst1 SAMPLE"
        MonitorCommandPeriod  = 21
        MonitorCommandTimeout = 29
        StartCommandTimeout   = 900
        StopCommandTimeout    = 900
        UserName              = "root"
        RunCommandsSync       = 1
        ProtectionMode        = 1
        HealthCommand         = ""
        HealthCommandPeriod   = 10
        HealthCommandTimeout  = 5
        InstanceName          = ""
        InstanceLocation      = ""
        SetHealthState        = 0
        MovePrepareCommand    = ""
        MoveCompleteCommand   = ""
        MoveCancelCommand     = ""
        CleanupList           = {}
        CleanupCommand        = ""
        CleanupCommandTimeout = 130
        ProcessCommandString  = ""
        ResetState            = 0
        ReRegistrationPeriod  = 0
        CleanupNodeList       = {}
        MonitorUserName       = ""
        ActivePeerDomain      = "UATec_db2h"
        NodeNameList          = {"host01","Unknown_Node_Name"}
        OpState               =
        ConfigChanged         =
        ChangedAttributes     =
        HealthState           =
        HealthMessage         =
        MoveState             =
        RegisteredPID         =
resource 3:
        Name                  = "db2_db2inst1_host01_0-rs"
        ResourceType          = 0
        AggregateResource     = "0x2028 0xffff 0x454eac36 0x0cb8029a 0x9377a673 0x4fc0f21c"
        StartCommand          = "/usr/sbin/rsct/sapolicies/db2/db2V105_start.ksh db2inst1 0"
        StopCommand           = "/usr/sbin/rsct/sapolicies/db2/db2V105_stop.ksh db2inst1 0"
        MonitorCommand        = "/usr/sbin/rsct/sapolicies/db2/db2V105_monitor.ksh db2inst1 0"
        MonitorCommandPeriod  = 10
        MonitorCommandTimeout = 120
        StartCommandTimeout   = 900
        StopCommandTimeout    = 900
        UserName              = "root"
        RunCommandsSync       = 1
        ProtectionMode        = 1
        HealthCommand         = ""
        HealthCommandPeriod   = 10
        HealthCommandTimeout  = 5
        InstanceName          = ""
        InstanceLocation      = ""
        SetHealthState        = 0
        MovePrepareCommand    = ""
        MoveCompleteCommand   = ""
        MoveCancelCommand     = ""
        CleanupList           = {}
        CleanupCommand        = ""
        CleanupCommandTimeout = 130
        ProcessCommandString  = ""
        ResetState            = 0
        ReRegistrationPeriod  = 0
        CleanupNodeList       = {}
        MonitorUserName       = ""
        ActivePeerDomain      = "UATec_db2h"
        NodeNameList          = {"host01"}
        OpState               = 1
        ConfigChanged         = 0
        ChangedAttributes     = {}
        HealthState           = 0
        HealthMessage         = ""
        MoveState             = [32768,{},0x0000 0x0000 0x00000000 0x00000000 0x00000000 0x00000000]
        RegisteredPID         = 0
resource 4:
        Name                  = "db2_db2inst1_host01_0-rs"
        ResourceType          = 1
        AggregateResource     = "0x3fff 0xffff 0x00000000 0x00000000 0x00000000 0x00000000"
        StartCommand          = "/usr/sbin/rsct/sapolicies/db2/db2V105_start.ksh db2inst1 0"
        StopCommand           = "/usr/sbin/rsct/sapolicies/db2/db2V105_stop.ksh db2inst1 0"
        MonitorCommand        = "/usr/sbin/rsct/sapolicies/db2/db2V105_monitor.ksh db2inst1 0"
        MonitorCommandPeriod  = 10
        MonitorCommandTimeout = 120
        StartCommandTimeout   = 900
        StopCommandTimeout    = 900
        UserName              = "root"
        RunCommandsSync       = 1
        ProtectionMode        = 1
        HealthCommand         = ""
        HealthCommandPeriod   = 10
        HealthCommandTimeout  = 5
        InstanceName          = ""
        InstanceLocation      = ""
        SetHealthState        = 0
        MovePrepareCommand    = ""
        MoveCompleteCommand   = ""
        MoveCancelCommand     = ""
        CleanupList           = {}
        CleanupCommand        = ""
        CleanupCommandTimeout = 130
        ProcessCommandString  = ""
        ResetState            = 0
        ReRegistrationPeriod  = 0
        CleanupNodeList       = {}
        MonitorUserName       = ""
        ActivePeerDomain      = "UATec_db2h"
        NodeNameList          = {"host01"}
        OpState               =
        ConfigChanged         =
        ChangedAttributes     =
        HealthState           =
        HealthMessage         =
        MoveState             =
        RegisteredPID         =

Note the full path to the db2 scripts – that could be useful to know if you want to change things. Note that it’s not just start and stop scripts listed, but also the monitoring script. There is a verbose mode for this script that you can use if you see any issues with a failure being detected. See the first article in the references section for more details on this.

Reference

Great Article on TSA and HADR: http://www.ibm.com/developerworks/data/tutorials/dm-1009db2hadr/ Seriously, read this one.
Great article on TSA: https://www.ibm.com/developerworks/tivoli/library/tv-tivoli-system-automation/

↧

Clustering a Table on an Index

October 7, 2014, 4:00 am

≫ Next: Querying Tables for an Activity Event Monitor

≪ Previous: A Better Understanding of TSA Resources and States

I have been playing a fair amount lately with clustering indexes and have been rehashing my views on reorging tables on an index. This is still a work in progress, but thought I’d share some details and see if others out there have any thoughts to share with me and others on it.

Clustering a Table on an Index

I’ve long been of the opinion that as long as I’m doing a table reorg in an OLTP database, I might as well reorg the table on an index. If there is a clustering index on the table, this is the easiest call in the world – of course the table will be reorged on that index – you can’t do it on any other index. But when you don’t have a clustering index, determining the right index is more difficult.

I have long reorged tables on the primary key when there is no clustering index defined. But with research I’ve done over the last couple of years, I have come around on this. I don’t see the performance gain there in many cases. When my tables are being accessed, how likely is it that they’ll be accessed by sequential primary key, especially in my WebSphere Commerce databases where the primary keys are almost always simply generated numbers in sequential order?

I’m increasingly convinced that my tables should be clustered on a low-cardinality index – I can get the most performance gain there. WebSphere Commerce databases have plenty of low-cardinality indexes that I’m not ready to drop outright. For a low-cardinality index, I am much more likely to be accessing a chunk of the data by the index value at a time. See my developerWorks article, Why low cardinality indexes negatively impact performance for a description of why clustering can make such a performance difference when these indexes are used to access a table.

What is a Clustering Index?

Before we dig in farther, let’s first start with the basics. A clustering index is created when you specify the CLUSTER keyword on index creation(if a new index on an existing table, the table should then be reorged on the clustering index). DB2 then attempts to maintain the physical data in the table in the same order as the index. It respects a clustering index on LOAD and even on insert, DB2 attempts to physically place rows where they belong in the order of the clustering index. However, the clustering is not guaranteed, and there are a number of situations in which DB2 is unable to maintain the table in the order of the index. This is where regular reorgs come into play – assuming data is changed or inserted across the range of values in the index column or columns, regular reorgs will be needed to keep the table clustered on the index.

MDC is a concept introduced all the way back in DB2 version 8. It is guaranteed clustering over one or more columns. I’d love to get my hands on it, but WebSphere Commerce considers it a “customization”, so I won’t be talking much about it here.

Performance Impact on Reorg of Reorging on an Index

The Reorg itself will run longer if you’re doing inplace reorgs on a index as opposed to a reorg that does not cluster the table on any index. The reason for this is that a reorg without specifying an index will scan the table starting at the end, and move things around as it makes sense, and at the end may have a range of pages that can be released from the table. It never has to scan an index. An inplace reorg on an index, though, must first clear a range of pages at the beginning of the table, then scan an index to determine what rows need to be in what order and move those rows into the proper positions. This is a lot more data to move around, and can take significantly longer. Especially in larger databases, the amount of time a reorg takes may be a limiting factor in the kinds of reorgs you do, especially considering that inplace reorgs are not fully online (see When is a Reorg Really Online?).

Determining Indexes for Clustering and Clustering Reorgs

The absolute best way to determine the ideal clustering indexes that you’re going to use is to look at your SQL workload on your database and find where a clustering index might help you there. This is time consuming as you consider each table and the SQL related to that table (which can be hard to find without a tool such as DBI’s Brother-Panther). It is also rigid, and likely requires you to define your indexes either as clustering indexes or to store the indexes you want to reorg tables on somewhere in an unchanging format. For DBAs supporting hundreds of databases with hundreds of tables in each database, this may be impossible.

Even when I consider only WebSphere Commerce databases, there is a vast variety in where clustering indexes are going to make the most impact. I have one client that uses multiple stores, but only one language, so there I’m looking strongly at clustering several tables on storeent_id. Another client doesn’t have many stores, but does have many languages that are being used. Certainly for the DESC tables, there I need to work more towards using language_id as a common clustering index. I wrote recently about how I helped a critical query using a couple of clustering indexes, and that was based on SOLR functionality that is only in the latest WCS Fix/Feature Pack, so changes between WCS versions can change things as well.

Some Queries in the Right Direction

What I really want is a query I can run that will tell me not just what index for any given table will make the most sense as the clustering index or index that I regularly cluster on using a reorg, but also how much of an improvement in my workload they will give me. There may be a way I can script some of that – explaining and/or advising the SQL in my package cache, and perhaps looking at explain/advis tables? Sounds like a fun science project, but I’m not there yet.

What I’m starting out with is something a bit more simple. The below two ksh scripts make a start in this direction. The first simply describes all the indexes in the table, in a way I like better than describe indexes. I’m still playing with it:

$ cat desc_ind.sh
tabname=`echo $1 | tr '[:lower:]' '[:upper:]'`;
db2 connect to wc005p01;
db2 -v "select i.lastused, substr(indname,1,20) as indname, substr(colnames,1,30) as colnames, fullkeycard as fullkeycard, card as table_card, decimal(clusterfactor,10,5) as clustefactor, indextype, index_scans from syscat.indexes i join syscat.tables t on i.tabname=t.tabname and i.tabschema=t.tabschema join table(mon_get_index( i.tabschema, i.tabname, -2 )) mgi on mgi.iid = i.iid where t.tabschema='WSCOMUSR' and t.tabname='$tabname'  with ur";
db2 connect reset;

This script is executed simply with ./desc_ind.sh TABNAME. It could easily be altered to also specify a schema name – that’s just not something I need at the moment. The output of this script looks something like this:

$ ./desc_ind.sh XSTORESKU

   Database Connection Information

 Database server        = DB2/LINUXX8664 9.7.7
 SQL authorization ID   = DB2INST1
 Local database alias   = WC005P01

select i.lastused, substr(indname,1,20) as indname, substr(colnames,1,30) as colnames, fullkeycard as fullkeycard, card as table_card, decimal(clusterfactor,10,5) as clustefactor, indextype, index_scans from syscat.indexes i join syscat.tables t on i.tabname=t.tabname and i.tabschema=t.tabschema join table(mon_get_index( i.tabschema, i.tabname, -2 )) mgi on mgi.iid = i.iid where t.tabschema='WSCOMUSR' and t.tabname='XSTORESKU'  with ur

LASTUSED   INDNAME              COLNAMES                       FULLKEYCARD          TABLE_CARD           CLUSTEFACTOR INDEXTYPE INDEX_SCANS
---------- -------------------- ------------------------------ -------------------- -------------------- ------------ --------- --------------------
09/15/2014 P_XSTORESKU          +XSTORESKU_ID                               3550167              3550167      0.85567 REG                     586462
09/04/2014 I_XSTORESKU01        +STLOC_ID                                      2681              3550167      0.18429 REG                         41
09/16/2014 I_XSTORESKU02        +UPC                                          15955              3550167      0.03371 REG                        334
09/16/2014 U_XSTORESKU01        +STLOC_ID+UPC                               3550167              3550167      0.15293 REG                     406417
01/01/0001 U_XSTORESKU02        +UPC+STLOC_ID                               3550167              3550167      0.00156 REG                          0

  5 record(s) selected.


DB20000I  The SQL command completed successfully.

This brings data together from several sources I want to know about when looking at my indexes in this context. Notice that I’m pulling the lastused to see if the index is used at all (in this case, there are some stupid indexes on the table), and also pulling from MON_GET_INDEX to look at the number of index scans.

The next script I’ve been playing with is designed to return the name of a single index that appears to be the best candidate for a clustering index without actually looking at the SQL. It finds indexes where the index cardinality is less than 50% of the table (not sure I’m happy with that number yet), but with a cardinality of greater than 1, because clustering a table on an index with a cardinality of one is patently ridiculous. Then it orders them in descending order of the number of index scans reported in MON_GET_INDEX, and returns only the index with the most scans for this table. This tells me that DB2 is already using this low-cardinality index even without the clustering, which tends to tell me that it may be causing potential performance problems.

 $ cat clus_ind.sh
tabname=`echo $1 | tr '[:lower:]' '[:upper:]'`;
db2 connect to wc005p01;
db2 -v "select substr(indname,1,20) as indname, float(fullkeycard)/float(card)
from syscat.indexes i join syscat.tables t on i.tabname=t.tabname and i.tabschema=t.tabschema join table(mon_get_index( i.tabschema, i.tabname, -2 )) mgi on mgi.iid = i.iid
where t.tabschema='WSCOMUSR' and t.tabname='$tabname' and float(fullkeycard)/float(card) < .5 and fullkeycard > 1 order by index_scans desc fetch first 1 row only  with ur";
db2 connect reset;

The problem with this approach is that I’m not looking at what index could have the most impact – maybe one of these indexes would be used very heavily if only the cluster factor was higher, since DB2 tends to avoid the usage of low-cardinality indexes when a table is not well clustered over them.

What other factors should I be considering here? Does anyone have any suggestions on how to make this methodology (or some other method) more relevantly find indexes on which a table would most optimally be clustered on? I feel like my approach could be heavily improved on, but wonder how to do it without extremes like explaining all the SQL in my package cache.

I’m still sure I should be dropping the lowest cardinality indexes in many situations. Perhaps choosing a clustering index is really more applicable to the “medium” cardinality indexes.

↧

Querying Tables for an Activity Event Monitor

October 14, 2014, 4:05 am

≫ Next: Looking Forward to IBM Insight

≪ Previous: Clustering a Table on an Index

I’ve been working with a developer to profile some SQL from a process he’s running. He has no insight into the SQL that the tool is using, primarily to insert data. I thought I’d share some of the details. I still think of this kind of event monitor as a “Statement Event Monitor”.

Setup

I started with the same exact setup from Using an Event Monitor to Capture Statements with Values for Parameter Markers.

Enabling Activity Data Collection

There are several levels at which you can enable data collection. You can do it at the session level. That didn’t work for me, because I’m collecting data from a web-based application (WebSphere Commerce), and I need to just capture whatever it does in a controlled scenario. That means my option was to capture this at the Workload Manager Level. Now I don’t have a license for WLM. But even if you’re not licensed, all of your database activity goes into one of two default classes. The classes in my 10.5 implementation are:

SYSDEFAULTADMWORKLOAD
SYSDEFAULTUSERWORKLOAD

All user queries are automatically in that latter “SYSDEFAULTUSERWORKLOAD” if you have no other properties defined.

As always, verify any licensing statements with IBM before counting on them, but my understanding is that you can enable activity data collection on these default classes with out having licensed the WLM feature. To enable the activity data collection I want for my activity statement event monitor, I had to alter that service class like this:

db2 "alter service class sysdefaultsubclass under sysdefaultuserclass collect activity data on all database partitions with details and values"

If you’re actually licensed for and using WLM, you would have to alter the appropriate service class to collect the data you need.

Presumably, there is overhead involved here, so you would not want to leave this enabled beyond the short period that you need to use it. To disable it, use:

db2 "alter service class sysdefaultsubclass under sysdefaultuserclass collect activity data none"

Event Monitor

Once you have enabled the collection of activity data, you need to create an event monitor to actually receive that data. Here, many of the options that you may be familiar with on event monitors are available. My preference for later data analysis is an event monitor that writes to a table, but if you were dealing with a fragile performance situation, that does add additional overhead.

Let’s stop for a minute and talk about when you might want to capture statements with an event monitor and this methodology. In my particular situation, we’re concerned about exactly what SQL is executing, and how long that SQL is taking. The data load process in question is taking longer than we would expect to insert 120,000 rows. I ran this event monitor during specifically defined activity in our non-production environment – with other people asked to stay out of the environment. In a production environment, you are likely not just to impact performance of the main workload with an event monitor with this level of detail, but you’re also likely to generate an incredible amount of data – which may either fill up your available disk space or just be that much harder to get a reasonable analysis out of because there is so much data.

Here’s the syntax I used for this event monitor:

db2 create event monitor act_stmt for activities write to table manualstart

That part actually seemed pretty simple to me.

Not being familiar with this event monitor, I wanted to see what tables it actually created. In my case, I didn’t really care which tablespace the tables ended up in for these test/staging environments, but you could specify that as well. So looking at the tables created, I see:

db2 list tables for all |grep DB2INST1
ACTIVITYMETRICS_ACT_STMT        DB2INST1        T     2013-11-15-13.05.39.368590
ACTIVITYSTMT_ACT_STMT           DB2INST1        T     2013-11-15-13.05.38.306542
ACTIVITYVALS_ACT_STMT           DB2INST1        T     2013-11-15-13.05.38.655985
ACTIVITY_ACT_STMT               DB2INST1        T     2013-11-15-13.05.37.208514

Of course to actually activate it when ready to capture the data, I used:

db2 "set event monitor act_stmt state = 1"

Querying Output

Querying the data takes a bit to figure out how to get what I want. What I came up with finally was this:

        select
                (select substr(stmt_text,1,70) as stmt_text from db2inst1.ACTIVITYSTMT_ACT_STMT as as1 where as1.EXECUTABLE_ID=as.EXECUTABLE_ID fetch first 1 row only),
                count(*) as NUM_EXECS,
                sum(STMT_EXEC_TIME) as SUM_STMT_EXEC_TIME,
                sum(TOTAL_CPU_TIME) as SUM_TOTAL_CPU_TIME,
                sum(LOCK_WAIT_TIME) as SUM_LOCK_WAIT_TIME,
                sum(ROWS_READ) as SUM_ROWS_READ,
                sum(ROWS_MODIFIED) as SUM_ROWS_MODIFIED
        from db2inst1.ACTIVITYSTMT_ACT_STMT as as
        left outer join db2inst1.ACTIVITYMETRICS_ACT_STMT av
                on as.appl_id=av.appl_id
                        and as.uow_id=av.uow_id
                        and as.activity_id=av.activity_id
        group by EXECUTABLE_ID
        order by 3 desc

Note that the substring length of the statement chosen was small to fit on the screen. Once I had looked at statements, I would likely query again to get the full text. The output looks like this:

STMT_TEXT                                                              NUM_EXECS   SUM_STMT_EXEC_TIME   SUM_TOTAL_CPU_TIME   SUM_LOCK_WAIT_TIME   SUM_ROWS_READ        SUM_ROWS_MODIFIED
---------------------------------------------------------------------- ----------- -------------------- -------------------- -------------------- -------------------- --------------------
INSERT INTO XPRICECHGSETTING (STORE, START_DATE, TRAN_TYPE, ITEM, NEW_      120111                14744               692148                    0                 6006               120111
SELECT INSTANCENAME,DBNAME,DBPARTITIONNUM,FACILITY,          (CASE REC          77                 7677              4486013                    0                    0                    0
DELETE FROM xpricechgsetting                                                     1                 1021               386150                    0                    0               120111
select i.INDSCHEMA, i.INDNAME, i.DEFINER, i.TABSCHEMA, i.TABNAME, i.CO           1                   89                37761                    0                31707                13358
SELECT TABNAME, COLNAME FROM SYSCAT.COLUMNS WHERE GENERATED='A' AND TA           2                   73                11406                    0                16454                    0
TRUNCATE TABLE wscomusr.X_GLINK_TEMP IMMEDIATE                                   1                   72                 3840                    0                   17                    0
TRUNCATE TABLE wscomusr.X_ITM_MASTER IMMEDIATE                                   1                   70                 2527                    0                   15                    0
CALL SYSIBM.SQLSTATISTICS (?,?,?,?,?,?)                                          4                   57                19488                    0                   20                    0
CALL SYSIBM.SQLCOLUMNS (?,?,?,?,?)                                               4                   56                26473                    0                    7                    0
CALL SYSIBM.SQLTABLES (?,?,?,?,?)                                                4                   53                 7045                    0                    2                    0
CALL SYSIBM.SQLFOREIGNKEYS (?,?,?,?,?,?,?)                                       4                   45                24777                    0                   15                    0
CALL SYSIBM.SQLPRIMARYKEYS (?,?,?,?)                                             4                   32                 7748                    0                    0                    0
TRUNCATE TABLE wscomusr.X_CATGPENREL_TEMP IMMEDIATE                              1                   24                 2259                    0                   16                    0
...

In this case, I’ve sorted on the execution time in descending order, because that’s the item of most concern to me. The others are interesting to me to be aware of too, and there are many more metrics in the ACTIVTYMETRICS table that might be interesting depending on what you’re looking for.

Here, I learned that the insert statements (the first statement) that this process is running are taking only 15 seconds in the database. So the fact that this overall data load process is running 30-40 minutes is not based on slowness in the database.

A side note – the second statement is one that I’ve already identified as coming from a Tivoli Monitoring implementation that I do not control, and it just keeps popping up in my analyses, both using DBI’s Brother-Panther and manual analyses like this, and it’s really beginning to bug me.

↧

Looking Forward to IBM Insight

October 21, 2014, 4:00 am

≫ Next: DB2 Basics: Backups of Data and Configuration

≪ Previous: Querying Tables for an Activity Event Monitor

IBM’s Insight conference (the conference formerly known as IOD or Information On Demand) is next week in Las Vegas. It’s a large conference that covers more than just DB2, but it is still DB2 geek heaven. I’ll be there – stop me to chat if you see me, I love to talk tech with anyone.

Last year was my first year at IBM Insight – I’ve generally been more of an IDUG gal myself, and IDUG NA is still my favorite conference. But this conference by IBM is still worthwhile. Here are a few tips for navigating Insight to make the most out of your time there.

General Navigation and Tips

Even if you’re staying at Mandalay Bay, there’s a lot of walking in a day. You’ll minimally have to walk through the casino (fun fact I was surprised by last year – slot machines don’t even take coins any more!), and then even within the conference center, there are a vast array of rooms that you may need to go to depending on where they put the DB2 tracks this year. Last year, they seemed to group them in one area upstairs, but I’ve heard in past years they’ve been downstairs, so we’ll see. They did seem to group them together which is nice for the breaks to be able to find folks. Maps of the Mandalay Bay Conference center are here, for anyone looking to study up.

Comfortable shoes are an absolute must. I always have to reject my desire to wear whatever cute but questionably comfortable shoes I currently have in my closet, but given how tired my feet get in my Birkenstocks, I’m always happy I’ve stuck with them.

I also find a sweater or jacket of some sort a must – I freeze my tail off at most conferences.

Overall Schedule

The general conference schedule is available here: http://www-01.ibm.com/software/events/insight/agenda/schedule/

Highlights to note from that:

Certification hours start at noon on Sunday, and 10 am most weekdays
Breakfast is Mon-Wed from 6:45 to 7:45 but 7-8 on Thursday
Lunch is Mon-Thu at 12:30

I actually find the general schedule page helpful when planning my conference to have general ideas of what is when.

Certification

Certification is available this year at the greatly reduced rate of $30 per test for Information Management tests. No freebies this year, and no more Susan Weaver, who was always the one sitting outside the certification room in past years, but has left IBM. Will be interesting to see how the certification room runs without her.

Also disappointing to me – there are literally no new DB2 certification tests out of IBM in the last year, which means I likely won’t be gaining any certifications this year, since I pretty much ran the board last year at this conference. Where’s 10.5 Advanced DBA? Where’s the SQL Procedure Developer for 10.anything?

I highly recommend certification – it always makes me study outside of my box and investigate areas I might not think about every day.

The certification room will be Surf F, MBCC South, Level 2. Certification hours are:
Sunday, October 26 12:00 p.m. – 6:00 p.m. LAST seating is at 4:00 p.m.
Monday, October 27 10:00 a.m. – 6:00 p.m. LAST seating is at 4:00 p.m.
Tuesday, October 28 10:00 a.m. – 6:00 p.m. LAST seating is at 4:00 p.m.
Wednesday, October 29 10:00 a.m. – 8:00 p.m. LAST seating is at 6:00 p.m.
Thursday, October 30 8:00 a.m. – 4:00 p.m. LAST seating is at 4:00 p.m.

Link to the certification information at Insight: http://www-01.ibm.com/software/events/insight/agenda/certification/index.html

Keynotes

Kevin Spacey is the celebrity keynote this year (8:15 AM on Wednesday). The slick style of some of the keynotes can drive me a bit insane, but last year, I found myself inspired by host Jake Porway, and I’ve heard good things about hearing Kevin Spacey speak in this kind of a setting.

The Information Management keynote this year is at 11:30 on Monday, and features a geek celebrity – Grant Imahara from Mythbusters.

The number of keynotes and general sessions at this conference can be a bit overwhelming. I try to make the first day’s keynote, the celebrity keynote and the information management keynote. It’s amazing the random places I find inspiration at conferences, and while the keynotes can be one of those, I also sometimes feel like a captive audience for an infomercial.

I’m sure I’ll be tweeting about the keynotes/general sessions. Tweeting is one of the fun things about this conference – a community to share and to find new people to follow.

Expo

The Expo at Insight is a level of salesy-ness beyond the normal conference expo hall. But it’s a place to hang out and run into people, and there are usually a few fun things to see and lots of swag to bring back for the kids. Hey, some of my favorite pens have come from conferences, and I have to use a pen at least once a month or so.

Seriously, it is worth a tour around to see what is going on, even if the Expo at Insight is so big that some of the smaller businesses have given up on it in favor of hosting private get-togethers either in suites or off site.

Evening Events

The big evening event this year is a concert by No Doubt. While I’m not generally a fan of the level of loud that tends to accompany concerts, I’ll likely go – I can’t miss “Don’t Speak” and “Just a Girl”.

In general, take advantage of the networking opportunities – find people you were in sessions with to go up and chat with. This is how you build the network of colleagues that you can turn to for technical advice, for a job, and for friendship.

Volunteer Work

What better way to give your brain a break from the intense education than by volunteering some time for a good cause? Stop Hunger Now is offering this opportunity – they hope to package 125,000 meals. Sign up for a spot here: https://stophungernow.secure.force.com/events/SA_EVENTS__Home?id=70170000000lH98AAE

Books

One of the fun spots to visit is always IBM’s bookstore. It’s located in the Bayside Foyer – not sure if it’s right at the bottom of the Escalators like last year, but that was a convienient location. There you can find more DB2 and database and big data and related books that you’ll ever see in one place in your life. There’s quite a schedule of book signings that Susan Visser has laid out on her blog.

It looks like the bookstore hours this year are:
Sunday, October 26 12:00pm – 8:00pm
Monday, October 27 10:00am – 7:00pm
Tuesday, October 28 7:30am – 7:00pm
Wednesday, October 29 7:30am – 5:00pm
Thursday, October 30 7:30am – 2:00pm

Susan’s blog also lists a few free giveaways, available in limited quantities at specific times.

Hands On Labs

Last year, I somehow managed not to get to a hands-on lab. This year, I’m going to make it to at least one to see how they work. Oddly enough, you cannot find the labs when searching for sessions, so you may instead log on to the newer conference planning tool using the ID IBM sent you after you registered for the conference. Then click on “Sessions” in the top navigation:

Then click on the “Tracks” box:

Then scroll down until you see the “Lab Center” section in the right column, and click on “IM: DB2 for LUW” or any other track you might be interested in:

I could not get the labs to come up in the session search tool – this was the only way I could find them after considerable experimentation.

Sessions

This is the fun part for me – going to see my fellow DBAs speaking and learn from all the awesome sessions that are on display. Once you pass a certain level, conferences provide the best education available – the ability to see what others are doing with the technology, to think about DB2 in different ways, to be inspired to do new things in better ways in your job and your career.

I inevitably end up in at least one session that is a very thinly veiled sales pitch. When planning my schedule, I always start with the speakers I know I’ll love, and then fill in around them – Melanie Stopfer, Matt Huras, and Dale McInnis tend to make the very top of my list. I was looking at the schedule this week, and saw a lot of people from outside IBM tend to be paired with IBMers.

There are, confusingly enough, two tools for scheduling your sessions in this year. One appears to be new this year, and the other looks a lot like last year’s.

Everyone should have been sent a login to the newer one after registering for the conference. I recommend changing your password to something you’ll remember, especially if you plan to use the mobile app. I’ve found the search a bit hit or miss with it – once searching for ‘BLU’ returned no results, repeatedly over the course of several hours, and there are tons of session on BLU. If it is not working for you, come back later, and maybe it will be working then. I hope they resolve whatever back end issues are causing that before the conference starts.

There is again a mobile app this year to help you plan and view your schedule. It’s the IBM Event Connect Center app on the Apple or Android app stores. It appears to also offer social networking features, so feel free to connect with me there. I wonder how many will be updating statuses there versus on Twitter? It looks similar to the newer scheduling tool.

I’ve found that at this conference, the best discussions are generally over dinners with groups of friends, or during the breaks between sessions. There are snack breaks most days where everyone stands around talking, and that’s one of my favorite parts.

Now if only they would start recognizing that some of us drink Diet Coke at breakfast. Maybe I should sell an advertising spot on the blog to whoever can provide me with a constant supply of cold Diet Coke during the conference. ;-)

↧

DB2 Basics: Backups of Data and Configuration

November 4, 2014, 4:00 am

≫ Next: IBM Insight 2014 – Brain Dump

≪ Previous: Looking Forward to IBM Insight

Why Backup?

Backups are so ingrained into DBAs. They should really be the first and the last thing we think of and ensure we do properly. We do regular backups so we can get data back in case of some failure, be it human, software, or hardware. We do ad-hoc backups before and after upgrades or fixpacks, before and after major application or database structure changes. Frequently, backups are used to move data between servers.

Developing a Recovery Strategy

How often you backup depends on your recovery strategy. Developing a recovery strategy involves things like explicitly stating a Recovery Point Objective (RPO) and a Recovery Time Objective (RTO). These are decisions that may be outside of the DBA’s sphere of influence. Often a DBA has to ask questions and listen carefully to understand what is important about the recovery objectives for a particular database, along with helping a business understand the bugetary impications of decisions they make in this area. Determining such a strategy is beyond the scope of this article.

DB2’s Backups

The specific files that are called backups by DB2 are binary files that represent every bit of data in a DB2 database. They are as big as the entire database as a result. Since they are at the bit level, they also cannot be restored between operating systems that may use different representations of characters. Restores are not supported between Windows and Linux/Unix. Restores are not supported between big-endian Linux/UNIX and little-endian Linux/Unix. See this IBM Knowledge Center page for details on cross-platform restrictions: http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.admin.ha.doc/doc/c0005960.html?lang=en

One of the questions I get most frequently from non-dbas is “so are they like MYSQL dumps?” Nope, not at all. There are other tools you could combine to generate the SQL to create every object in the database (db2look), and then to export/import all the tables in the database (db2move), but they are not what we call backups.

Backup Options

There are a lot of options when taking a backup. I’m not going to cover all of them in this article, but focus on a few that are most basic.

Online vs. Offline

One of the most basic backup choices is whether the database will be online or offline while the backup is being taken. If an offline database backup is taken, no one will be able to access the database while the backup is running, and you must force off all connections and deactivate the database before the backup can be taken.
It is important to consider whether your database is even enabled for online backups. Only databases that are using archive logging allow online backups. If a database uses the default of circular logging, you will not be able to take an online backup (SQL2413N Online backup is not allowed because the database is not recoverable or a backup pending condition is in effect). To understand if a database is using circular logging, you can use:

$ db2 get db cfg for SAMPLE |grep LOGARCHMETH
 First log archive method                 (LOGARCHMETH1) = DISK:/db_arch_logs/SAMPLE/
 Second log archive method                (LOGARCHMETH2) = OFF

Replace ‘SAMPLE’ above with your database name. In the above example, the database is using archive logging and not circular logging. If both LOGARCHMETH1 and LOGARCHMETH2 are set to OFF, then the database is using circular logging, and online backups will not be possible. There are other things to consider when chosing a logging method, so be sure to research the implications before changing your archiving method. Among other things, if you enable archive logging, you will have to manage deleting old transaction log files.
Taking an online database backup is done by including the ONLINE keyword in the proper place in the BACKUP DATABASE command. An offline database is taken if you do not use the ONLINE keyword.
If it is easy to take the outage for an offline database backup, then choose an offline one. They are slightly easier to restore from when you don’t need to rollforward.

Backup Types

The types of backups that can be taken include:

Full – the entire database is backed up
Incremental – changes since the last full backup are backed up (restore requires a full image and the incremental image)
Delta – changes since the last full or incremental or delta backup are backed up (restore requires a full image and all incremental and delta images since the full image)

I prefer to take a full backup whenever possible. I use incremental backups sometimes to reduce the time it takes to perform a restore when either space or backup duration prevent me from taking more frequent full backups. I don’t like delta backups in most scenarios because I worry about managing all the files needed for a restore.
To take an incremental backup, the INCREMENTAL keyword is used in the BACKUP DATABASE command. To take a delta backup, the DELTA keyword is used in the BACKUP DATABASE command. If neither INCREMENTAL or DELTA is specified, then the backup will be a full.

Backup Scope

Backup can capture the entire database, or it can just capture a single tablespace or subset of tablespaces. A subset of tablespaces should only be used in specific situations. To backup only a subset of tablespaces, the TABLESPACE keyword is used in the BACKUP DATABASE command.

Backup Compression

You can optionally tell DB2 to compress the backup while it is being taken. This is a good option if you are taking backups to disk. If you have a dedup device or other location that applies its own compression to the backup files, then you’ll want to avoid compression when the backup is taken – compression on top of compression is not the best idea. To compress the backup as it is taken, use the COMPRESS keyword in the BACKUP DATABASE command. You can also compress the backup after it is taken using the gzip or your favorite compression command. This method has the advantage of reducing the backup duration, but it requires more disk space, more overall time, and may require more time on restore as the backup image has to be uncompressed before it can be used.

Backup Syntax

See the IBM Knowledge Center for full backup command syntax.
My favorite backup syntax is usually:

db2 backup db sample online to /db_bkup/full compress without prompting

There are many databases that only get offline backups at upgrade time, and at no other time.

What is in a DB2 Backup and What isn’t

By default, online DB2 backup images include the transaction log files that were written to while the backup was being taken. No other transaction log files are included in the backup image. Both the structure and data of all objects in the database are included in the backup image. The backup also includes the database history file and the database configuration (though these are only part of restore in specific circumstances).
I like to also regularly copy other data to ensure I have it in addition to what’s in the backup itself. Data I collect regularly includes:

db2look extracting SQL to recreate objects and permissions
list of tablespaces
list of tablespace containers
node directory
database directory
dbm cfg
db cfg
db2 registry (db2set)

Some of these items are included in database backups, but it may be hard to extract only a small part of the data from a backup. I don’t want to have to restore the whole database to see the syntax to create a single dropped table or to get the database configuration.

↧

IBM Insight 2014 – Brain Dump

November 6, 2014, 4:00 am

≫ Next: Mike’s IDUG Prague Technical Conference Preview

≪ Previous: DB2 Basics: Backups of Data and Configuration

Whew, what a week. I’m coming out of the conference with at least 20 ideas for new blogs, including a new “Internals” series that I want to work on. I call this post my Brain Dump as I share out random things I learned and ideas.

Saturday

For me, the conference started on Saturday. As an IBM Gold Consultant, I’m eligible to go to what’s called “CAC” – which I think stands for Customer Advisory Counsel. A chunk of what was discussed was covered by an NDA, so I cannot go into much detail. I think I can tell you it exists, though. It was preview and a more in-depth version of some of the other sessions at the conference, with slightly less vague mentions of some of the things IBM is working on. It was my first year there, and I’m probably the youngest Gold Consultant or person eligible for that. It felt in some ways like being a newbie again, but the other consultants and partners are really nice and welcoming. Unfortunately, lunch was rather non-vegetarian, and when I asked about vegetarian food and they said there was nothing, I ate my salad and cookie, and slipped out to buy some supplementing food at a little nearby store. Dinner was fun on Saturday night, on IBM’s dime – sat with some Gold Consultants and IBM execs and chatted. Again, pretty close to the youngest in the room, and felt rather on the junior side.

Sunday

On Sunday, I took it a bit easy, in preparation for the week. Had lunch with a couple of sales guys from my new employer and few of their contacts, and the evening held a Gold Consultant reception and the Expo hall opening.

Monday

Monday started out with the general session, which was generally just fine, and fun to tweet about. After that was an interesting but not airtight session on converting information about SQL statements into monetary costs (The Monetary Cost of SQL Statements with Christopher Godfrey and Mike Galtys), which I was interested in not so much for the ability to charge back, but for the ability to quantify to a client how much working on their SQL has saved them. I don’t think I could go that far with it without some work, but it’s an interesting idea that is still percolating.

The IM Keynote was on Monday, and it featured Grant Imahara from Mythbusters. He was ok, but the scripted banter at these things always gets me. It was cool to have someone who worked on R2D2 and a giant Lego ball that my kids would go nuts over there – he had geek credentials, anyway.

One of the announcements on Monday was about DashDB which is an offering on BlueMix and maybe Softlayer that delivers BLU as a service. It might be interesting to play with, and from bits and pieces I’ve heard about it, I think it’s pretty well engineered on the back end.

I went to two other sessions on Monday – one was Matt Huras’s session on BLU (IBM DB2 with BLU Acceleration in 2014: The Latest News from the Lab). An excellent session which I unfortunately came into late, but still managed to learn a ton about shadow tables (they’re using replication to move the data to the shadow tables). A lot of the session was about Cancun – they’ve increased performance in several areas, including optimizations for group by, better CTE support, increased update performance, filtering on FKs, better execution plans in nested joins, better use of PK indexes at query runtime, better varchar compression, the support of user-maintained MQTs on columnar tables, the support of the MERGE statement, HADR with BLU.

I closed Monday out with Berni Schiefer’s session IBM DB2 Performance Update. Berni had a lot of interesting things to say- the two biggest that stood out to me were the fact that we should now be aiming for 8-16GB of memory per core on DB2 servers, and a couple of interesting things I want to blog in more depth about – the use of GPUs for DB2, and the base performance impact of DB2 10.5 over 10.1. He also talked about how BLU uses CPU at a higher rate than previous releases of DB2. He said that in the past, 80% CPU utilization was a sign of contention, but with BLU it is just expected that DB2 will use almost as much CPU as you can throw at it. There were great other things too – I’ve got at least a page of notes, and tagged this as one of the presentations I definitely want to download. (note, presentation not available for download yet as of this writing!)

Tuesday

Tuesday dawned awfully early after a late night on Monday, and I skipped breakfast and the general session. That’s the first general session I’ve skipped. My first session was What’s New in IBM DB2 BLU Acceleration: Tips and Insights into the Latest 2014 Enhancements with Sam Lightstone and Guy Lohman. This session had some repeats of stuff I’d already heard, but some interesting things as well. I learned that while BLU was already faster than row based tables for complex workloads, in Cancun, it’s even faster – with base 10.5 being 10-50 times as fast as row based tables, while Cancun is 35-73 times as fast. Focus for Cancun was placed on getting inserts and loads faster -with ELT (which they seem to be using now instead of ETL) improvements of 49-112 times and insert from subselect up to 35% faster. The caveat to these is that such operations are still slow if they would have relied on the tertiary indexes that cannot be created on BLU tables. Accessing a single row by PK is still 50% to 2 times slower on BLU than on row based tables. One interesting concept that came across in this session was thinking of traditional indexes as an approximation of a columnar store – I’d never thought of it that way. Also, they laid out the expectation that for BLU systems, memory should be about 5% of the data size.

The next session is one of my favorites of every conference – the panel (Ask the IBM DB2 Technical Leadership Team with Phil Downey, Matt Huras, Leon Katsnelson, Sam Lightstone, and Berni Schiefer). This one was on the smaller side, but I always learn so much from the questions that others ask. I was torn in this time slot as Melanie Stopfer was giving her only presentation of the conference at the same time. How in the world could Melanie Stopfer, the favorite presenter of so many, have only one session?!?! I realize scheduling is so hard at these things, and I want to go to so much, but Melanie should have at least two or three sessions. The most interesting thing that stuck out to me in the panel was a discussion about parallelization of backups. Backups are done in parallel only by tablespace – so one large tablespace can cause backups to run long. The way to get around this is to split up into more tablespaces – maybe even using range partitioning to split large tables across tablespaces. So sad I had to skip out early to make a lunch meeting.

On Tuesday Afternoon, I went into my first Lab – I hadn’t managed to do one when I went last year. I found it absolutely awesome. I chose a lab on HADR and TSAMP – a bit worried that it wouldn’t be advanced enough, since I consider myself a bit of an expert in this area. But I have always wanted to better understand the TSAMP structures behind db2haicu, so I went for it. The lecture part of this scheduled lab was a bit on the basic side for me. But the lab included three options – basic, intermediate, and advanced. I skipped straight to the advanced one, and wow, was it good. Exactly the level of detail I wanted and the commands to do everything. Because I took it not as a drop-in lab, but as a scheduled one, I got to keep the book. I plan to blog about the stuff I learned there. Just goes to prove to me that there is always something new to learn, even in areas of expertise.

I did go to the No Doubt concert on Tuesday night, but I’m not a big fan of LOUD or strobe lights, so it wasn’t really for me. I feel for the performers at these conferences with audiences that are only half into them.

Wednesday

Wednesday’s general session was really good. Captain Philips was great and would have been the high point of the general session speakers if it wasn’t for Kevin Spacey, who was truly incredible. No odd scripted banter with IBM execs for these guys – just good speaking and good humor. Kevin Spacey somehow made the tie-ins to the conference themes not seem wooden like many speakers do, and dropped a few F-bombs without seriously offending anyone that I talked to. He played the audience so incredibly well – never seen anything like it. If you get a chance to see him speak – go! As a side note, I’ve now become a House of Cards addict.

IBM also announced a partnership with Twitter surrounding Twitter’s data that could prove interesting if done right. I’m interested to see what that turns into.

I went to the analytics panel (Architect Panel: Distinguished Engineers on Analytics), which was interesting, but did not get a ton out of it since I’ve been so focused on OLTP databases for the past 8 years. One interesting suggestion from a participant there, given frustration with the Knowledge Center and the horrible search issues there, was to feed the DB2 documentation into Watson and see how searches would work then. Great idea – I hope the right people at IBM were listening. Think about how that would prove to people who work with data the power of the Watson platform.

I did a drop-in lab on shadow tables that was excellent – “OLTAP – The New Frontier of Database Workload Powered by DB2 with BLU acceleration”. It was great and thorough – just what I was looking for to really understand the new BLU shadow tables introduced with Fixpack 4 (Cancun). I’d love to get these going in a real environment and blog about it, and may do it in a sandbox and blog about that.

The sleeper stand-out session of the conference for me was “Hidden Gems: The Top 10 Most Underused Functions in IBM DB2″. I was glad I chose this in a time slot where there were literally 4(!) other sessions I also wanted to go to. I missed “Fun with SQL” – sad, that’s a fun session. And Chip, I so wanted to see speak in another session. But this session should have been titled “9 things Ember can blog about” – and it’s 9 only because one of them I’ve already covered in a blog post. So many great things here to base blog entries on.

Thursday

Thursday was the best technical day of the conference for me. Packed full of great technical sessions.

For me, any day that starts with a Matt Huras session is just awesome. The guy knows and can explain so much. He goes to the right depth, and doesn’t over explain or under explain. He gives me a strange look now when I go to his internals sessions at every single conference – twice a year. He tweaks the presentation a bit, but what I find is that first, I change between conferences – what I’m interested in or what I’m working on – so I hear and see different things as important each time. Also the questions that people ask tell me about what people are interested in and lead to whole new tangents. My big epiphany this conference is that there are so many things from this session I can blog about! I now have a whole series of Internals blogs planned. I feel absolutely idiotic for not thinking of this sooner – one of the things I always lament about not being able to find anywhere other than a conference is the material covered in this internals session. Well there ya go – I know enough of many of the topics now to write well about them. I’m so excited, I want to jump in and start writing these!

I went to Ken Shaffer and Dom Turchi’s session – IBM DB2 Linux, UNIX, and Windows Online Version Upgrades. I’ve got a client interested in an online upgrade, so this was a good one for me to see. Ken’s also a friend. It was interesting to see the way they were able to get through an upgrade with just 15 minutes of downtime.

One of my favorite newer IBM speakers is Michael Kwok. He’s the head of data warehouse performance for DB2, and I went to see his session on “Enhancing your Operation Analytics Warehosue Performance Experience with IBM DB2 10.x”. He reminded me that just moving to 10 from 9.7 gets about a 10% performance improvement out of the box. He also went through a lot of the details a to why the performance is better. I really enjoyed a performance session that wasn’t just about BLU, but talked about more details on how compression helps performance almost across the board, how prefetching was improved with 10, and how the optimizer was improved as well. He also gave me some ideas on how I could better make use of charts on the blog and he had some interesting ways of displaying some of his data.

On Thursday, I went to the in-person version of the Lab I had gone to on shadow tables. This let me see the presentation part of the lab, and ask a few questions, as well as get a copy of the lab book. I left early since I had already done the lab portions, and wanted to get to the last two sessions of the day.

I like to hear Dale McInnis talk as well. I went to his session – Building a Continuously Available System with IBM DB2. I spend a fair amount of time with HADR and db2haicu/TSAMP and like to know the options for availability for my clients. Dale also has an ear on what IBM is working on with their big WebSphere Commerce clients, so it’s interesting for me to hear what little I can get in that area as well. He both talks at a high level about things to consider for availability – RTO vs RPO and all that, and also digs down into the details of some solutions. It sounds like he is heavily involved with the availability choices for DashDB and SQLDB on Bluemix, and those are interesting to hear. I was interested to hear about the GDPC option that would only be done with lab services from IBM involved, but sounds fun to figure out and set up.

For my final session of the conference, I went to “A Close Look at Deploying IBM DB2 in a Shared Multitenant Environment”. This was a good session to think about the various ways of doing multi-tenant and the details and drawbacks. Using RCAC and table partitioning, you could even serve multiple clients from the same database. Think about backing them up separately, partitioning what data they can see, etc – very interesting.

Overall Experience

It was a very different experience for me this year than it was for me last year. Daily page views on my blog have more than tripled in that amount of time, and at least 10 people came up to me to introduce themselves and tell me that they read the blog – a truly wonderful feeling for me. There was even one person who was going down the big escalators as I was going up who pointed at me, got off, and then came back up to talk to me.

The only way to get hard copies of the lab books for the drop-in labs was to wait in line on Thursday afternoon. I missed most of a session to do this for a couple that I really wanted to see. It seems like IBM could provide soft copies of them on the lab machines that one could just email to oneself or upload on dropbox or something.

I knew the big scale and most of the routine, so it was much more fun – I didn’t have to resort to my technique of finding Ian and standing next to him at all – though I met some interesting people that way last year.

Early in the week I was disappointed with the technical content (though there were still some good sessions). I also felt oddly like in some time slots there was nothing too interesting, while in others there were just so many incredible things I had trouble picking which one to go to. I had never before had a session at any conference with 5 things I wanted to see in the same time slot. Until this conference, I had also never skipped a session just to talk with people. I know that scheduling these things must be a nightmare because so many people have so many different opinions on what is best – you’re never going to make everyone happy. Wednesday and Thursday were the best in technical content in my opinion – though there was at least one excellent session each day. I would love to see more of Melanie Stopfer while keeping up the number of sessions from Matt Huras, Michael Kwok, and Dale McInnis. I would love to see a little less on BLU – I know they’re pushing it, but there are still workloads out there that cannot benefit from it. I would also love to get my own sessions approved, but that’s a whole other ball of wax, I’m sure – I’ll learn how to get there one of these years.

Networking

Over time, you develop friends at conferences and it’s so much fun to just hang out with them. I ate meals with so many interesting people, and realized when submitting my expense report that I somehow only managed to pay for two or three meals the whole week.

I escalated to a new level this conference, making some interesting connections with IBM executives. I still enjoy talking to technical folks the most, but you get good advice and ideas in all kinds of places.

Conference Guide

The hard-copy conference guide is nearly useless without speaker names. I understand IBM wants to be able to change speakers at a moment’s notice, but the speaker makes the session, and there are some speakers I would go listen to no matter what the topic because I know I would learn from them. I only used the guide when the mobile app wouldn’t work.

Ah, the mobile app. The favorite target of complaints all week. My biggest problem with it was that it did not keep a local copy of my schedule, so if the wifi went down (which it did, several times) and/or cell data service wasn’t fully working, I didn’t know what I had planned. Other suggestions would include:

A way to get sessions I’ve signed up for on my own calendar – doesn’t seem like it would be that hard – a calendar I could subscribe to online
Not going back to the home screen when my phone times out and goes to sleep or when I switch to another app – I’d like to go right back where I was – why is it running in the background otherwise?
When showing me sessions I have signed up for, going to the current day instead of making me scroll through everything since Sunday
An easy way to look at only Labs
Ability to overlap sessions with the end of labs – I don’t always stay for the entire lab session
Ability to filter what’s happening “NOW” by track
Keynotes and Meals with locations
Ability to add custom appointments to the schedule

Sessions Online

You can see replays of the general sessions and the keynote sessions online at: https://ibm.biz/insightGOreg

Downloading Presentations

If you attended the conference, you can download the presentations. Go to https://connections.ibmeventconnect.com/files/app#/communityfiles. If prompted, sign in with your Insight Conference Connect user id and password. To find a presentation, you must have the number it goes by in the program. I found that half or less of the presentations I wanted to download were available. I’d love a download by track option too. Here’s a thought – integrate with the schedule tool and let me download everything on my schedule!!

↧