Wednesday, July 16, 2014

DB2 for z/OS Buffer Pool Enlargement is NOT Just an Elapsed Time Thing

A couple of weeks ago, I got a question from a mainframe DB2 DBA about the impact of DB2 buffer pool enlargement on application and system performance. This individual had requested an increase in the size of a buffer pool on his system, and the system administration team had agreed to implement the change; however, one of the system administrators told the DBA that while the larger buffer pool would improve elapsed times for application processes accessing objects assigned to the pool, no CPU efficiency gains should be expected.

I am regularly surprised at the persistence of this notion that bigger DB2 for z/OS buffer pools do not drive CPU savings. Let me see if I can set the record straight in clear terms: YES, THEY DO. I'll explain herein why this is so, and I'll provide a means whereby you can measure the CPU -- yes, CPU -- impact of a DB2 buffer pool size increase.

I don't think that anyone would dispute that a larger buffer pool will decrease I/O activity (especially read I/O activity) for objects (table spaces and/or indexes) assigned to the pool. The disagreement is over the impact of I/O activity on the CPU consumption of DB2-accessing applications, and on the CPU consumption of DB2 itself. What I've found is that some people believe that a System z server's I/O assist processors handle ALL of the processing associated with I/O operations (this view seems to be more widely held by people who have been working with mainframes for a long time, perhaps because I/O assist processors were a more talked-about feature of the platform back in the day). This is not true. I/O assist processors offload from general-purpose engines a substantial portion -- but not all -- of the work involved in reading and writing data from and to disk. I/O assist processors are great, and they are one reason that System z has long excelled as a platform for I/O-intensive applications, but general-purpose engines (and zIIP engines, for that matter) still have to shoulder some of the read/write load.

Thus it is that a reduction in I/O activity will reduce CPU consumption on a mainframe system. If you enlarge a DB2 buffer pool (to reduce disk reads and writes) AND you change that buffer pool to be page-fixed in real storage (via -ALTER BUFFERPOOL bpname PGFIX(YES)), you'll get even more in the way of CPU savings, because one of the things that a general-purpose engine typically has to do in support of a DB2 I/O operation is fix in memory the page holding the DB2 buffer in question (the one into which data will be read into or written from) until the I/O action is complete, after which the page is released (i.e., made pageable again). This is done so that the buffer won't be paged out to auxiliary storage in the middle of the I/O operation. When a pool's buffers are fixed in memory from the get-go (true when PGFIX(YES) is in effect), the page-fix/page-release actions formerly needed for I/Os are not required, and CPU consumption is reduced accordingly. In a DB2 10 or 11 for z/OS system, you can get even more CPU efficiency benefits from page-fixed buffer pools, because in those environments DB2 will request that a page-fixed buffer pool be backed by 1 MB page frames, versus 4 KB page frames (the LFAREA parameter of the IEASYSxx member of PARMLIB specifies the amount of a z/OS LPAR's memory that is to be managed in 1 MB frames). The 1 MB page frames save CPU by improving the efficiency of virtual storage to real storage address translation.

OK, on now to measuring the effect of a buffer pool change (such as enlarging a pool, or page-fixing the buffers in a pool) on application and DB2 CPU efficiency. For the application-level CPU effect, use Accounting Long Reports that can be generated by your DB2 for z/OS monitor (depending on the monitor that you use, these might be called Accounting Detail Reports). Input to these reports is the data contained in records generated when DB2 accounting trace classes 1, 2, and 3 are active (these records are typically written to SMF). With those trace classes active (and BEFORE you've implemented the buffer pool change), do the following:
  • Generate an Accounting Long Report for a particular day of the week (e.g., Tuesday) and a particular time period. That time period could capture a "peak" of system activity (e.g., 9-11 AM in the morning), or it might be an entire 24 hours -- go with the FROM and TO times that are of interest to you. You can have the DB2 monitor aggregate information in the report in a variety of ways (using an ORDER or GROUP specification -- or something similar, depending on the monitor that you use -- in the report control statement in the SYSIN part of the JCL for the report-generating job). Use the aggregation level (or levels -- you could choose to generate several reports) of interest to you. Want to see the CPU impact on the overall application workload for the DB2 system? Have the data aggregated at the DB2 subsystem level. Want to see the impact for different subcomponents of the workload (e.g., CICS-DB2 work, DRDA work, call attach facility batch work, etc.)? Have the data aggregated by connection type. Note that, by default, a DB2 monitor will typically aggregate accounting information by primary DB2 authorization ID within DB2 plan name -- that is an aggregation that I usually find to be not very useful.
  • Implement the buffer pool change.
  • Generate an "after" Accounting Long Report, for the same day of the week (e.g., Tuesday) and the same time period (e.g., 9-11 AM) as for the "before" report. Use the same aggregation specification as before (e.g., at the DB2 subsystem level). Looking at the "before" and "after" reports, find the average in-DB2 CPU time (also known as the average class 2 CPU time), which is the average CPU time for SQL statement execution. Note that this time will be in two fields: general-purpose engine time, and "specialty engine" CPU time (this is typically zIIP engine time). Do NOT overlook the specialty engine time -- for some workloads, particularly the DRDA workload that comes through the DB2 DDF address space, specialty engine CPU time can be greater than general-purpose CPU time. See how these CPU times (general-purpose and specialty engine) have changed, and there's your effect at the application level (the "average" is per DB2 accounting trace record -- one of these is usually generated per online transaction, and per batch job). If you requested that the monitor aggregate data at (for example) the connection type level, you will have in the accounting report a sub-report for each connection type (one for the CICS connection type, one for DRDA, one for call attach, etc.), and there will be an average in-DB2 CPU time (again, both a general-purpose engine and a specialty engine time) in each of these sub-reports.

The procedure for measuring the impact of a buffer pool change on DB2's CPU consumption (i.e., on the CPU time charged to DB2 tasks versus tasks associated with DB2-accessing application programs) is similar to what I described above:
  • BEFORE making the buffer pool change, use your DB2 monitor to generate a Statistics Long Report for the subsystem (your monitor might refer to this as a Statistics Detail Report). Input to this report is the data in records generated by the "standard" DB2 statistics trace classes (1, 3, 4, 5, and 6). Use the same day of the week and same time period as for the aforementioned Accounting Long Reports.
  • AFTER making the buffer pool change, generate another Statistics Long Report, for the same day of the week and the same time period as before. In the "before" and "after" reports, find the section of the report in which the CPU times for the DB2 address spaces are provided. Look at the CPU times for the DB2 database services address space (the one most affected by I/O activity -- it handles prefetch reads and database writes), and there's your DB2 CPU impact. I say "look at the CPU times" because you should see both a total CPU time for the address space and a field with a name like "preemptable IIP SRB time." The latter is zIIP engine time, and it is NOT included in the former (reported "total" CPU time is general-purpose engine time).

To summarize this blog entry's message: buffer pool size increases should deliver CPU savings on your system, at both the application level and the DB2 subsystem level, by reducing I/O activity. Those CPU savings can be boosted further by page-fixing pools (usually done most effectively for your higher-activity pools), and page-fixed pools save additional CPU when they are backed by 1 MB page frames (automatic in DB2 10 and DB2 11 environments, when LFAREA in IEASYSxx sets aside some of the LPAR's memory resource to be managed in 1 MB frames). When you've made a buffer pool change that should provide enhanced CPU efficiency for your DB2 applications and subsystem, by all means measure that impact. Your best measurement tool for that purpose is your DB2 monitor, and the Accounting and Statistics Long Reports that it can generate.

I hope that this information will be useful to you.

Thursday, June 26, 2014

DB2 for z/OS: the Functional Advantages of Native SQL Procedures

I have been a big fan of DB2 for z/OS native SQL procedures ever since the functionality was introduced with DB2 9, back in 1997. I have posted quite a few blog entries on the topic, the first in 1998 (written while I was working as an independent consultant) and the most recent just last month. In these blog entries, and in presentations I've delivered over the past several years (and in discussions generally), I've focused largely on performance aspects of native SQL procedures versus external stored procedures. These performance pluses (native SQL procedures run in the DB2 database services address space, they run under the task of the calling application process, and they are zIIP-eligible when called by a DRDA requester) are important, but lately I've found myself thinking, more and more, about the functional advantages of native SQL procedures. That's the thrust of this blog post.

Right out of the gate with DB2 9 for z/OS (new-function mode) there was some space, functionality-wise, between native and external SQL procedures: a native SQL procedure could include a nested compound statement, while an external SQL procedure could not (a compound SQL statement, typically the heart of a SQL procedure, is a group of SQL statements, set off by BEGIN and END, in which there can be variable declarations and associated assignments, along with SQL logic flow control statements such as IF, WHILE, and ITERATE). What does support for a compound SQL statement within another compound SQL statement mean to a developer? It means that he or she can create SQL procedures that have multi-statement condition handlers, for more sophisticated processing of exception and/or error conditions that might occur in the execution of the procedure.

The functional advantage of native SQL procedure usage advanced further with DB2 10 for z/OS (new-function mode), which allowed a native SQL procedure -- and only a native SQL procedure -- to have input and output parameters of the XML data type (and to specify the XML data type for variable declarations). Before DB2 10 (and even in a DB2 10 or DB2 11 environment, using anything other than a native SQL procedure), getting an XML data value to a stored procedure required serializing the XML document into character string or CLOB (depending on its size) and passing it to the stored procedure (and working with it in the stored procedure) in that form. Yuck.

DB2 11 for z/OS (in new-function mode) delivers two really cool SQL-procedure-only functional enhancements: array parameters and autonomous procedures. More information on these enhancements follow.

Array parameters

An array parameter is a parameter that contains, essentially, a "stack" of values. Before passing an array parameter to a native SQL procedure (or receiving an array as an output parameter of a native SQL procedure, or declaring and using an array variable in a native SQL procedure), you first have to create the array. Why? Because an array in a DB2 for z/OS context is a user-defined data type (UDT). In creating an array, you have two choices: ordinary and associative. In an ordinary array, elements are referenced by their ordinal position within the array (for example, the third element added to an ordinary array would be referenced as value 3 of that array). Elements in an associative array are referenced by user-provided index values; so, if I assigned an index value of 'Home' to a data value in an associative array containing phone numbers, I could reference a person's home number by using the 'Home' index value.

Here is the CREATE statement for an ordinary array:

CREATE TYPE EMPL_NUMS AS CHAR(6) ARRAY[20];

In the above example statement, CHAR(6) refers to the data type of the values placed in the array (an example would be employee number '089234'), and 20 refers to the number of values that the array can hold (if no value were there, i.e., if [] had been specified instead of [20] after the keyword ARRAY, the maximum number of values that the array could hold would default to the high positive value for the INTEGER data type, which is 2147483647).

The CREATE statement for an associative array would look like this:

CREATE TYPE SCHOOLS_ATTENDED AS VARCHAR(40) ARRAY[VARCHAR(30)];

As with the ordinary array, the data type after the AS keyword refers to the data values that will be stored in the array ('Eastern State University' could be an example). The second data type specified for the array (and that second data type is your indication that it's an associative array) refers to index values for the array ('Graduate school - Masters' could be one such value). Note that an associative array, unlike an ordinary array, does not have an explicitly or implicitly specified maximum cardinality -- the cardinality of an associative array is based on the number of unique index values used when elements are assigned to the array.

As mentioned, a DB2 for z/OS stored procedure defined with array-type input and/or output parameters (or with references to array variables in the procedure body) must be a native SQL procedure; furthermore, the CALL that invokes such a SQL procedure can come from only two types of program: another SQL procedure language routine (SQL PL is the language in which SQL procedures are coded, and in which compiled SQL scalar functions can be coded) or a Java program that accesses the DB2 for z/OS server via the IBM Data Server Driver for JDBC and SQLJ type 4 driver. That second program type is really important in my eyes. Java programmers regularly work with arrays, and plenty of those folks had requested, prior to DB2 11's debut, the ability to pass an array to, or receive an array from, a DB2 for z/OS stored procedure.

Autonomous procedures

Consider this scenario: you have a transaction for which you want to record some information for each execution, even if the transaction fails before completion and is rolled back by DB2. A rollback would undo any data changes made by the transaction, right? So, how do you persist some information associated with the transaction? With DB2 11, you can do that with an autonomous procedure, which is a type of native SQL procedure. How would this work? Well, the transaction would call the autonomous procedure, and that SQL procedure would do its thing -- inserting, for example, some data into DB2 table T1. Control would then pass back to the caller, and the transaction would do its thing -- updating, let's say, data in table T2. If the transaction fails after updating T2, what happens? DB2 will back out the transaction's change of T2 data, but the insert into T1 performed by the autonomous procedure will not be backed out. Cool, eh?

What makes a DB2 11 native SQL procedure an autonomous procedure? Technically, it's the specification of the AUTONOMOUS option in the associated CREATE PROCEDURE (or ALTER PROCEDURE) statement. AUTONOMOUS would be used in place of the COMMIT ON RETURN option. Completion of a called autonomous procedure will drive a commit, but that commit will "harden" only the data changes made by the autonomous procedure -- it will have NO EFFECT on any data changes made up to that point by the calling program. This is made possible by the fact that the autonomous procedure's DB2 unit of work is independent from that of the calling application process. Because of this independence, locks acquired by DB2 for an application process are not shared with locks acquired for an autonomous procedure called by the application process. It is therefore theoretically possible that an autonomous procedure will encounter lock contention vis-a-vis its caller. That possibility might influence decisions you'd make about the locking granularity that you'd like DB2 to use for a table space (e.g., row versus page), if an autonomous procedure and its caller will change data in the same table.

The future?

Will future versions of DB2 for z/OS introduce other enhancements that fall into the "SQL procedures only" category? We'll have to wait and see about that, but it sure has been interesting to see the progressive augmentation of native SQL procedure functionality just over the past three DB2 versions. I'll be looking for more of the same, and that's one of the reasons that I'm a native SQL procedure advocate: they get better and better.

Friday, June 13, 2014

DB2 for z/OS: Getting to Universal Table Spaces

Often, there is a bit of a time lag between the introduction of a DB2 for z/OS feature and the widespread adoption of the new technology. Take universal table spaces, for example. These were introduced with DB2 9 for z/OS (almost 7 years ago), but some organizations are only now beginning to convert non-universal table spaces to the universal variety. In this blog post I want to go over the incentives for undertaking a conversion to universal table spaces, highlight the important way in which DB2 10 eased the conversion process, and raise some matters one should consider in the course of effecting table space conversions.

Why universal?

As I see it, their are two main reasons to convert non-universal table spaces to the universal kind. First, it's the only way in which you can leverage the benefits of partition-by-growth table spaces, one of two varieties of universal table space (the other being partition-by-range). A partition-by-growth (PBG) table space -- as the name implies -- is one that is partitioned as needed to accommodate a table's growth; so, if in creating a table space (or altering an existing table space) one specifies a DSSIZE (data set size) of 2G (2 gigabytes) then upon the table reaching 2 GB in size DB2 will add a second partition to the table space. If that partition fills up (i.e, when it reaches 2 GB in size), a third partition will be added to the table space, and so on up to the maximum number of partitions specified for the table space (that being MAXPARTITIONS, an alterable value).

What's good about this? Well, first and foremost it eliminates the 64 GB size limit that previously existed for table spaces that are not range-partitioned -- a PBG table space, like a range-partitioned table space, can reach a size of 128 TB (and that's for the non-LOB data in the table -- with LOB data the data capacity of a table space can far exceed 128 TB). Of course, you might think of a table in your DB2 for z/OS environment that would never approach 64 GB in size, and wonder, "Why should I convert THAT table's table space to universal PBG?" That would be a good question, if size were the only incentive for converting a non-universal table space to universal. There are, in fact, quite a few non-size-related reasons for getting these conversions done. I'll get to these presently, but first I want to clear up a misconception. Some folks think that PBG table spaces are not appropriate for small tables because of the "P" (for partition) in PBG: we've historically thought of partitioning as a means of getting more than 64 GB of data in a table, and so we equate "partition" with "big" and write off PBG for smaller tables. Time to change that thinking. Is PBG a good choice for a table that will never hold more than, say, 20 KB of data (this could be a reference or code table), even with 1G being the smallest allowable DSSIZE value? Sure it is. Will that little table's table space be 1 GB in size, with 20 KB of the space used and the rest wasted? Of course not. The table space's physical size will be determined by its PRIQTY and SECQTY specifications (primary and secondary space allocation, respectively). If those specifications are chosen appropriately, the table with 20 KB of data will occupy 20 KB of disk space. The 1G DSSIZE specification means that IF the table space size reaches 1 GB then DB2 will add another partition to the table space. If the table space size never reaches 1 GB then the table space will stay at one partition. Got it?

OK, on to the non-size related incentives for going universal. A growing list of DB2 features can ONLY be used in conjunction with universal table spaces. These include (and I've indicated the DB2 release through which these features were introduced):
It is likely that future releases of DB2 for z/OS will introduce more features with a universal table space requirement.

Getting there got a lot easier starting with DB2 10

In my universal table space incentive list above I mentioned pending DDL. This is a capability that became available with DB2 10 running in new-function mode. Big picture-wise, what pending DDL made possible was the non-disruptive alteration of a number of aspects of the definition of a table space or table or index. Want to change the SEGSIZE of a table space? The page size of an index? The DSSIZE (data set size) of a table space? No problem. Thanks to pending DDL, you just issue the appropriate ALTER statement and then materialize the change by way of an online REORG. Between the ALTER and the online REORG, is application access to the target object compromised? No. The table space or index affected is placed in the new (with DB2 10) and non-restrictive AREOR state (which basically means that a pending DDL change has been issued for the object but has not yet been materialized via online REORG).

So, what does this have to do with conversion of non-universal to universal table spaces? Well, it so happens that the only pending DDL change possible for a non-universal table space is a change that would, when materialized, result in the non-universal table space becoming universal. For a simple or segmented table space containing a single table, that change is an ALTER TABLESPACE that adds a MAXPARTITIONS specification to the object's definition. Issue such an ALTER, and after a follow-on online REORG the table space will be a universal partition-by-growth (PBG) table space. For a table-controlled partitioned table space, the change is an ALTER TABLESPACE that adds a SEGSIZE specification to the object's definition. Run an online REORG for the object after the ALTER, and voila -- you have a universal partition-by-range (PBR) table space. Easy.

Some considerations

As you plan for the conversion of your non-universal table spaces to the universal variety, there are things to which you should give some thought:
  • The non-disruptive process described above for converting simple and segmented table spaces to PBG universal table spaces (ALTER with MAXPARTITIONS, followed by online REORG) is available, as I mentioned, only for single-table simple and segmented table spaces. For a simple or segmented table space containing multiple tables, you'd have to either go the unload/drop/create/re-load route (with one PBG universal table space created for each table in the multi-table simple or segmented table space), or wait and see if a future release of DB2 provides a non-disruptive universal table space conversion process for multi-table simple and segmented table spaces (this is a known requirement).
  • The non-disruptive process for converting traditional range-partitioned table spaces to PBR universal table spaces (ALTER with SEGSIZE, followed by online REORG) is available, as I mentioned, only for table-controlled partitioned table spaces. For an index-controlled partitioned table space, you'll first need to accomplish the conversion to table-controlled partitioning. That's most easily done via issuance of an ALTER INDEX statement with NOT CLUSTER for an index-controlled partitioned table space's partitioning index (as described in a blog post I wrote a couple of years ago).
  • Materialization of the change to universal table space from non-universal (via online REORG after the appropriate ALTER statement) will invalidate packages that depend on a given table space. These packages will be automatically rebound (by default) when the associated programs are next executed, or you can rebind them explicitly. By default, plan management should be active on your system (i.e., the value of the PLANMGMT parameter in ZPARM should be EXTENDED or BASIC). That being the case, on the off chance that a package rebind operation leads to performance degradation (performance will typically be the same or better following a package rebind), you can very quickly switch back to the previous instance of a package via a REBIND PACKAGE command with the SWITCH(PREVIOUS) option.

There you have it. If you've not already started the process of converting your non-universal table spaces to universal PBG or PBR table spaces, it's time to get that effort going. Your organization will benefit from your actions.

    Friday, May 30, 2014

    DB2 for z/OS: Some Matters Pertaining to Nested DB2 Stored Procedures

    I worked recently with some IT people who were themselves engaged in the development and enhancement of a high-volume application that accesses DB2 for z/OS data by way of stored procedures (the client side of the application communicates with the target DB2 system via the distributed data facility, also known as DDF). Various actions implemented and plans formulated by this IT team serve to effectively illustrate some important points regarding the nesting of DB2 stored procedures ("nesting," in this context, refers to a situation in which stored procedures call other stored procedures). Through this blog entry, I want to bring these points to light.

    Why nest in the first place?

    This basically comes down to what you want in terms of the granularity of the functionality provided by your DB2 for z/OS stored procedures. Suppose that you have a stored procedure that performs a variety of data-related actions -- perhaps inserting some rows into one table, updating rows in another table, and generating a result set that will be consumed by the calling process. If these actions are always to be accomplished every time the stored procedure is called, keeping them in the one stored procedure is probably the way to go for optimal performance and for the sake of simplicity (simple is generally good); however, if it is likely that a typical CALL will end up exercising just one of the stored procedure's data processing functions, separating those functions in several different, smaller stored procedures could be advantageous. Of course, such a break-up doesn't necessarily imply nesting -- client applications could just call the individual stored procedures directly. The question then, is this: at what level do you want small-grain (referring to scope of functionality) stored procedures to be grouped so as to provide larger-grain database services? If you want this grouping to be done by client-side application programs, have those programs call the small-grain stored procedures directly. If, on the other hand, you want the grouping of small-grain stored procedures for larger-grain processing to be handled by higher-level "orchestrating" stored procedures, go the nesting route.

    In the case of the application I mentioned in the opening paragraph of this entry, the IT team wanted the flexibility provided by small-grain stored procedures, but they didn't want client-side developers to have to do the work of "stringing together" small-grain stored procedures to accomplish more-comprehensive data processing tasks. Given that situation, the decision to implement nested stored procedures is understandable.

    When one stored procedure invokes another: the CALL statement

    Because the application to which I've referred is quite dynamic with respect to changes in client-required data processing, the IT team in charge of stored procedure design and development went with a maximally flexible implementation: they created a "parent" stored procedure that would invoke nested, "child" stored procedures with calls of this form (here ":hvar" refers to a host variable):

    CALL :hvar (:hvar, :hvar, :hvar, :hvar) 

    That CALL-coding decision, in turn, dictated the use of an external stored procedure for the "parent" routine, versus a native SQL procedure. Why? Because a CALL in the body of a native SQL procedure must be of the form CALL procedure-name. In other words, a nested stored procedure invoked by a native SQL procedure must be explicitly named (i.e., "hard-coded") in the CALL statement issued by the native SQL procedure.

    COBOL was chosen by the IT team as the coding language for the "parent" stored procedure. It was further decided that the nested stored procedures would be native SQL procedures.

    "Hey, where's my zIIP offload?"

    When the stored procedure set-up went live, the supporting IT folks were surprised to see that the native SQL procedures were getting very little in the way of zIIP engine offload. Aren't native SQL procedures supposed to be majorly zIIP-eligible when they are called by DRDA requesters (i.e., through DDF)? Yes, but that is only true when a native SQL procedure is directly called by a DRDA requester (as I pointed out in an entry I posted to this blog a few months ago). See, a native SQL procedure always runs under the task of its caller. When the caller is a DRDA requester, that z/OS task is an enclave SRB in the DB2 DDF address space, and that makes the called native SQL procedure zIIP-eligible (to the tune of 55-60%). If a native SQL procedure is called by an external DB2 stored procedure, the native SQL procedure will execute under the task of the external stored procedure -- a TCB in a stored procedure address space -- and so will get little, if anything, in the way of zIIP offload.

    Making a result set generated by a nested stored procedure available to a program that is more than "one level up" from the nested procedure

    The aforementioned lack of zIIP offload seen for the nested native SQL procedures was exacerbated by the mechanism used to make result set rows generated by a cursor declared and opened in a nested stored procedure available to a "top-level" calling program (i.e., to a client-side program that initiated a chain of nested stored procedure calls): the result set rows were inserted by the nested stored procedure into a temporary table, and the top-level calling program would then retrieve the rows from that temporary table. Not an uncommon approach, but also not great from a performance perspective, and more complex than you'd like. DB2 10 for z/OS gave us a much better way of getting this job done: ditch the temporary table, and have the nested stored procedure declare its cursor WITH RETURN TO CLIENT (versus WITH RETURN TO CALLER, the formerly only valid specification by which a cursor's result set can be directly fetched only by the direct caller of the result-set-generating stored procedure). When a cursor in a stored procedure is declared WITH RETURN TO CLIENT, the result set rows can be directly fetched by the top-level calling program, no matter how far down the result-set-generating stored procedure is in a chain of nested calls.

    Rethinking things: there's more than one kind of CALL

    Even WITH RETURN TO CLIENT cursors might not have delivered the performance that was desired for the stored procedures of which I've been writing, particularly in light of the fact that, as mentioned, the nested, native SQL procedures were getting very little zIIP engine offload. Of course, the parent stored procedure could have been changed to a native SQL procedure from an external stored procedure, so as to maximize zIIP offload (and minimize general-purpose CPU consumption), but that would have required going with CALL procedure-name invocation of the nested stored procedures, and the IT team really wanted to stay with the CALL :hvar approach. Ultimately, a decision was made to go with an option made possible by the fact that the parent stored procedure was written in COBOL: the nested routines would be changed from native SQL procedures to COBOL subroutines, and they would be invoked as such from the parent stored procedure. That makes sense in this case: if zIIP offload isn't your ticket to minimizing consumption of general-purpose engine capacity, get CPU efficiency by reducing path length -- a COBOL subroutine called by a COBOL stored procedure program should be more CPU-efficient than an equivalent-function nested stored procedure called by that same COBOL stored procedure program (I say "should" because I haven't yet seen the results of the IT team's change to the COBOL subroutine approach, but I expect efficiency gains).

    In conclusion...

    You have plenty of options when you're looking to implement stored procedure-based DB2 database services -- external and native, nested and non-nested, WITH RETURN TO CALLER and WITH RETURN TO CLIENT cursors. There's not a one-size-fits all "right way" that will always be the best way. Consider the particulars of your situation, and make the choices that deliver the flexibility, performance, and functionality that YOU want.

    Tuesday, May 13, 2014

    Some zIIP Things of Which DB2 for z/OS People Should be Aware

    zIIP is short for System z Integrated Information Processor. It's a so-called specialty engine for an IBM mainframe server. The zIIP raison d'etre is reduced cost of computing on the System z platform, accomplished in two ways: 1) zIIP engines cost less than general-purpose processors (often referred to as CPs -- short for central processors), and 2) zIIP engines do not factor into the cost of software that runs in z/OS LPARs. zIIP engines aren't new (they were introduced in 2006), but there are some relatively recent zIIP-related developments that are important from a DB2 for z/OS perspective. I hope that this blog entry will help mainframe DB2 people to keep their zIIP knowledge up to date.

    The circle of zIIP-eligible work continues to expand

    At many System z sites, the number one driver of zIIP utilization is execution of SQL statements that get to DB2 via the distributed data facility, or DDF (i.e., statements that come from DRDA requesters). Two factors emerged in recent years to boost zIIP offload for DDF workloads. First, there was an increase in zIIP eligibility for work executing under enclave SRBs in the DDF address space: a DB2 9 PTF took this zIIP offload percentage from about 55% to about 60%. Second, native SQL PL routines were introduced, also with DB2 9. SQL PL -- short for SQL procedure language -- is the language in which DB2 native SQL procedures are written. A native SQL procedure executes under the task of the application process that calls it. When the calling process is a DDF-connected application, the task is an enclave SRB in the DDF address space. In that case, a native SQL procedure is zIIP-offloadable to the tune of about 60%, as is true of SQL statements in general that are issued by DRDA requesters (in contrast, an external stored procedure always runs under its own TCB in a WLM-managed stored procedure address space, and so is not zIIP-eligible). A number of organizations have been engaged for some time in converting external DB2 stored procedures called by DRDA requesters to native SQL procedures, so as to boost redirection of work to zIIP engines.

    This SQL PL-related zIIP offload booster was expanded in scope when, with DB2 10, user-defined functions as well as stored procedures could be written in SQL PL. UDFs coded in SQL PL, along with UDFs that have a RETURN statement that includes a scalar fullselect, are referred to as compiled SQL scalar UDFs, or, sometimes, as non-inline SQL scalar UDFs. Compiled SQL scalar UDFs, like inline SQL scalar UDFs and SQL table UDFs, run under the task of the UDF-invoking application process, just as do native SQL procedures. As noted previously, when the application process is a DRDA requester, the task in the z/OS system is a DDF enclave SRB, and that makes any invoked SQL UDF -- whether compiled scalar, inline scalar, or table -- about 60% zIIP-eligible.

    Some IBM DB2 utility operations contribute to zIIP engine utilization. In a DB2 9 system, it's the index maintenance activity associated with LOAD, REORG, and REBUILD execution. DB2 10 added RUNSTATS processing to this mix. DB2 11 delivers still greater zIIP offload for RUNSTATS, plus zIIP offload for inline statistics generation performed as an option of LOAD, REBUILD INDEX, and REORG.

    Other DB2-related zIIP utilization drivers include query parallelization (DB2 V8), prefetch read and database write operations (DB2 10), XML schema validation (DB2 10), log reads and writes (DB2 11), and clean-up of pseudo-deleted index entries (DB2 11).

    And don't think that DB2 alone exercises zIIP engines. I'm seeing more and more Java programs running in z/OS systems -- certainly in WebSphere Application Server for z/OS, but also in the form of DB2 for z/OS Java stored procedures, and as batch processes and started tasks that utilize IBM's JZOS Toolkit. Java code executing in a z/OS system can be redirected to another type of specialty engine called a zAAP, but on a mainframe server with zIIPs and no zAAPs, Java programs will drive utilization of zIIPs by way of what's called zAAP-on-zIIP functionality.

    The bottom-line message is this: DB2 itself, with each succeeding release, provides more and more opportunities for shifting work from general-purpose mainframe engines to zIIPs. Java programs, increasingly prevalent in z/OS systems, also boost zIIP utilization. That's good news from a System z cost-of-computing perspective, as it means that you can grow a mainframe workload without having to add general-purpose engines to a system; however, it also means that you need to keep an eye on zIIP engine utilization, because running those engines too hot could have a negative impact on application performance, as explained below.

    Don't over-utilize zIIP engines

    One thing that organizations have long liked about mainframe computers is the fact that you can run general-purpose engines at very high levels of utilization -- like 90% or more -- while still getting excellent application performance and throughput. The story is different for zIIP engines, and here's why: if a zIIP engine is not available when zIIP-eligible work is ready to be dispatched, that work can be directed instead to a general-purpose engine, but such redirection introduces a degree of delay. This delay can affect performance noticeably in a DB2 10 (or later) system, because (as previously noted) starting with DB2 10 prefetch processing became zIIP eligible. If prefetch reads are slowed because of overly high zIIP engine utilization, throughput for prefetch-intensive workloads (think batch, and decision support applications) can be negatively impacted. In a case of that nature, a slowdown in prefetch processing would show up in a DB2 monitor accounting long report (or an online display of thread detail data) as elevated "wait for other read" time (that is one of the so-called class 3 wait time categories).

    Frankly, I'd start thinking about adding more zIIP capacity to a System z server if I saw zIIP utilization regularly reaching 60% or more.

    July 2013: a zIIP capacity boost

    With more and more zIIP-eligible work in z/OS systems, and given the importance of not over-utilizing zIIP engines, you'll want to make sure that zIIP capacity on your mainframe servers stays well ahead of demand for same. That task became easier with an important announcement that IBM made in July of 2013: on a zEC12 or zBC12 server, the number of zIIP engines can be up to two times the number of general-purpose engines (prior to that announcement, the number of zIIP engines on a server could not exceed the number of general-purpose engines). The more heavily weighted your DB2 workload is towards DDF-connected applications (and the more Java code you have running in z/OS LPARs), the more heavily weighted your processing capacity should be towards zIIP engines versus general-purpose engines.

    zIIP engines are enabling organizations to grow z/OS system workloads in an ever more cost-effective manner. That said, you need to keep an eye on zIIP engine utilization. Yes, take actions to drive zIIP capacity utilization (so as to shift work off of general-purpose engines), but take actions as well to ensure that zIIP resources on your system are adequate to handle zIIP-eligible work in an optimal fashion.

    Tuesday, April 29, 2014

    A Tiger Changes His Stripes: I LIKE Java on z/OS

    I well remember a phone conversation, from about six or seven years ago, during which a friend of mine and I discussed an application project at his site which involved use of DB2 for z/OS stored procedures. Me: "In what language will the stored procedure programs be written?" My friend (sounding like someone whose favorite sports team had just lost the championship game): "Java." Me (groaning): "Java? Why Java?" My friend: "Because some 'insultants' told our management that that was the right way to go." I probably winced at that point -- both because I felt his pain, and because I was working at the time as an independent DB2 insultant (er, consultant).

    My disappointment on hearing my friend's news was genuine. Sometime later, I was asked to weigh in on an issue that people with another organization were trying to decide: should some new DB2 for z/OS stored procedures be coded in Java or SQL PL (i.e., SQL Procedure Language, used to create DB2 native SQL procedures). That was a no-brainer, as far as I was concerned. SQL PL was my recommendation because I really like DB2's native SQL procedure functionality, and because I really didn't like the idea of running Java programs in z/OS systems. I was fine with Java code executing in a browser on my laptop, or in an application server on a distributed systems platform, but on a mainframe? No! I saw Java, in the context of a z/OS execution environment, as being a CPU and a memory hog, a waste of processing capacity. Want performance and efficiency from your z/OS programs? You don't want Java.

    Well, that was then, and a funny thing happened on the way to now: I started encountering more and more organizations running Java programs on z/OS systems, and they were getting excellent performance and throughput, and people weren't griping about mainframe resource utilization. How could that be? Had reality been warped in some way? No, nothing that cosmic. Basically, what changed was technology -- IBM System z and z/OS technology, in particular. A lot of the changes were on the hardware side, including:
    • zAAPs and zIIPs to reduce the cost of computing for z/OS-based Java applications. zAAPs (System z Application Assist Processor) and zIIPs (System z Integrated Information Processor) are specialty engines that pack the power and performance of System z general-purpose central processors (GCPs) but do not factor into the cost of mainframe software (that's based on GCP capacity). zAAPs are specifically intended to run Java workloads (as well as handling XML data processing functions). zIIPs, while typically driven primarily by DB2 for z/OS DDF/DRDA (i.e., network-attached client-server) processing, can also be used for zAAP-eligible work when a system does not have zAAP engines (that's called zAAP-on-zIIP). zAAPs and zIIPs have been around for a while, but they've come lately to the forefront in terms of Java processing in large part because of the aforementioned zAAP-on-zIIP capability, and as a result of the recent (July 2013) increase in the allowable ratio of zAAP and zIIP engines to GCPs on a zEC12 or zBC12 server (whereas the number of zAAP or zIIP engines on a mainframe formerly could not exceed the number of GCPs on the server, now for a zEC12 or zBC12 server there can be up to twice the number of zAAPs or zIIPs as GCPs).
    • More and cheaper mainframe memory, and new ways to manage it. I no longer think of Java as a memory hog on z/OS systems, any more than I think of DB2 as a memory hog. Like DB2, Java applications are exploiters of large server memory resources: give them more memory, and they'll perform better and with greater efficiency. Providing Big Memory for Java applications on z/OS is much more feasible and practical now than was the case just a few years ago. A modern mainframe (z196 or zEC12) can be configured with up to three terabytes of central storage, and that memory is far lest costly per gigabyte than it would have been in the not-too-distant past. And, this is not just a more-and-cheaper story -- there's a smarter angle, too, in the form of large frame support. Instead of managing all memory in 4 KB page frames, modern mainframes provide the ability to have some of a z/OS system's memory managed in 1 MB frames, thereby boosting CPU efficiency by reducing the overhead associated with translating virtual to real storage addresses. A JVM (Java virtual machine) in a z/OS system can be set up to use non-pageable 1 MB page frames for heap storage, and (if the server has Flash Express for super-fast paging) pageable 1 MB frames for JIT (just-in-time compiler) work areas. With a zEC12 or zBC12 server, the Java heap can be backed by 2 GB page frames for even greater efficiency.
    • Transactional Execution (TX) facility. This capability, known in the industry as hardware transactional memory, is essentially an in-hardware implementation of optimistic locking for transactional workloads. The TX facility, a feature of zEC12 and zBC12 servers, is aimed largely at boosting the performance of Java application workloads through lock avoidance.
    • In-hardware boosts for decimal floating-point and IEEE binary floating-point operations. Java applications often make use of these operations.
    • Runtime instrumentation (RI). This is a hardware facility, introduced with the zEC12 and zBC12 servers, that delivers a dynamic and self-tuning online recompilation capability for Java applications.

    So, today's zEnterprise servers and the z/OS operating system provide an outstanding execution environment for Java applications. How are organizations implementing DB2 for z/OS-accessing Java programs in this environment? This is what I'm seeing:
    • The IBM JZOS Toolkit. This is a set of of classes, distributed with the IBM Java SDKs for z/OS, that can be used to launch Java applications as batch jobs or started tasks in a z/OS system. Also included in the toolkit are Java methods that facilitate access to z/OS data sources and services such as DB2. More information about the classes associated with the JZOS Toolkit can be found in developerWorks on IBM's Web site.
    • DB2 for z/OS Java stored procedures. Java as a language for stored procedure development has been supported by DB2 for z/OS since Version 5 (late 1990s), but for a long time there was not a lot of activity here, due largely, I think, to the historically negative view of Java on z/OS to which I alluded in the opening part of this blog entry. Nowadays, Java stored procedures are showing up at a growing number of DB2 sites. This trend should get a boost from DB2 11 for z/OS, which added support for running Java stored procedures in 64-bit JVMs (previously, only 31-bit JVMs were supported for Java stored procedures). Not only that, but these are multi-threaded JVMs; so, in a DB2 11 system you'll have a single 64-bit JVM per WLM-managed stored procedure address space in which Java stored procedures run, versus having a 31-bit JVM for each TCB in such an address space. That means a smaller JVM storage footprint, less overhead associated with starting JVMs, and more concurrently active Java stored procedures per stored procedure address space.
    • WebSphere Application Server (WAS) for z/OS. I'm encountering WAS on z/OS a lot more than was the case just a few years ago. An organization with which I work pretty extensively is a case in point: they have a modern, high-volume, business-critical, client-server application that is wholly contained within a z/OS system, with business logic running in WAS and a DB2 subsystem on the same LPAR providing data management (and a good bit of data access logic in the form of stored procedures and user-defined functions). Plenty of other companies have gone, or are planning on going, this same route. One of the interesting aspects of such an environment is the decision presented regarding the choice of JDBC driver for the WAS-based, DB2-accessing Java applications: type 2 (a "local" connection using the DB2 Recoverable Resource Services Attach Facility) or type 4 (a "remote" connection through the DDF address space of the target DB2 subsystem). The type 2 JDBC driver provides the more CPU-efficient connection to a DB2 subsystem in the same LPAR as WAS. The type 4 driver, while adding to connection pathlength (this because the connection involves going into the LPAR's TCP/IP stack and then into the DB2 DDF address space), increases zIIP engine offload for SQL statement execution because the statements run under enclave SRBs in DDF. Which JDBC driver should you use? That will vary by application. For an application with transactions that issue a lot of quick-running SQL statements, the type 2 driver might be a good fit, as there could be a premium in that case on getting to DB2 and back as efficiently as possible. For applications characterized by a smaller number of longer-running SQL statements, the greater zIIP offload for SQL execution provided by the type 4 JDBC driver might tilt you in that direction. The good news? Going from one driver to the other for an application is a pretty simple thing. It's not a big deal to try both of them.

    And so you have it. The positive results that organizations with which I work have gotten in running Java on z/OS systems has caused me to change my tune (or my stripes, as suggested by this entry's title). Tell me now that you're running Java on z/OS, and you won't hear me groan. Instead, I'm likely to say, "That's a fine choice you've made, there." My colleagues at the IBM Poughkeepsie (zEnterprise and z/OS) and Silicon Valley Labs (DB2 for z/OS), who've worked hard to make z/OS an excellent platform for Java applications, would wholeheartedly agree.

    Monday, April 21, 2014

    DB2 for z/OS: the Importance of Historical Performance Data

    Just within the past couple of weeks, I've received questions of a similar nature from two different DB2 for z/OS people. In each message, a DBA described a change that he had made in a production DB2 environment (in one case a size increase for several buffer pools, and in the other a change from an external stored procedure to a native SQL procedure). The questions asked pertained to assessing the impact of the change on system and/or application performance.

    I've been fielding questions of this nature for a long time. In every such instance, my first question is pretty much the same: do you have accounting and statistics trace information for the DB2 subsystem in question from before the time of the change? If the answer to that question is, "No," I know that assessing the impact of the DB2 system or application change will be less easy -- maybe a lot less easy -- than it otherwise would have been. In this blog entry, I hope to convey to you the importance of having historical DB2 for z/OS performance information on hand (ideally, information that is based on DB2 accounting and statistics trace information), and to provide you with some related guidance and recommendations.

    Why you need "before" performance data

    In a nutshell: because the value of your "after" data (meaning, from after the change that you made and want to assess) is considerably diminished if there's no "before" data that can be used for comparative purposes. This is true not only for situations in which you want to determine the performance impact of an alteration of a DB2 setting (e.g., a change to PGFIX(YES) for a buffer pool), or of some aspect of a database object's physical design (e.g., going to LOCKSIZE ROW for a table space) or of DB2-accessing application code (e.g., going to multi-row INSERT for a program) -- it's also applicable to situations in which something's gone wrong, performance-wise (perhaps a transaction's response time has quadrupled, or a batch job is running past a time specified in an SLA), and you need to get to the root of the problem. Online monitoring is great for seeing things from a "now" perspective, but sometimes you need the "then" view of things, too, to avoid going down dead-end streets in your pursuit of a solution.

    Your historical data raw materials

    First, you need to have DB2 writing trace records to data sets on disk -- not just to the in-memory buffers that are used by performance monitors. In my experience, DB2 trace records are most often written to SMF data sets (GTF is another option). As for the trace classes to have active, I'd go with accounting trace classes 1, 2, and 3, and statistics trace classes 1, 3, 4, 5, and 6 (class 5 is applicable to a DB2 data sharing system). The combined overhead of running all these trace classes should be less than 5%. There are organizations that like to have DB2 accounting trace classes 7 and 8 on, too, to provide package-level information. I agree that package-level figures can be very useful, but at some sites there is concern about the volume of trace records generated via accounting trace classes 7 and 8. If that's a concern at your shop, you could leave classes 7 and 8 off most of the time, and activate them on-demand if you have a need for package-level accounting data. Here's another idea: leave accounting trace classes 7 and 8 on, and use the new (with DB2 10) ZPARM parameter SMFCOMP to activate compression of DB2-generated SMF trace records. You're likely to get a disk space reduction of 60-80% for these compressed SMF records, and the CPU overhead associated with the compression work is likely to be very small (like 1% or less). Before compressing SMF trace records, verify that your DB2 monitor can handle SMF records in compressed form (IBM's OMEGAMON DB2 monitor can) -- if your monitor can't, you can use DB2 sample program DSNTSMFD, provided via APAR PM27872, to decompress DB2 SMF trace records that have been compressed).

    Options for keeping historical performance data in a space-efficient way

    With DB2 trace records being written to SMF data sets, you then need to decide how long to keep the data around. This is going to have something to do with the volume of trace data produced on a daily basis (workload dependent -- lots more DB2 trace data will be generated at some sites versus others), and the amount of space available for the data in your disk subsystem. Personally, I'd like to have at least 60 days of DB2 accounting and statistics trace data available to me -- 30 days at a minimum. What if you'd like to keep a rolling 60 days of DB2 trace data but don't have the space needed to hold that much in its raw SMF form (even with SMF data compression turned on)? In that case, you have at least a couple of options, which I'll briefly describe below.

    Option 1: print to disk. Have a batch process kick off once each night to generate (by way of your DB2 monitor's batch reporting function) a standard set of accounting and statistics reports, and "print" those reports to disk. At a DB2-using site at which I worked in the early- and mid-2000s, we had such reports written to a set of GDGs, with each GDG having 60 data sets (thus, an automatically maintained rolling 60 days of reports). If you go with the "print to disk" approach, I'd recommend generating, each night (using the previous days' SMF records), the following reports:
    • At least one statistics long report for each production DB2 subsystem. That report should have FROM and TO times that will capture 24 hours of system activity (midnight-to-midnight), and the data should be summarized at 1-hour intervals (so, one report would consist of twenty-four 1-hour subreports). If you are running DB2 in data sharing mode, in addition to member-level statistics long reports generate one group-level statistics long report for each production data sharing group.
    • At least one accounting long report for each production DB2 subsystem. If you just generate one report, I'd go with a report that captures 24 hours of data (as with the previously mentioned statistics long report, this should be midnight-to-midnight, with data summarized at 1-hour intervals), and I'd have the data ordered by connection type (i.e., aggregated at the connection type level, so that I could see, for a given time period, the CICS-DB2 activity, the DRDA activity, the call attach activity, etc.). If you want to slice and dice things a little more finely, consider generating at least a couple more accounting long reports each night: 1) for your CICS-DB2 and batch-DB2 activity, consider a report that includes only the CICS and call attach connection types (and/or TSO attach, depending on how your batch jobs connect to DB2), and that orders information by correlation name (batch job names and CICS transaction names are correlation names in DB2 accounting lingo), so that you'll see detailed accounting data for each DB2-accessing batch job and CICS transaction; and 2) for your DDF activity, generate an accounting detail report that includes only the DRDA connection type, with data ordered by primary authorization ID (that should then show you information at the DDF-using application level if, as is typically the case, each such application has its own ID with which it connects to the DB2 server).

    Option 2: a performance database. Here, you'd use two functions of your monitor (if your DB2 monitor has these functions): a "reduce" function (probably, to filter and/or summarize information contained in the raw SMF trace records) and a "save" or "file" function. The "save" or "file" function would write data to a file, and you could subsequently use that file to load the DB2 tables that would comprise your performance database. If you go this route, you'll want to follow the directions provided in your monitor's documentation to create and populate the tables in the performance database. The performance database approach might be more involved than the print-to-disk option (in terms of set-up, that is), but once you get the prep work done and the data loaded, you'll have a lot of flexibility in terms of how you view and report on your historical DB2 performance data.

    Information is power

    When you have historical DB2 performance data at your fingertips, you'll be able to much more effectively point to the effects of your DB2 tuning changes, and say, "Our team did that." What was the impact of a buffer pool change on DB2 DBM1 address space CPU time? How did thread reuse increase when you set up some CICS-DB2 protected entry threads, and how did average in-DB2 CPU time change for transactions that use those protected threads to execute DB2 packages bound with RELEASE(DEALLOCATE)? How did the change to row-level locking for a particular set of table spaces affect deadlock and timeout totals, and how was per-transaction in-DB2 CPU time impacted? How did an increase in the value of the SRTPOOL parameter in ZPARM affect activity in the buffer pools used for the work file table spaces? How did an increase in the page size of a set of indexes affect the throughput of a certain batch job? How did leveraging of DB2 10 high-performance DBATs impact per-transaction in-DB2 CPU times for a DDF-using application? How did a change from an external stored procedure to a native SQL procedure affect zIIP offload for a DDF-using application? How did migration to DB2 11 from DB2 10 affect average in-DB2 CPU times for your applications? How did the archive and purge of "old and cold" data in DB2 tables affect GETPAGE counts in the system?

    And so on, and so on. Yes, your DB2 monitor's "what's happening now" capability makes it a powerful tool, but combine that with some historical data, and watch that power grow exponentially. The cost is low, and the return is high. If you're not there, get busy.