Home View (Oracle Performance Analyst)

From RapidSQL
Jump to: navigation, search

Go Up to Oracle Performance Analyst Statistics

The Home view lets you review availability and overall performance of all monitored databases from a single window. The Home page includes the following sections:

Key Ratio Analysis Pane

Database performance analysts typically use one of two methods for examining the performance levels of a database: Ratio-based or wait/bottleneck-based. Ratio-based analysis involves examining a number of key database ratios that can be used to indicate how well a database is running. Performance ratios serve as roll-up mechanisms for at-a-glance performance analysis. Succinctly presented performance ratios can be quickly scanned to see if any database needs immediate attention.

While there are certainly many opinions as to which rules to follow, there are some standards to adhere to. To start with, many of the formulas that make up ratio-based analysis must be derived from delta measurements instead of cumulative statistics. Many of the global ratios a you examine come from the v$sysstat performance view. The performance view maintains a count of all the occurrences (in the VALUE column) of a particular database incident (in the NAME column) since the database was brought up. For databases that are kept up for long periods of time, these values can grow quite large and impacts how a particular ratio is interpreted. However, if delta statistics are used (taking, for a specified sampling period, the before and after counts of each statistic that make up a ratio), then an accurate and current portrayal of the various ratios can be had.

While there are a number of rules of thumb that can be used as starting points to begin the evaluation of database performance, DBAs must determine each database’s individual ‘personality’ with respect to the key performance ratios. Some hard and fast rules simply do not apply to every database. The danger in using blanket ratio standards is that they can lead to haphazardly take action, which can at times contribute nothing to the situation, and sometimes even degrade performance.

The following ratios are used on the Performance Analyst Home page to succinctly communicate the general overall performance levels of the monitored database:

Buffer Cache Hit Ratio

Data read from memory will produce end-user response times many times faster than when that same data is read from disk. Keeping physical I/O's to an absolute minimum is one of the purposes of the Oracle buffer/data cache.

The buffer cache hit ratio is a terrific indicator of how often user requests for data are satisfied through memory vs. being physically read from disk. A number of key counters in Oracle are used to arrive at this statistic. The table below lists them:

Counter Description

DB BLOCK GETS

Data read from memory for DML operations.

CONSISTENT GETS

Data read from rollback segments in memory.

PHYSICAL READS

Data read physically from disk.

DIRECT READS

Data read physically from disk that bypasses the buffer cache. Direct reads are filtered out of overall physical reads so an accurate cache hit ratio can be determined.

Dividing the data read from memory by data read from disk yields the cache hit ratio.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Key Ratio Analysis Pane Oracle Performance Analyst Statistics > Memory View (Oracle Performance Analyst) > Key Ratio Analysis pane

Metrics

To help ensure excellent performance, you want to keep your cache hit ratio in the neighborhood of 90% or higher. However, you should be aware that every database has its own ‘personality’ and may exhibit excellent performance with below average readings for the cache hit ratio. You should also be aware that excessive logical I/O activity can produce a very high cache hit ratio while actually degrading overall database performance. Consistently viewed low readings (60% or less) may require tuning attention on the part of the DBA. Note: The db_cache_size parameter value may need to be adjusted. The db_cache_size parameter is dynamic and may be altered without stopping and starting the database instance.

Correction

If a problem is found, then: Increase the size of the db_cache_size parameter through use of the ALTER SYSTEM SET db_cache_size command value (take caution to not over-allocate; ensure enough free memory exists on server before increasing value). Monitor the new value to see if performance improves. If using an SPFILE, save the new configuration values so Oracle will reuse them each time the database is stopped and re-started.

Library Cache Hit Ratio

Oracle's shared pool is made up of two main memory areas - the Library Cache and the Data Dictionary Cache, and offers a number of tuning possibilities.

The library cache holds commonly used SQL statements - basically database code objects. A method for improving performance in Oracle is to encourage the reuse of SQL statements so expensive parse operations may be avoided. The library cache assists this tuning effort.

The library cache hit ratio offers a key indicator in determining the performance of the shared pool. It shows how often SQL code is being reused by other database users vs. the number of times a SQL statement is broken down, parsed, and then loaded (or reloaded) into the shared pool.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Key Ratio Analysis Pane Oracle Performance Analyst Statistics > Memory View (Oracle Performance Analyst) > Key Ratio Analysis pane

Metrics

A high library cache hit ratio is a desirable thing. You should strive for a hit ratio between 95-100%, with 99% being a good performance benchmark for code reuse. Note that when a database is first started, the library cache hit ratio will not be at an optimal level because all code being used will be relatively new, and as such, must be parsed and placed into the shared pool. If, however, after a solid hour or two of steady database time, the library cache hit ratio has not increased to desirable levels, you should look into the possibility of increasing the shared_pool_size parameter. Other red flags that can indicate a too small shared pool include: A wait count for the event 'latch free' of ten or greater The library cache wait count of two or greater. These indicators can be tracked with Performance Analyst’s Bottleneck and Wait detail views. A way of improving the library cache hit ratio is by encouraging code reuse through the implementation of bind variables. Discouraging hard coded literals in application code and instead making use of variables bound at run time aids in the reuse of SQL code that is maintained in Oracle's shared pool. Note that bind variables can have an affect on the cost-based optimizer though. A second way is to pin frequently used code objects in memory so they will always be there when needed. This can be accomplished through the use of the system supplied DBMS_SHARED_POOL package. You can use Performance Analyst to view objects in the shared pool that are always present and/or have increasing reload numbers to help identify objects that are good candidates for pinning.

Correction

If a problem is found, then: Increase the size of the shared_pool_size parameter through use of the ALTER SYSTEM SET shared_pool_size command value (take caution to not over-allocate; ensure enough free memory exists on server before increasing value). Monitor the new value to see if performance improves. If using an SPFILE, save the new configuration values so Oracle will reuse them each time the database is stopped and re-started. If you determine that SQL literals are causing SQL to not be reused, you can: Change the cursor_sharing parameter to FORCE by using the ALTER SYSTEM SET cursor_sharing=FORCE command. Monitor database to see if parse activity is reduced and library cache reloads shrink. If using an SPFILE, save the new configuration values so Oracle will reuse them each time the database is stopped and re-started. If using an Init.ora file, add the cursor_sharing=FORCE parameter to the file.

Dictionary Cache Hit Ratio

Oracle's shared pool offers an number of tuning possibilities and is made up of two main memory areas:

  1. Library Cache
  2. Data Dictionary Cache

The dictionary cache hit ratio offers another key indicator in determining the performance of the shared pool. It shows how often object definitions are found in memory vs. having to read them in from disk. Because Oracle references the data dictionary many times when an SQL statement is processed, it is imperative that as much of this vital reference information be kept in RAM as possible.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Key Ratio Analysis Pane Oracle Performance Analyst Statistics > Memory View (Oracle Performance Analyst) > Key Ratio Analysis pane

Metrics

Just as with the library cache, a high data dictionary cache hit ratio is desirable. Strive for a hit ratio between 90-100%, with 95% being a good performance benchmark. NOTE: When a database is first started, the data dictionary cache hit ratio is not at an optimal level because all references to object definitions are relatively new, and as such, must be placed into the shared pool. Look for hit ratios in the eighty's for new database startups. If, however, after a solid hour or two of steady database time, the data dictionary cache hit ratio has not increased to desirable levels, increase the shared_pool_size parameter. Databases supporting applications that involve large number of objects (such as an Oracle Financials installation) should have larger than normal shared pools to support the required object definitions. Although each parameter is not individually tunable (it was in Oracle6), you can see which area of the dictionary cache could be pulling the overall hit ratio down.

Correction

If a problem is found, do the following: Increase the size of the shared_pool_size parameter through use of the ALTER SYSTEM SET shared_pool_size command value. Take caution to not over-allocate; ensure enough free memory exists on server before increasing value. Monitor the new value to see if performance improves. If using an SPFILE, save the new configuration values so Oracle reuses them each time the database is stopped and re-started.

Memory Sort Ratio

Oracle9i or later has the option of running automatic PGA memory management. Oracle has introduced an Oracle parameter called pga_aggregate_target. When this parameter is set and you are using dedicated Oracle connections, Oracle ignores all of the PGA parameters in the Oracle file, including sort_area_size, hash_area_size and sort_area_retained_size. Oracle recommends that the value of pga_aggregate_target be set to the amount of remaining memory (less a 10% overhead for other server tasks) on a server after the instance has been started.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Key Ratio Analysis Pane Oracle Performance Analyst Statistics > Memory View (Oracle Performance Analyst) > Key Ratio Analysis pane Oracle Performance Analyst Statistics > Users View (Oracle Performance Analyst) > Home > Key User Analysis Pane

Metrics

If the memory sort ratio falls below 90%, and you are on Oracle9i or later, investigate the use of pga_aggregate_target. Once the pga_aggregate_target has been set, Oracle automatically manages PGA memory allocation, based on the individual needs of each Oracle connection. Oracle9i or later allows the pga_aggregate_target parameter to be modified at the instance level with the alter system command, thereby lets you dynamically adjust the total RAM region available to Oracle9i. Oracle9i also introduced a new parameter called workarea_size_policy. When this parameter is set to automatic, all Oracle connections benefits from the shared PGA memory. When workarea_size_policy is set to manual, connections allocates memory according to the values for the sort_area_size parameter. Under the automatic mode, Oracle tries to maximize the number of work areas that are using optimal memory and uses one-pass memory for the others.

Correction

If you find a problem, do the following: Edit the Init.ora or SPFILE file for the database. Increase the amount of sort_area_size to a higher value. Take caution to not over-allocate; ensure enough free memory exists on server before increasing value. EVERY user receives this amount for sorting. Cycle the Oracle server when possible to allow the new value to take effect. Monitor new value to see if performance improves. In addition to increasing the amount of memory devoted to sorting, find inefficient SQL that cause needless sorts. For example, UNION ALL does not cause a sort whereas UNION does in an SQL query (to eliminate duplicate rows). DISTINCT is frequently misapplied to SQL statements and causes unnecessary sort actions. There are times you simply cannot stop sort activity. This being the case, try to keep it in memory whenever possible. However, large data warehousing systems oftentimes simply exhaust RAM sort allotments, so if disk sorts must occur, try to ensure three things: Your user's TEMPORARY TABLESPACE assignment is not the SYSTEM tablespace, which is the default assignment. NOTE: For Oracle9i or later, you can specify a default tablespace other than SYSTEM for every user account that is created. The TEMPORARY TABLESPACE assigned to your users is placed on a fast disk. The TEMPORARY TABLESPACE has the tablespace parameter TEMPORARY assigned to it, which allows sort activity to be performed in a more efficient manner.

Parse/Execute Ratio

Each time a new SQL statement is submitted to Oracle, the kernel must ‘parse’ the statement, which involves syntax checks, security checks, and object validations. The Parse/Execute Ratio shows the percentage of SQL executed that did not incur a hard parse.

Note: This statistic is available on the Home page, Memory home page, and the Users home page.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Key Ratio Analysis Pane Oracle Performance Analyst Statistics > Memory View (Oracle Performance Analyst) > Key Ratio Analysis pane Oracle Performance Analyst Statistics > Users View (Oracle Performance Analyst) > Key User Analysis Pane

Metrics

Seeing low values might indicate that users are firing SQL with many hard-coded literals instead of using bind variables within an application. High values (90% and greater) generally indicate Oracle is saving precious CPU by avoiding heavy parse tasks.

Correction

The best way to reduce unnecessary parse activity is to encourage SQL statement reuse. This can be done by promoting SQL execution through the use of stored procedures or applications where bind variables can be used. Oftentimes, literals in otherwise identical SQL statements can cause unneeded parse work for Oracle. The use of bind variables can counter that problem. If you determine that SQL literals are causing SQL to not be reused, do the following: Change the cursor_sharing parameter to FORCE by using the ALTER SYSTEM SET cursor_sharing=FORCE command. Monitor database to see if parse activity is reduced and library cache reloads shrink. If using an SPFILE, save the new configuration values so Oracle reuses them each time the database is stopped and re-started. If using an Init.ora file, add the cursor_sharing=FORCE parameter to the file.

Buffer Busy Waits

Buffer busy waits occur when a process needs to access a data block in the buffer cache, but cannot because it is being used by another process. A wait event generally happens because a buffer is being read into the buffer cache by another process or the buffer is in the buffer cache, but cannot be switched to a compatible mode immediately.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Key Ratio Analysis Pane Oracle Performance Analyst Statistics > Memory View (Oracle Performance Analyst) > Home > Bottleneck Analysis pane

Metrics

Buffer busy waits normally center around contention for rollback segments, too small an INITRANS setting for tables, or insufficient free lists for tables.

Correction

You can use the automatic segment management feature in Oracle9i locally-managed tablespaces to help make free list problems a thing of the past. Using an UNDO tablespace in 9i or later can help remedy any rollback contention problem. You can also obtain which objects have actually experienced buffer busy waits in Oracle9i or later by querying the sys.v_$segment_statistics. This view is not populated unless the configuration parameter statistics_level is set to TYPICAL or ALL.

Rollback Contention Ratio

Rollback segments are used by Oracle to hold data needed to rollback any changes made through inserts, updates, or deletes to various Oracle objects. They also allow Oracle to have read consistency for long running queries, are used for recovery purposes, and play a role during exports of database information. In a heavy transaction processing environment, rollback segments are accessed continuously and therefore are subject to contention problems. The Rollback Contention Ratio helps identify contention occurring on the system relating to rollbacks.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Key Ratio Analysis Pane Oracle Performance Analyst Statistics > Objects View (Oracle Performance Analyst) > Home > Key Ratio Analysis Pane

Metrics

Overall, if the rollback contention ratio approaches 1% or more, create more rollback segments. Also consider creating a specialized, larger, rollback segment to be used by long running transactions. Doing so alleviates dynamic rollback extensions and cuts down heavily on ORA-01555 Snapshot Too Old errors.

Correction

Begin by creating new rollback segments and altering them to be online for use. Then monitor the overall contention ratio to see if it begins to drop. If you are using Oracle9i or later, consider using an UNDO tablespace and allowing Oracle to automatically control rollback segment management.

Latch Miss Ratio

Protecting the many memory structures in Oracle's SGA are latches. They ensure that one and only one process at a time can run or modify any memory structure at the same instant. Much more restrictive than locks (which at least allow for some collective user interaction), latches have no queuing mechanism - so either you get it or you do not and are forced to continually retry.

The latch miss ratio defines the number of times a process obtained a willing-to-wait latch vs. missing the attempt.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Key Ratio Analysis Pane

Metrics

If the latch miss ratio exceeds 1%, you should take action to resolve the amount of latch contention.

Correction

Examine the details regarding the latch contention. Increasing the shared_pool_size can assist in latch problems also. The table below describes latches:

Latch Description

Cache buffer chain latch

Protects paths to database block buffers in the buffer cache. High I/O loads tend to cause contention for this latch. You can alleviate contention somewhat by adding more buffers to the cache (through the db_block_buffers or db_cache_size parameter) or by adding more LRU latch chain latches with the db_block_lru_latches parameter.

Library cache latches

Protects cached SQL statements in the library cache area of the Oracle shared pool. Contention for this latch is the usual result of literals being used for SQL statements instead of bind variables.

Parallel Query Busy Ratio

Oracle's parallel query feature, when used properly, allows for terrific increases in performance for many SQL and utility operations. Parallel operations can be introduced through SQL hints, specified in an object's DDL, or used in command line arguments for utility programs (like SQL*Loader). To effectively service parallel query requests, ensure that enough query servers exist in the database instance. The Parallel Query Busy Ratio is an indicator of how busy all the servers are on the database in question.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Key Ratio Analysis Pane

Metrics

If the Parallel Query Busy Ratio approaches 80-90%, add more query servers to the database, or examining parallel requests to ensure they are being used in an efficient and necessary manner.

Correction

To fix, do the following: Edit the Init.ora file or SPFILE for the database. Increase the amount of parallel_max_servers to a higher value. Cycle the Oracle server when possible to allow the new value to take effect. Monitor new value to see if performance improves. You can also investigate the use of the parallel_automatic_tuning parameter.

Free Shared Pool Percent

Oracle's shared pool need not use all of the memory given to it through the shared_pool_size parameter. If the database does not have many object and code definitions to reference, then the shared pool can contain an abundance of free memory that is not being used.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Key Ratio Analysis Pane Oracle Performance Analyst Statistics > Memory View (Oracle Performance Analyst) > Key Ratio Analysis pane

Metrics

Under-allocating the shared pool size can have a serious impact on your database's performance, but over-allocating the shared pool can have run time ramifications as well. If you have a good chunk of memory allocated to the Oracle shared pool that is never used, it might be more of a performance enhancement to reduce the shared pool amount and instead give the memory to the buffer/data cache, or even back to the operating system itself. In terms of knowing when to reduce the shared pool, a good benchmark is continually seeing 2-3MB of free memory. On the other hand, if after an hour or so of beginning database operation, you see that virtually no free memory is left in the shared pool, or you are seeing ORA-4031 errors (that indicate definitions cannot find enough contiguous free space in the shared pool), increase the pool by 10% or more.

Correction

If you continuously see little or no free memory in the shared pool, do the following: Increase the size of the shared_pool_size parameter through use of the ALTER SYSTEM SET shared_pool_size command value. Take caution to not over-allocate; ensure enough free memory exists on server before increasing value. Monitor the new value to see if performance improves. If using an SPFILE, save the new configuration values so Oracle reuses them each time the database is stopped and re-started. You can also use the ALTER SYSTEM FLUSHED SHARED_POOL command to remove all objects from the shared pool and start with a clean slate.

Bottleneck Analysis Pane

When an Oracle database is unning, every connected process is either busy doing work or waiting to perform work. A process that is waiting can mean nothing in the overall scheme of things or it can be an indicator that a database bottleneck exists. And this is where wait-based or bottleneck analysis comes into play. DBAs used this form of performance analysis to determine if perceived bottlenecks in a database are contributing to a performance problem.

Bottleneck analysis is a valid method of measuring performance because it helps you track where a database has been spending its time. If latch contention or heavy table scan activity has been dragging a database’s performance down, you can use bottleneck analysis to confirm the actual root cause. Once one or more wait events or other bottlenecks have been pinpointed as possible performance vampires, you can determine which sessions and objects are causing the problem.

For wait event analysis to be properly carried out, it is imperative that the timed_statistics initialization parameter be set to TRUE. By default this parameter is set to FALSE, which disallows the collection of wait times for each wait event defined to the Oracle engine. For one to really understand the impact of wait events on database performance, you need to discover what the database is or has been waiting on, and the durations of the waits. Having both allows a complete picture to be formed regarding the magnitude of wait-initiated performance degradations. Almost all Oracle experts now agree that allowing the collection of timing statistics adds little if anything to database overhead, so setting timed_statistics to TRUE should not be a worry. The Performance Analyst Home page identifies the top system and session waits that can decrease performance.

When using bottleneck analysis, you cannot rely only on the information contained in the wait event views that Oracle provides. For example, an object can attempt to extend into another extent of space in a tablespace and yet be denied if no such free space exists. Such a failure is not reflected in any wait event, but still represents a very real bottleneck to the database. In the same way that you cannot depend on only a few ratios to properly carry out ratio-based performance analysis, an administrator must include other statistical metrics in their overall bottleneck analysis framework to obtain an accurate performance risk assessment. Performance Analyst works to identify bottlenecks in your database that fall outside of pure wait events so you can get a total picture of all stoppages in your system. The Performance Analyst Home page displays space-related bottlenecks as well as other sources of contention/bottlenecks that can decrease the overall performance of your system.

The following bottleneck indicators are used on the Performance Analyst Home page to succinctly communicate the general overall performance level of the monitored database:

Top System Bottlenecks

When viewing wait statistics, there are many levels of detail that you can view. The first level is the system view, which provides a global, cumulative snapshot of all the waits that have occurred on a system. Viewing these numbers can help you determine which wait events caused the most commotion in a database so far. The Top System Bottlenecks section identifies the top waits that have occurred on the Oracle database based on the number of waits per event.

For Oracle 10g only you see a graph that provides a historical look at the last hour and allows you to see what wait classes experienced the most wait activity in terms of time waited. To get detail into which wait events have accumulated the most time, you can drill down into the System Waits view.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Bottleneck Analysis Pane

Correction

Appendix A in the Oracle Reference manual contains a listing and description of every current wait event defined in Oracle. DBAs unfamiliar with what each event represents should keep this listing close by as they examine wait-based event metrics. For example, a ‘db file scattered read’ event is typically indicative of table scan operations. If you see many of these events, then you can begin to see if large table scans are occurring in the database. Like the ‘db file scattered read’ event, each wait event has its own meaning and individual end-resolution diagnosis. After looking at system-level wait activity, you can discover which current connections are responsible for waits at the system level. Performance Analyst reports on historical and current wait events at the session level, making this investigation easy to accomplish

Top Session Bottlenecks

When viewing wait statistics, there are many levels of detail that you can view. The first level is the system view, which provides a global, cumulative snapshot of all the waits that have occurred on a system. The second level is a historical look at waits from a session level. The third is which sessions are currently experiencing waits. The Top Session Bottlenecks section identifies the top sessions that are currently waiting, based on their wait time in seconds.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Bottleneck Analysis Pane

Correction

Oracle documentation contains a listing and description of every current wait event defined in Oracle. DBAs unfamiliar with what each event represents should keep this listing close by as they examine wait-based event metrics. The most common wait viewed at the session level is an ‘enqueue’ wait, which typically identifies lock contention. If enqueue waits are observed, then you can check the “Current Object Blocks” count on the Performance Analyst Home page, as well as the Users View (Oracle Performance Analyst) which displays locks and blocking locks detail. As with an enqueue event, each wait event has its own meaning and individual end-resolution diagnosis.

Problem Tablespaces

The Problem Tablespaces metric is a count of all tablespaces in the database whose free space percentage amount has fallen below a user-defined limit.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Bottleneck Analysis Pane

Metrics

A rule of thumb for dynamic tablespaces (those with growing objects) is to keep a minimum of 10-15% free space available for object growth.

Correction

There are two ways to prevent a tablespace from running out of available free space: Turn AUTOEXTEND on for the underlying tablespace's datafiles. This allows them to automatically grow when free space in the datafile has been exhausted. Using the ALTER TABLESPACE … ADD DATAFILE… command, you can manually add a new datafile to a tablespace that is about to run out of available free space.

Problem Objects

The Problem Objects statistic is a count of all objects in the database that are in danger of reaching their maximum extent limit or cannot allocate their next extent of free space because of a lack of overall or contiguous free space in their parent tablespace.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Bottleneck Analysis Pane

Metrics

Modify any object approaching their maximum extent limit or unable to allocate a new extent of space so they can continue to grow in size.

Correction

Depending on the situation, there are a number of things you can do to prevent object extent problems: Turn AUTOEXTEND on for the underlying parent tablespace's datafiles. This allows a tablespace to automatically grow when free space in the datafile has been exhausted, and allows an object to extend even when little or no current free space is available. Using the ALTER TABLESPACE … ADD DATAFILE… command, you can manually add a new datafile to a tablespace that is about to run out of available free space. You can alter an object that is at or near their maximum extent limit so that the object has unlimited extents. You can use locally-managed tablespaces to ensure that no object ever reaches its maximum extent limit, because all objects are allowed unlimited extents. An object can be reorganized into another tablespace or reorganized in general to reduce the number of extents the object currently takes up.

Current Object Blocks

A single blocking user has the potential to stop work for nearly all other processes on a small system, and can cause major headaches even on large systems. Although Oracle supports unlimited row-level locking, blocking lock situations do crop up. Blocks are most often caused by user processes holding exclusive locks and not releasing them via a proper COMMIT frequency.

Note: This statistic is also called Sessions Blocked on the Users View (Oracle Performance Analyst) and Session Blocks on the Objects View (Oracle Performance Analyst) Home page.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Bottleneck Analysis Pane

Metrics

Investigate any indicator above zero immediately before the situation has a chance to mushroom.

Correction

Once discovered, a blocking lock situation can normally be quickly remedied. You can issue a KILL against the offending process, which eliminates the user's stranglehold on the objects they were accessing. Other user processes then nearly almost always complete in an instant. Discovering the blocked lock situation is made easier by using tools like Performance Analyst, but preventing the blocking lock situation in the first place is where it gets tricky. You can look at the Users View (Oracle Performance Analyst) detail and view all current blocking locks to see exactly which sessions are holding the currently restrictive locks. The culprit of blocking lock scenarios is usually the application design, or the SQL being used within the application itself. Properly coding an application to reference database objects in an efficient order, and then using the right SQL to get the job done, is an art. Most DBAs who have had to face Oracle Forms applications have suffered through the dreaded SELECT … FOR UPDATE statements that place unnecessary restrictive locks on nearly every read operation, and know all too well that good coding practice is important. The key to avoiding lock contention is to process user transactions in the quickest and most efficient manner possible - something not always easy to do.

Enqueue Waits

An enqueue is an advanced locking device that allows multiple database processes to share certain resources. Enqueue waits typically occur when sessions wait to be granted a requested lock. Sometimes these locks are internal Oracle locks while other times they could be locks for rows of data. Enqueues are issued implicitly by Oracle.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Top Session Bottlenecks Oracle Performance Analyst Statistics > Objects View (Oracle Performance Analyst) > Bottleneck Analysis Pane

Metrics

Investigate any enqueue waits that read consistently above one or more (delta statistics).

Correction

Removing contention for enqueues is almost always an application design issue. If many enqueue waits are seen, this normally indicates either contention for certain rows in the database, or the result of database-initiated lock escalation. Examine the use of indexes to make sure all referencing foreign keys are indexes and that SQL is tuned to not tarry over rows in the database during modification operations.

Enqueue waits can also be the result of space management tasks (such as objects extending) and disk sorts (mainly in tablespaces that do not make use of the TEMPORARY tablespace parameter).

Free List Waits

Free lists are lists of Oracle data blocks that contain free space for an Oracle object. Every table has at least one free list. Free lists are used to locate free blocks of space when a request is made of a table for the insertion of a row. Free list contention can reduce the performance of applications when many processes are involved in the insertion of data to the same table.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Top Session Bottlenecks Oracle Performance Analyst Statistics > Objects View (Oracle Performance Analyst) > Bottleneck Analysis Pane

Metrics

If consistent numbers for free list waits continues to appear, add additional free lists to the most dynamic database objects through the use of the STORAGE parameter. Another indicator of insufficient free lists are consistent, non-zero numbers for the buffer busy wait event.

Correction

If free list waits are observed, add more free lists to tables and indexes with high insert rates. For Oracle9i or later, objects can be created inside of locally-managed tablespaces that use the automatic segment management feature, which entirely does away with free lists.

Storage Analysis Pane

While DBAs focus on memory settings and tuning SQL, storage problems can play a major role in wrecking an otherwise well-running database.

Storage problems generally take one of two forms:

  1. Hit-the-wall, which can bring things to a complete standstill.
  2. Performance vampire, which slowly drains the performance of a database over time.

Storage problems have the capability to bring the curtain down on a database very quickly, as in the case of an archive log destination running out of free space. But storage problems can also silently work behind the scenes to slowly but surely rob a database of its performance. For example, you can access a hub table in a busy database when an application is first given life, but over time, if it develops a heavy migrated row problem, it can cause things to run very differently.

The Storage Analysis section of the Performance Analyst Home page reports on the total used and free space of all tablespaces in the database as well as whether the database is running in Archive Log mode. For more detailed information on database space, use the Performance Analyst Space View (Oracle Performance Analyst).

Total Used Space/Total Free Space

These statistics represent the total used and free space available in all tablespaces/datafiles in the database. Although good to know, a more detailed listing by tablespace is needed to determine where any actual space shortages exist in the database.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Storage Analysis Pane

Metrics

If any one tablespace begins to approach 90% used (and the objects contained within it are dynamic and growing as opposed to static), take action to prevent any future space allocation errors.

Correction

Here are a some things you can do to prevent a tablespace from running out of available free space: Turn AUTOEXTEND on for the underlying tablespace's datafiles. This allows them to automatically grow when free space in the datafile has been exhausted. Using the ALTER TABLESPACE … ADD DATAFILE… command, you can manually add a new datafile to a tablespace that is about to run out of available free space.

Archive Log

Oracle can be placed into archivelog mode, which tells the database to make copies of the online redo log files for point-in-time recovery purposes. The Archive Log statistic displays whether the database is running in archivelog mode or not. This information is displayed in the Storage Analysis section of the Performance Analyst Home page. For more information, see Storage Analysis Pane.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Storage Analysis Pane

SQL Analysis Pane

Much of a database's overall performance can be attributed to SQL statement execution. Poorly optimized SQL statements can drag an otherwise well-configured database down in terms of end user response times. SQL statements that use much memory can also cause a problem in a database.

Before you can identify problem SQL in your database, you have to ask the question, “What is bad SQL?” What criteria do you use when you begin the hunt for problem SQL in your critical systems? Understand that even the seasoned experts disagree on what constitutes efficient and inefficient SQL; so there is no way to sufficiently answer this question to every Oracle professional's satisfaction. The table below lists some general criteria you can use when evaluating the output from various database monitors or personal diagnostic scripts:

Criteria Description

Overall Response (Elapsed) Time

The amount of time the query took to parse, execute, and fetch the data needed to satisfy the query. It should not include the network time needed to make the round trip from the requesting client workstation to the database server. This statistic is available in Oracle9i or later.

CPU Time

The amount of CPU time the query took to parse, execute, and fetch the data needed to satisfy the query.

Physical I/O

This is often used as the major statistic in terms of identifying good vs. bad SQL, this is a measure of how many disk reads the query caused to satisfy the user's request. While you certainly want to control disk I/O where possible, it is important that you not focus solely on physical I/O as the single benchmark of inefficient SQL. Make no mistake, disk access is slower than memory access and also consumes processing time making the physical to logical transition, but you need to look at the entire I/O picture of a SQL statement, which includes looking at a statements' logical I/O as well.

Logical I/O

the number of memory reads the query took to satisfy the user's request. The goal of tuning I/O for a query should be to examine both logical and physical I/O, and use appropriate mechanisms to keep both to a minimum.

Repetition

The number of times the query has been executed. A problem in this area is not as easy to spot as the others unless you know your application well. A query that takes a fraction of a second to execute can still cause a headache on your system if it is executed erroneously (for example, a query that executes in a runaway PL/SQL loop) over and over again.

There are other criteria that you can examine like sort activity or access plan statistics (that show items like Cartesian joins and the like), but more often than not, these measures are reflected in the criteria listed above. Fortunately, Oracle records all the above measures (some only in 9i), which makes tracking the SQL that has been submitted against an Oracle database much easier.

Metrics

When you begin to look for inefficient SQL in a database, there are two primary questions you want answered:

  1. What HAS been the worst SQL that has historically run in my database?
  2. What IS the worst SQL that is running right now in my database?

When troubleshooting a slow system, you should be on the lookout for any query that shows an execution count that is significantly larger than any other query on the system. It can be that the query is in an inefficient PL/SQL loop, or other problematic programming construct. Only by bringing the query to the attention of the application developers will you know if the query is being mishandled from a programming standpoint.

There is the possibility that the SQL statement just is not tuned well. To determine that, you can go further into Performance Analyst's Top SQL view and, if you have IDERA SQL Tuner installed, you can port the SQL over to SQL Tuner to better optimize the statement.

Workload Analysis Pane

When your database experiences performance slowdowns, you should answer the following:

  • Who is currently connected to the database?
  • What resources are they using?
  • What are they currently executing?

The Workload Analysis section of the Performance Analyst Home page provides insight into the leading resource hogs of a database as well as a count of total active and inactive processes. You can view detailed information into what each leading session is currently involved with.

The Workload Analysis section of the Home page includes:

The buffer cache hit ratio is an indicator of how often user requests for data are satisfied through memory vs. being physically read from disk. Data read from memory produces user response times many times faster than when that same data is read from disk. Keeping physical I/Os to an absolute minimum is one of the Oracle buffer cache's purposes in life.

The table below describes the key counters Performance Analyst uses to calculate the buffer cache hit ratio:

Key Counter Description

DB BLOCK GETS

Data read from memory for DML operations.

CONSISTENT GETS

Data read from rollback segments in memory.

PHYSICAL READS

Data read physically from disk.

Direct Reads

Data read physically from disk that bypasses the buffer cache. Direct reads are filtered out of overall physical reads so an accurate cache hit ratio can be determined.

Dividing the data read from memory by data read from disk yields the cache hit ratio.

Metrics

To help ensure excellent performance, you want to keep your cache hit ratio in the neighborhood of 90% or higher. However, every database has its own ‘personality’ and can exhibit excellent performance with below average readings for the cache hit ratio. Excessive logical I/O activity can produce a very high cache hit ratio while actually degrading overall database performance.

Investigate consistent low readings of 60% or less.

Note: The db_cache_size parameter may need adjusting. The db_cache_size parameter is dynamic and can be altered without stopping and starting the database instance.

Correction

If a problem is found, do the following:

  • Increase the size of the db_cache_size parameter through use of the ALTER SYSTEM SET db_cache_size command value (take caution to not over-allocate; ensure enough free memory exists on server before increasing value).
  • Monitor the new value to see if performance improves.
  • If using an SPFILE, save the new configuration values so Oracle reuses them each time the database is stopped and re-started.

Top Processes

When the database population as a whole experiences a system slowdown, it is not uncommon to find one or two users who are responsible for bringing the system to its knees. In the best of worlds, users have an evenly divided amount of memory usage, disk I/O, CPU utilization, and parse activity. Unfortunately, this is not usually the case. Users submit large batch jobs during peak OLTP activity, or when sessions that are firing off untuned queries on a critical system.

If you are seeing a slowdown in your database, and cannot seem to find a root cause, one thing to examine is the resource consumption of the leading sessions on a system. A handful of users can overwhelm the I/O capabilities of Oracle (through untuned queries or runaway batch jobs) or hammer the CPU or memory structures of the database and server.

Performance Analyst makes it easy to pinpoint the top sessions by showing the leading processes at the physical I/O, logical I/O, memory, and CPU usage levels.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Workload Analysis Pane

Metrics

If any one session uses more than 50% of a total resource (CPU, memory, etc.) go into the session to find out what they are currently executing.

Active User Processes

The Active User Processes statistic is the total number of active and open threads reported in the database. Active Sessions displays the number of processes actively performing work.

For more information, see Inactive User Processes.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Workload Analysis Pane

Inactive User Processes

This metric is the total number of threads logged on to the database that are idle at the current time.

Location

Oracle Performance Analyst Statistics > Home View (DBArtisan - Oracle Performance Analyst) > Workload Analysis Pane

Metrics

A large number of inactive users could indicate user sessions that have mistakenly been left logged on. Because each user thread consumes a portion of memory on the Oracle server, sever any sessions not needing a connection to reduce resource usage.

Correction

Double-click this statistic to open the Top Sessions Tab of the Users Detail page. On this tab you can check sessions that have many seconds idle and/or that have been logged on for very long periods of time as indicated by the logon time column. After verifying that the session is no longer necessary, you can KILL the session. For more information, see Active User Processes.