Monday, 5 March 2012

Large number of instances in the SOA 11g dehydration store causes EM console performance issues


SOA Suite 11g R1(11.1.1.3,11.1.1.4,11.1.1.5)

Scenario and Symptoms:

The BPEL Engine Audit level is set to Development.

The BPEL engine is load tested with some huge number of transactions causing to generate large number of instances in the dehydration store(SOAINFRA Schema)

The number of instances is say 10-50 Lacs.The developers complain about EM Console being very slow. Drilling into composites and instances take minutes.Below activities in EM console take time:

1.       EM Login Page Load time
2.       Time taken for logging in to the EM
3.       Time taken to render homepage for SOAINFRA
4.       Expanding the SOAINFRA and each partition within
5.       Time taken to render home page for each  deployed composite
6.       Time taken to render details for each instance of any deployed composite

Cause:

Large number of instances in the dehydration store with audit level set to Development. When you login to EM console it tries to load the large amounts of instance and fault data from database leading to slowing up the EM console response time.

Solution:

Improving the Loading of Pages in Oracle Enterprise Manager Fusion Middleware Control Console
You can improve the loading of pages that display large amounts of instance and fault data in Oracle Enterprise Manager Fusion Middleware Control Console by setting two properties in the Display Data Counts section of the SOA Infrastructure Common Properties page.

These two properties enable you to perform the following:
  • Disable the fetching of instance and fault count data to improve loading times for the following pages:
    • Dashboard pages of the SOA Infrastructure, SOA composite applications, service engines, and service components
    • Delete with Options: Instances dialog

    • These settings disable the loading of all metrics information upon page load. For example, on the Dashboard page for the SOA Infrastructure, the values that typically appear in the Running and Total fields in the Recent Composite Instances section and the Instances column of the Deployed Composites section are replaced with links. When these values are large, it can take time to load this page and other pages with similar information.
    • Specify a default time period that is used as part of the search criteria for retrieving recent instances and faults for display on the following pages:
    • Dashboard pages and Instances pages of the SOA Infrastructure, SOA composite applications, service engines, and service components
    • Dashboard pages of services and references
    • Faults and Rejected Messages pages of the SOA Infrastructure, SOA composite applications, services, and references
    • Faults pages of service engines and service components

Other Suggestions/Best Practices:

1. Purge the instance-The moment we purge the instances, we see good performance.The flipside is you cannot be purging data regularly in production (to meet SLAs). My take is if you are in development server you can afford frequent purging of the instances. In Stage/Prod the Audit level would be set to production,hence performance issues due to large number of instances would not be seen. After you purge the dehydration store make sure you shrink the SOAINFRA tables along with indexes (or rebuild indexes)

2. Set Audit Level to Production/Off-The flip side is developers won't be able to troubleshoot issues with their composites. Go ahead with these settings in Production.

3. Another strategy is Switching the audit configuration to 'Deferred' which allows the auditing operations to be performed in an ansynchronous manner resulting in performance comparable to setting the audit level to disabled.Please refer Tuning BPEL audit performance [ID 1328382.1].This is recommended in production.Also can be applied to Dev/Stage environments.

4. As the number of record grows in the dehydration the EM console takes longer time to return information about instances. I believe this is a result of bad performing querries and full table scans. Generate an AWR report and see if you can tune some querries and build some indexes on tables. Get this gone by the DBAs.

5. Using a fast single threaded server(eg M5000/9000) for database instead of using slow CMT servers like the Sun T5140/5240. Please refer Migration from fast single threaded CPU machine to CMT UltraSPARC T1 and T2 results in increased CPU reporting and diminished performance [ID 781763.1]

6. Try using M series boxes for the application tier as well. But if you have to use CMT servers like T5240 make sure you follow note 860459.1 and apply the steps in the solution part to adapt all the components with CMT machine.


No comments:

Post a Comment