Wednesday, January 15, 2014

New Technical article: Back Up a Thousand Databases Using Enterprise Manager Cloud Control 12c


I am pleased to announce that a new technical article of mine has been published (January 2014) on the Oracle Technical Network.

Back Up a Thousand Databases Using Enterprise Manager Cloud Control 12c

This detailed technical article explains the set up and scheduling of full and incremental RMAN Database backups for  thousands of databases using Enterprise Manager Cloud Control (Enterprise Manager) 12c, and how this is done more easily and efficiently than the older, more time-consuming, manual method of performing Unix shell scripting, RMAN scripting, and cron jobs for each database to be backed up. 

And with the Database Group Backup feature new to Enterprise Manager Cloud Control 12c, it can be even faster to set up RMAN backups for multiple databases - even if there are thousands - that are part of an Enterprise Manager Database Group.

The article also highlights the advantages of using PDBs in Oracle Database 12c and backing them up using RMAN. RMAN cannot backup individual schemas, and it has always been difficult to perform point-in-time-recovery (PITR) at an individual schema level, since schemas can easily be distributed across multiple tablespaces. The advantage in using PDBs in a Container Database is that you can easily set up RMAN backups at the Container Database level, and yet perform PITR at the PDB level. This is a clear technical advantage of the Multi-tenant architecture of Oracle Database 12c.

The set up and scheduling of RMAN database backups forms a part of the Base Database Management features of Enterprise Manager that enables numerous customers to use Enterprise Manager 12c more and more.  In fact I had personally introduced Enterprise Manager to HDFC bank in India in 2007 for the purpose of their RMAN backups, they started using it for the first time, and today they are a DBaaS-Exadata reference customer who have presented in OOW for the last 2 years.



Thursday, January 9, 2014

What is EM 12c DBaaS Snap Clone?

Happy New Year to all! Being the first blog post of the new year, lets look at a relatively new feature in EM that has gained significant popularity over the last year - EM 12c DBaaS Snap Clone.
The ‘Oracle Cloud Management Pack for Oracle Database’ a.k.a the Database as a Service (DBaaS) feature in EM 12c has grown tremendously since its release two years ago.  It started with basic single instance and RAC database provisioning, a technical service catalog, an out of box self service portal, metering and chargeback, etc. But since then we have added provisioning of schemas and pluggable databases, full clones using RMAN backups, and Snap Clone. This video showcases the various EM12c DBaaS features.
This blog will cover one of the most exciting and popular features – Snap Clone. In one line, Snap Clone is a self service way of creating rapid and space efficient clones of large (~TB) databases.
Self Service - empowers the end users (developers, testers, data analysts, etc) to get access to database clones whenever they need it.
Rapid - implies the time it takes to clone the database. This is in minutes and not hours, days, or weeks.
Space Efficient - represents the significant reduction in storage (>90%) required for cloning databases 
Customer Scenario 
To best explain the benefits of Snap Clone, let’s look at a Banking customer scenario:
  • 5 production databases total 30 TB of storage
  • All 5 production databases have a standby
  • Clones of the production database are required for data analysis and reporting
  • 6 total clones across different teams every quarter
  • For security reasons, sensitive data has to be masked prior to cloning
Based on the above scenario, the storage required, if using traditional cloning techniques, can be calculated as follows:
5 Prod DB                  = 30 TB
5 Standby DB            = 30 TB
5 Masked DB             = 30 TB (These will be used for creating clones)
6 Clones (6 * 30 TB) = 180 TB
Total                           = 270 TB
Time = days to weeks
As the numbers indicate, this is quite horrible. Not only 30 TB turn into 270 TB, creating 6 clones of all production databases would take forever. In addition to this, there are other issues with data cloning like:
  • Lack of automation. Scripts are good but often not a long term solution.
  • Traditional cloning techniques are slow while, existing storage vendor solutions are DBA unfriendly 
  • Data explosion often outpaces storage capacity and hurts ITs ability to provide clones for dev and testing
  • Archaic processes that require multiple users to share a single clone, or only supports fixed refresh cycles
  • Different priorities between DBAs and Storage admins
Snap Clone to the Rescue 
All of the above issues lead to slow turnaround times, and users have to wait for days and weeks to get access to their databases. Basically, we end up with competing priorities and requirements, where the user demands self service access, rapid cloning, and the ability to revert data changes, while IT demands standardization, better control, reduction in storage and administrative overhead, better visibility into the database stack, etc.
EM 12c DBaaS Snap Clone tries to address all these issues. It provides:
  • Rapid and space efficient cloning of databases by leveraging storage copy-on-write (or similar) technology
  • Supports all database versions from 10g to 12c
  • Supports various storage vendors and configurations NAS and SAN
  • Lineage and association tracking between clone master and its various clones and snapshots
  • 'Time Travel' capability to restore and access past data
  • Deep visibility into storage, OS, and database layer for easy triage of performance and configuration issues
  • Simplified access for end user via out-of-the-box self service portal
  • RESTful APIs to integrate with custom portals and third party products
  • Ability to meter and charge back on the clone databases
So how does Snap Clone work?
The secret sauce lies in the Storage Management Framework (SMF) plug-in. This plug-in sits between the storage system and the DBA, and provides the much needed layer of abstraction required to shield DBAs and users from the nuances of the different storage systems. At the storage level, Snap Clone makes use of storage copy-on-write (or similar) technology. There are two options in terms of using and interacting with storage:
1. Direct connection to storage: Here storage admins can register NetApp and ZFS storage appliance with EM, and then EM directly connects to the storage appliance and performs all required snapshot and clone operations. This approach requires you to license the relevant options on the storage appliance, but is the easiest and the most efficient and fault tolerant approach.
2. Connection to storage via ZFS file system: This is a storage vendor agnostic solution and can be used by any customer. Here instead of connecting to storage, the storage admin mounts the volumes to a Solaris server and format it with ZFS file system. Now all snapshot and clone operations required on the storage are conducted via ZFS file system,. The good thing about this approach is that it does not require thin cloning options to be licensed on the storage since ZFS file system provides these capabilities.
For more details on how to setup and use Snap Clone, refer to a previous blog post
Now, lets go back to our Banking customer scenario and see how Snap Clone helped then reduce their storage cost and time to clone.
5 Prod DB                      = 30 TB
5 Standby DB                 = 30 TB
5 Masked DB                 = 30 TB
6 Clones (6 * 30 TB)      = 180 TB
6 Clones (6 * 5 * 2 GB) = 60 GB
Total                               = 270 TB 90 TB
Time = days to weeks minutes
Assuming the clone databases will have minimal writes, we allocate about 2GB of write space per clone. For 5 production databases and 6 clones, this totals to just 60GB in required storage space. This is a whopping 99.97% savings in storage. Plus, these clones are created in matter of minutes and not the usual days or weeks. The product has out-of-the-box charts that show the storage savings across all storage devices and cloned databases. See the screenshot below.
Snap Clone Savings
Where can you use Snap Clone databases?
As i said earlier, Snap Clone is most effective when cloning large databases  (~TBs). Common scenarios we see our customers best use Snap Clone are:
  • Application upgrade testing. For example, EBusiness suite upgrade to R12
  • Functional testing. For example, testing using production datasets.
  • Agile development. For example, run parallel development sprints by giving each sprint its own cloned database.
  • Data Analysis and Reporting. For example, stock market analysis at the close of market everyday.
Its obvious that Snap Clone has a strong affinity to applications, since its application data that you want to clone and use. Hence it is important to add that the Snap Clone feature when combined with EM12c middleware-as-a-service (MWaaS) can provide a complete end-to-end self service application deployment experience. If you have existing portals or need to integrate Snap Clone with existing processes, then use our RESTful APIs for easy integration with third party systems.
In summary, Snap Clone is a new and exciting way of dealing with data cloning challenges. It shields DBAs from the nuances of different storage systems, while allowing end users to request and use clones in a rapid and self service fashion. All of this while saving storage costs. So try this feature out today, and your development and test teams will thank you forever.
In subsequent blog posts, we will look at some popular deployment models used with Snap Clone.
-- Adeesh Fulay (@adeeshf)
Additional References

Database Lifecycle Management for Cloud Service Providers

Adopting the Cloud Computing paradigm enables service providers to maximize revenues while driving capital costs down through greater efficiencies of working capital and OPEX changes. In case of enterprise private cloud, corporate IT, which plays the role of the provider, may not be interested in revenues, but still care about providing differentiated service at lower cost. The efficiency and cost eventually makes the service profitable and sustainable. This basic tenet has to be satisfied irrespective of the type of service-infrastructure (IaaS), platform (PaaS) or software application (SaaS). In this blog, we specifically focus on the database layer and how its lifecycle gets managed by the Service Providers. 

Any service provider needs to ensure that:
  • Hardware and software population are in control. As new consumers come in and some consumers retire, there is a constant flux of resources in the data center. The flux has to be managed and controlled
  • The platform for providing the service is standardized, so that operations can be conducted predictable and at scale across a pool of resources
  • Mundane and repeatable tasks like backup, patching, etc are automated
  • Customer attrition does not happen owing to heightened compliance risk
While the Database Lifecycle Management features of Enterprise Manager have been widely adopted, I feel that the applicability of the features with respect to service providers is yet well understood and hence appreciated. In this blog, let me try addressing how the lifecycle management features can be effective in addressing each of the above requirements.
1. Controlling hardware and software population:
Enterprise Manager 12c provides a near real-time view of the assets in a data center. It comes with out-of-box inventory reports that show the current population and the growth trend within the data center. The inventory can be further sliced and diced based on cost center, owner, etc. In a cloud, whether private or public, the target properties of each asset can be appropriately populated, so that the provider can easily figure out the distribution of assets. For example, how many databases are owned by Marketing LOB can be easily answered. The flux within the data center is usually higher when virtualization techniques such as server virtualization and Oracle 12c multitenant option are used. These technologies make the provisioning process extremely nimble, potentially leading to a higher number of virtual machines (VMs) or pluggable databases (PDBs) within the data center and hence accentuating the need for such ongoing reporting. The inventory reports can be also created using BI Publisher and delivered to non-EM users, such as a CIO.
Now, not all reports can always be readily available. There can be situations where a data center manager can seek adhoc information, such as, how many databases owned by a particular customer is running on Exadata. This involves an adhoc query based upon an association, viz. database running on Exadata and target properties, viz. owner being the customer. Enterprise Manager 12c provides a sophisticated Configuration Search feature that lets administrators define such adhoc queries and save them for reuse.
2. Standardization of platform:
The massive standardization of platform components is not merely a nice-to-have for a cloud service provider, it is rather a must-have. A provider may choose to offer various levels of services, tagged with levels such as gold, silver and bronze. However, for each such level, the platform components need to be standardized, not only for ease of manageability but also for ensuring consistency of QOS across all the tenants. So how can the platform be standardized? We can highlight two major Enterprise Manager 12c features here:
The ability to rollout gold images that can be version controlled within Enterprise Manager's Software Library. The inputs of the provisioning process can be "locked down" by the designer of the provisioning process, thereby ensuring that each deployment is a replica of the other.
The ability to compare the configuration of deployments (often referred to as the "Points of Delivery" of the services). This is a very powerful feature that supports 1-n comparisons across multiple tiers of the stack. For example, one can compare an entire database machine from storage cells, compute nodes to databases with one or more of those.
3. Automation of repeatable tasks:
A large portion of OPEX for a service provider is expended while executing mundane and repeatable tasks like backup, log file cleanup or patching. Enterprise Manager 12c comes with an automation framework comprising Jobs and Deployment Procedures that lets administrators define these repetitive actions and schedule them as needed. EMCC’s task automation framework is scalable, carries functions such as ability to schedule, resume, retry which are of paramount importance in conducting mass operations in an enterprise scale cloud. The task automation verbs are also exposed through the EMCLI interface. Oracle Cloud administrators make extensive use of EMCLI for large scale operations on thousands of tenant services.
One of the most popular features of Enterprise Manager 12c is the out-of-box procedures for patch automation. The patching procedures can patch the Linux operating system, clusterware and the database. For minimizing the downtime involved in the patching process Enterprise Manager 12c also supports out-of-place patching that can prepare the patched software ahead of time and migrate the instances one by one as needed. This technique is widely adopted by the service providers to make sure the tenants' downtime related SLAs are respected and adhered to. The co-ordination of such downtime can be instrumented by Enterprise Manager 12c's blackout functionality.
4. Managing Compliance risks:
In a service driven model, the provider is liable in case of security breaches. The consumer and in turn, the customer of the consumer's apps need to be assured that their data is not breached into owing to platform level vulnerabilities. The security breaches often happen owing to faulty configuration such as default passwords, relaxed file permissions, or an open network port. The hardening of the platform therefore, has to be done at all levels-OS, network, database, etc. The security breaches often happen owing to faulty configuration such as default passwords, relaxed file permissions, or an open port. . To manage compliance, administrators can create baselines referred to as Compliance Standard. Any deviations from the baselines triggers compliance violation notifications, alerting administrators to resolve the issue before it creates risk in the environment.
We can therefore see how four major asks from a service provider can be satisfied with the Lifecycle Management features of Enterprise Manager 12c. As substantiated through several third party studies and customer testimonials, these result in higher efficiency with lower OPEX.

Using EM CLI for mass update of Lifecycle Status Property Value

I co-presented at Oracle Open World in September, Manage Beyond Limits: Enterprise Manager CLI and Other Extensibility Features. I focused on the enhancements to Enterprise Manager Command Line Interface, EM CLI. I enthused about the two new modes, Interactive and Script mode and how they compare to the standard mode of previous releases, from the SQL*Plus like environment of Interactive mode to the scalable, JSON formatted output of script mode. I highlighted the ease of use and the scalable power of EM CLI.
After my session a number of you asked me for a copy of the scripts that I demoed. This is one. 
Why do we take on the extra task involved in learning something new? …because we know it will lead to personal growth, ultimately solve a problem or two, and maybe even look good on our resume. Learning Jython scripting will tick all of those boxes. Plus, it’s fun!
This script tries to solve the problem of mass updates to the Lifecycle Status property value. This is a new property introduced in Oracle Enterprise Manager 12c, and can be used to indicate the importance of a target, e.g. “Mission Critical", or to determine where a target is in its life cycle, e.g. “Stage”, “Test” or “Production”. Consider a new deployment of several hundred Oracle Databases, half of which are Mission Critical and the other half are in “Test”, but are about to go “Production”.
What is the best way to transition from “Test” to “Production”?
EM CLI in script mode!
EM CLI in script mode takes advantage of the Jython scripting language to use Enterprise Manager in a programmatic way, allowing task automation. The EM CLI Jython script below automates the setting of the Lifecycle Status Property Value, and uses standard programming constructs to make itterating through several Targets simpler, more robust and less error prone.
At a high level, every EM CLI Jython script can effectively be broken down into two parts:
Step 1: The setting and defining the necessary variables such as, which OMS URL to connect to, how secure you want your communication channel and which Administrator to log into the OMS.
Step 2: The calling or manipulation of EM CLI 12c procedures. Procedures were called verbs in previous releases, verb options are now procedure arguments in script and interactive mode.You can explore the on-line verb reference for more information.
Let’s break the script down further in to the major functional blocks of code.
Line 19: Sets the variable EMCLI_OMS_URL, which determines which OMS URL we shall connect too.
Line 21: Sets the variable EMCLI_TRUSTALL, which determines the level of security associated with the communication channel between the EM CLI and the OMS. We are choosing the lowest level of security.
Both of these variables could also have been set as environment variables.
Line 26 – 40: Between the if – else loop, we are checking for arguments that are passed to the script. We are passing two arguments into this script. Following, is what it looks like when calling an EM CLI Jython script, with arguments, on the command line:
$>./emcli OWUSER Production
Where: - is the name of our Jython Script.
OWUSER - is the username used to log into the OMS, the script will prompt for a password, to authenticate this user. The mode of authentication is the same as is configured for the Console. Authentication modes supported are Repository, SSO or LDAP.
Production - is the Lifecycle Status property Value we shall set.
Line 27: We log into the OMS.
Line 29: We search through all targets where the version, “DBVersion” is greater than or equal to 12.1. This is passed to an internal procedure defined in Line 10.
Line 11: We construct the SQL command, based on the arguments passed in, then use the EM CLI list()procedure to convert the returned output to an easily parse-able JSON formatted syntax (line 15) . We then return the Response Object, obj (line 16). The information returned are all the targets of the appropriate version.
Line 37: We then take the information and parse it, filtering further on oracle_database Target types. Finally we parse and print TARGET_NAME, TARGET_TYPE, PROPERTY_NAME and PROPERTY_VALUE for all databases which fit our criteria.
Line 39: We call the set_target_property_value() procedure which accepts a colon separated list of property records, in the form, TARGET_NAME:TARGET_TYPE:PROPERTY_NAME:PROPERTY_VALUE.
Please copy the code, save it with the *.py extension and change the EMCLI_OMS_URL value to the valid OMS URL for your environment.
Play around with it, and take your Jython scripting knowledge from Test to Production.

Implementing Service Level Agreements in Enterprise Manager 12c for Oracle Packaged Applications

Contributed by Eunjoo Lee, Product Manager, Oracle Enterprise Manager.
Service Level Management, or SLM, is a key tool in the proactive management of any Oracle Packaged Application (e.g., E-Business Suite, Siebel, PeopleSoft, JD Edwards E1, Fusion Apps, etc.). The benefits of SLM are that administrators can utilize representative Application transactions, which are constantly and automatically running behind the scenes, to verify that all of the key application and technology components of an Application are available and performing to expectations.
A single transaction can verify the availability and performance of the underlying Application Tech Stack in a much more efficient manner than by monitoring the same underlying targets individually.
In this article, we’ll be demonstrating SLM using Siebel Applications, but the same tools and processes apply to any of the Package Applications mentioned above. In this demonstration, we will log into the Siebel Application, navigate to the Contacts View, update a contact phone record, and then log-out.
This transaction exposes availability and performance metrics of multiple Siebel Servers, multiple Components and Component Groups, and the Siebel Database - in a single unified manner. We can then monitor and manage these transactions like any other target in EM 12c, including placing pro-active alerts on them if the transaction is either unavailable or is not performing to required levels. The first step in the SLM process is recording the Siebel transaction. The following screenwatch demonstrates how to record Siebel transaction using an EM tool called “OpenScript”. A completed recording is called a “Synthetic Transaction”.
The second step in the SLM process is uploading the Synthetic Transaction into EM 12c, and creating Generic Service Tests. We can create a Generic Service Test to execute our synthetic transactions at regular intervals to evaluate the performance of various business flows. As these transactions are running periodically, it is possible to monitor the performance of the Siebel Application by evaluating the performance of the synthetic transactions. The process of creating a Generic Service Test is detailed in the next screenwatch. EM 12c provides a guided workflow for all of the key creation steps, including configuring the Service Test, uploading of the Synthetic Test, determining the frequency of the Service Test, establishing beacons, and selecting performance and usage metrics, just to name a few.
The third and final step in the SLM process is the creation of Service Level Agreements (SLA). Service Level Agreements allow Administrators to utilize the previously created Service Tests to specify expected service levels for Application availability, performance, and usage. SLAs can be created for different time periods and for different Service Tests. This last screenwatch demonstrates the process of creating an SLA, as well as highlights the Dashboards and Reports that Administrators can use to monitor Service Test results.
Hopefully, this article provides you with a good start point for creating Service Level Agreements for your E-Business Suite, Siebel, PeopleSoft, JD Edwards E1, or Fusion Applications. Enterprise Manager Cloud Control 12c, with the Application Management Suites, represents a quick and easy way to implement Service Level Management capabilities at customer sites.

Sending notification after an event has remained open for a specified period

Enterprise Manager (EM) 12c allows you to create an incident rule to send a notification and/or create an incident after an event has been open for a specified period. Such an incident rule will help prevent premature alerts on issues that may correct themselves within a certain amount of time.
For example, there are some agents in an unstable network area, and often there are communication failures between the agents and the OMS lasting three, four minutes at a time. In this scenario, you may only want to receive alerts after an agent in that area has been in the Agent Unreachable status for at least five minutes.
Note: Many non-target availability metrics allow users to specify the “number of occurrences” or the number of consecutive times metric values reach thresholds before a notification is sent. It is best to use the feature for such metrics.

This article provides a step-by-step guide for creating an incident rule set to cater for the above scenario, that is, to create an incident and send a notification after the Agent Unreachable event has remained open for a five-minute duration.

Steps to create the incident rule
1.     Log on to the console and navigate to Setup -> Incidents -> Incident Rules.
Note: A non-super user requires the Create Enterprise Rule Set privilege, which is a resource privilege, to create an incident rule.

The Incident Rules - All Enterprise Rules page displays.

2.     Click Create Rule Set …
The Create Rule Set page displays.

3.     Enter a name for the rule set (e.g. Rule set for agents in flaky network areas), optionally enter a description, and leave everything else at default values, and click + Add.
The Search and Select: Targets page pops up.

Note:  While you can create a rule set for individual targets, it is a best practice to use a group for this purpose.

4.     Select an appropriate group, e.g. the AgentsInFlakyNework group. The Select button becomes enabled, click the button.
The Create Rule Set page displays.

5.     Leave everything at default values, and click the Rules tab.
The Create Rule Set page displays.

6.     Click Create…
The Select Type of Rule to Create page pops up.

7.     Leave the Incoming events and updates to events option selected, and click Continue.
The Create New Rule : Select Events page displays.

8.     Select Target Availability from the Type drop-down list.
The page shows more options for Target Availability.

9.     Select the Specific events of type Target Availability option, and click + Add.
The Select Target Availability events page pops up.

10.   Select Agent from the Target Type dropdown list.
The page expands.

11.   Click the Agent unreachable checkbox, and click OK.

Note: If you want to also receive a notification when the event is cleared, click the Agent unreachable end checkbox as well before clicking OK.

The Create New Rule : Select Events page displays.

12.   Click Next.
The Create New Rule : Add Actions page displays.

13.   Click + Add.
The Add Actions page displays.

14.   Do the following:
a.     Select the Only execute the actions if specified conditions match option (You don’t want the action to trigger always).
The following options appear in the Conditions for Actions section.
b.     Select the Event has been open for specified duration option.
The Conditions for actions section expands.
c.     Change the values of Event has been open for to 5 Minutes as shown below.
d.     In the Create Incident or Update Incident section, click the Create Incident checkbox as following:
e.     In the Notifications section, enter an appropriate EM user or email address in the E-mail To field.
f.     Click Continue (in the top right hand corner).
The Create New Rule : Add Actions page displays.

15.   Click Next.
The Create New Rule : Specify name and Description page displays.

16.   Enter a rule name, and click Next.
The Create New Rule : Review page appears.

17.   Click Continue, and proceed to save the rule set.
The incident rule set creation completes.
After one of the agents in the group specified in the rule set is stopped for over 5 minutes, EM will send a mail notification and create an incident as shown in the following screenshot.

In conclusion, you have seen the steps to create an example incident rule set that only creates an incident and triggers a notification after an event has been open for a specified period. Such an incident rule can help prevent unnecessary incidents and alert notifications leaving EM administrators time to more important tasks.
- Loc Nhan 


Opinions expressed in this blog are entirely the opinions of the writers of this blog, and do not reflect the position of Oracle corporation. No responsiblity will be taken for any resulting effects if any of the instructions or notes in the blog are followed. It is at the reader's own risk and liability.

Blog Archive