DCIM: what are the complementarities with ITSM practice? (Part 4)

By Laurent Duenas, October, 14th 2013 

 

Part 3 started to present the intersections between DCIM and ITSM approaches. It introduced common concerns about cartography representation and monitoring industrialization. This article continues to present intersections through other perspectives of industrialization, and though change management. 

 

What are the links between DCIM and ITSM? (suite)

 

To industrialize IT operations

The capabilities of Data Center automation have been talked within the previous chapter. These have addressed several day-to-day needs:

 

► The automation of operations relating to activity variation (such as additional capacity requests or woarkload upscale or downscale) 

► The automation of operations relating to disruptions

 

In both cases, DCIM tooling automatically provisions (or decommissions) facility resources or send to VM hypervisors calls for provisioning (or decommissioning) modifying significantly the active infrastructure configuration at a time.

iStock_000019244067XSmall - Data Center - DCIM 1.jpg

Examples: activating additional blades, adding power supply through scalable UPS, modifying fan velocity for adapting fresh air distribution, activating additional cooling units, all are operations orchestrated by the DCIM tool through daily automatic decisions without any human intervention.

By a simple detection of a power default or cooling, the DCIM tool which is permanently in communication with hypervisors of Virtual Machines (VMware, VSphere, System Center MS) triggers the routing of the VM and the data towards new blades or cabins units without any disruption risk.

 

These automated decisions can only be aligned with IT Service objectives only if the rules used to put them into motion are consistent with policies defined in Service Level Management activities, at the core of the ITSM culture.

 

To secure operations

Uncontrolled changes and human mistakes are the principal cause of disruptions. Data Centers do not escape to this rule. The more the Data Center complexity is, the more operations must be guided through rigorous procedures. In this field, DCIM tools also provide an important contribution.

 

Example: what characterizes Data Center universe is that the equipments are all standardized and fully aggregated. Behind a blade or a box, nobody knows at a time what is executed on. A manipulation error can stop a critical business application and provoke losses for the customers. This risk imposes to respect normalized procedures and accept being under permanent controls.

 

DCIM tools helps in establishing models of execution for operations (automation, workflows) reducing error risk on operations. DCIM tool, supporting these procedures, is fully part of the industrialization of operating IT services. Their objectives must be aligned to the service agreements defined in SLAs, as well. 

 

To plan and schedule changes

 The change planning and scheduling specific to Data Center equipments are part of the functionalities provided by the DCIM tools. Change scheduling are made according to Data Center management decisions (ie: the replacement of a cooling unit) or directly linked to business changes (ie: new IT Service implementation, evolution of IT Service consumption by the business). For interdependencies and compatibility reasons, Data Center changes have to be synchronized with the other changes implemented into the Information System.

 To be able to analyze the dependences and the potential incompatibilities, and in order to respect the change windows permitted in the SLAs, Change Planning from DCIM tooling must be integrated into the Central Change Planning*.

 

Companies coordinate mostly well change release to the live environment **. But, preliminary steps of the change are less coordinated (as design, build and test steps), as for the coordination of different changes relating the one to the others *** which is far for being a reality. As an example: the synchronization of application change with added capacity to be implemented into the Data Center for a same global change relating to an IT Service.

 

* Known in ITIL framework as FSC for Forward Schedule Change.

** To be compliant with the ITIL framework, at the deployment stage.

*** We call it as « Father change » to which are connected dependant « Child changes ». 

 

To track and control changes 

Change tracking and control are part of the functionalities brought by the DCIM solutions. They act as very Change Management tools. They permit to identify changes (with RFC*) and to approve them (through CAB**), and then to track their preparation and their validation through workflows. But, all these functionalities do not guaranty, only by their presence, that all the care is taken to secure the change.

 

* Request For Change

** Change Advisory Board

 

The level of control required is linked to the risk taken to execute the change. A set of ITSM best-practices should also be applied into the DCIM universe, particularly in terms of impact analysis (before change approval). In this domain, DCIM tools can compensate the difficulty to test the impact before actually performing a change (due to its physical specificity). Simulation functionalities provided by DCIM tools are useful to anticipate risks.

 

Example: In the past, to add a server does not come over to difficulties. Equipment has low density and the capacity of the white room was sufficient. Today, to add blades without having a global view of the power and cooling requirements equals to take the risk of destabilizing the rack and by a consequence the IT service availability. The difficulty in a Data Center is that no tests can be done in real. Cause all work in live conditions. Only simulation functions supplied by DCIM tools can virtually apraise the impact of a change such as on power supply load or on cooling units. Workflows integrated into DCIM tools can potentially oblige to verify the results from simulations before approving a change.

 

Even small changes, seemingly insignificant, and managed as simple requests are still changes. Their status of Standard Change* made them to be managed and tracked through simplified workflows, some of them might be performed without any human intervention, but all fully compliant with Change Management best-practices. This secures the execution of a change and facilitates later the incident diagnostic for potential future incidents.

 

*Standard Change is a change industrialized to up to the point to be managed as a “service request” (using the same automation tools as self-service portals and associated workflows).

 

To make DCIM workflows embed Change Management best practices (as defined in ITSM), it is essential that Data Centers operations design “Change models"* which merge all Data Center best practices coming from field experience and from ITSM frameworks. This should result into a unified reflection sharing know-hows pursuing the main goal: respect SLAs.

 

*Change Models are a level of standardisation of a workflow of tasks which deliver a change. This workflow is generally implemented into an ITSM tool.

     

To apply consistent Norms and Standards 

Teams in charge of a Data Center define (or select, depending on the case) Norms and Standards proper to their environment/context: technological standards, size norms, consumption/power norms, connectivity standards, etc. These norms and Standards pursue several normalization and optimization objectives, as warranties (such as availability, security, etc.), the Data Center department promise to provide as an internal provider.

It often happens that architecture choice or market software standards coming fro other IT entities (such as Development department, Application Design,) are not compatible with Data Centers Norms and Standards. And this could more often happen when norms are no shared enough and department take distance the one from the others due to their specialization. These situations are generally solved through tweaks in principles, with customized procedures, resulting to overpass costs in operations and additional risks.

Again, tight collaboration between ITSM and Data Center (or DCIM) specialists could bring an important added-value to get a consistent Norms & Standards implementation and application, fully aligned on same quality objectives and SLAs. From ITSM perspective, Norms & Standards are guided by a quite normalized eco-system:

► On one side, Norms & Standards provided by reflection activities from clearly identified processes (see examples below)

► On the other side, Change Management which is the controller of the good application of Norms and Standards

  

Examples of processes source of Norms & Standards:

► Availability Management defines architecture norms for applications and infrastructure aiming at building more resilient IT Services

► Capacity Management defines the " resources management settings" related to performance, prioritization, resources allocation, for everything to do with storage, computing power, network bandwith, guarantying an optomal resource management

► Security Management defines norms relating to data and application protections

► Release and Deployment Management defines norms for planning, validating, deploying versions embedding changes to live environment

► And many others exist in other domains such as support, supplier management, and financial management

 

 * Service Model results from a strategic reflection about design, build, deployment, operations and support approaches and choices for IT Services. 

 

Role of Change Management in Norms & Standards application:

► Creation of "Change Models" by typology of change, including Norms & Standards to be applied at design and development and testing stages,

► The "models" integrate all controls, checks, to be proceeded verifying the right application of the Norms & Standards, 

► The industrialization of the "models" based on wrokflows and automated functions provided by ITSM tools as from DCIM tools

 

Without any deep insight into both domains, an IT organization cannot dispose of all Norms & Standards mandatory to provide resilient IT Services and to respect all customer quality expectations. Each domain lacking in the other perspective, still be incertitude about the organization ability to keep its promise on company strategy alignment. Only a tight dialogue between designer of Norms & Standards and change models (in charge to apply them) could ensure that DCIM will no longer be one of these lacks.

 

 

Copyright © 2013 - PRACT Publishing