About the Paper “DCIM Policies: Automating Data Center Standard Operating Procedures”
Recent high profile data center failures have shown the pitfalls of keeping Data Center Standard Operating Procedures (SOP) as a manual and not automating them. The logical home for automated procedures is DCIM. Operational procedures are packaged into a “DCIM Policies” framework which link into different modules of the DCIM Software such that the DCIM detects any potential violation and sends alerts. It could also prevent an accidental mishap, such as overloading a Rack or taking down an electrical device for preventive maintenance without providing for back-up.
This paper outlines twelve key operating procedures that should be part of “DCIM Policies”.
- Risk Management: This tries to mitigate a Data Center Manager’s nightmare of an unplanned downtime, or worse an extended outage that disrupts business application availability, causes massive financial loss and damages an organization’s reputation. [Alarm Policy, Escalation Policy, Redundancy Policy, Disaster Recovery Policy]
- Governance: Streamlined governance with chain of command, checks & balance system, and audit trails are few of the universal best practices any organization adopts to ensure voluntary or statutory compliance measures. This applies to Data Centers as well. [Security Policy, Data Retention Policy, Approval Policy, SLA Policy]
- Efficiency Management: The Green Grid, ASHRAE and Uptime Institute have defined number of KPIs for an energy and operationally efficient data center. It is up to each organization, based on their own business priorities, to decide which KPIs matter to them, the benchmarks they wish to maintain and accordingly decide the policies that would help them get there. [PUE Policy, Rack Load Policy, Replacement Policy, Preventive Maintenance Policy].
For more details, please download the Paper completing the form on this page.
You may also refer to another Paper Causes of Data Center Failures: Can DCIM Prevent Them?