The goal of an IT department includes aligning the IT plans with business objectives, establishing measures of IT effectiveness, directing employee efforts toward IT objectives, improving the performance of technology, and achieving balanced results across stakeholder groups. As an IT Director of an organization; there should be an on going assessment of the current state of your IT department from the perspective of reliability, availability and Disaster Recovery (DR) readiness. You must first understand the entire system, and evaluate how each component affects the overall systems reliability, recoverability, serviceability, performance, security, and manageability. Then the critical elements of the system would be investigated and evaluated in order of set priorities (from more critical to less critical), i.e. failure with any of these components, affects the entire system. System reliability is the probability that a system will be operational over a specified time in a given environment for a given purpose without failing. That is, it’s a form of guarantee that a given system service(s) will be delivered as specified. While system availability is the probability that a system will be operational and able to deliver the requested services at a point in time (without unplanned outages). System reliability and availability are related but distinct in that availability takes into account the time that the system is out of service, whereas unreliable systems can have a high availability if there is a short restart time. System reliability and availability also goes hand in hand with systems disaster recovery and business continuity plans. For example, a system failure during working hours that might take several hours to rectify, calls for a disaster recovery action; however, keeping the system functional (available) while the error is being corrected is form of business continuity plan. Therefore, evaluate the reliability goals and assess the efficacy of the measurement effort in order to carry out the required corrective actions. The following analysis will identify whether the IT reliability goals are met. Perform reliability assessment to:
|
Social: