Data center Cooling Infrastructure

The high-density deployments of modern data centers have intensive power demands, but much of that power doesn’t actually go to the servers and computing equipment itself. Instead, it goes to the cooling equipment that prevents those systems from overheating. Data center cooling refers to a collective equipment, tools, techniques and process that ensures an idle operating temperature within a data center facility. In general, temperatures should not fall below 50°F or rise above 82°F in a server room(10°C -  27.7°C) whereas optimal temperatures range between 68° and 71° Fahrenheit(20°C - 21.6°C). To ensure reliable operation and the longest possible life from components you need to ensure that the temperature stays within that band. Even a few degrees too hot can blow a server chip. The cost of a catastrophic server failure can be considerable. Think how much money you would lose if your servers went down. There is the cost of replacement, but think also of lost e-commerce business, lost customer details, wasted staff time, and all the other associated costs.

IT equipment consumes electricity and leaves behind heat as its “waste product”. On my previous articlewe have seen that power is the biggest part in a data center. Do you know that your cooling infrastructure is the biggest consumer of this power. This may take up to 40 percent of all the power going into your data center and you know the heat generated by servers in data centers is currently ten times greater than the heat generated by them 10 years ago. So it’s huge , you can imagine, right. 

The simplest way to define cold is nothing the absence of heat. Generally there are two types of heat removal methods used in a data center environment,

·      Air Cooling
·      Liquid Cooling(typically water or some form of refrigerant)

Air cooling is a standard method of system cooling used to method of dissipate heat. The object being cooled will have a flow of air moving over its surface. Most air cooling systems use a combination of fans and heat sinks, which exchanges heat with air.
Liquid Cooling is the first method within liquid-based cooling which is “water-cooled racks.” This method uses water to cool along the hot side of the cabinet bringing the temperature down. The water is confined within basins and flows from tower pumps through pipes alongside the servers, but does not touch the components of the servers. The water-cooled racks system works very well, but still has the dangerous possibilities of leaks. We can say that this is not a largely adopted solution for cooling do you know why?
The answer is pretty simple: Water + Electricity = Disaster.
However this is being used on some industry level and we will look into those details in next article.
Immersion Cooling
The last method we will go over is “liquid immersion cooling” system. It is the practice of submerging computer components (or full servers) in a thermally, but not electrically, conductive liquid (dielectric coolant). Green revolution cooling(GRC) has patented this method of cooling technology. 

Before proceeding further I want you to understand some of the fundamental principles of cooling which are as below,

   Compression
·      Gases are compressible - liquids are not
·      Gases get hot when compressed
    Heat flow
·      Heat flows from high temperature to low
·      When a liquid is heated it vaporizes into a gas
·      When a gas is cooled it condenses into a liquid
·      Lowering the pressure above a liquid reduces its boiling point, and increasing the pressure raises it
Fluid flow
·      Fluids flow from high pressure to low

Cooling System Components in a data center

As you know the output from a machine is heat and we need to supply cold as the input for these machines. Let us see and understand the equipment used in a chiller based cooling systems and understand the flow.
The Chiller

Chillers are the cooling machines used to lower the temperature of equipment in data centre environments (uses the basic refrigerant cycle operation). A chiller is a machine that removes heat from a liquid via a vapor-compression or absorption refrigeration cycle. This liquid can then be circulated through a heat exchanger to cool air or equipment as required. The function of a chiller is to produce chilled water (water refrigerated to about 8 -15°C [46 - 59°F]). Chilled water is pumped in pipes from the chiller to the Computer Room Air Handlers (CRAH) units located in the IT environment.


Chillers use either a vapor-compression or absorption refrigerant cycle to cool a fluid for heat transfer. The working principles are relying on three basic principles:
      I.         First - When a liquid is heated, it vaporizes into a gas, and when a gas is cooled, it condenses into a liquid
    II.         Second - Lowering the pressure above a liquid reduces its boiling point and increasing the pressure raises it
  III.         Third - Heat always flows from hot to cold

There are three main types of chillers distinguished by their use of water or air to reject heat:
·      Water-cooled chillers – This model is have a cooling tower, thus they feature higher efficiency than air-cooled chillers. Water-cooled chillers are more efficient because they condense depending on the ambient temperature bulb temperature, which is lower than the ambient dry bulb temperature. The lower a chiller condenses, the more efficient it is.
·      Glycol-cooled chillers - Glycol chillers are industrial refrigeration systems that use a type of antifreeze called glycol, mixed with water, to lower the freezing point in the application of the chilling system.
·      Air-cooled chillers - Air cooled and water cooled and chillers work in a rather similar manner. They both have an evaporator, compressor, condenser and an expansion valve. The main difference is that one uses air to fuel condenser cooling and the other uses water.
If anyone interested to understand more in this concept please review the article by Carrier

Cooling tower

A cooling tower is a heat rejection device that rejects waste heat to the atmosphere through the cooling of a water stream to a lower temperature(atmosphere).  Cooling tower heat rejection process is termed “evaporative cooling”. Cooling towers minimize the thermal pollution of natural water heat sinks and allow the reuse of circulating water. The heat from the water stream transferred to the air stream raises the air's temperature and it relative humidity to 100%: this air is then discharged to the atmosphere.

Cooling towers minimize the thermal pollution of natural water heat sinks and allow the reuse of circulating water. A cooling tower extracts waste heat from the condenser water loop to the atmosphere through the cooling of a water stream to a lower temperature.
The cooler water at the bottom of the tower is collected and sent back into the condenser water loop via a pump package. Evaporative heat rejection devices such as cooling towers are commonly used to provide significantly lower water temperatures than achievable with "air cooled" or "dry" heat rejection devices, thereby achieving more cost-effective and energy efficient system operation.

Computer room air conditioning (CRAC)

A computer room air conditioning (CRAC) unit is a device that monitors and maintains the temperature, air distribution and humidity in a network room or data center. CRAC is  also known as Precision Air Conditioner or Close Control Air Conditioner or Close Control Unit (CCU) or server room air conditioner. A CRAC unit is exactly like the air conditioner at your home. It has  direct expansion(DX) refrigeration cycle built into the unit. This means that the compressors required to power the refrigeration cycle are also located within the unit. The temperature of the coil is maintained with the help of a refrigerant. A CRAC is often thought of as having an internal compressor, thus not needing the support of a centralized chilled water system.
There are a variety of ways that the CRAC units can be situated. One CRAC setup that has been successful is the process of cooling air and having it dispensed through an elevated floor. The air rises through the perforated sections, forming cold aisles. The cold air flows through the racks where it picks up heat before exiting from the rear of the racks. The warm exit air forms hot aisles behind the racks, and the hot air returns to the CRAC intakes, which are positioned above the floor.

Computer room air handler(CRAH)

A computer room air handler (CRAH) is a device used frequently in data centers to deal with the heat produced by equipment. Unlike a computer room air conditioning (CRAC) unit that uses mechanical refrigeration to cool the air introduced to a data center, a CRAH uses fans, cooling coils and a water-chiller system to remove heat. A CRAH is similar to a chilled water air handling unit found in most of these high rise commercial office buildings. Here, the cooling is accomplished by blowing air over the cooling coil filled with the chilled water. These chilled water in turn, are supplied to the CRAHs by a chilled water plant, otherwise called as, the chiller. These CRAHs can have VFDs that modulate fan speed to maintain a set static pressure either under floor or in the overhead ducts.
Now that we understand how air circulates in the data center, we must understand the difference between air handling and air conditioning. Anyone who pays their power-bill at home knows that the cost of cooling/heating a home is one of the highest costs associated with power. The same holds true for data centers. Cooling a data center, which houses equipment that generates substantial heat, is very expensive. Because of this, data centers have turned to what is call air-side economization. What that means is that when the ambient outside temperature is within a certain threshold, the air is then ‘handled’ and pulled into the facility and pushed down in the pressurized sub-floor. The handled air flows through the data center as traditional air conditioned air would. However, in this scenario, the facilities with CRAH units are actually using ‘free’ air and the cost to cool the facility significantly drops.

The efficiency of this method obviously depends on climate, In many Europe and US  market we can use this technology about 60% of the year(during the winter months). This saves immense power cost, which in turn lowers PUE and drives the cost per kilowatt down for you. Overall, the utilization of air-handling and air-side economization rather than traditional air-conditioning is an all-around innovation win for the industry and its clients.

Water Side Economizer

Using this economizer, there are no noticeable changes on the data center floor. The same collection of air handlers, raised floors, and fans move air as they normally would. The change occurs behind the scenes in the production of the chilled air and the removal of waste heat. Users with an existing chilled water infrastructure can accomplish “free-cooling” via a supplemental heat exchanger called a Water-Side Economizer.


For data centers with water- or air-cooled chilled water plants, a water-side economizer uses the evaporative cooling capacity of a cooling tower to produce chilled water and can be used instead of the chiller during the winter months. Water-side economizer operation depends on ambient conditions. The outside air must sufficiently cool the condenser water to allow for proper heat exchange between the two loops.

Water-side economizers can be integrated with the chiller or non-integrated. Integrated water-side economizers are the better option because they can pre-cool water before it reaches the chiller. Non-integrated water-side economizers run in place of the chiller when conditions allow. Water-side economizers offer cooling redundancy because they can provide chilled water in the event that a chiller goes offline. This can reduce the risk of data center down time. During water-side economizer operation, costs of a chilled water plant are reduced by up to 70%.

Air-Side Economizer

An air-side economizer(see below picture) brings outside air into a building and distributes it to the servers. Instead of being re-circulated and cooled, the exhaust air from the servers is simply directed outside. If the outside air is particularly cold, the economizer may mix it with the exhaust air so its temperature and humidity fall within the desired range for the equipment. The air-side economizer is integrated into a central air handling system with ducting for both intake and exhaust; its filters reduce the amount of particulate matter, or contaminants, that are brought into the data center.

Because data centers must be cooled 24/7, 365 days per year, air-side economizers may even make sense in hot climates, where they can take advantage of cooler evening or winter air temperatures. For various regions of the United States, Figure 14 shows the number of hours per year with ideal conditions for an air-side economizer.

Intel IT conducted a proof-of-concept test that used an air-side economizer to cool servers with 100% outside air at temperatures of up to 90°F. Intel estimates that a 500kW facility will save $144,000 annually and that a 10MW facility will save $2.87 million annually. Also, the company found no significant difference between failure rates using outside air and an HVAC system. Some of the things has to be considered is control systems are very important to the operation of the air-side economizer and must be properly maintained. Excessive humidity control can cut into the savings achieved by the economizer. In certain geographic locations, for example, air can be very cool but very dry, and the system may spend excessive energy humidifying the air. Users will need to consider ASHRAE's recommendations, studies of their ambient climate, and their humidity preferences before considering implementation. If desired humidity ranges are too restrictive, net energy savings from an economizer can be limited. Proper management and controls are imperative to ensure that correct air volume, temperature and humidity are introduced.

Air-side and water-side economizers offer the potential for significant energy savings by using the ambient environment to augment or replace mechanical air conditioning. Depending on your location, environment, data center design and existing infrastructure, implementing a “free cooling” strategy can be extremely challenging. And the objective of economization is to run those less or run those with less load.


What allows more economizer hours?

The basic function of a chiller is to remove heat energy from a data center by compressing and expanding a refrigerant to keep chilled water at a set supply temperature, typically 45°F/7°C. When the outdoor temperature is about 19°F/11°C colder than the chilled water temperature, the chiller can be turned off. The cooling tower now bypasses the chiller and removes the heat directly from the data center.

By increasing the chilled water supply temperature, the number of hours that the chiller can be turned off (economizer hours) increases. For example, there may be 1000 hours per year when the outdoor temperature is at least 19°F/11°C below the 45°F/7°C chilled water temperature. But if the chilled water is increased to 55°F/13°C, the economizer hours increase to 3,700.

By this we have covered all critical mechanical equipment’s that are involved in a data center cooling infrastructure.

Knowledge Credits : www.energystar.gov and www.42u.com


Have a comment or points to be reviewed? Let us grow together. Feel free to comment.


Data Center Power Infrastructure

Typically there are four primary physical key pillars that is the major factor for the successful operation of a data center and those are

·      Power
·      Cooling
·      Connectivity
·      Space

Each of these pillars is having its own unexceptional roles to the success of our data center. Let’s have a look into each of this in details and by the end of this article, you will definitely understand the importance of these.

Power Infrastructure

When you consider the importance of electric power, you know that each and every mechanical, electrical and operational parts require to utilize the power in order to function properly. Even though the utilization rate will vary depends on the areas of devices, electrical infrastructure is the most important part among these four pillars. We can say that the power distribution system is the heart of any critical facility, and it's vital that everyone working in and around critical sites knows at least the basics of the power distribution system.

There are multiple electric components involved in a data center. The electric power distribution structure from the utility power to the rack PDU(Power strip) is generally known as Power Train. Let us see this in a picture format as below,



So let me tell you a data center that is having N+1 configuration will have below electric components to support the electricity infrastructure.

·      Main power input
·      Medium-voltage switchgear including MV/LV transformer
·      Low-voltage switchgear/switchboard
·      UPS system with input/output switchboard and UPS distribution switchboard
·      Switchgear
·      Motor Control Centre (MCC)
·      Panelboard
·      Power Distribution Units (PDUs) and Remote Power Panels (RPPs)
·      Rack power strips

Remember to verify the same when you get a chance to do your data center tour
 J
Let us have a look into each component that fulfilled the power train(picture on top) for the understanding of its function.

Utility Power

Here is the source of your electric energy, without the utility power we would have never achieved a power solution in a data center. The power from the utility will be always consumed for the operation which is bypassed through generator set. In any circumstance where there is a lack of utility power, the generator set will take control of this operation. The power supply of every larger data center starts with a connection to the main grid, which is provided by the local utility company.
Data center main power supply is in three-phase distribution. It is usually used in the industry to drive motors and other devices. The current which is mainly distributed in the data center is Alternate current(AC). If your region is following UK standards the main electricity supply is at 50 cycles per second and hence it can be said as 50 Hz, if this is a US standard then we would say that it will be 60Hz. 

Generator Set

If the Utility power becomes unavailable, the ATS triggers the emergency source. In most data centers, that means the on-site generators. The function of the backup generator is to provide power when there is an interruption of main power. Data center components do not easily tolerate power spikes due to switching from a normal to the emergency power supply. When these components loose power (if only for a fraction of a second), a total restart is required. This could allow for system downtimes, startup issues and loss of in-process information.
When utility power is lost in the facility the following chain of events occur,

·      UPS supplies power to security and data center
·      Emergency generator starts and automatic transfer switch transfers to emergency power
·      Switchgear routes power to Critical and Non-Critical loads
·      UPS transfers to normal operation, Data center and security are powered by an emergency generator
·      When normal power is resumed, the automatic transfer switch routes power to utility and critical and non-critical loads are powered.

Data center and security see no power interruption and continue to operate normally through power loss. Components such as HVAC and work stations may need to be reset to regain normal operation. Often work stations contain individual UPS backup to keep computers powered for a short amount of time.

Main switchboard (Main PDU or Switchgear)

As you can see from the power train diagram, the power from utility is directly running to the main switchboard and we can say that electrical power is presented to the data center through a switchboard (or switchboards).

The incoming circuits are split into a number of outgoing circuits to feed different areas or loads within the data center. The outgoing cables are protected by an appropriate fuse or circuit breaker, such that a fault on any circuit will only affect that circuit and not trip out the entire facility. The cables feeding loads may be single or three phase, and will also contain neutral and earth conductors.

Note: The feeds from the utility provider are high voltage and low current, which allows them to utilize smaller conductors for greater distances. 

UPS(Uninterruptable power supply)

Uninterrupted Power Supply (UPS) systems are to ensure safety, security and continuity of operations in harsh environments. The usage and capacity of a UPS in a data center environment purely depend on the data center design and operation. Some data centers are not using UPS power supply and instead, they design their diesel generator with DRUPS(it stores kinetic energy to kick start the generator). Whereas most of the data centers use static UPS as a backup instead of DRUPS.


UPSs vary greatly in physical size, weight, form factor (e.g. standalone vs. rack-based), capacity, supported input power source (e.g. single phase vs. 3-phase), technological design, and cost. There are a number of design decisions to make relating to a UPS for a data center (or other mission-critical facilities) such as:

·      The size of the load to be protected
·      The battery runtime required
·      The proper input and output voltages
·      The right type of system (on-line, line-interactive)
·      Pricing and performance seen within manufacturer product portfolios
·      The advances in technologies
·      The ideal level of redundancy (i.e., N, N+1, 2N, 2N+1, etc)
·      The required output distribution

Backup time of a UPS - This is the time during which the UPS can supply the rated load with power from its battery under nominal conditions when the normal AC source fails. This time depends on the battery. Typical backup times can be designed to last up to 30 minutes. Industrial UPS systems often include specifically designed NI-CD battery arrays that withstand harsh conditions and provide long hours of reserve power when necessary, especially in remote areas subject to extreme temperatures.

Another UPS system benefit is the ability to clean the incoming utility power. Normal utility power voltages vary wildly depending on what other loads the service is supplying. These voltage fluctuations are detrimental to power supplies in servers and can shorten their life spans or worse: Destroy them. UPS units clean electrical power by converting utility power from AC to DC and back to AC again; this process is referred to as "dual conversion”.

Power distribution unit(PDU)

PDUs are not clearly defined by standards, and they come in many configurations, although they all have common features and functions. Their basic function is to distribute power to the racks, either through cables and sockets or via an overhead bus-track system. There are three main types of PDU:

·      Dumb (No instrumentation - not manageable)
·      Metered (Equipped with a display showing current load on each phase)
·      Switched (Receptacles can be individually switched on or off remotely)

One challenge in selecting PDUs is to balance the relatively high cost, great functionality and low risk of a switched PDU, versus the relatively low cost (but higher risk and lack of manageability) of a dumb PDU. 

With a dumb PDU, a data center power supply runs the risk of phase unbalancing; devices may be unexpectedly plugged in, possibly tripping a circuit breaker. This could result in an unplanned "emergency" shut-down of critical equipment, causing data loss or corruption, and costly hardware damage.

Dependent on the quality of the incoming utility supply, PDUs may be fitted with internal transformers where the power quality is prone to fluctuations.

Remote power panel(RPP)

Remote Power Panels (RPPs) are like PDUs without a transformer and are therefore smaller (about the size of a standard raised floor tile). RPPs may contain up to four panelboards and a monitoring system, and distribute power to the IT racks. RPPs are most often fed from one or more PDU sub-feed Breakers. Usually, RPPs are located in the IT space(white space area) to distribute, control, and monitor the critical power from the upstream UPS system to IT racks. 

Some of the advantages of using Remote Power Panels are as below,

·      Reduce the length of cable runs between your PDU and the individual loads
·      Optimize usable floor space
·      Simplify server consolidation plans
·      Meet growth demands
·      Retrofit to any existing distribution system
·      Come with an integrated energy management system

Rack Power strips

Rack power strips are installed in IT racks and are powered from the mating connector of the upstream PDU or RPP.

There is a wide range of options are available for this. It can be single-phase or three-phase, horizontal or vertical in shape, metered or unmetered and IP-addressable for remotely managing and monitoring.



Importance of power protection

So we have seen the major components included in electricity infrastructure for data centers. So how important it is to have the power highly available in a data center?

Let’s see some of the facts that can happen after a power outage for IT systems.

·      More than 33% of companies require more than one day to recover
·      10% of companies take more than one week
·      It can take up to 48 hours to reconfigure a network
·      It can take days or weeks to re-enter lost data

What are the causes of Data Centre Power Outages and how can we prevent it? There are four main areas responsible for power outages at data centers.

·      Insufficient/inconsistent operational testing and monitoring.
·      Lack of redundancy in design and implementation.
·      Lack of system-level power control management.
·      Lack of proper preventative maintenance.
And most frequently - Human Error! L
By the above notes, we have completed the major power infrastructures involved in a data center. We will discuss the physical infrastructure of the Cooling system on the next article. 

Have a comment or points to be reviewed? Let us grow together. Feel free to comment.