Wednesday, July 9, 2008

Data center disaster recovery considerations checklist

Looking back on my data center disaster recovery experiences as both an IT director and consultant, I regularly encountered organizations in various stages of defining, developing, implementing and improving their disaster recovery capabilities. Disaster recovery policies and architectures, especially in larger organizations, are complex. There are lots of moving parts: standards and procedures to be defined, people to be organized, technology functions to be coordinated and application dependencies to be identified and prioritized. Add to this mix the challenge of grappling with the inherent uncertainties (chaos) associated with a disaster – whatever the event might be – and the complex becomes even more convoluted.

It is critical to come to an agreement on some fundamental assumptions in an effort to establish and ensure both internal and external (think stakeholders and stockholders). This should be done in order to recognize the need to address these many facets of disaster recovery development. Failure to do so will only lead to significant problems down the road.

I've given many presentations that address the "DR Expectations Gap," in which business assumptions concerning recoverability are often misaligned with actual IT capabilities. It's a fact that without explicit assumptions being clearly identified and communicated, your disaster recovery heroes of yesterday's recovery will become tomorrow's disaster recovery scapegoats.

Key among these assumptions, of course, is establishing classes of recovery in terms of RTO and RPO, but there are also a number of fundamental considerations that need to be measured, weighed and incorporated into the disaster recovery planning process. Here are a few practical planning items whose assumptions must be stated explicitly in order to drive an effective disaster recovery design and plan:

Staff: Will the IT staff be available and able to execute the disaster recovery plan? How will they get to the alternate disaster recovery site? Are there accommodations that need to be made to ensure this? When a disaster recovery event hits, you better understand that some of your staff will stay with their families rather than immediately participate in data center recovery.
Infrastructure: What communications and transportation infrastructure is required to support the plan? What if the planes aren't flying or the cell phones aren't working or the roads are closed?
Location: Based on the distance of the disaster recovery site, what categories of disaster will or will not be addressed? Looking at some best practices, that site should be far enough away to not be affected by the same disaster recovery event – is yours?
Disaster declaration: How does a disaster get declared, and who can declare it? When does the RTO "clock" actually start?
Operation of the disaster recovery site: How long must it be operational? What will be needed to support it? This is even more important if you're using a third party. (e.g. what's in my contract?)
Performance expectations: Will applications be expected to run at full performance in a disaster recovery scenario? What level of performance degradation is tolerable and for how long?
Security: Are security requirements in a disaster scenario expected to be on par with pre-disaster operation? In some specific cases, you may require even more security than you originally had in production.
Data protection: What accommodations will be made for backup or other data protection mechanisms at the disaster recovery site? Remember, after day one at your recovery site, you'll need to do backups.
Site protection: Will there be a disaster recovery plan for the disaster recovery site? And if not immediately, then who's responsible and when?
Plan location: Where will the disaster recovery plan be located? (It better not be in your primary data center). Who maintains this? How will it be communicated?

Obviously, there are many more considerations that are critical to identify and address for successful disaster recovery, but hopefully this tip helped to point you in the right direction.

Sunday, July 6, 2008

UPS Apparent & Real Power

"80%" figures come from several different things and, since the same percentage number is coincidentally used, it can be confusing as to what is "required" versus what is "recommended". Two of these "80%" figures are strictly engineering-related, but are not generally understood. If they are well known to you, my apologies, but I think they bear explanation.

First is Power Factor ("pf"), which is the way engineers deal with the difference between "Real" and "Apparent" power. In our industry, the pf is usually created by reactive devices such as motors and transformers. "Apparent Power" is Volts x Amps. This is the "VA" rating of the equipment, (or "kVA" if its divided by 1,000). "Real Power" is Watts or kW – the "useful work" you get from electricity. Since you have stated that you are using Liebert hardware, we'll assume the 0.8 Power Factor on which their designs (and most UPS designs) are based. With this in mind, what's important to understand is that there are really two UPS Ratings you can't exceed. 100% Load means 400 kVA (kiloVolt-Amperes), but it also means 320 kW (kiloWatts). That comes from the formula kW = kVA x pf. In years past, computer devices with a 0.8 pf were common, so both the kW and kVA ratings of the UPS were reached essentially simultaneously. However, since most of our computers today are designed with much better Power Factors, (between 0.95 and 0.99), it is virtually certain it is the kW rating that will be your limiting factor, not the kVA rating. (For example, a device measuring 10 Amps at 120 Volts draws 1,200 VA or 1.2 kVA of "Apparent Power." If the Power Factor is 0.8 it consumes only 960 Watts or 0.96 kW of "Real Power" – the same ratio as your UPS ratings. However, with a 0.95 pf, the "Real Power" is 1,140 Watts or 1.14 kW, so the kW capacity of the UPS will be reached before the kVA limit.)

The second "80%" number is the National Electric Code (NEC) requirement for Circuit Breaker Ratings. NEC states that you can't load any circuit to more than 80% of the Breaker Rating. This means, for example, that a 20-Amp Breaker on a 120-Volt circuit, running light bulbs or heating appliances which have a pf of 1.0, cannot be continuously loaded to more than 1,920 Watts (120 x 20 x 80%). A 20-Amp breaker can handle a full 20-Amp load for a short time, such as when a motor starts, but running a sustained current will eventually cause it to trip. That's the way they're designed. This, however, has little to do with how a UPS can be loaded, since all the circuit breakers are designed to operate within legal range when the UPS is at capacity. I explain it only because some people have thought this limited the total UPS loading. It does not.

Now to the third "80%" consideration. Any piece of electrical equipment generates heat when it operates, and the more power it handles, the more heat it produces. That's where the "Real Power" goes; it is converted to heat. Industrially rated devices, such as large UPS systems, are designed to withstand this heat – at least so long as proper cooling and ventilation are provided to remove it in the manner for which the equipment was designed. However, heat eventually causes a breakdown of electrical insulation and shortens the life of components, especially if it's applied over a long period of time. Therefore, "good practice" has always been to operate electrical equipment at 80% or less of rated capacity simply to ensure longer life, as well as to compensate for the fact that virtually nothing actually gets cooled and ventilated in the field as well as it does in the lab, or as perfectly as the specifications call for. But top-quality equipment (and any 400 kVA UPS is bound to be from a manufacturer making high-quality goods) is designed with enough "headroom" to run its entire rated life at 100% loading. If there's a weak point, of course, full-level operation will expose it, and it can then be fixed, but that's not what we're talking about here. Although there is no "written rule" I'm aware of, 80% continuous loading is the generally accepted "rule-of-thumb" for maximizing the service life and reliability of electrical equipment.

Now let's discuss the "Parallel Redundant" or "Power Tie" configuration, because that sheds a different light on the situation. In this configuration, as you obviously are aware, you must manage your power so that neither UPS is loaded beyond 50% of its continuous load rating. (Again, both kW and kVA readings must be examined, with the kW reading likely to be the governing one.) This is so the loss of either UPS, whether due to failure or to an intentional maintenance shutdown, will not load the remaining UPS beyond 100% of its rated load. Should we be concerned that one UPS is running at 100% when the other is shut down? Not at all. Even if this load continues for some number of days, the UPS is only operating as it was designed. It is highly unlikely that a few hours, or even a few days, of full-load operation will shorten its life (again, hidden flaws notwithstanding, in which case any load may cause a failure at some point in time). What we should consider here is normal conditions, where each UPS is operating at less than 50% of its capacity. This is obviously well below the 80% rule-of-thumb, so the UPS is literally coasting. Under this loading, it should run for many years beyond its expected life span if kept clean and the batteries remain in good condition.

Therefore, your stated limit of 40% on each UPS is ultra-conservative. Obviously, you don't want to be so close to the 50% level on either UPS that someone plugging in a temporary device runs you over, but in your situation you have some 32,000 Watts of "2N redundant headroom" at the 40% level, and that's a lot of expensive cushion. (Estimate $1,200 per kW for each UPS, and you're at more than $76,000 in "insurance").

As you observed, any good UPS can sustain a little over 100% rating for a short time, so if you happened to exceed 50% temporarily, and a failure were to occur, the second UPS might Alarm, but it should continue to function at least long enough for you to accomplish some manual load-shedding. You have also mentioned, however, allowing capacity for parallel installations and change-outs, which is a valid reason to operate below the 50% level. Only you can determine how much margin you really need for that purpose, but since redundant UPS capacity is expensive, as noted above, it might be more cost-effective to run test setups on lesser, rack-mounted UPS's (full-time, not "line interactive") than to maintain a high level of headroom on a parallel-redundant system.

Surges should not be a particular concern. There's not much in the data center that can cause a significant surge. Those tend to occur on the input side, and are part of what the UPS is supposed to get rid of. (Hopefully, you have good surge protection on your bypass feeders.) The one big thing that must be evaluated in choosing UPS systems for redundant operation is the "step loading function." In the event of a failure, the resulting sudden load shift can be 100%, literally doubling the load on one UPS virtually instantaneously. The UPS must be able to sustain this rapid change, and maintain stable output voltage, current, frequency and waveform, to be suitable for redundant service. This is an easy performance item to verify with, and to compare among, manufacturers.

Regarding your PDU's, the kW capacity is dependent on the pf of your data center loads. Liebert's PDU Technical Data Manual instructs you to assume a 0.8 pf if the actual pf is unknown. This would mean each of your 225 kVA PDU's could deliver only 180 kW of power. Today, it is more probable that the pf is in the order of 0.95, as discussed above, which would mean that each 225 kVA PDU could deliver 214 kW or more. (Incidentally, you should be able to read the kW, kVA and pf from your PDU metering systems.)

If we understand your PDU configuration, you have only two 225 kVA units connected to your 400 kVA parallel redundant UPS, rather than two per UPS which would be a total of four. If this is correct, then with 0.8 pf loads it would be possible to run each PDU at 89% of capacity without exceeding the kW rating of your redundant UPS (89% x 180 kW x 2 PDU's = 320 kW). If the loads are closer to a 0.95 pf, which is likely, then you could load each PDU to only about 75% of maximum before reaching the limit of your UPS capacity (75% x214 kW x 2 PDU's = 321 kW). This is obviously well below the 80% "rule of thumb" for both the PDU's and the UPS's. In most data centers, because we like to minimize the number of devices on a single circuit, branch circuits are rarely loaded to more than a fraction of capacity, so the 80% breaker maximum is rarely a consideration, and total PDU loadings are often far less than maximum.

But this is another place where redundancy must be considered. If you have only two PDU's, and you are connecting dual-corded equipment plus, as you indicate, single-corded equipment with Static Transfer Switches (STS), then you must maintain the total load on both PDU's at no more than 100% of one PDU's capacity. The easiest way to ensure this is to keep the loads fairly evenly balanced, and below 50% of capacity on each PDU. In your case, assuming (2) 225 kVA PDU's and equipment with a pf of 0.95, the load on each PDU should be no more than 107 kW or 113 kVA. This would result in a total maximum load on your UPS of only 214 kW (67% of the 320 kW redundant capacity) or just 33.5% capacity on each UPS. This is gross under-utilization of the UPS. But if you load either of your 225 kVA PDU's to more than 50% of capacity (assuming the total now exceeds 100%), and either PDU must be shut down for service, or its main breaker trips, then the total load will instantly shift to the remaining PDU, which will now be overloaded and will also shut down – if not immediately, then in a short time.

My preference is to use more, smaller PDU's in order to maintain redundancy, as well as to gain as many discreet circuits as possible. One can then configure data hardware connections to minimize PDU vulnerability, as well as for UPS redundancy, make better use of each PDU, and realize the full usable capacity of the UPS. Without an actual diagram of your installation, of course, we are just speculating as to how you are configured. There are several ways to connect an installation, and our response is based on the way we read your question.

You seem to be running your UPS very conservatively. We are not so sure about how you are running your PDU's, but it seems you have a reasonably good understanding of the principles. We hope this answers your questions, gives you a little more insight, and perhaps lets you confidently get more from your valuable UPS.

Saturday, June 28, 2008

What is the technology difference between modular and traditional centralized uninterruptible power supplies (UPS)? Can we have a mix of both at diffe

Either approach, traditional centralized or modular UPS, is perfectly appropriate for data centers – provided, of course, it is configured correctly for the application and is properly sized and power-managed. (Read my article UPS -- It's Not Uninterruptible).

All UPS systems (at least those that should be used in data centers) use some form of double conversion design that take alternating current AC in, change it to direct current DC which charges the batteries, then re-convert it back to AC.

Many traditional UPS systems have been built this way for years, using rather large "modules", either to create higher capacity systems than were considered practical with single module designs, or to obtain "N+1" redundancy. Three 500 kVA UPSs, for example, could be intended to deliver a maximum of 1,000 kVA, so if any one unit fails or is shut down for service, the full design capacity is still available.

What has occurred in recent years is the use of much smaller modules (10 kVA to 50 kVA) to make up larger UPS systems (from 40 kVA to 1,000 kVA from American Power Conversion Corp., for example). As with anything in engineering there are advantages and disadvantages.

The principle advantages touted for the modular approach are the ability to grow capacity as needed (assuming the right frame size is installed initially) and reduced maintenance cost, since the modules are hot swappable and can be returned to the factory by the user for exchange or repair. I find a third potential advantage which I'll explain later.

Modular systems are also generally designed to accept one more module than is required for their rated capacity, making them inherently "N+1" capable at much lower cost than would be possible with the very large system.

Disadvantages to the modular approach

The disadvantages to modular systems depend on several factors, so my descriptions will tend to be conditional.

The smaller modular systems (up to about 120 kVA) tend to be installed "in-row", as additional cabinets. This means added space and weight in the machine room. Depending on how many cabinet rows are thus equipped, and how their distribution circuits are wired, there may also be a loss of economy of scale because extra capacity in one UPS may not be readily available to another part of the floor that needs it. This can be offset to some extent by moving UPS modules to where they're needed, assuming the frame size is adequate, but over-building with an 80 kVA frame in a row that will never need more than 30 kVA is just not cost-effective.

Next are batteries. Batteries used inside the data center must be valve-regulated lead acid (VRLA). This type of battery is used in the majority of UPS's today, but it carries certain failure risks and lifetime limitations that can add up in replacement costs over time. If you are running a large data center, and prefer the long-term reliability of wet cell lead acid batteries despite their initial cost, construction, and maintenance requirements, then locating smaller UPS's among your cabinet rows will not likely be practical -- running DC any distance requires huge copper wiring that quickly becomes very costly as well as space-consuming. Today, however, you can install large, central systems as either traditional or modular, using whichever type of battery you prefer.

Another factor involves the built-in redundancy of most modular systems. If the frame is fully populated (nine 10 kVA modules in an 80 kVA frame, for example), then there is essentially no problem. If you load beyond 80 kVA you will either receive warning alerts or trip the main circuit breaker, so you should always have the redundancy you bought. However, if the frame is not fully populated, it is your responsibility to manage power so there is always at least one module's worth of unused capacity. Otherwise your redundancy is lost. This is also true when large traditional UPS systems are configured for redundancy, but large system modules don't move around, so their alarm and protection circuits are always properly set.

The biggest debate regarding modular UPS is reliability. It is well known that the more parts in any system, the greater the chance that something will fail. Proponents of traditional UPS will dwell on this factor, but the manufacturers of the newer modular systems have had highly regarded experts run statistical analyses on their systems and can show you both theoretical and field data countering the conventional wisdom. The fact is, today's mainline brand UPS's are all highly reliable. You should probably be weighing other factors more heavily in making a choice.

The last potential advantage to modular UPS systems, to which I alluded earlier, is a relatively new consideration, and that is efficiency. A UPS system runs at highest efficiency when it is near its maximum rated capacity. As load level drops, so does efficiency. The losses may not seem great on the surface, but they it add up, and as we become increasingly concerned about energy waste and cost, this starts to become a consideration.

Modular UPS systems can be configured, and readily re-configured, so they are running close to capacity. Large, traditional UPS systems are usually purchased with all the capacity anticipated for the future, so they often run well below capacity for a number of years, if not forever. Redundancy, however, always means running below capacity which also means reduced efficiency. This can be minimized in an "N+1" modular system through careful power management.

However, with any "2N" redundant configuration, regardless of type, it is always necessary to manage power so that no system is loaded beyond 50% of its capacity, otherwise it will overload if the duplicate, load-sharing system fails. As a result, every UPS running in a "2N" mode operates at less than maximum efficiency. Again, with very careful management, a modular UPS may be configured more closely than a larger, fixed-capacity system, and this might result in some long-term power savings. There are many "ifs", "coulds", and "mays" in this scenario.

To quickly answer your last questions: Both UPS types can be mixed, in either the same or in different zones of the data center. Some people use a traditional UPS as their main source, but use smaller, modular systems as the second source for their most critical dual corded hardware to give it "2N" redundancy without incurring that cost for the entire enterprise.

Monday, January 28, 2008

Ceiling Tiles in Data Center

What is your opinion on ceiling tiles in the data center?
We are building new facilities and there have been questions regarding the usefulness of a drop ceiling in the data center. Other than reducing the cost of gas fire suppression, are there other reasons a drop ceiling should be used? Assume the raised floor is 18 inches and there will be overhead cable tray and gas will be used for fire suppression. The structural height is about 14 feet.

We have multiple projects with data center with ceiling tiles, here are some comments:

1) The finish floor to ceiling height ratio needs to be taken into account, as you are correct you will severely limit your infrastructure placement.
2) The ceiling tile creates a air return plenum, just as the below the raised floor you create the air supply plenum.
3) Keep in mind the type of tile, some get damage pretty easy and release particles in the air stream.
4) By having raised floor perfs and ceiling grid return air grilles you can better distribute or control your supply and retrun air flows.
5) Ceiling Tile management is required as with raised floor tiles.
6) Cleanliness is critical.
7) I would not use fire gas suppression system in projects with ceiling tile, however I have seen a reduction on thsi approach due to cost.

Our current Datacenter has tile ceilings but it is a nightmare...we can't get our gas suppression pressure tested because the tiles leak. We have retainer clips on them but atleast one or more tiles always pop loose or the corners crack.

Besides I don't think you have tall enough ceilings to use tiles anyway. We are currently building a new datacenter and our engineers did some research and the minimum ceiling height is 14 feet. If you put in tiles you are lowering that and you won't be able to get the heat far enough away from the racks.

Good Point. We have ceiling tile in our DC and we had to install an exhaust to pump the hot air out of the space above the tile canopy. If I could do it all over again I'd nix the tile and go with open ceiling about 20-25 ft with a raised floor.

Monday, January 21, 2008

Focus on Physical Layer

By John Schmidt

The data center is the most critical resource of any business, providing the means for storage, management and dissemination of data, applications and communications. Within the data center, large amounts of information are transmitted to and from servers, switches, routers and storage equipment via the physical layer’s low-voltage cabling infrastructure. The design and deployment methods of the cabling infrastructure have a direct impact on data center space savings, proper cooling and reliability and uptime.

Space Savings
Business environments are constantly evolving, and as a result, data center requirements continuously change. Providing plenty of empty floor space when designing your data center enables the flexibility of reallocating space to a particular function, and adding new racks and equipment as needed.

As connections, bandwidth and storage requirements grow, so does the amount of data center cabling connecting key functional areas and equipment. Maximizing space resources is one of the most critical aspects of data center design. Choosing the right mix of cabling and connectivity components can have a direct impact on the amount of real estate required in your data center. Fundamentally, you cannot use the same cabling components designed for low-density LANs and expect them to perform to the level required in a data center. To properly design your data center for space savings:

• Ensure ample overhead and underfloor cable pathways for future growth.
• Select high-density patching solutions that require less rack and floor space.
• Consider higher port-density solutions like 12-fiber MPO cables and cassettes.
• Look for smaller diameter cables that take up less pathway space.

Expanding the physical space of a data center requires construction, movement of people and equipment, recabling and downtime. Expansion can cost more than the original data center build itself. Given these consequences, properly designing the data center for space savings at the start is essential. TIA-942 Telecommunications Infrastructure Standard for Data Centers, which was published in 2005 and specifies requirements and guidelines for data center infrastructures, covers cabling distances, pathways, site selection, space and layout. This standard is a valuable tool in designing your data center infrastructure for maximum space savings.

Proper Cooling
The reliability of data center equipment is directly tied to proper cooling. Servers and equipment are getting smaller and more powerful, which concentrates an enormous amount of heat into a smaller area. Proper cooling equipment is a must, as well as the use of hot aisle/cold aisle configuration where equipment racks are arranged in alternating rows of hot and cold aisles. This practice, which is recommended in the TIA-942 standard, allows cold air from the cold aisle to wash over the equipment where it is then expelled out the back into the hot aisle (see Figure 1).

Figure 1: Hot Aisle/Cold Aisle Cooling

Good cable management solutions are also necessary for proper cooling. Cables that are not properly stored and organized can block air inlets and exits, which can raise the temperature of switches and servers. Other considerations for cooling include the following:

• Increase airflow by removing obstacles to air movement, blocking unnecessary air escapes, and/or increasing the height of the raised floor.
• Spread equipment out over unused portions of the raised floor, space permitting.
• Use open racks instead of cabinets when security is not a concern, or use cabinets with mesh fronts and backs.
• Choose components that manage fiber overhead, reducing the need to store it in the raised floor and helping to increase airflow.
• Use perforated tiles with larger openings.

Reliability & Uptime
When employees and customers are unable to access the servers, storage systems and networking devices that reside in the data center, your entire organization can shut down, and millions of dollars can be lost in a matter of minutes. With 70 percent of network downtime attributed to physical layer problems, specifically cabling faults, it’s paramount that more consideration is given to the cabling infrastructure design and deployment.

As information is sent back and forth within your facility and with the outside world, huge streams of data are transferred to and from equipment areas at extremely high data rates. The low-voltage cabling deployed in the data center must consistently support the flow of data without errors that cause retransmission and delays. A substandard performing data center can be just as costly and disruptive to your business as total downtime.

Because networks expand and bandwidth demands increase, the cabling should be selected to support current needs while enabling migration to higher network speeds. In fact, the cabling chosen for the data center should be designed and implemented to outlast the applications and equipment it supports by at least 10 to 15 years. With 10 Gigabit Ethernet already a reality, that means implementing the highest-performing cable available such as augmented category 6 copper cabling and laser-optimized 50µm multimode fiber. These types of copper and fiber cabling will support bandwidth requirements for the future and ensure reliability of your data center for many years to come.

The protection of cabling and connections is a key factor in ensuring data center reliability and uptime. When cabling is bent beyond its specified minimum bend radius, it can cause transmission failures, and as more cables are added to routing paths, the possibility of bend radius violation increases (see Figure 2). The separation of cable types in horizontal pathways and physical protection of both cable and connections should also be implemented to prevent possible damage.

Figure 2: Care must be taken to avoid violating minimum
bend radius when adding fibers

Manageability is also key to maintaining uptime, and it starts with strategic, unified cable management that keeps cabling and connections properly stored and organized, easy to locate and access, and simple to reconfigure. Infrastructure components that offer proper cable management reduce the time required for identifying, routing and rerouting cables during upgrades and changes, thereby reducing downtime.

The use of a central patching location in a cross-connect scenario is the optimum solution for enhanced manageability in the data center, providing a logical and easy-to-manage infrastructure whereby all network elements have permanent equipment cable connections that once terminated, are never handled again. In this scenario, all modifications, rerouting, upgrades and maintenance activities are accomplished using semi-permanent patch cord connections on the front of the cross-connect systems (see Figure 3).

Figure 3: Interconnect vs. Cross-Connect

To improve the reliability and uptime of the data center:

• Choose the highest performing cabling and connectivity backed by a reputable manufacturer and engineered for uptime with guaranteed error-free performance.
• Select components that maintain proper bend radius, efficiently manage cable slack, and provide separation of cable types and physical protection.
• Deploy common rack frames with ample cable management that simplify cable routing and ensure paths are clearly defined and intuitive to follow.
• Use connectivity components that ensure connectors are easily defined and accessed with minimal disruption to adjacent connections.
• Deploy plug-and-play cabling solutions for faster configuration and upgrades.
• Use a central patching location in a cross-connect scenario.

Summary
The enterprise network is made up of layers with each layer supporting the one above it. When transmitting information across the network, control starts at the application layer and is moved from one layer to the next until it reaches the physical layer at the bottom where low-voltage cabling and components provide the means for sending and receiving the data. Since the total cost for low-voltage cabling components of the physical layer is but a fraction of the entire data center cost, decisions for selecting that physical layer are often taken lightly. But the fact remains that the cabling infrastructure is the core foundation upon which everything else depends – failure at the physical layer affects the entire network.

By recognizing the value of the data center cabling infrastructure, you can ensure that employees and customers have access to the servers, storage systems and networking devices they need to carry out daily business transactions and remain productive. Selecting fiber and copper cable, connectivity and cable management components that work together to satisfy space savings, reliability and uptime requirements lower the total cost of ownership. This is the ultimate means to a thriving data center and overall successful business.

About the Author
John Schmidt is the Senior Product Manager for Structured Cabling at ADC. John has been with ADC for 10 Years in a variety of design engineering and product management roles. He is the author of several articles, white papers, and presentations related to the design of telecommunications and data networks. John has a Bachelor of Science degree in Engineering from the University of Minnesota, and has 10 patents for telecommunications and network equipment design.

About ADC
Founded in 1935, ADC provides the connections for wireline, wireless, cable, broadcast and enterprise networks around the world. ADC’s network infrastructure equipment and professional services enable high-speed Internet, data, video, and voice services to residential, business, and mobile subscribers. The company sells products and services in more than 130 countries. Today, ADC is focused on serving the converged network, carrying simultaneous voice, data, and video services over wireline and wireless connections via products engineered for uptime. For more information about ADC, call 1-800-366-3891 or visit www.adc.com.

Saturday, January 19, 2008

Isolated Ground in Data Center

Do I need isolated grounds in my data center?

31 Jan 2007 | Robert Macfarlane, Contributor

I've been advocating against isolated grounds in data centers for years. The fact is, unless you use very special mounting hardware on everything and take an unrealistic level of care with the installation of each piece of equipment, you will corrupt the "IG" with the first device you mount.

Why? Because it has a metal chassis with a built-in safety ground (that's code) and that chassis is screwed into a metal cabinet that had better also be grounded. You now have two ground paths: one to the standard power ground, and one to your so-called "IG." Each piece of installed equipment creates another dual-ground path, so the whole "IG" system is no longer "isolated."

"Isolated grounds" were developed for early, sensitive computers. Those computers were installed in an office environment where all sorts of other equipment were also connected, which put electrical noise on the line.

Today's boxes are much more stable, as evidenced by the fact that nearly every home has one, and power corruption problems are rarely seen. The much more sophisticated servers and storage we install in data centers does need good grounding, but that does not mean a true "isolated ground."

Dual Power Supplies

More on power supplies

Following on an earlier question about dual power cords, do the power supplies have to be of the same phase? Or can they be from entirely separate power grids. (We will have effectively two supply systems with inline diesel rotary UPS). Also, don't you create large fault currents if you have paralleled two electrical supplies?

EXPERT RESPONSE

In answering this question, we must assume that the computing hardware you are using is of true "dual corded" design, in which each power cord connects internally to a totally separate power supply. In a true "dual corded" device, the only thing that should be common to the two power cords is the safety grounding conductor that connects to the computing device chassis. Unfortunately, there have been some "fly by night" products on the market, thankfully rare, which have gone so far as to actually have the "dual cords" spliced together inside the equipment and connected to only a single power supply. This is illegitimate, illegal, dangerous, and obviously completely unethical. You should have no concerns about major name products, but if you buy some interesting, off-brand "garage shop" device, perhaps you should look inside before plugging it in, because the answer that follows doesn't apply to stuff built this way.

Understand that the purpose of the power supply inside any computing device is to convert line voltage alternating current (AC) to the low voltage direct current (DC) required to run the computing circuitry. Therefore, the two independent power supplies, each connected to a different incoming AC line, completely isolate one AC line from the other. It is only on the DC side that power is paralleled, where positive and negative are clearly defined and "phase" is no longer an issue. Furthermore, DC paralleling is generally done via isolation diodes so that the two supplies "load share" and neither supply can back-feed and affect the other. Therefore, you should be able to operate any truly "dual corded, dual power supply" device from any two power sources. Once could even be the utility company AC line and the other a local generator, with neither having any reference to the other. So long as both sources are within the operating voltage range of the computing device, each is of sufficient current capacity, and the entire system is properly grounded, there should be no concern. (Grounding is most often the thing that gets done wrong, and that's worth closely examining in any complex power system.)

Paralleling actual power sources, such as two generators or two UPS's, can only be done through proper paralleling gear. In simple terms, the paralleling gear keeps the two or more sources in phase synchronization, and also provides isolation against back-feeds and fault currents so that neither of the power sources "see" current from the other source. But if you simply connect two sources together, without regard to phase or anything else, then yes, you're certainly going to have major problems, on top of being contrary to code.

Data Center Critical Foundation