Electronic Packaging: Everything a system needs
Cooling strategies: appropriate, effective and reliable
In the third part of this series we will consider the cooling, or removal of waste heat, in electronic systems such as VMEbus, CompactPCI, MicroTCA and AdvancedTCA. Defined airflows and the parallel arrangement of the PCBs in these systems allows for a level of air cooling that is optimised and generally sufficient. The thermal requirements for the cooling of boards and of the overall system are set out in the PICMG specifications for CompactPCI, MicroTCA and AdvancedTCA and those of VITA for VMEbus.
Removal of waste heat in electronic systems
Meanwhile however the heat dissipation in electronics packaging systems has reached an order of magnitude such that active cooling is invariably necessary. As a rule this takes the form of an air-cooling system, and at present such air cooling can be provided in three ways:
• With unregulated fans: the fans run continuously at a fixed speed. This very inexpensive solution is particularly suited to situations in which thermal and noise considerations are non-critical.
• With simple temperature-controlled fans: this efficient solution offers better noise performance and is suitable for more demanding systems.
• With a shelf manager: this controls the fan speed and offers further information exchange on the current thermal status of the system according to a predetermined algorithm. This allows a highly effective control of air cooling, adapted precisely to the current operating point of the system. This solution is best suited to high-performance systems and others in which thermal conditions are critical. This form of control makes possible a predictable influence on the temperature in systems in which the expected fan speeds have been established and set in advance in relation to the temperature gradients.
Under certain conditions, where packing densities are high and power consumption levels are close to maximum, the boards in the system may be cooled more or less directly by water. This method of heat removal has not however yet matured sufficiently in its application to electronics packaging systems for it to be applied as standard in ordinary applications; for now it remains restricted to specific projects.
• With unregulated fans: the fans run continuously at a fixed speed. This very inexpensive solution is particularly suited to situations in which thermal and noise considerations are non-critical.
• With simple temperature-controlled fans: this efficient solution offers better noise performance and is suitable for more demanding systems.
• With a shelf manager: this controls the fan speed and offers further information exchange on the current thermal status of the system according to a predetermined algorithm. This allows a highly effective control of air cooling, adapted precisely to the current operating point of the system. This solution is best suited to high-performance systems and others in which thermal conditions are critical. This form of control makes possible a predictable influence on the temperature in systems in which the expected fan speeds have been established and set in advance in relation to the temperature gradients.
Under certain conditions, where packing densities are high and power consumption levels are close to maximum, the boards in the system may be cooled more or less directly by water. This method of heat removal has not however yet matured sufficiently in its application to electronics packaging systems for it to be applied as standard in ordinary applications; for now it remains restricted to specific projects.
Higher power consumption causes more dissipated heat
In VMEbus and CompactPCI the maximum heat loss per board is not precisely specified and is limited ultimately by the maximum current-carrying capacity of the pins. In practice this means that the power dissipation is between 30 and 100 W, whereby some 10 % of this loss falls to the RTM (rear transition module) boards. These relatively low heat loss rates allow the use of low-cost, ungoverned 12 DC fans, which have become particularly established in CompactPCI systems. Such systems thus remain relatively low in cost overall. The costlier VMEbus systems, on the other hand, with heat dissipation similar to that of CompactPCI, generally use controlled fans. In this case the fan speed is controlled e.g. by a fan control module on the basis of temperature. This keeps the fan noise to a minimum and simultaneously increases the service life of the fans. In MicroTCA systems heat dissipation can lie anywhere between 20 and 80 W, depending on the size of the AdvancedMC modules employed. With 14 to 16 boards producing some 230 W heat each (minus 30 W each to the RTMs), however, AdvancedTCA systems can produce a total of up to 3680 W, and the provision of the requisite cooling poses a tough challenge. The fans used for cooling the boards are controlled and monitored by a shelf manager as described above.
Fig. 1: The Varistar electronics cabinet platform with LHX 12 air/water heat exchanger
Where more than one AdvancedTCA system is housed in one cabinet, a realistic alternative solution is the use of an air/water heat exchanger. Schroff's Varistar LHX 12 (Fig. 1) and LHX 20 modules are a possible solution. These units are capable of continuously maintaining an air temperature in AdvancedTCA systems of, say, 20 °C at heat dissipation levels of up to 12 or 20 kW respectively. In view of imminent developments in AdvancedTCA systems, whereby the maximum permitted dissipation per board may reach 350 W and thus 5.6 kW per system, the generous 20 kW capacity of an LHX 20 does not appear excessive.
Various cooling strategies
The choice of an appropriate cooling concept must be made early in the system design stage, as electronics packaging systems are constructed in a variety of versions. Small systems with only a few slots are generally based on horizontally-mounted component groups. This requires horizontal cooling of the electronics. In applications of this type, air is normally drawn in on the left-hand side and extracted on the right, or vice versa. The necessary lateral spaces must be made available for such an airflow pattern. Care should also be taken to ensure that the warm exhausted air of one system is not then drawn into others.
Larger systems with more slots usually consist of vertically arranged boards that are extracted and inserted from the front. Accordingly, the cool air must be directed in a vertical direction. In most cases the air travels from the bottom to the top. However, where multiple systems are mounted one above the other in the cabinet, the warm air from lower-positioned components may be drawn into units mounted higher up. In such situations the airflow is arranged from front to rear (Fig. 2). In special cases, where the front of a system is sealed by front panels and the rear by backplanes, an air entry channel is made in the lower front area and an air extraction channel to the upper rear. The fan units fitted in these channels then provide an independent flow of air.
fa909, 08/2009
Larger systems with more slots usually consist of vertically arranged boards that are extracted and inserted from the front. Accordingly, the cool air must be directed in a vertical direction. In most cases the air travels from the bottom to the top. However, where multiple systems are mounted one above the other in the cabinet, the warm air from lower-positioned components may be drawn into units mounted higher up. In such situations the airflow is arranged from front to rear (Fig. 2). In special cases, where the front of a system is sealed by front panels and the rear by backplanes, an air entry channel is made in the lower front area and an air extraction channel to the upper rear. The fan units fitted in these channels then provide an independent flow of air.
Fig. 2: Schematic representation of airflow from bottom to top and from front to rear
Where multiple systems are contained in a cabinet, additional measures may be necessary to boost the airflow to the required level. This may extend to the cooling of an entire, sealed cabinet. The first step in this direction is the use of additional fans in the rear door or top cover. In certain applications pressure fans can also be used to provide the necessary volume of cold air in the front of the cabinet. These are then often connected to an existing room air conditioner. In closed cabinets air/water heat exchangers are mostly used, and, less frequently, air conditioners. Always a decisive factor in the choice of cooling strategy is the intended location of the system (e.g. office, laboratory or industrial environment) and the associated requirements for noise and IP protection, the facilities available at the site (e.g. false floor, cold water supply, room air conditioner) and environmental influences (e.g. additional heat radiation, air impurities, high humidity etc). The individual cooling components must always be matched to the desired overall result and to the other cooling components used. This means that e.g. in a forced air-cooling situation the fans employed do not interfere with each other's operation. If multiple fans are to be used together, care should be taken to ensure that their performance characteristics are matched to one another.
Redundancy and hot swap
Where a specified availability or indeed high availability is required of an electronics packaging system, the redundancy provision of the system must extend to include the cooling. This applies in particular to the fans, which are generally considered to be critical components in that they include mechanical moving parts. They must thus be arranged, in terms of number and circuit configuration, such that the overall system remains intact and fully functioning. Where demands are so high, it must also be possible to replace faulty fans during system operation (hot swap), which is often made possible by using multiple, separate, overdimensioned fan units. In the next-lowest step (no hot swap), the fans are positioned e.g. on an insertable fan plate, so that they can be swapped without excessive physical effort. Where redundancy and availability are not of the greatest importance, fans can be built into fixed positions. Where necessary, the system can be shut down, or in some cases partly disabled, and the fans swapped.
Choice of fans
Many parameters must be observed in selecting the appropriate fan, and these are closely interrelated:
• Positioning and dimensioning must ensure that the maximum airflow volume that the fans must support, including all redundancy scenarios, is obtained.
• The maximum current drawn by each individual fan must be matched to the power availability of the system, and the heat thereby generated must be included in the overall heat balance calculation.
• The size, geometry and airflow must be matched to the available space. Here the decision must often be taken between axial and radial fans.
• The fans must meet the specified conditions in regard to supply voltage (e.g. 230 VAC or 24 VDC, etc), the required alarm signals and the type of control (e.g. temperature-controlled fan speed).
• The fans selected must also meet the permitted noise emissions when at maximum operation and/or at the normal operating point. This requires consideration not only of the manufacturer's data but also of the actual noise created by the air movement, turbulence etc and the empirically observed noise level.
In their tables manufacturers normally give only free-blowing air volumes. In a real situation in a chassis, this value is however of lesser importance. Important on the one hand is to obtain the actual operating point of the fans in relation to the air impedance curve of the system, and on the other, to position the fans accordingly. This requires that the individual air impedance curves (Fig. 3) are plotted in advance, and these are principally a function of changes of airflow direction, air perforations, constrictions and widenings in the channels and, not least, of the PCBs to be cooled. Additionally, the true fan performance characteristics should be plotted, since simply adding air volumes together (for parallel operation of multiple fans) or an addition of static pressures (for operation in series) is adequate only for rough dimensioning purposes. In tightly packed electronics packaging systems the fans are often so closely positioned together that they can affect one another to a significant extent. Thus one obtains a crossing point between the system impedance and the fan performance curve, known as an operating point.
• Positioning and dimensioning must ensure that the maximum airflow volume that the fans must support, including all redundancy scenarios, is obtained.
• The maximum current drawn by each individual fan must be matched to the power availability of the system, and the heat thereby generated must be included in the overall heat balance calculation.
• The size, geometry and airflow must be matched to the available space. Here the decision must often be taken between axial and radial fans.
• The fans must meet the specified conditions in regard to supply voltage (e.g. 230 VAC or 24 VDC, etc), the required alarm signals and the type of control (e.g. temperature-controlled fan speed).
• The fans selected must also meet the permitted noise emissions when at maximum operation and/or at the normal operating point. This requires consideration not only of the manufacturer's data but also of the actual noise created by the air movement, turbulence etc and the empirically observed noise level.
In their tables manufacturers normally give only free-blowing air volumes. In a real situation in a chassis, this value is however of lesser importance. Important on the one hand is to obtain the actual operating point of the fans in relation to the air impedance curve of the system, and on the other, to position the fans accordingly. This requires that the individual air impedance curves (Fig. 3) are plotted in advance, and these are principally a function of changes of airflow direction, air perforations, constrictions and widenings in the channels and, not least, of the PCBs to be cooled. Additionally, the true fan performance characteristics should be plotted, since simply adding air volumes together (for parallel operation of multiple fans) or an addition of static pressures (for operation in series) is adequate only for rough dimensioning purposes. In tightly packed electronics packaging systems the fans are often so closely positioned together that they can affect one another to a significant extent. Thus one obtains a crossing point between the system impedance and the fan performance curve, known as an operating point.
Fig. 3: Impedance curve of a CompactPCI system linked to the fan performance characteristic
Influence of specifications
The most important characteristics of all electronics packaging systems are set out, in varying levels of detail, in the various specifications. It would thus also be wise to check these for any prescription of the levels of power dissipation and cooling. The largest amount of data is found for AdvancedTCA and MicroTCA systems. The AdvancedTCA specification thus not only defines the maximum heat dissipation per board including the division between the front and rear areas, but also prescribes in detail the minimum speeds of the moving air. Even the procedure to be followed on the failure of a given fan is documented in order to assure the highest possible availability of the overall system. To ensure the universal applicability of the system it is also specified that the cold air must always be drawn in from the front bottom and extracted from the rear top.
Also in the MicroTCA specification, the maximum heat dissipation of individual modules (AdvancedMC) is precisely defined as 20 W for single compact AdvancedMCs and up to 80 W for double full-size modules. Here, forced air cooling is prescribed as the cooling method. Various cooling configurations are suggested and the air distribution in the slot is defined. Barometrical changes as a result of system location at different altitudes above sea level are also indicated. Air filters, thermal sensors and much else is also specified. In the VMEbus and CompactPCI specifications, on the other hand, only minimal information is provided as concerns cooling requirements. There are also further standards in both the USA (NEBS - the Network Equipment Building Standard) and Europe (ETSI - the European Telecom Standards Institute) that place very strict requirements on noise generation.
Now the system designer has the task, using what latitude is available to him, of providing an adequate air supply to the individual component groups without compromising the prescriptions that apply to the specification. He must choose between, for example, a push or a pull principle. In the push principle (Fig. 4) the fan is positioned in front of the components and pushes the air through the slot. The advantage of this principle is that the fan, its motor and associated electronics also benefit from the cold incoming airstream to the system, resulting in a longer operating life. Its disadvantage however lies in the fact that the heat generated by the fan motor is blown into the board cage and thus raises the basic temperature.
Also in the MicroTCA specification, the maximum heat dissipation of individual modules (AdvancedMC) is precisely defined as 20 W for single compact AdvancedMCs and up to 80 W for double full-size modules. Here, forced air cooling is prescribed as the cooling method. Various cooling configurations are suggested and the air distribution in the slot is defined. Barometrical changes as a result of system location at different altitudes above sea level are also indicated. Air filters, thermal sensors and much else is also specified. In the VMEbus and CompactPCI specifications, on the other hand, only minimal information is provided as concerns cooling requirements. There are also further standards in both the USA (NEBS - the Network Equipment Building Standard) and Europe (ETSI - the European Telecom Standards Institute) that place very strict requirements on noise generation.
Now the system designer has the task, using what latitude is available to him, of providing an adequate air supply to the individual component groups without compromising the prescriptions that apply to the specification. He must choose between, for example, a push or a pull principle. In the push principle (Fig. 4) the fan is positioned in front of the components and pushes the air through the slot. The advantage of this principle is that the fan, its motor and associated electronics also benefit from the cold incoming airstream to the system, resulting in a longer operating life. Its disadvantage however lies in the fact that the heat generated by the fan motor is blown into the board cage and thus raises the basic temperature.
Fig. 4: Schematic representation of a push system
With the pull principle (Fig. 5), the fan sucks air from behind the board cage. The heat generated by the fan motor thus does not affect the temperature in the board cage. The disadvantage, however, is that the fan motor and its electronics are exposed to the waste heat from the component groups, which can in some cases considerably shorten their operating life.
Fig. 5: Schematic representation of a pull system
Cooling management systems
A particular feature of AdvancedTCA and MicroTCA is the management strategy, which also affects the ventilation.
AdvancedTCA:
In AdvancedTCA systems the shelf manager assumes the task of fan monitoring and controls the fan speed in relation to the cooling requirements. More specifically, the shelf manager processes the information provided by the sensors and outputs this as fan control. Every AdvancedTCA board has one or more temperature sensors that are positioned at the critical points of the board, and which communicate via the IPMB (intelligent platform management bus) with the shelf manager. The sensors are programmed with three threshold values at the upper end of the temperature range; the first of these is known as the noncritical threshold. If the temperature of the sensor exceeds this value, the sensor requests an appropriate increase in fan speed from the shelf manager. Should the temperature exceed the second value, the 'critical threshold', the fans are set to maximum speed; additionally, the shelf manager requests the AdvancedTCA boards to reduce their power consumption and the Telco alarm status is set to 'major'. In the event that the sensor temperature exceeds the third value, the 'non-recoverable threshold', the board concerned is disabled in order to avoid damage to the board or indeed the risk of fire. In this situation the Telco alarm status is set to 'critical'.
MicroTCA:
In a MicroTCA system each AdvancedMC module, power module and even the ventilation must be equipped with a management controller. The MicroTCA carrier hub (MCH) controls the individual management components. MicroTCA defines so-called cooling units (CUs). A system may contain one or two of these cooling units. Each cooling unit contains one or more fans. One cooling unit should be sufficient for the system's cooling needs, and the optional second unit then acts as a redundancy component. With such redundant cooling it is good practice to place one cooling unit in front of the AdvancedMC modules and the other behind them. Such a solution is known as the push/pull principle. To facilitate easy exchange of the cooling units, these are realised as fan units that can be simply swapped over from the front.
In MicroTCA, as with AdvancedTCA, the shelf manager monitors the system temperature. The AdvancedMC modules (e.g. CPUs) contained in a MicroTCA system are provided with their own temperature sensors and logic. If a temperature exceeds the preset threshold value, this information is forwarded as an 'event' to the shelf manager. The shelf manager then instructs the cooling unit to adjust the fan speed accordingly. Similarly, the cooling unit reports the status of the fans and any fault or error function to the shelf manager. An intelligent fan control system of this type provides many benefits. The service life of the fans is increased, since the fan speed is adjusted to the exact output requirement of the system at any time. Reduced fan speed means less noise and lower electrical energy consumption.
AdvancedTCA:
In AdvancedTCA systems the shelf manager assumes the task of fan monitoring and controls the fan speed in relation to the cooling requirements. More specifically, the shelf manager processes the information provided by the sensors and outputs this as fan control. Every AdvancedTCA board has one or more temperature sensors that are positioned at the critical points of the board, and which communicate via the IPMB (intelligent platform management bus) with the shelf manager. The sensors are programmed with three threshold values at the upper end of the temperature range; the first of these is known as the noncritical threshold. If the temperature of the sensor exceeds this value, the sensor requests an appropriate increase in fan speed from the shelf manager. Should the temperature exceed the second value, the 'critical threshold', the fans are set to maximum speed; additionally, the shelf manager requests the AdvancedTCA boards to reduce their power consumption and the Telco alarm status is set to 'major'. In the event that the sensor temperature exceeds the third value, the 'non-recoverable threshold', the board concerned is disabled in order to avoid damage to the board or indeed the risk of fire. In this situation the Telco alarm status is set to 'critical'.
MicroTCA:
In a MicroTCA system each AdvancedMC module, power module and even the ventilation must be equipped with a management controller. The MicroTCA carrier hub (MCH) controls the individual management components. MicroTCA defines so-called cooling units (CUs). A system may contain one or two of these cooling units. Each cooling unit contains one or more fans. One cooling unit should be sufficient for the system's cooling needs, and the optional second unit then acts as a redundancy component. With such redundant cooling it is good practice to place one cooling unit in front of the AdvancedMC modules and the other behind them. Such a solution is known as the push/pull principle. To facilitate easy exchange of the cooling units, these are realised as fan units that can be simply swapped over from the front.
In MicroTCA, as with AdvancedTCA, the shelf manager monitors the system temperature. The AdvancedMC modules (e.g. CPUs) contained in a MicroTCA system are provided with their own temperature sensors and logic. If a temperature exceeds the preset threshold value, this information is forwarded as an 'event' to the shelf manager. The shelf manager then instructs the cooling unit to adjust the fan speed accordingly. Similarly, the cooling unit reports the status of the fans and any fault or error function to the shelf manager. An intelligent fan control system of this type provides many benefits. The service life of the fans is increased, since the fan speed is adjusted to the exact output requirement of the system at any time. Reduced fan speed means less noise and lower electrical energy consumption.
Energy efficiency issues
Energy efficiency can be understood as obtaining a desired outcome with the minimum possible consumption of energy. In relation to cooling in electronics packaging systems, it means using only that amount of energy that is absolutely necessary to obtain the required cooling. A factor in assuring energy efficiency is thus the use of speed-controlled fans. Equally important is the matching of temperatures and air volumes. The temperature difference between incoming and outgoing air should be as great as possible. The greater is this temperature difference, the more efficient is the cooling. Also, the higher the incoming air temperature may be set, the less cooling capacity is required. Manufacturers have already recognised this fact in the area of server cooling and in some cases have allowed the incoming air temperature to approach 30 °C; the air exit temperature has been modified accordingly. As a result, by merely changing the operating temperature range, efficiency has increased and costs reduced. This development is certain to spread to electronics packaging applications. Most of today's components are already designed to operate at higher temperatures. It will be a significant task, however, to reach the optimum point between the protection of components and low energy costs.
Experience and simulation - the key to an optimal cooling strategy
Cooling designs for electronics packaging systems can be realised in many different ways. Nevertheless, the best design for each situation is always obtained through consideration of the specific application and conditions. Thanks to the many years of experience of Schroff's cooling specialists with VMEbus, CompactPCI, AdvancedTCA and MicroTCA systems, the company can competently implement the cooling requirements of users' individual applications. The use of simulation software (Fig. 6) allows the cooling to be optimised prior to the time-consuming design and construction of prototypes. The result is a considerable reduction in development costs. The prototype can then be tested in Schroff's climate bay or wind tunnel in detail.
Fig. 6: Simulation of heat patterns in a CompactPCI system
Notes on the author:
fa909, 08/2009
Home

