APPLYING “SANDY BRIDGE” TO EMBEDDED APPLICATIONS

Overview

Market demand continues to grow for high-performance imaging, communications and security systems. Solutions need to contain scalable power-efficient processors with better integrated graphics capabilities than ever before. For example, medical professionals need to view and interpret the results of scans correctly and efficiently. In other cases, remote workers need fast VPN access to servers through the same interfaces that must be protected from viruses and hackers by wire-speed deep packet inspection.

To provide these solutions, Intel’s latest embedded platform, “Sandy Bridge”, contains a number of new technologies and upgraded technologies for graphics and security.


Sandy Bridge Defined

Sandy Bridge is the code name for the second generation microarchitecture for the Intel® Core™ i-series and Xeon® series processors, following Nehalem. Sandy Bridge is also used as the platform name for two-chip solutions for mobile-class, desktop-class and server-class computing. As a platform, it succeeds the Westmere first generation Core i-series.  Calpella is the mobile Westmere platform.

Within the platform, certain dual-core and quad-core processors are selected by Intel to be placed on the embedded roadmap for 7-year availability prior to end-of-life (EOL). Refer to Appendix A for a list of these processors. Pentium- and Celeron-branded parts are entry-level dual-core chips. The processors are then paired with specific Platform Control Hubs (PCHs) to form the two-chip platforms as follows.

Series

Processors

TDP Range

Package

Chipset

Mobile-based

Core i7, i5, i3,
Celeron

17-45W

PGA 988 or BGA 1023

QM67 Express

Desktop-based

Core i7, i5, i3,
Pentium

65-95W

LGA 1155

Q67 Express

Server-based

Xeon E3,
Core i3

20**-95W

LGA 1155

C206 Express

                                    ** - 20W with Xeon E3-1220L, and 45W with Xeon E3-1260L.

Per the table above, all desktop-based and server-based processors are socketed. For the first time, Sandy Bridge embedded processors are socket-compatible and pin-compatible between desktop-based and server-based, and Q67 and C206 PCHs (aka chipsets) are even pin-compatible as well. Furthermore, the Xeon models work with the Q67 chipset and Core i-series processors work with the C206 chipset. New Low-Voltage Xeon models have been introduced recently, such as the Xeon E3-1220L at only 20 Watts Thermal Design Power (TDP) rating. The benefit of all of this compatibility is unprecedented scalability within a circuit board design.

For mobile-based and desktop-based platforms, there are alternate PCHs available (entry-level) but these are generally not featured on embedded boards. The C206 chipset has a TDP rating of 6.5W, while all other chipsets have a TDP rating of 3.9W.

How does Sandy Bridge Compare to Calpella?

Publically available benchmarks from Passmark show substantial performance gains from Calpella platforms to Sandy Bridge platforms in Figure 1 below. Benchmarks are itemized in Appendix A.

Appendix A

Figure 1 – Passmark benchmarks show improvements over the previous generations

Performance Enhancements

With its powerful built-in graphics engine, Sandy Bridge can be the perfect solution to meet high-end processing needs. Its new microarchitecture embodies a substantial performance benefit over previous generations and can provide a seamless upgrade for the installed base. The new second-generation i-series platforms include several key features that benefit imaging without increasing the power / thermal envelope.

  • - Desktop chipset (Q67) and server chipset (C206) options on the same Single Board Computer (SBC) or System Host Board (SHB) design. Before Sandy Bridge, a manufacturer would have to make choices among the server-class route, the cost-saving desktop route or the low power mobile route. Imaging generates an enormous amount of graphics data that needs to be retrieved and displayed quickly. The incredible performance gains over the last three chipset generations, including the architectural improvement of integrating the graphics controller and memory controller into the processor chip, have led to the server chipset offering a superset of features of the desktop chipset including graphics. The PCIe x16 interface of the C206 chipset can be bifurcated into two PCIe x8 lanes. Furthermore, all PCIe interfaces for graphics and I/O are revision 2.0

  • - Expedient storage and retrieval. This performance level is normally a facet of the server chipset, which also includes the ability to support redundancy using RAID technology. This increased bandwidth and optional redundancy means that data is stored with accuracy and integrity, both essential attributes.
  • - Error-correcting circuitry (ECC). Previously a feature only of server chipsets, ECC is now integrated into the processor and is available for server- , desktop- and even some mobile-class processor models. ECC is important for the high accuracy and high reliability environments found in imaging and security applications. The ECC support for data integrity combines with SATA 600 rapid storage and RAID redundancy to provide a multi-pronged protection from accidental corruption if there are power spikes or simple bit errors.
  • - Tight integration. Sandy Bridge unifies processor cores, memory controller, last-level cache (LLC) and graphics and media processing. This tight integration improves performance and efficiency in a variety of ways, all of which benefit medical imaging applications. Fast access by the cores and graphics to shared data in the LLC accelerates graphics processing. Fewer buses over which signaling and data must travel, such as the old Front Side Bus (FSB), results in faster processing. More memory bandwidth for the cores boosts overall system performance.
  • - True 32nm performance. Sandy Bridge’s 32nm geometry has a smaller footprint with greater performance than Intel’s first generation Nehalem micro-architecture—a 20 to 30 percent improvement—and this allows for greater integration by combining the processor with the graphics controller and memory controller. As an example, the previous mobile-series Calpella platform used the Nehalem architecture with a dual die Arrandale Core i7 processor family. The multi-chip module (MCM) consisting of 32nm processor die and 45nm Northbridge die has now been replaced by a monolithic 32nm die, directly resulting in space reduction and performance gain. Refer to Figure 2 below.
Figure2
Figure 2 – Sandy Bridge processors use a single die (right) rather than an MCM (left)
  • - Graphics acceleration. A powerful graphics engine speeds image processing, with hardware-based media accelerators, graphics execution units, and 256-bit Advanced Vector Extensions (AVX) to improve 3D rendering. The performance surpasses entry-level graphics cards. The processor cores are freed up to accomplish other workloads.
New and Upgraded Technologies

Sandy Bridge offers benefits to communications applications through new and upgraded features in the processors and chipsets. These technologies include new instruction sets, frequency scaling, memory access improvements, and security features under the vPro® umbrella (AMT, VT and TXT).

Feature

Benefit

Turbo Boost Technology 2.0

Graphics or scalar workload acceleration

Hyper-Threading Technology

Parallel execution to reduce elapsed time

Active Management Technology

Allows remote management regardless of system power or OS state

Virtualization Technology for Processor (VT-x)

Protects separate OS and application domains through hardware assisted performance acceleration

Virtualization Technology for Directed I/O (VT-d)

Allows I/O to be shared among OS/app domains

Trusted Execution Technology (TXT)

Prevents unauthorized code execution in hardware

Advanced Encryption Standard (AES) New Instructions

Accelerates encryption algorithms

Intel  64

More data processed per instruction

Advanced Vector Extension (AVX)

Doubles vector register sizes to 256-bits for up to 2X floating point operations per second, processing more data with fewer instructions

Enhanced Intel SpeedStep® Technology

Prevents overheating

Thermal Monitoring Technologies

Prevents overheating

Execute Disable Bit

Prevents unauthorized code execution

 

A shining example of these features is the updated Turbo Boost, which achieves additional performance gains without sacrificing TDP. Turbo Boost Technology 2.0 dynamically controls the performance and power of processor cores and graphics by reallocating the performance to either/or, depending upon the software workload. Turbo Boost‘s energy saving algorithms boost performance exactly where and when needed by checking constantly with the thermal junction transistors to determine the headroom available for frequency upscaling. This is a perfect tool for applications such as imaging and scalar processing (encryption algorithms) that require a large amount of data crunching.

In other words, Turbo Boost allows safe overclocking of processor and graphics cores to process data sooner without risk of overheating. For example, the Core i7-2600 can increase from 3.4GHz to 3.8GHz for each of its four cores, or the graphics clock can increase from a Graphics Base Frequency of 850MHz to a Graphics Max Dynamic Frequency of 1.35GHz. When achieving added performance, it’s critical to design the thermal solution properly.

A Simple Upgrade

As the COM Express® form factor gains popularity due to the pluggable CPU core concept, the installed base of carrier boards is growing rapidly. The original Intel 915 chipset for which it was architected dates back to 2004 and the chipset is nearing the end of its production lifecycle. In addition, chipsets like the 945 and 965 are quite mature by now, and the benefits of the Sandy Bridge platform within the same power envelope make a drop-in upgrade appealing for the large installed base of Type 2 pinout carrier boards.

Furthermore, brand new carrier board designs are following the path of the newly introduced Type 6 pinout which replaces legacy IDE and PCI interfaces with digital display interfaces and USB 3.0, fully aligned with the QM67 chipset and roadmap. Portwell offers both pinout types – PCOM-B217VG (type 2) and PCOM-B217VG-VI-ECC (type 6) – to support upgrades and new carrier designs. See Figure 3.

 

Figure3

Figure 3 – COM Express Type 2 PGA (left) and Type 6 BGA with ECC RAM (right)


Protecting Legacy PICMG-Based Investments

Many board manufacturers have abandoned the legacy PICMG 1.x form factors in favor of ATCA and uTCA for telecom applications. However, Portwell continues to show its long lifecycle heritage by introducing server-class Sandy Bridge support with C206 chipset in the PICMG 1.3 System Host Board (SHB) form factor. Portwell provides longevity commitment to those legacy systems that have served medical and industrial OEMs well for 15+ years. This gives medical OEMs the reassurance of knowing they can still function with their legacy systems in a time of tighter budgets because they can still design the latest platforms on the older form factors.

For example, the form factor of Portwell’s ROBO-8110VG2AR shown in Figure 4 is special. It’s short (126.39mm) and wide (338.5mm), so it can stand up in a horizontal backplane within Portwell’s standard 4U chassis or plug laterally into a vertical backplane to squeeze into Portwell’s 2U chassis.

Figure4

Figure 4 – A high-performance PICMG SBC example based upon Sandy Bridge with C206

The ROBO-8110 SHB supports both the Core i7/i5/i3 and Xeon processors in an LGA 1155 package along with up to 16GB DDR3 1333 RAM in two DIMM sockets with ECC support. Two SATA 600 ports and two SATA 300 ports are supported with RAID levels 0, 1, 5 and 10. Due to this superset of benefits, the C206 server-class chipset was chosen over the Q67 desktop-class chipset. Its rich I/O connects include high-performance interfaces—two Gigabit Ethernet and USB 2.0 ports, as well as the legacy I/O interfaces needed in this market—serial ports, parallel port and even floppy disk (FDD). In addition, support for iAMT 7.0 is included in all processors except for the Core i3 model. Finally, multiple display types are supported due to the prevalence of old analog monitors and new digital monitors, with a DVI-I connector (DVI-D digital plus analog VGA signals on some of the pins) and HDMI on a header.

The 2U chassis shown in Figure 5 below includes a low profile backplane with one PCIe x16 graphics slot to display high-resolution images, one PCIe x4 slot, and three PCI slots to support system OEMs’ legacy cards.


Figure5

Figure 5 – A 2U Rackmount Chassis is accomplished with a vertical backplane (midplane)

Sandy Bridge for Network Appliances

For small to medium size businesses, Sandy Bridge can be implemented cleanly in a 1U rackmount network appliance with up to 16 Gigabit Ethernet ports. For example, the CAR-4010 from Portwell features a choice of all copper or copper+fiber LAN ports for applications such as internet security, e-mail server, access control, and threat management. Refer to Figure 6 below.


Figure5

Figure 6 – The CAR-4010 architecture delivers essential features, expansion, and three exhaust fans

Close examination of the front and rear panels reveals a rich set of features as well as careful system layout. The front panel includes four copper GbE ports and four fiber GbE ports (which can be substituted for four more copper GbE ports); refer to the top half of Figure 7 below. Two of these ports include bypass segments with configurability to fail open or fail closed so that communications continue upon failure. Bypass segments are activated by hardware trigger, such as a power failure, or software watchdog timer (WDT) to protect against application or OS lockup (hangs).

An IPMI port (remote system management, upgrade and security) and a console port are included as well. To the left, the system can be installed with zero, one or two additional 4-port Portwell fiber or copper Network Interface Cards (NICs), to bring the total number of Gigabit LAN ports to 16. An LCD display is provided along with buttons for easy operation without a console. The “EZIO” series of LCD modules comes with a variety of 16x2 character or 128x32 or x64 graphics displays and 4-7 buttons, all of which appear as a serial port to the system for easy communication.


Figure5

Figure 7 – The CAR-4010 architecture delivers 4 copper and 4 optical fiber high-speed LAN ports, with expansion via Portwell NIC cards for up to 16 GbE ports total along with rear expansion I/O



A “Mother” of a Board

Thanks to the pin compatibility of Sandy Bridge’s processors and chipsets, Portwell offers a desktop-class and a server-class industrial ATX motherboard. RUBY-D712VG2AR features the Q67 chipset with dual core Core i5 and quad core Core i7 processors, while RUBY-D711VG2AR features the C206 chipset with dual core Core i3 and quad core Xeon E3-1200 family processors. Both models use the LGA 1155 socket for processor scalability.

Both products are rounded out with rich features, including four DIMM sockets (2x dual channel) for ECC-capable DDR3 1333 RAM (16GB total) and dual independent video displays which can be chosen among the three rear I/O connectors:  Analog VGA, digital DVI-D and HDMI. Expansion slots include one PCIe x16, one PCIe x8, one PCIe x4, and four PCI slots for high-end graphics and industrial I/O cards. Six SATA devices can be attached, two with SATA 600 speed. For high data integrity and storage redundancy, RAID levels 0, 1, 5 and 10 are supported, and for security, iAMT 7.0 is available. The I/O is rounded out with 6 serial ports, several with RS-232/422/485 selectability, 16 GPIO pins (8 in, 8 out), 4 USB ports, high definition audio (line in, out, mic), and LPC bus header.

Figure8

Figure 8 – The ATX form factor motherboard features a trifecta of display interfaces, LANs and UARTs.

An Architectural Simplification

Mini-ITX has gained enormous popularity as a Small Form Factor (SFF) motherboard, both in commercial/consumer and embedded markets alike. Due to the horsepower of the Q67 and C206 chipsets, it is now possible for the first time to eliminate co-processors such as DSPs, FPGAs and other offload engines for many applications. To accomplish this in Mini-ITX will cause many system OEMs to re-think their custom hardware strategy completely, and focus on algorithm conversion and software core competencies instead. Portwell offers both WADE-8012 (Q67) and WADE-8011 (C206) models.

Figure9

Figure 9 – The WADE-8011 is unusual in featuring server-class C206 chipset in the tiny Mini-ITX SFF

WADE-8011 is the world’s first server-class Mini-ITX board, with Xeon E3-1225 processor in the LGA 1155 socket, to complement the C206 server PCH. Two 240-pin DIMM sockets support dual channel DDR3 SDRAM up to 8GB, and dual independent display is available on VGA / DVI / HDMI ports. Intel’s AMT 7.0 is also included. Besides dual GigE LAN and four USB ports, two serial ports and audio connectors are included on the rear I/O block.

Lower Power Mini-ITX

For applications that prioritize low power ahead of raw computational performance, the WADE-8110 features the QM67 mobile chipset. The chipset’s TDP rating is 3.9W instead of 6.1W for Q67, and the PGA socket-type processors have a TDP of 35W and 45W for Core i5-2510E and Core i7-2710QE, well beneath the 65W and 95W desktop-based processor TDP ratings.


Figure10

Figure 10 – The WADE-8110 features the traditional mobile-based platform for low-power computing

Taking Mini-ITX to the Next Level

Portwell’s WADE-8011, WADE-8012 and WADE-8110 SBCs bring Sandy Bridge performance to Portwell’s small form factor Mini-ITX enclosures, for complete system solutions with room for I/O expansion and rear panel customization. Refer to Figure 11 below.


Figure11


Figure 11 – The WADE-2xxx series of enclosures support the WADE-801x Mini-ITX SBCs

Code Optimizations

In order to get the most performance out of the Sandy Bridge products above, it is essential to use the best available compiler, performance primitives, math kernel libraries, DSP libraries, and profiler tools. Because Intel designed the chip, Intel has the knowledge to make the best development tools and libraries. Tools like C++ Composer XE, Parallel Inspector, Trace Analyzer and Collector, and VTune™ Amplifier XE are available for Windows® and Linux® users. Integrated Performance Primitives (IPP), Math Kernel Library (MKL) and Thread Building Blocks (TBB) help to extract the most out of the platform. The performance improvements are substantial, well worth the time to download and test.

For more information and to download libraries and tools for free 30-day evaluations, please visit: 
http://software.intel.com/en-us/articles/intel-software-evaluation-center/

 

Conclusion

Sandy Bridge offers unprecedented performance, features and scalability at the board-level or system-level, whether fully off-the-shelf or customized. The entire range of mobile-based, desktop-based and server-based processors and chipsets with a unified architecture and 7-year production lifecycle presents vast opportunities to embedded system OEMs. Most importantly, all Portwell products comply with ISO 13485 “medical ISO” in Taiwan and in the US office, which is more stringent than basic ISO 9001 quality system requirements. Even for OEMs outside of medical markets, this level of commitment to product breadth and quality makes Portwell an excellent choice for your next design project.

Appendix A

The tables below show the key specifications and product technologies for the three Sandy Bridge platforms. Each processor family has its own code name, such as “Bromolow” etc. below.

Server-Based:  “Bromolow” Processors with C206 Chipset (E3-12x5 = quad core, all others dual core):

Figure11


Desktop-Based: "Sugar Bay" Processors with Q67 Chipset (24/2600 = quad core, all others dual core):
Figure11


Mobile-Based: "Huron River" Processors with QM67 Chipset (27xx = quad core, all others dual core):



Figure11


Note: Most mobile processors don't have benchmarks at www.cpubenchmark.net, as this is usually just for socketed processors, so the benchmark score for the next closest processor is substituted above as follows:


Core i7-2720QM in lieu of i7-2710QE
Core i7-2620M in lieu of i7-2655LE

Core i3-2310M in lieu of i3-2310E

All data contained herein is for information purpose only and not guaranteed for legal purpose. All information is subject to change without notice. All brand and product names may be trademarks or registered trademarks of their respective companies or mark holders.

[Home] [About Portwell] [Press Releases] [Products] [Support] [Site Map]