Industry Economy TECH GAME
Society Comfort AUTO MEDIA

Xilinx-SK Telecom Held Press Conference to Present FPGA-Based AI Accelerator Implementation Cases

Xilinx and SK Telecom held a press conference at the Grand InterContinental Seoul Parnas Hotel in Gangnam-gu, Seoul on November 1st, and SK Telecom announced the adoption of Xilinx Alveo data center accelerator card to strengthen its real-time AI-based physical intrusion detection service. SK Telecom's AI inference accelerator (AIX) implemented on Xilinx Alveo cards provides efficient and accurate physical intrusion detection using deep neural networks. ADT CAPS CO., LTD. licenses and commercially deploys the physical intrusion detection service. SK Telecom's AI-based physical intrusion detection service, T view™, monitors hundreds of thousands of customers' commercial and home camera systems in real-time and dispatches security guards under physical intrusion circumstances. Processing a large volume of data from thousands of cameras using deep neural networks necessitates a powerful AI accelerator that can provide enough throughput and accuracy. T view uses SK Telecom's AI inference accelerator (AIX), implemented on Xilinx Alveo U250 cards. Running on servers in SK Telecom's data center, Alveo U250 cards have demonstrated high throughput and accuracy in theft detection services. Built on the Xilinx 16nm UltraScale+ architecture, Alveo accelerators are adaptable to changing algorithms and acceleration requirements, enabling domain specific architectures that optimize performance for a range of workloads without changing hardware, and while reducing overall cost of ownership. Xilinx Alveo accelerator cards are designed to meet the performance and flexibility needs of data center AI workloads, providing 10X higher performance for AI-based speech translation and over 3X higher throughout for video analytics pipelines compared to GPUs. ▲ Sam Rogan, Vice President of APAC Regional Sales at Xilinx ▲ Jin-hyo Park, Head of ICT Technology Center at SK Telecom Sam Rogan, vice president of APAC regional sales at Xilinx, introduced that the key to meeting the growing demand for computing is ‘performance’, and there are many ways to define it. And the definition of this performance and the way to improve it are changing. Starting with competing simple processor speeds, parallel processing, a combination of multicores and heterogeneous cores have been made as well as a lot of performance gains in the process. However, we are now faced with the limitations of performance improvements again. And Xilinx has proposed a ‘domain-specific architecture’ in the form of next-generation system configurations. This ‘domain-specific architecture’ is drawing more attention in the hot topics such as artificial intelligence and machine learning applications. Since 2012, about 20 to 30 models of neural networks have invented, and compared to the earlier models, the current models are able to use parallel processing or non-sequential processing, which greatly increases processing performance. In addition, the neural network model for these AIs requires the optimal architecture to boost the performance, and also requires customized precision, datapaths, and memory hierarchies to get the highest performance from the architecture. At this time, the hardware that can be used to implement these features is GPU, ASIC, FPGA. Of these, the GPU is flexible but has a problem of high-power consumption and latency. In addition, ASIC/ASSP, which locks logic on hardware to overcome the drawbacks of the GPU, takes long time from design to production, and sometimes the neural network model is already devalued at the start to production. FPGAs, however, overcome the challenges of GPUs and ASICs, and provide an environment to quickly keep up with the ever-changing neural network model. Xilinx highlighted that it develops tools and libraries that users can easily access, and provides a way of using high-level languages that can be used for the existing model development on FPGAs. Jin-hyo Park, Head of ICT Technology Center at SK Telecom, explained that the existing telecommunications company thought that it would be necessary to use 'artificial intelligence' in areas that can differentiate itself from the existing operators in expanding the scope as a ‘service’ other than telecommunications. And he commented that SK Telecom has an artificial intelligence service called ‘NUGU’, which requires the development of algorithms and software for artificial intelligence, and an appropriate infrastructure to use the developed software and algorithms well. Hence, SK Telecom has focused on accelerators using FPGAs in researching infrastructure for artificial intelligence services, which is the third application for SK Telecom. In addition, he stated that AI and accelerators will be used for MEC in the future, and cooperation with Xilinx is important to create new converged services. ▲ Kang-Won Lee, Head of Cloud Labs at SK Telecom ▲ AIX, an accelerator for high-performance AI inference environments Kang-Won Lee, Head of Cloud Labs at SK Telecom, introduced that SK Telecom's “AIX” was the project name for the inference accelerator for AI services. He explained that SK Telecom is currently the largest mobile network operator in Korea, and is conducting business in various areas such as media, e-commerce, security, and semiconductors. Also, in all those areas, AI is used for various purposes, and AI is not a technology for a specific area, but a technology that can provide value to customers and society in every area. Moreover, in the application of AI, there are AI technologies that become 'cores' such as image analysis, data analysis, and natural language processing. SK Telecom believes that these technologies will be the core competency of the business, and has been developing accelerators. In general, the lifecycle of an AI service consists of 'training' and 'inference', of which 'training' takes place in a specific part of the data center during the development phase. It is not latency sensitive and is primarily deployed in GPU farm environments. And ‘inference’ is what the developed AI services actually provide, and when more parts become AI in the future, it is expected that there will be greater market opportunities related to inference. Furthermore, he stated that the service infrastructure for the AI inference environment is targeted at large users, and since cost and power consumption are important as well as performance, an accelerator that can satisfy both high performance and high efficiency will be effective. SK Telecom's AIX designs the Neural Processing Unit (NPU) for AI inference and mounts it on Xilinx FPGAs to use for the actual services. At this time, the AIX unit implemented in the FPGA can provide high performance with high performance memory such as HBM, and the FPGA is a configuration that is used in a commercial server in the form of a PCIe card. In addition, software stacks such as compilers, libraries, and runtime environment for AIX are prepared. In terms of software, it supports the framework environment that is widely used, and there are modules for performance optimization under it. It also provides a runtime environment that enhances users’ convenience. ▲ An example of speeding up the intrusion detection process with AIX, based on Alveo U250, was introduced. SK Telecom stressed that its AI services were not created at the laboratory level, and have been repeatedly used and improved in actual services. In addition, AIX has been used for voice recognition of 'NUGU' speakers to achieve higher performance and cost-effectiveness than existing GPU-based systems. SK Telecom also introduced that it provided a real-time speech-to-text (STT) service through AIX accelerator to the call center’s STT service, Vanessa Speech Note. Moreover, in the case of 'T view', AIX has expanded its use of video analysis. While there have been many false positives and inefficiencies until now, AI-based intrusion detection technology will provide more accurate detection and higher cost efficiency. The two hardwares used in the case of ‘T view’ are Alveo U250 and a custom card with two Virtex-based FPGAs. AIX utilizing FPGAs can expect to be more than twice as cost-efficient as GPUs, and can handle a lot of processing while reducing latency. Even removing some of the latency constraints, AIX provides more performance than GPU. In addition, by utilizing AIX for intrusion detection, the inefficiency of on-site dispatch caused by false positives was greatly reduced, and the environment where AI could send out dispatch requests was implemented with accurate and quick intrusion detection. Meanwhile, with the rise of 5G networks, the ‘edge cloud’ can place infrastructure and services closer to users and expect lower latency. By utilizing the edge cloud in 5G networks, the network and the edge cloud can handle the tasks that the devices had to handle because of latency until now. Consequently, it is expected to reduce the dependence on device performance and spread the use of “smart” devices. In particular, AI will also be handled at the network level, offering a wider variety of AI services to more users and adding new value to customers. ▲ Adam Scraba, Director of Product Marketing, Data Center and AI at Xilinx Adam Scraba, director of product marketing, data center and AI at Xilinx, introduced about 'Adaptive Platform' and Vitis Unified Software Platform. He stated that in the future, the demand for "real-time" AI services is expected to increase in voice-activated AI assistants, home security, and commercial environments, but achieving this goal is not so easy. And to achieve this, a processor-centric computing architecture would have to be transformed into a workload-specific form, and computing would have to evolve in the way of more closely integrating with storage and networks. To meet the high throughput, low latency, power efficiency, and fast adoption of modern algorithms that future services will demand, next-generation computing environments are also required to be transformed into ‘adaptive platforms’ that can be defined in both software and hardware. In addition, Xilinx stressed that the Alveo data center accelerator card enables companies such as SK Telecom to respond appropriately to problems, combines the latest architecture, performance, and connectivity, and can be used in various areas such as cloud services. Above all, Xilinx highlighted that it provides the 'adaptive' features in both hardware and software terms. Xilinx introduced that the company is providing solution stacks for its Alveo accelerator cards and platforms, and in terms of Alveo’s market entry, it can be used both on premises and in the cloud. Accordingly, various methods such as channel partners, cloud providers, and solution providers may be possible to provide. Moreover, in AI inference performance, Alveo can be used as an effective means of achieving performance goals in the presence of various algorithms and workflows. Even on HPC, Alveo can offer several to tens of performance improvements over traditional CPUs or GPUs, such as in financial risk modeling. In leveraging 'adaptive hardware' in these new workloads, Xilinx Vitis unified software platform enables developers to abstract better, develop applications based on industry-standard frameworks, and utilize Xilinx hardware in an optimal form, providing an environment that can speed up development. Furthermore, Vitis AI integrates the 'Domain Specific Architecture' development environment so that the customer’s AI models can be optimally used on Xilinx hardware through industry-standard frameworks and pre-trained AI models.

Xilinx Developer Forum 2019 – Keynote: Victor Peng

Xilinx held the “Xilinx Developer Forum (XDF) 2019” at Fairmont San Jose Hotel in California, USA, on October 1 and 2. Through this XDF 2019, Xilinx announced a variety of innovations and directions for future strategies, along with the new software platform Vitis. XDF 2019 is the place where Xilinx experts, partners and industry leaders gather to inspire new innovations. Starting with a keynote from Victor Peng, Xilinx’s CEO, XDF 2019 began with 1,300 customers, partners, developers and Xilinx employees. The event included a variety of new presentations through keynotes, more than 75 sessions on a variety of topics, a hands-on developer lab with over 40 hours, and a variety of use case exhibits from partners. A variety of stories from industrial key people can also help you discover new perspectives and insights. At the XDF 2019, Xilinx unveiled Vitis, a unified software platform that offers a new design experience. The integrated software platform Vitis makes it easy for software engineers, AI scientists, and other developers in new fields to easily take advantage of hardware adaptability. In addition, keynotes and exhibitions introduced various innovations using Xilinx FPGAs and ACAPs in various areas such as 5G, cloud-to-edge data center areas, AI, and autonomous vehicles. There was also a place to share and discuss new information or implementation examples through various sessions. ▲ Xilinx CEO Victor Peng ▲ The three areas are mentioned as markets for major growth engines. Victor Peng, CEO of Xilinx, stated that Xilinx technologies and products have been used as core competencies for a variety of innovative applications to dates and have been presented an implementation of adaptable, intelligent and connected future world with visions. And as Xilinx’s strategy, he pointed out three pillars: data center first, core market acceleration, and adaptive computing initiatives. These will be implemented as a combination of hardware, software developers, and platforms. At the same time, three current major growth engines were selected: data center, 5G, and automobiles. In implementing 5G infrastructure, Xilinx's Ultrascale+ platform is effectively used to implement 5G infrastructure equipment, and Wonil Roh, Vice President of Samsung Electronics, introduced the implementation of 5G infrastructure equipment through cooperation with Xilinx at the keynote of XDF 2019. First, he claimed that 5G communication technology is spreading at the fastest speed so far. Korea and the United States are leading this, and Japan and India are preparing for full-scale 5G commercial services. And he explained, on the journey to 5G commercial services in the United States, Samsung Electronics is collaborating with major telecom operators in the supply of 5G equipment. In Korea, Samsung Electronics is in a leading position in supplying equipment for 5G infrastructure, such as 5G RAN and core equipment, and supports the rapidly growing Korean 5G market. And one of the reasons for the successful evaluation in the market for the development and supply of high-performance 5G equipment is that it was able to quickly provide a solution that meets the needs of customers. In addition, he explained that Xilinx products’ features such as low power consumption and heat generation, larger memory and performance, and flexibility have enabled the development of equipment that can be easily installed with smaller size and higher energy efficiency. Wonil Roh also introduced that Samsung Electronics is already planning a product that uses Xilinx Versal ACAP. ▲ Xilinx’s hardware platform is now reaching ACAP concept. ▲ ‘SageMaker Neo’ enables you to quickly migrate your optimized machine learning model from cloud to edge. It is introduced that Xilinx's hardware platforms range from FPGAs and SoCs, and now to the Adaptive Compute Acceleration Platform (ACAP) and the Alveo platform for data centers. In particular, 'Alveo', which is optimized for the implementation of adaptive accelerators in data centers, can be configured with HBM to provide a high performance environment for various applications in the data center. Xilinx also presented a 7nm Versal ACAP chip and development board. In addition, in the growth of the data center ecosystem, which is one of the major growth engines in the future, Xilinx highlighted that it has more than 5,800 companies, academia, 725 accelerator programs, 85 apps, etc. across Alveo's partner ecosystem and ISV ecosystem covering various areas. In using FPGAs as "adaptive accelerators" in the data center, Amazon is offering its AWS offering a "FPGA as a service" model through the F1 service. The EC2 F1 service is now deployed and operating on a global basis and continues to expand. And here, Amazon announced that AWS is expanding its F1 service based on Virtex Ultrascale+ FPGAs to the Canada Region. Amazon stressed that developers using FPGAs can scale their apps through the cloud and rapidly deploy to larger regions. Many customers around the world are using this F1 service to efficiently handle compute-intensive applications. AstraZeneca has effectively combined the EC2 F1, Batch and S3 services to efficiently build a genomic sequence data processing pipeline on AWS services. The pipeline's performance is enough to analyze more than 100,000 genomes in 100 hours, and there is still room for improvement. In the area of security, Trend Micro is also deploying high-performance network security virtual appliances on the FPGA instance and showing them on AWS Marketplace, adding performance, efficiency and operational simplicity. What’s more, Amazon introduced ‘Sagemaker Neo’ service, which provides an environment for building and deploying machine learning models with optimized performance and efficiency. The service allows users to deploy the most optimized models in a variety of environments including edge environment and the environments utilizing Xilinx solutions. With this service, machine learning models built and run on Xilinx Alveo cards on-premises can be seamlessly migrated to the F1 in the cloud and to edge devices based on Xilinx technology. ▲ As a future data center model, a ‘distributed adaptive computing environment’ was proposed. ▲ Microsoft also introduced Alveo-based FPGA accelerated VM service to Azure services. Xilinx introduced its FPGAs and ACAPs as "accelerators" in computing environments, enabling efficient performance of tens of times in video transformation, data analysis, and real-time machine learning inference. In addition, in the storage area, it is possible to greatly increase performance by using FPGAs as accelerators in compression, encryption, and deduplication, etc. in processing the rapidly growing data. And SmartNIC, which uses FPGAs together in the network, can greatly improve packet processing throughput and provide a more secure connection environment by improving encryption processing performance. As a future data center model, “Distributed Adaptive Computing” was proposed, in which FPGAs and ACAPs were actively combined with the existing cloud resource pools. In this overall cloud environment expanding to hybrid and edge, processors, SmartNICs, storage with accelerators, and ACAP-based compute accelerators are all connected in the form of resource pools and optimized for workloads, allowing flexible configuration deployment as needed. In such an environment, proper resource allocation enables high performance, low latency, and optimized cost. At the same time, reconfigurable silicon devices such as FPGAs and ACAPs can respond to rapid innovation cycles while minimizing the introduction of new silicon. Microsoft stated that Azure Cloud Service offers a variety of specialized service models, including high-performance VMs, accelerator-combined VMs, HPC services, and high-performance storage or large-memory VMs. Among these service models, the Azure NP VM service can utilize Alveo U250 FPGA accelerators, with 10-40 CPU cores, 168GB to 672GB of memory, and four Alveo U250 accelerators depending on the required performance level. This service will soon be available in the US East, West, Southeast Asia and Western Europe regions so that the software products currently available in the Alveo environment will be available in Azure as well. ▲ Vitis unified software platform enables FPGAs to be handled on the basis of software technologies. ▲ Hitachi announced that it achieved results in just two months by using Vitis to implement the entire ADAS system. Xilinx has continued to evolve its capabilities in the software direction since the FPGA design tool Vivado. And Xilinx introduced its plan to clean up the complex environments that have been divided so far through the Vitis unified software platform, and to organize the development environment into two major axes, Vivado and Vitis. In addition, Xilinx stressed that in response to heterogeneous computing environments, edge-to-cloud environments, and the use of AI technology, the Vitis platform enables leveraging Xilinx’s hardware architecture capabilities without the need for expertise in hardware. The Vitis unified software platform includes a rich set of optimized open source libraries and implementations of domain-specific architectures, providing an environment for optimized hardware from AI, video to big data analytics. Xilinx announced that it plans to offer the Vitis platform free of charge for the Xilinx board, based on an open source model, and emphasized that the journey to open source is now one of the key strategies for Xilinx. Furthermore, Xilinx is contributing to the OpenAMP project recently, and in the future, the TensorFlow and Kubernetes projects will also contribute to Xilinx technology. In the automotive sector, it was introduced that Hitachi Automotive Systems supplies ADAS systems to a wide range of automotive manufacturers, and utilizes Xilinx’s FPGAs and SoCs to optimize the functionality and performance required for level 2 or higher ADAS systems. In implementing such a modern ADAS system, Hitachi introduced that it could create an end-to-end automotive design that includes a deep learning-based object recognition system in just two months by using Versal boards and Vitis platforms. ▲ Vitis AI integrates a “domain specific architecture” to give AI scientists even greater capability. ▲ The use of FPGAs in the implementation of autonomous driving platforms enables them to meet a variety of demanding conditions. Incorporating a domain-specific architecture (DSA), Vitis AI allows you to optimize and program your Xilinx hardware using industry-leading frameworks such as TensorFlow, Caffe, and PyTorch. Vitis AI provides tools to optimize, compress, and compile trained AI models and run them on Xilinx devices in about one minute. Also, specialized APIs for building edge-to-cloud with best-in-class reasoning performance and efficiency are supported. Xilinx emphasized that this tool is a useful tool for startups that want to take advantage of AI by enabling optimized hardware use without expertise in hardware. presented an example of using Xilinx FPGAs and Versal ACAP to implement the level 4 autonomous driving system, “PonyPilot”. The autonomous driving system needs to be able to continue driving even in a complicated traffic situation on a real road, and to judge data coming through various sensors, and to perform a near real-time judgment in a complicated situation. In particular, in meeting the difficult conditions of low latency and limited power consumption, the use of FPGAs has improved the latency by 12 times compared to the conventional system, and realized the system with about 30W of power consumption.

Kevin Brown - From Buzzwords to Reality: Managing Edge Data Centers

Schneider Electric introduced its strategies and solutions for edge infrastructure at ‘Edge Press Conference 2019’ on September 19 at the Marina Bay Sands Hotel in Singapore. At the event, Schneider Electric introduced 'EcoStruxure™' platform and various solutions for the construction and operation of efficient edge infrastructure under the theme of 'Life at the Edge'. With the proliferation of IoT utilization, the “edge” infrastructure is becoming increasingly important in the overall IT infrastructure, and by 2025, 75% of data generated by companies around the world is expected to be created and processed at the edge. Hence, in the construction and operation of the edge infrastructure, Schneider Electric's event with the theme of 'Life at the Edge' suggested new trends in digital transformation and business profitability improvement using cloud-based software, AI and micro data center infrastructure, and introduced Schneider Electric's strategies, solutions and collaborations to realize these opportunities. Kevin Brown, Senior Vice President of Innovation and CTO in Secure Power Division, introduced some practical considerations for implementing an edge data center. While building infrastructure at the edge is more important than our expectations, it's hard to put resources in place to increase availability. And the ways to ensure availability must also be approached differently from the traditional data centers. Schneider Electric has pointed out the cloudification of the management system and AI technology to leverage the vast amount of data as the ways to effectively maximize the resiliency of the edge. ▲ Kevin Brown, Senior Vice President of Innovation and CTO in Secure Power Division ▲ Given the availability of the entire infrastructure, the availability of local edges should also be higher than expected. Kevin Brown introduced that there are many different views on edge infrastructure and hybrid edge architectures, but in brief, they fall into three categories; central large data center, ‘regional edge’ by region, and ‘local edge’ that creates and consumes data closest to the user. Even in this architecture, the model will vary depending on how much computing power the local edge has and which device the local edge is connected to and operating with. But he added that the local edge is where the device is connected first. The first thing to think about when implementing local edge availability and resiliency is that the "expectations" people have for IT has changed. If a service is stopped due to some kind of failure, it may not have a big impact on someone, but it may come as a big disaster for someone. And the availability aspects of traditional data centers are different in stages, ranging from Tier 1 with 99.67% availability to Tier 4 with 99.995% availability. The percentages may seem as small as a change in decimal units. However, in the aspect of fault-tolerant, Tier 1 can be achieved even if service stops for 28.8 hours, while Tier 4 only allows 25 minutes per year. Typically, the data center targets Tier 3 with 99.98% availability, with an annual downtime of 1.6 hours. Efforts have also been constantly made to drive this up, but it was pointed out that only focusing on downtime from an IT perspective may distort the results regardless of the user experience. In addition, considering the combination of Tier 3 central data centers and Tier 1 local edges, the availability level will drop to 99.65%, resulting in 30.7 hours of downtime per year. Indeed, Kevin stressed that the edge should also be built as a mission-critical data center beyond expectations. The things needed to implement an edge site with this type of mission-critical data center beyond expectations might include facility security, redundancy configuration, independent cooling, monitoring, management, local support staff, etc. In response to these requirements, the integrated ecosystem, appropriate management tools, and the use of analytics and AI were mentioned to improve edge resilience. This enables monitoring to the physical environment, implements access control to equipment, remotely configures settings, and proactively responds before problems occur. The use of AI can also reduce the burden on management resources. ▲ In the edge environment, the 'integrated ecosystem' aspect by multiple players becomes more special. ▲ A cloud-based implementation is effective for the management of distributed edge environments. The cores of the 'integrated ecosystem' through active cooperation within the ecosystem are standardization, robustness, and simplicity, which can overcome the 'distributed multiple installation environment' and 'insufficient resident management resource', which are the characteristics of edge environment. As for the edge as well, in the user's environment, when a failure occurs, all surrounding consumption and distribution environment would be stopped. Therefore, management and security should be considered in the edge construction. The edge environment should be able to be monitored and managed from anywhere. And as the system is built and distributed through partners, it should be able to quickly respond to various user-specific deployment cases or failure situations by considering the training aspects of manpower. On the other hand, the training aspect of the manpower is a very important part of the system construction, and the importance of the integrated ecosystem is also emphasized here. For example, if Schneider Electric's customers are considering the provision of services as 'managed service providers' and the customers are in locations that are unable to be responded by directly sending manpower, it may be necessary to entrust the information to a local service provider. And depending on the customer's application, a configuration of deploying computing power on the local edge might be also possible. Here, Schneider Electric added that it is working closely with a variety of partners, including HPE, Dell EMC, and Cisco, to address these diverse needs. As for the management tools for edge infrastructures deployed in physically remote locations, traditional management tools were not suitable for the edge in terms of access control, alarm issues, and management cycles. Hence, Schneider Electric proposed a cloud-based edge management environment to overcome this problem. If the management environment is implemented on a cloud basis, you can access and monitor remote devices anytime, anywhere, and you don't have to worry about device scalability. In addition, Kevin explained that it is advantageous in terms of flexible payment model, software update, and utilization of new technologies such as AI. ▲ The use of cloud and AI can provide more accurate insights. In the management of edge infrastructure, AI technology is expected to save human effort and enable efficient data-based management. However, some preparation is required to properly use it. The primary preparation includes secure, scalable, and robust cloud architecture, and at the same time, a "data lake" with large amounts of normalized data is needed. Besides, Kevin Brown stated that the experts who have a deep understanding of the system's activities and an approach to machine learning algorithm expertise would be required. In fact, in deriving insights from data, it is important to clarify what data to analyze what problems and to collect and refine the data accordingly. Simply pouring data into an AI system does not yield results, and most of the data analysis is about obtaining and refining data that can be analyzed. Of course, it is not easy for customers to exactly identify and handle these points, and the problems that customers face becoming more complex, so Kevin pointed out that the difficulty keeps increasing. Schneider Electric introduced 'UPS Score' as a way to simplify the data analysis and insight aspects of UPS management. This UPS Score analyzes the information of hundreds of UPSs installed in various infrastructures based on algorithms, providing a clear picture of what your UPS is up to now and taking a series of steps before a failure occurs. The scoring criteria are based on the device's service life, battery life, temperature, the balance of phases, alarm data and sensor data, etc. This provides a more intuitive understanding of the current state and allows us to respond proactively before serious problems arise.