SK Telecom's AI-based physical intrusion detection service, T view™, monitors hundreds of thousands of customers' commercial and home camera systems in real-time and dispatches security guards under physical intrusion circumstances. Processing a large volume of data from thousands of cameras using deep neural networks necessitates a powerful AI accelerator that can provide enough throughput and accuracy. T view uses SK Telecom's AI inference accelerator (AIX), implemented on Xilinx Alveo U250 cards. Running on servers in SK Telecom's data center, Alveo U250 cards have demonstrated high throughput and accuracy in theft detection services.
Built on the Xilinx 16nm UltraScale+ architecture, Alveo accelerators are adaptable to changing algorithms and acceleration requirements, enabling domain specific architectures that optimize performance for a range of workloads without changing hardware, and while reducing overall cost of ownership. Xilinx Alveo accelerator cards are designed to meet the performance and flexibility needs of data center AI workloads, providing 10X higher performance for AI-based speech translation and over 3X higher throughout for video analytics pipelines compared to GPUs.
|▲ Sam Rogan, Vice President of APAC Regional Sales at Xilinx|
|▲ Jin-hyo Park, Head of ICT Technology Center at SK Telecom|
Sam Rogan, vice president of APAC regional sales at Xilinx, introduced that the key to meeting the growing demand for computing is ‘performance’, and there are many ways to define it. And the definition of this performance and the way to improve it are changing. Starting with competing simple processor speeds, parallel processing, a combination of multicores and heterogeneous cores have been made as well as a lot of performance gains in the process. However, we are now faced with the limitations of performance improvements again. And Xilinx has proposed a ‘domain-specific architecture’ in the form of next-generation system configurations.
This ‘domain-specific architecture’ is drawing more attention in the hot topics such as artificial intelligence and machine learning applications. Since 2012, about 20 to 30 models of neural networks have invented, and compared to the earlier models, the current models are able to use parallel processing or non-sequential processing, which greatly increases processing performance. In addition, the neural network model for these AIs requires the optimal architecture to boost the performance, and also requires customized precision, datapaths, and memory hierarchies to get the highest performance from the architecture.
At this time, the hardware that can be used to implement these features is GPU, ASIC, FPGA. Of these, the GPU is flexible but has a problem of high-power consumption and latency. In addition, ASIC/ASSP, which locks logic on hardware to overcome the drawbacks of the GPU, takes long time from design to production, and sometimes the neural network model is already devalued at the start to production. FPGAs, however, overcome the challenges of GPUs and ASICs, and provide an environment to quickly keep up with the ever-changing neural network model. Xilinx highlighted that it develops tools and libraries that users can easily access, and provides a way of using high-level languages that can be used for the existing model development on FPGAs.
Jin-hyo Park, Head of ICT Technology Center at SK Telecom, explained that the existing telecommunications company thought that it would be necessary to use 'artificial intelligence' in areas that can differentiate itself from the existing operators in expanding the scope as a ‘service’ other than telecommunications. And he commented that SK Telecom has an artificial intelligence service called ‘NUGU’, which requires the development of algorithms and software for artificial intelligence, and an appropriate infrastructure to use the developed software and algorithms well. Hence, SK Telecom has focused on accelerators using FPGAs in researching infrastructure for artificial intelligence services, which is the third application for SK Telecom. In addition, he stated that AI and accelerators will be used for MEC in the future, and cooperation with Xilinx is important to create new converged services.
|▲ Kang-Won Lee, Head of Cloud Labs at SK Telecom|
|▲ AIX, an accelerator for high-performance AI inference environments|
Kang-Won Lee, Head of Cloud Labs at SK Telecom, introduced that SK Telecom's “AIX” was the project name for the inference accelerator for AI services. He explained that SK Telecom is currently the largest mobile network operator in Korea, and is conducting business in various areas such as media, e-commerce, security, and semiconductors. Also, in all those areas, AI is used for various purposes, and AI is not a technology for a specific area, but a technology that can provide value to customers and society in every area. Moreover, in the application of AI, there are AI technologies that become 'cores' such as image analysis, data analysis, and natural language processing. SK Telecom believes that these technologies will be the core competency of the business, and has been developing accelerators.
In general, the lifecycle of an AI service consists of 'training' and 'inference', of which 'training' takes place in a specific part of the data center during the development phase. It is not latency sensitive and is primarily deployed in GPU farm environments. And ‘inference’ is what the developed AI services actually provide, and when more parts become AI in the future, it is expected that there will be greater market opportunities related to inference. Furthermore, he stated that the service infrastructure for the AI inference environment is targeted at large users, and since cost and power consumption are important as well as performance, an accelerator that can satisfy both high performance and high efficiency will be effective.
SK Telecom's AIX designs the Neural Processing Unit (NPU) for AI inference and mounts it on Xilinx FPGAs to use for the actual services. At this time, the AIX unit implemented in the FPGA can provide high performance with high performance memory such as HBM, and the FPGA is a configuration that is used in a commercial server in the form of a PCIe card. In addition, software stacks such as compilers, libraries, and runtime environment for AIX are prepared. In terms of software, it supports the framework environment that is widely used, and there are modules for performance optimization under it. It also provides a runtime environment that enhances users’ convenience.
|▲ An example of speeding up the intrusion detection process with AIX, based on Alveo U250, was introduced.|
SK Telecom stressed that its AI services were not created at the laboratory level, and have been repeatedly used and improved in actual services. In addition, AIX has been used for voice recognition of 'NUGU' speakers to achieve higher performance and cost-effectiveness than existing GPU-based systems. SK Telecom also introduced that it provided a real-time speech-to-text (STT) service through AIX accelerator to the call center’s STT service, Vanessa Speech Note. Moreover, in the case of 'T view', AIX has expanded its use of video analysis. While there have been many false positives and inefficiencies until now, AI-based intrusion detection technology will provide more accurate detection and higher cost efficiency.
The two hardwares used in the case of ‘T view’ are Alveo U250 and a custom card with two Virtex-based FPGAs. AIX utilizing FPGAs can expect to be more than twice as cost-efficient as GPUs, and can handle a lot of processing while reducing latency. Even removing some of the latency constraints, AIX provides more performance than GPU. In addition, by utilizing AIX for intrusion detection, the inefficiency of on-site dispatch caused by false positives was greatly reduced, and the environment where AI could send out dispatch requests was implemented with accurate and quick intrusion detection.
Meanwhile, with the rise of 5G networks, the ‘edge cloud’ can place infrastructure and services closer to users and expect lower latency. By utilizing the edge cloud in 5G networks, the network and the edge cloud can handle the tasks that the devices had to handle because of latency until now. Consequently, it is expected to reduce the dependence on device performance and spread the use of “smart” devices. In particular, AI will also be handled at the network level, offering a wider variety of AI services to more users and adding new value to customers.
|▲ Adam Scraba, Director of Product Marketing, Data Center and AI at Xilinx|
Adam Scraba, director of product marketing, data center and AI at Xilinx, introduced about 'Adaptive Platform' and Vitis Unified Software Platform. He stated that in the future, the demand for "real-time" AI services is expected to increase in voice-activated AI assistants, home security, and commercial environments, but achieving this goal is not so easy. And to achieve this, a processor-centric computing architecture would have to be transformed into a workload-specific form, and computing would have to evolve in the way of more closely integrating with storage and networks.
To meet the high throughput, low latency, power efficiency, and fast adoption of modern algorithms that future services will demand, next-generation computing environments are also required to be transformed into ‘adaptive platforms’ that can be defined in both software and hardware. In addition, Xilinx stressed that the Alveo data center accelerator card enables companies such as SK Telecom to respond appropriately to problems, combines the latest architecture, performance, and connectivity, and can be used in various areas such as cloud services. Above all, Xilinx highlighted that it provides the 'adaptive' features in both hardware and software terms.
Xilinx introduced that the company is providing solution stacks for its Alveo accelerator cards and platforms, and in terms of Alveo’s market entry, it can be used both on premises and in the cloud. Accordingly, various methods such as channel partners, cloud providers, and solution providers may be possible to provide. Moreover, in AI inference performance, Alveo can be used as an effective means of achieving performance goals in the presence of various algorithms and workflows. Even on HPC, Alveo can offer several to tens of performance improvements over traditional CPUs or GPUs, such as in financial risk modeling.
In leveraging 'adaptive hardware' in these new workloads, Xilinx Vitis unified software platform enables developers to abstract better, develop applications based on industry-standard frameworks, and utilize Xilinx hardware in an optimal form, providing an environment that can speed up development. Furthermore, Vitis AI integrates the 'Domain Specific Architecture' development environment so that the customer’s AI models can be optimally used on Xilinx hardware through industry-standard frameworks and pre-trained AI models.