Cloud Deep Learning Chips Training & Inference

Cloud Deep Learning Chips
Training & Inference
Created date：2019.12.07 
Updated date : 2019.12.15/17/25 
＠Vengineer

This is a summary of learning and inference
chips for deep learning in the cloud.
Each company's chips, product photos and
images are borrowed from the URL on the same
page.

Habana Labs：Goya (DRAM)
Intel Nervana：NNP-I (DRAM)
Google：TPU v1 (SRAM)
Groq (SRAM)
Alibaba：Hanguang(含光) 800 (SRAM)
The inference chip does not have an
interconnect itself.
InferenceTraining
Google：TPU v2/v3 (HBM2)
Intel Nervana：NNP-T (HBM2)
Habana Labs：Gaudi (HBM2)
Alphaics：RAP (HBM2 option)
Huawei：Ascend 910 (HBM2)
Graphcore：GC2 (SRAM)
Cerebras：CS-1 (SRAM)
The training chip has its own
interconnect.
https://vengineer.hatenablog.com/entry/2019/11/05/060000
https://github.com/basicmi/AI-Chip
https://twitter.com/jwangARK/status/1189560904872058880

Google Cloud TPU v2/v3
https://cloud.google.com/tpu/docs/system-architecture
Cloud TPU v2
180 TFLOPS
64 GB HBM
Cloud TPU v3
420 TFLOPS
128 GB HBM

Intel Nervana NNP-T
https://www.intel.ai/nervana-nnp/
nGraph
https://github.com/NervanaSystems/ngraph

Habana Gaudi
https://habana.ai/ 2019.12.17(16) : Intel が 2Bドル(約2200憶円)で買収

Alphaics：Real AI Processing (RAP)
https://www.alphaics.ai/

Huawei：Ascend 910
https://ascend.huawei.com/

Enﬂame Technology：邃思 DTU
http://www.enﬂame-tech.com/products.html
CloudBlazer
　PCIe Gen4 x16
　HBM2 16GB (512GB/s)
　T10：PCIe Card (255W)
　　　　FP32　　：20TFLOPS
　　　　FP16/BF16：80TFLOPS
　T11：OAM (300W
　　　　FP32　　：22TFLOPS
　　　　FP16/BF16：86TFLOPS

Baidu：Kunlun (advanced XPU)
https://news.samsung.com/global/baidu-and-samsung-electronics-ready-for-production-of-leading
-edge-ai-chip-for-early-next-year
Samsung 14nm
　HBM2 16GB (512GB/s)
　260TOPS/150W

Graphcore GC2
https://www.graphcore.ai/

Cerebras Systems：Cerebras CS-1
https://www.cerebras.net/

Google TPU v1
https://cloud.google.com/blog/products/gcp/an-in-depth-look-at-googles-first-tensor-processing-unit-tpu

Intel Nervana NNP-I
https://www.intel.ai/nervana-nnp/
Glow
https://github.com/pytorch/glow
NNP-I x32：1U

Habana Goya 2019.12.17(16) : Intel が 2Bドル(約2200憶円)で買収
https://habana.ai/
Glow

groq：Tensor Streaming Architecture (TSA)
https://groq.com/
https://groq.com/groq-announces-worlds-first-architecture-capable-of-petaops-on-a-single-chip/
https://www.electronicdesign.com/industrial-automation/groq-s-ai-accelerator-eyes-hyperscalers-auto
nomous-vehicles

Alibaba：Hanguang(含光) 800
https://www.alibabacloud.com/blog/announcing-hanguang-800-alibabas-ﬁrst-ai-inference-chip_59
5482
https://en.wikichip.org/wiki/t-head/hanguang_800

Glow: A community-driven approach to AI
infrastructure speciﬁcation
https://engineering.fb.com/ml-applications/glow-a-community-driven-approach-to-ai-infrastructure/

Intel and Baidu Continue Collaboration
across AI, AD and 5G
https://newsroom.intel.com/articles/intel-baidu-continue-collaboration-across-ai-ad-5g/
● BaiduBrain* (Baidu’s AI platform),
● PaddlePaddle* (Baidu’s deep learning platform)
● DuerOS* (Baidu’s AI-powered voice assistant platform)
● Apollo* (Baidu’s autonomous driving platform)
● Intel® Xeon® Scalable platform
● Intel® Optane™ DC Persistent Memory
● Intel® Optane™ DC SSD
● silicon photonics
● Ethernet
● Intel AI accelerators and Intel software stack

Training PyTorch models on Cloud TPU
Pods
https://cloud.google.com/tpu/docs/tutorials/pytorch-pod
github : https://github.com/pytorch/xla

MICROSOFT AND GRAPHCORE
COLLABORATE TO ACCELERATE
ARTIFICIAL INTELLIGENCE
https://www.graphcore.ai/posts/microsoft-and-graphcore-collaborate-to-accelerate-artiﬁcial-intellig
ence
Today we are very excited to share details of our collaboration with Microsoft,
announcing preview of Graphcore® Intelligence Processing Units (IPUs) on
Microsoft Azure.
● Graphcore IPUs with Dell EMC DSS 8440 Server
● Graphcore also delivers a full training runtime for ONNX and is working closely with the ONNX
organisation to include this in the ONNX standard environment. Initial PyTorch support is
available in Q4 2019 with full advanced feature support becoming available in early 2020.

Baidu, Facebook and Microsoft work together to
define the OCP Accelerator Module
specification
https://www.opencompute.org/blog/baidu-facebook-and-microsoft-work-together-to-define-the-oc
p-accelerator-module-specification
https://146a55aca6f00848c565-a7635525d40ac1c70300198708936b4e.ssl.cf1.rackcdn.com/images/22fa829b159a4c
ea7b33aa12bc2c61909e52d077.pdf
Other than Apple and Amazon =>

I am a computer engineer,
not a deep learning craftsman
 
 
ありがとうございました。
Thanks
＠Vengineer
ソースコード解析職人
Source code analysis craftsman

Cloud Deep Learning Chips Training & Inference

More Related Content

What's hot

Similar to Cloud Deep Learning Chips Training & Inference

More from Mr. Vengineer

Recently uploaded

Cloud Deep Learning Chips Training & Inference