Article Review - AdderNet: Do We Really Need Multiplication in Deep Learning? (Image classification)

Using addition instead of multiplication for convolution results in less latency than standard CNN

Convolution AdderNet using addition, no multiplication
AdderNet: ?, (AdderNet), , Huawei Noah's Ark Lab .


  1. AdderNet

  2. : BN, ,

1. AdderNet


  • , Y :

  • S - .


Standard convolution using multiplication
  • , . .

1.3. AdderNet

AdderNet ,
  • , l1- :

  • l1- .

, .

, , - , .

2. : BN, ,

2.1. (Batch Normalization - BN)

  • , (BN) Y , , CNN, AdderNets.

  • BN , , , .

  • ( - BN, ?)


  • l1- . , l2-:

  • .

  • , X [-1,1].

  • Y X :

  • HT - HardTanh:


l2-measures of gradients in LeNet-5-BN
l2- LeNet-5-BN
  • , AdderNets , CNN, AdderNets.

  • AdderNets :

  • Îŗ - (, BN ), ΔL(Fl) - l, Îąl - .

  • ,

  • k Fl, Ρ - .


3.1. MNIST

  • LeNet-5-BN .

  • CNN 99,4% 435K 435K .

  • , AdderNet 99,4%, CNN, 870K .

  • , .

  • , VIA Nano 2000 4 2 . AdderNet LeNet-5 1.7M, CNN 2.6M CPU.

3.2. CIFAR

Classification results on the CIFAR-10 and CIFAR-100 datasets
BNN: XNORNet convolution using XNOR boolean operations
  • (Binary neural networks - BNN): XNOR , .

  • VGG-small, AdderNets (93,72% CIFAR-10 72,64% CIFAR-100) CNNs (93,80% CIFAR-10 72,73% CIFAR-100).

  • BNN , AdderNet CNN, (89,80% CIFAR-10 65,41% CIFAR-100).

  • ResNet-20, CNN (.. 92,25% CIFAR-10 68,14% CIFAR-100), (41,17M).

  • AdderNets 91,84% CIFAR-10 67,60% CIFAR-100 , CNN.

  • , BNN 84,87% 54,14% CIFAR-10 CIFAR-100.

  • ResNet-32 , AdderNets CNN.

3.3. ImageNet

Classifying Results on ImageNet Datasets 
  • CNN 69,8% top-1 89,1% top-5 RESNET-18. , 1.8G .

  • AdderNet 66,8% top-1 87,4% top-5 ResNet-18, , .

  • , BNN , 51,2% top-1 73,2% top-5 ResNet-18.

  • ResNet-50.


Feature visualization in AdderNets and CNN.  CNN tags of different classes are divided according to their corners.
AdderNets CNN. CNN .
  •  LeNet++ MNIST, 3D .

  • 32, 32, 64, 64, 128, 128 2 .

  • AdderNets l1- . .

  • , AdderNets CNN.

Visualization of filters in the first layer of LeNet-5-BN on MNIST
  • adderNets - .

  • , AdderNets .

Histograms by weights from AdderNet (left) and CNN (right).
AdderNet () CNN ().
  • AdderNets , CNN . , l1- .


AdderNets Learning Curve Using Various Optimization Schemes
  • AdderNets, (adaptive learning rate - ALR) (increased learning rate - ILR), 97,99% 97,72% , , CNN (99,40%) .

  • AdderNets.

  • AdderNet ILR 98,99% . (ALR), AdderNet 99,40%, .

