Search for a command to run...
In order to achieve better accuracy, modern models have become increasingly large, leading to an exponential increase in computational load, making it challenging to apply them to edge computing. Binary neural networks (BNNs) are models that quantize the filter weights and activations to 1-bit. These models are highly suitable for small chips like advanced RISC machines (ARMs), field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), system-on-chips (SoCs) and other edge computing devices. To design a model that is more friendly to edge computing devices, it is crucial to reduce the floating-point operations (FLOPs). Batch normalization (BN) is an essential tool for binary neural networks; however, when convolution layers are quantized to 1-bit, the floating-point computation cost of BN layers becomes significantly high. This paper aims to reduce the floating-point operations by removing the BN layers from the model and introducing the scaled weight standardization convolution (WS-Conv) method to avoid the significant accuracy drop caused by the absence of BN layers, and to enhance the model performance through a series of optimizations, adaptive gradient clipping (AGC) and knowledge distillation (KD). Specifically, our model maintains a competitive computational cost and accuracy, even without BN layers. Furthermore, by incorporating a series of training methods, the model’s accuracy on CIFAR-100 is 0.6% higher than the baseline model, fractional activation BNN (FracBNN), while the total computational load is only 46% of the baseline model. With unchanged binary operations (BOPs), the FLOPs are reduced to nearly zero, making it more suitable for embedded platforms like FPGAs or other edge computers.