Effective Combination of DenseNet and BiLSTM for Keyword Spotting-Deep Learning Project

6,000.00

Effective Combination of DenseNet and BiLSTM for Keyword Spotting-Deep Learning Project

100 in stock

SKU: Keyword Spotting-Deep Learning Project Categories: ,

Description

Effective Combination of DenseNet and BiLSTM for Keyword Spotting-Deep Learning Project

Keyword spotting (KWS) is a major component of human-computer interaction for smart on-device terminals and service robots, the purpose of which is to maximize the detection accuracy while keeping footprint size small. In this project, based on the powerful ability of DenseNet on extracting local feature-maps, we propose a new network architecture (DenseNet-BiLSTM) for KWS. In our DenseNet-BiLSTM, the DenseNet is primarily applied to obtain local features, while the BiLSTM is used to grab time series features. In general, the DenseNet is used in computer vision tasks, and it may corrupt contextual information for speech audios. In order to make DenseNet suitable for KWS, we propose a variant DenseNet, called DenseNet-Speech, which removes the pool on the time dimension in transition layers to preserve speech time series information. In addition, our DenseNet-Speech uses less dense blocks and filters to keep the model small, thereby reducing time consumption for mobile devices. The experimental results show that feature-maps from DenseNet-Speech maintain time series information well. Our method outperforms the state-of-the-art methods in terms of accuracy on Google Speech Commands dataset. DenseNet-BiLSTM is able to achieve the accuracy of 96.6% for the 20-commands recognition task with 223K trainable parameters.

Reviews

There are no reviews yet.

Be the first to review “Effective Combination of DenseNet and BiLSTM for Keyword Spotting-Deep Learning Project”

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.