Defining properties of speech spectrogram images to allow effective pre-processing prior to pattern recognition

被引:0
|
作者
Mohammed, Aldarkazali [1 ]
Rupert, Young [1 ]
Chris, Chatwin [1 ]
Philip, Birch [1 ]
机构
[1] Univ Sussex, Sch Engn & Design, Ind Informat Res Grp, Brighton BN1 9QT, E Sussex, England
来源
OPTICAL PATTERN RECOGNITION XXIV | 2013年 / 8748卷
关键词
D O I
10.1117/12.2014511
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The speech signal of a word is a combination of frequencies which can produce specific transition frequency shapes. These can be regarded as a written text in some unknown 'script'. Before attempting methods to read the speech spectrogram image using image processing techniques we need first to define the properties of the speech spectrogram image as well as the reduction of the clutter of the spectrogram image and the selection of the methods to be employed for image matching. Thus methods to convert the speech signal to a spectrogram image are initially employed, followed by reduction of the noise in the signal by capturing the energy associated with formants of the speech signal. This is followed by the normalisation of the size of the image and its resolution of in both the frequency and time axes. Finally, template matching methods are employed to recognise portions of text and isolated words. The paper describes the pre-processing methods employed and outlines the use of normalised grey-level correlation for the recognition of words.
引用
收藏
页数:11
相关论文
共 28 条
  • [1] Signal pre-processing in speech recognition
    Kolokolov, A.S.
    Avtomatika i Telemekhanika, 2002, (03): : 160 - 168
  • [2] Speech recognition by neural networks and pre-processing wavelet
    Cister, AM
    Galante, GMF
    WAVELET APPLICATIONS IN SIGNAL AND IMAGE PROCESSING V, 1997, 3169 : 575 - 578
  • [3] Multi-Resolution Pre-Processing for Pattern Recognition in Images and Audio Signals.
    Mansor, Noha
    Flynn, Ronan
    Daly, Mark
    2021 32ND IRISH SIGNALS AND SYSTEMS CONFERENCE (ISSC 2021), 2021,
  • [4] Pre-processing and segmentation of speech signal in frequency domain for speech recognition
    Kolokolov, A.S.
    Avtomatika i Telemekhanika, 2003, (06): : 152 - 162
  • [5] Input pre-processing for transformation invariant pattern recognition
    Tascini, G
    Montesanto, A
    Fazzini, G
    Puliti, P
    ENGINEERING APPLICATIONS OF BIO-INSPIRED ARTIFICIAL NEURAL NETWORKS, VOL II, 1999, 1607 : 393 - 401
  • [6] Pre-Processing for Performance Enhancement of Speech Recognition in Digital Communication Systems
    Seo, Jinho
    Park, Hochong
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2005, 24 (07): : 416 - 422
  • [7] PRE-PROCESSING AND PATTERN RECOGNITION METHODS FOR ARTIFICIAL OLFACTION SYSTEMS: A REVIEW
    Di Natale, Corrado
    Martinelli, Eugenio
    D'Amico, Arnaldo
    METROLOGY AND MEASUREMENT SYSTEMS, 2005, 12 (01): : 3 - 22
  • [8] PRE-PROCESSING OF FIBERS FOR EFFICIENT CHARACTERIZATION BY PATTERN-RECOGNITION TECHNOLOGY
    KAYE, BH
    NAYLOR, AG
    ROBB, NI
    TIMBRELL, V
    POWDER TECHNOLOGY, 1976, 14 (01) : 189 - 190
  • [9] Advanced data pre-processing for damage identification based on pattern recognition
    Staszewski, WJ
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2000, 31 (11) : 1381 - 1396
  • [10] Muzzle Point Pattern Recognition System Using Image Pre-Processing Techniques
    Kumar, Santosh
    Chandrakar, Shashank
    Panigrahi, Avinash
    Singh, Sanjay Kumar
    2017 FOURTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2017, : 127 - 132