FeatureExtraction Class Reference

This class handles all feature extraction issues. More...

Inheritance diagram for FeatureExtraction:

List of all members.

Public Member Functions

 FeatureExtraction (const char *audioIn, const char *audioOut, int mfccSize, int useEnergy, int useZeroCross, int useDeltas, double vtln, bool useSqrt10=false, bool offs=false, bool onlyPrepare=false, bool onlyMEL=false, bool pCTS=false)
 ~FeatureExtraction ()
int getVectorSize () const
int getPoolSize () const
Vector ** getPool () const
VectorgetOnlineVector (double *silThreshold, bool isSil)
void setVTLN (double vtln)
void setBackgroundHistNorm (Vector **hist, int numBins)
bool createFeaturesUntilFrame (int lastFrame, int normID=0)
void finishExtraction ()
void doFeatureExtraction (const char *audioIn, const char *audioOut, bool onlyPrepare=false)
void setNormalization (FEATURENORM_TYPE normT, int nrClusters, bool doMean, bool doVar, Vector **hist)
void finishClusterCMVN ()
void performClusterCMVNUntilFrame (int lastFrame, int cmvnID)
void performHistNormUntilFrame (int lastFrame, int histID)
void initializeHistNorm ()
void setHistNormModel (int normID, Vector **h)
void setAudioFileOut (const char *fileName)
void performPCA (Vector **pca, int len)
void setOnlyMEL ()
void storeFeatureVectors (FILE *file)

Public Attributes

Vector ** inverseHistogram

Protected Member Functions

bool readAudioWindow (short int *audioWindow)
void calculateDelta (int time, int offset)
VectorcreateMfccFrame (int normID, int spectralSubtraction=-1)

Protected Attributes

int melIndexAboveCTS
FILE * audioFile
FILE * audioFileOut
bool offset
bool onlyMEL
float fea_delta_noemer
float hammingTemplate [FEA_WINDOWSIZE]
short int rememberBuffer [FEA_REMEMBERWINSIZE]
int melTemplate [LARGE_MEL_BANKLENGTH+2]
float fftWindowSpectralSubtract [FEA_FREQWINDOWSIZE]
Vector ** featureVector
Vector ** gaussianization
int nrFrames
int mfccSize
int vectorSize
int useDeltas
int useZeroCross
int useEnergy
int nonDeltaSize
int numberOfFeaturesProcessed
int nrNormClusters
bool doCMN
bool doCVN
bool doSqrt10
bool performCTS

Detailed Description

This class handles all feature extraction issues.

The following feature extraction steps are taken:

  • Create overlapping windows: default: 32 ms windows, every 10 ms.
  • DC offset removal. (is this needed? Does the pre-emphesis solve this?)
  • Pre-emphesis
  • Apply Hamming window
  • Calculate energy
  • Fast Fourier Transformation (magnitude)
  • Vocal Tract Length Normalization
  • Mel bank filtering (default 20 banks, logaritmic)
  • DCT transformation creating MFCC coefficients
  • Cepstrum liftering
  • Normalize the coefficients: mean substraction
  • Calculate delta's and delta-delta's

This class can do feature extraction for a single audio file (16K16, raw PCM audio) or for a batch file.

Constructor & Destructor Documentation

FeatureExtraction::FeatureExtraction ( const char *  audioIn,
const char *  audioOut,
int  mfccS,
int  useE,
int  useZC,
int  useD,
double  vtln,
bool  useSqrt10 = false,
bool  offs = false,
bool  onlyPrepare = false,
bool  onlyM = false,
bool  pCTS = false 

Member Function Documentation

void FeatureExtraction::calculateDelta ( int  time,
int  offset 
) [protected]

References fea_delta_noemer, featureVector, Vector::getValue(), nonDeltaSize, and Vector::setValue().

Referenced by doFeatureExtraction(), finishExtraction(), and getOnlineVector().

Vector * FeatureExtraction::createMfccFrame ( int  normID,
int  spectralSubtraction = -1 
) [protected]

Vector * FeatureExtraction::getOnlineVector ( double *  silThreshold,
bool  isSil 

Vector** FeatureExtraction::getPool (  )  const [inline]

int FeatureExtraction::getPoolSize (  )  const [inline]

int FeatureExtraction::getVectorSize (  )  const [inline]

void FeatureExtraction::initializeHistNorm (  ) 

References numberOfFeaturesProcessed.

void FeatureExtraction::performClusterCMVNUntilFrame ( int  lastFrame,
int  normID 

References Vector::divideElements(), doCMN, doCVN, featureVector, normData, numberOfFeaturesProcessed, and Vector::substractElements().

Referenced by FeaturePool::createNewPool().

void FeatureExtraction::performHistNormUntilFrame ( int  lastFrame,
int  normID 

References featureVector, Vector::getValue(), NormData::histogram, inverseHistogram, Vector::len(), normData, numberOfFeaturesProcessed, and Vector::setValue().

void FeatureExtraction::performPCA ( Vector **  pca,
int  len 

References featureVector, nrFrames, Vector::setValue(), and vectorSize.

Referenced by FeaturePool::createNewPool().

bool FeatureExtraction::readAudioWindow ( short int *  audioWindow  )  [protected]

References audioFile, audioFileOut, WriteFileLittleBigEndian::freadEndianSafe(), and rememberBuffer.

Referenced by createMfccFrame(), and doFeatureExtraction().

void FeatureExtraction::setAudioFileOut ( const char *  fileName  ) 

References audioFileOut.

Referenced by ShoutOnline::ShoutOnline().

void FeatureExtraction::setBackgroundHistNorm ( Vector **  hist,
int  numBins 

void FeatureExtraction::setHistNormModel ( int  normID,
Vector **  h 

References NormData::histogram, and normData.

void FeatureExtraction::setNormalization ( FEATURENORM_TYPE  normT,
int  nrClusters,
bool  doMean,
bool  doVar,
Vector **  hist 

void FeatureExtraction::setOnlyMEL (  ) 

References onlyMEL.

Referenced by FeaturePool::setOnlyMEL().

void FeatureExtraction::setVTLN ( double  vtln  ) 

void FeatureExtraction::storeFeatureVectors ( FILE *  file  ) 



References featureVector, MEL_BANKLENGTH, nrFrames, onlyMEL, and Vector::storeFloatData().

Referenced by FeaturePool::storeFeatureVectors().

Member Data Documentation

Referenced by FeatureExtraction().

float FeatureExtraction::fftWindowSpectralSubtract[FEA_FREQWINDOWSIZE] [protected]

Referenced by createMfccFrame().

float FeatureExtraction::hammingTemplate[FEA_WINDOWSIZE] [protected]

int FeatureExtraction::melTemplate[LARGE_MEL_BANKLENGTH+2] [protected]

Referenced by createMfccFrame(), and setVTLN().

bool FeatureExtraction::offset [protected]

Referenced by FeatureExtraction().

Referenced by FeatureExtraction(), and setVTLN().

short int FeatureExtraction::rememberBuffer[FEA_REMEMBERWINSIZE] [protected]