Audio Decoders¶
-
struct
musher::core
::
AudioDecoded
¶ Decoded audio file information that is common across all audio files.
Subclassed by musher::core::Mp3Decoded, musher::core::WavDecoded
Public Members
-
uint32_t
sample_rate
¶ Sampling rate of the audio signal [Hz].
-
int
channels
¶ Number of audio channels in the buffer.
-
bool
mono
¶ True is audio is mono.
-
bool
stereo
¶ True if audio is stereo.
-
int
samples_per_channel
¶ Number of samples per channel.
-
double
length_in_seconds
¶ Detailed description after the member. Based on the number of samples and the sample rate.
-
std::string
file_type
¶ Type of the file decoded.
-
int
avg_bitrate_kbps
¶ Average bitrate of the buffer [kbps].
-
std::vector<std::vector<double>>
normalized_samples
¶ Normalized samples of the audio file.
normalized_samples[0] holds channel 1
normalized_samples[1] holds channel 2 (Will not exist if mono audio)
-
uint32_t
-
struct
musher::core
::
WavDecoded
: public musher::core::AudioDecoded¶ Decoded WAV file information.
Contains same attributes as AudioDecoded.
Public Members
-
int
bit_depth
¶ Bit depth of each sample.
-
int
-
struct
Mp3Decoded
: public musher::core::AudioDecoded¶ Decoded Mp3 file information.
Contains same attributes as AudioDecoded.
-
std::vector<uint8_t>
musher::core
::
LoadAudioFile
(const std::string &file_path)¶ Load the data from an audio file.
- Return
std::vector<uint8_t> Audio file data.
- Parameters
file_path
: File path to a .wav file.
-
WavDecoded
musher::core
::
DecodeWav
(const std::vector<uint8_t> &file_data)¶ Decode a wav file.
- Return
WavDecoded .wav file information.
- Parameters
file_data
: WAV file data.
-
WavDecoded
musher::core
::
DecodeWav
(const std::string &file_path)¶ Overloaded wrapper around DecodeWav that accepts a file path to a .wav file.
- Return
WavDecoded .wav file information.
- Parameters
file_path
: File path to a .wav file.
-
Mp3Decoded
musher::core
::
DecodeMp3
(const std::string file_path)¶ Decode an mp3 file.
- Return
Mp3Decoded .mp3 file information.
- Parameters
file_path
: File path to a .mp3 file.
FFT Convolve¶
-
std::vector<double>
musher::core
::
CenterVector
(const std::vector<double> &vec, size_t new_shape)¶ Centered a vector with respect to the full discrete linear convolution of the input.
- Return
std::vector<double> Centered vector.
- Parameters
vec
: Vectornew_shape
: New shape of of vector.
-
std::vector<double>
musher::core
::
FFTConvolve
(const std::vector<double> &vec1, const std::vector<double> &vec2)¶ Perform ‘same’ convolve of two 1-dimensional arrays using FFT.
Convolve
vec1
andvec2
using the fast Fourier transform method. The output is the same size asvec1
, centered with respect to the full discrete linear convolution of the inputs.This function was heavily inspired by: https://github.com/scipy/scipy/blob/12fa74e97d3d18ca3a4e6991327663e88462f238/scipy/signal/signaltools.py#L551 https://github.com/scipy/scipy/blob/master/scipy/fft/_pocketfft/pypocketfft.cxx
- Return
std::vector<double> A 1-dimensional array containing a subset of the discrete linear convolution of
vec1
withvec2
.- Parameters
vec1
: Vector 1vec2
: Vector 2
Framecutter¶
-
class
musher::core
::
Framecutter
¶ This class should be treated like an iterator.
Framecutter framecutter(audio_signal); for (const std::vector<double> &frame : framecutter) { perform_work_on_frame(frame); }
Public Functions
-
Framecutter
(const std::vector<double> buffer, int frame_size = 1024, int hop_size = 512, bool start_from_center = true, bool last_frame_to_end_of_file = false, double valid_frame_threshold_ratio = 0.)¶ Construct a new Framecutter object.
- Parameters
buffer
: Buffer from which to read data.frame_size
: Output frame size.hop_size
: Hop size between frames.start_from_center
: If true start from the center of the buffer (zero-centered at -frameSize/2) or if false the first frame at time 0 (centered at frameSize/2).last_frame_to_end_of_file
: Whether the beginning of the last frame should reach the end of file. Only applicable if start_from_center is false.valid_frame_threshold_ratio
: Frames smaller than this ratio will be discarded, those larger will be zero-padded to a full frame. (i.e. a value of 0 will never discard frames and a value of 1 will only keep frames that are of length ‘frameSize’)
-
std::vector<double>
operator*
() const¶ Each iteration returns a frame.
- Return
std::vector<double> Cut frame.
-
std::vector<double>
compute
()¶ Computes the actual slicing of the frames, this function is run on each iteration to calculate the next frame.
This function should not be called by the user, it will be called internally while iterating.
- Return
std::vector<double> Sliced frame.
-
HPCP¶
-
int
musher::core
::
ArgMax
(const std::vector<double> &input)¶ Get the arg max of a vector.
Checks if the vector is empty first.
- Return
int Arg max
- Parameters
vec
: Vector
-
template<typename
T
>
voidmusher::core
::
NormalizeInPlace
(std::vector<T> &vec)¶ Normalize a vector so its largest value gets mapped to 1.
If zero, the vector isn’t touched.
- Template Parameters
T
:
- Parameters
vec
: Vector to normalize.
-
template<typename
T
>
voidmusher::core
::
NormalizeSumInPlace
(std::vector<T> &vec)¶ Normalize a vector so it’s sum is equal to 1.
The vector is not touched if it contains negative elements or the sum is zero.
- Template Parameters
T
:
- Parameters
vec
: Vector to normalize.
-
void
musher::core
::
AddContributionWithWeight
(double freq, double mag_lin, double reference_frequency, double window_size, WeightType weight_type, double harmonic_weight, std::vector<double> &hpcp)¶ Add contribution to the HPCP with weight.
- Parameters
freq
: Frequency [Hz]mag_lin
: Magnitudereference_frequency
: Reference frequency for semitone index calculation, corresponding to A3 [Hz].window_size
: Size, in semitones, of the window used for the weighting.weight_type
: Type of weighting function for determining frequency contribution.harmonic_weight
: Strength/weight of the harmonic.hpcp
: Harmonic pitch class profile.
-
void
musher::core
::
AddContributionWithoutWeight
(double freq, double mag_lin, double reference_frequency, double harmonic_weight, std::vector<double> &hpcp)¶ Add contribution to the HPCP without weight.
- Parameters
freq
: Frequency [Hz]mag_lin
: Magnitudereference_frequency
: Reference frequency for semitone index calculation, corresponding to A3 [Hz].harmonic_weight
: Strength/weight of the harmonic.hpcp
: Harmonic pitch class profile.
-
void
musher::core
::
AddContribution
(double freq, double mag_lin, double reference_frequency, double window_size, WeightType weight_type, std::vector<HarmonicPeak> harmonic_peaks, std::vector<double> &hpcp)¶ Adds the magnitude contribution of the given frequency as the tonic semitone.
As well as its possible contribution as a harmonic of another pitch.
- Parameters
freq
: Frequency [Hz]mag_lin
: Magnitudereference_frequency
: Reference frequency for semitone index calculation, corresponding to A3 [Hz].window_size
: Size, in semitones, of the window used for the weighting.weight_type
: Type of weighting function for determining frequency contribution.harmonic_peaks
: Weighting table of harmonic contribution.hpcp
: Harmonic pitch class profile.
-
std::vector<HarmonicPeak>
musher::core
::
InitHarmonicContributionTable
(int harmonics)¶ Builds a weighting table of harmonic contribution.
Higher harmonics contribute less and the fundamental frequency has a full harmonic strength of 1.0.
- Return
std::vector<HarmonicPeak> Weighting table of harmonic contribution.
- Parameters
harmonics
: Number of harmonics for frequency contribution, 0 indicates exclusive fundamental frequency contribution.
-
std::vector<double>
musher::core
::
HPCP
(const std::vector<double> &frequencies, const std::vector<double> &magnitudes, unsigned int size = 12, double reference_frequency = 440.0, unsigned int harmonics = 0, bool band_preset = true, double band_split_frequency = 500.0, double min_frequency = 40.0, double max_frequency = 5000.0, std::string _weight_type = "squared cosine", double window_size = 1.0, bool max_shifted = false, bool non_linear = false, std::string _normalized = "unit max")¶ Computes a Harmonic Pitch Class Profile (HPCP) from the spectral peaks of a signal.
HPCP is a k*12 dimensional vector which represents the intensities of the twelve (k==1) semitone pitch classes (corresponsing to notes from A to G#), or subdivisions of these (k>1).
- Return
std::vector<double> Resulting harmonic pitch class profile.
- Parameters
frequencies
: Frequencies (positions) of the spectral peaks [Hz].magnitudes
: Magnitudes (heights) of the spectral peaks.size
: Size of the output HPCP (must be a positive nonzero multiple of 12).reference_frequency
: Reference frequency for semitone index calculation, corresponding to A3 [Hz].harmonics
: Number of harmonics for frequency contribution, 0 indicates exclusive fundamental frequency contribution.band_preset
: Enables whether to use a band preset.band_split_frequency
: Split frequency for low and high bands, not used if bandPreset is false [Hz].min_frequency
: Minimum frequency that contributes to the HPCP [Hz] (the difference between the min and split frequencies must not be less than 200.0 Hz).max_frequency
: Maximum frequency that contributes to the HPCP [Hz] (the difference between the max and split frequencies must not be less than 200.0 Hz)._weight_type
: Type of weighting function for determining frequency contribution.window_size
: Size, in semitones, of the window used for the weighting.max_shifted
: Whether to shift the HPCP vector so that the maximum peak is at index 0.non_linear
: Apply non-linear post-processing to the output (use with _normalized=’unit max’). Boosts values close to 1, decreases values close to 0._normalized
: Whether to normalize the HPCP vector.
-
std::vector<double>
musher::core
::
HPCP
(const std::vector<std::tuple<double, double>> &peaks, unsigned int size = 12, double reference_frequency = 440.0, unsigned int harmonics = 0, bool band_preset = true, double band_split_frequency = 500.0, double min_frequency = 40.0, double max_frequency = 5000.0, std::string _weight_type = "squared cosine", double window_size = 1.0, bool max_shifted = false, bool non_linear = false, std::string _normalized = "unit max")¶ Overloaded function for HPCP that accepts a vector of peaks.
Refer to original HPCP function for more details.
- Return
std::vector<double> Resulting harmonic pitch class profile.
- Parameters
peaks
: Vector of spectral peaks, each peak being a tuple (frequency, magnitude).size
: Size of the output HPCP (must be a positive nonzero multiple of 12).reference_frequency
: Reference frequency for semitone index calculation, corresponding to A3 [Hz].harmonics
: Number of harmonics for frequency contribution, 0 indicates exclusive fundamental frequency contribution.band_preset
: Enables whether to use a band preset.band_split_frequency
: Split frequency for low and high bands, not used if bandPreset is false [Hz].min_frequency
: Minimum frequency that contributes to the HPCP [Hz] (the difference between the min and split frequencies must not be less than 200.0 Hz).max_frequency
: Maximum frequency that contributes to the HPCP [Hz] (the difference between the max and split frequencies must not be less than 200.0 Hz)._weight_type
: Type of weighting function for determining frequency contribution.window_size
: Size, in semitones, of the window used for the weighting.max_shifted
: Whether to shift the HPCP vector so that the maximum peak is at index 0.non_linear
: Apply non-linear post-processing to the output (use with _normalized=’unit max’). Boosts values close to 1, decreases values close to 0._normalized
: Whether to normalize the HPCP vector.
Key¶
-
std::vector<std::vector<double>>
musher::core
::
SelectKeyProfile
(const std::string profile_type)¶ Select a key profile given the type.
About the Key Profiles:
Diatonic - Binary profile with diatonic notes of both modes. Could be useful for ambient music or diatonic music which is not strictly ‘tonal functional’
Tonic Triad - Just the notes of the major and minor chords. Exclusively for testing.
Krumhansl - Reference key profiles after cognitive experiments with users. They should work generally fine for pop music.
Temperley - Key profiles extracted from corpus analysis of euroclassical music. Therefore, they perform best on this repertoire (especially in minor).
Shaath - Profiles based on Krumhansl’s specifically tuned to popular and electronic music.
Noland - Profiles from Bach’s ‘Well Tempered Klavier’.
Edma - Automatic profiles extracted from corpus analysis of electronic dance music [3]. They normally perform better that Shaath’s
Edmm - Automatic profiles extracted from corpus analysis of electronic dance music and manually tweaked according to heuristic observation. It will report major modes (which are poorly represented in EDM) as minor, but improve performance otherwise [3].
Braw - Profiles obtained by calculating the median profile for each mode from a subset of BeatPort dataset. There is an extra profile obtained from ambiguous tracks that are reported as minor[4]
Bgate - Same as braw but zeroing the 4 less relevant elements of each profile[4]
References: [1] E. Gómez, “Tonal Description of Polyphonic Audio for Music Content
Processing,” INFORMS Journal on Computing, vol. 18, no. 3, pp. 294–304, 2006. [2] D. Temperley, “What’s key for key? The Krumhansl-Schmuckler key-finding algorithm reconsidered”, Music Perception vol. 17, no. 1, pp. 65-100, 1999. [3] Á. Faraldo, E. Gómez, S. Jordà, P.Herrera, “Key Estimation in Electronic” Dance Music. Proceedings of the 38th International Conference on information” Retrieval, pp. 335-347, 2016. [4] Faraldo, Á., Jordà, S., & Herrera, P. (2017, June). A multi-profile method”
for key estimation in edm. In Audio Engineering Society Conference: 2017 AES” International Conference on Semantic Audio. Audio Engineering Society.
essentia: https://github.com/MTG/essentia/blob/master/src/algorithms/tonal/key.cpp
- Return
std::vector<std::vector<double>> Key profile
- Parameters
profile_type
: Key profile type.
-
std::vector<double>
musher::core
::
AddContributionHarmonics
(const std::vector<double> &M_chords, const int pitch_class, const double contribution, const int num_harmonics, const double slope)¶ Add contribution harmonics to chords. Each note contribute to the different harmonics: 1.- first harmonic f -> i 2.- second harmonic 2*f -> i 3.- third harmonic 3*f -> i+7 4.- fourth harmonic 4*f -> i .. The contribution is weighted depending of the slope.
- Return
std::vector<double> chords with added contribution harmonics
- Parameters
chords
: Chordspitch_class
: pitch classcontribution
: harmonic contributionnum_harmonics
: Number of harmonics that should contribute to the polyphonic profile (1 only considers the fundamental harmonic).slope
: Value of the slope of the exponential harmonic contribution to the polyphonic profile.
-
std::vector<double>
musher::core
::
AddMajorTriad
(const std::vector<double> &M_chords, const int root, const double contribution, const int num_harmonics, const double slope)¶ Adds the contribution of a chord with root note ‘root’ to its major triad. A major triad includes notes from three different classes of pitch: the root, the major 3rd and perfect 5th. This is the most relaxed, most consonant chord in all of harmony.
- See
http://www.songtrellis.com/directory/1146/chordTypes/majorChordTypes/majorTriad The three notes of the chord have the same weight
- Return
std::vector<double> Chords with contribution added to its major triad.
- Parameters
chords
: Chordsroot
: root notecontribution
: harmonic contributionnum_harmonics
: Number of harmonics that should contribute to the polyphonic profile (1 only considers the fundamental harmonic).slope
: Value of the slope of the exponential harmonic contribution to the polyphonic profile.
-
std::vector<double>
musher::core
::
AddMinorTriad
(const std::vector<double> &M_chords, const int root, const double contribution, const int num_harmonics, const double slope)¶ Adds the contribution of a chord with root note ‘root’ to its minor triad A minor triad includes notes from three different classes of pitch: the root, the minor 3rd and perfect 5th.
- See
http://www.songtrellis.com/directory/1146/chordTypes/minorChordTypes/minorTriadMi The three notes of the chord have the same weight
- Return
std::vector<double> Chords with contribution added to its minor triad.
- Parameters
chords
: Chordsroot
: root notecontribution
: harmonic contributionnum_harmonics
: Number of harmonics that should contribute to the polyphonic profile (1 only considers the fundamental harmonic).slope
: Value of the slope of the exponential harmonic contribution to the polyphonic profile.
-
std::tuple<std::vector<double>, double, double>
musher::core
::
ResizeProfileToPcpSize
(const unsigned int pcp_size, const std::vector<double> &key_profile)¶ Resizes and interpolates the profiles to fit the pcp size.
- Return
std::tuple<std::vector<double>, double, double> Tuple of (resized profile, mean, standard deviation).
- Parameters
pcp_size
: Number of array elements used to represent a semitone times 12.key_profile
: Key profile.
-
double
musher::core
::
StandardDeviation
(double mean, const std::vector<double> &vec)¶ Calculate the standard deviation of a vector.
- Return
double Standard devation
- Parameters
mean
: Mean (Average)vec
: Vector
-
KeyOutput
musher::core
::
EstimateKey
(const std::vector<double> &pcp, const bool use_polphony = true, const bool use_three_chords = true, const unsigned int num_harmonics = 4, const double slope = 0.6, const std::string profile_type = "Bgate", const bool use_maj_min = false)¶ Computes key estimate given a pitch class profile (HPCP).
- Return
KeyOutput A struct containing the following: key: Estimated key, from A to G. scale: Scale of the key (major or minor). strength: Strength of the estimated key. first_to_second_relative_strength: The relative strength difference between the best estimate and second best estimate of the key.
- Parameters
pcp
: The input pitch class profile.use_polphony
: Enables the use of polyphonic profiles to define key profiles (this includes the contributions from triads as well as pitch harmonics).use_three_chords
: Consider only the 3 main triad chords of the key (T, D, SD) to build the polyphonic profiles.num_harmonics
: Number of harmonics that should contribute to the polyphonic profile (1 only considers the fundamental harmonic).slope
: Value of the slope of the exponential harmonic contribution to the polyphonic profile.profile_type
: The type of polyphic profile to use for correlation calculation.use_maj_min
: Use a third profile called ‘majmin’ for ambiguous tracks [4]. Only available for the edma, bgate and braw profiles.
-
KeyOutput
musher::core
::
DetectKey
(const std::vector<std::vector<double>> &normalized_samples, double sample_rate = 44100., const std::string profile_type = "Bgate", const bool use_polphony = true, const bool use_three_chords = true, const unsigned int num_harmonics = 4, const double slope = 0.6, const bool use_maj_min = false, const unsigned int pcp_size = 36, const int frame_size = 4096, const int hop_size = 512, const std::function<std::vector<double>(const std::vector<double>&)> &window_type_func = BlackmanHarris62dB, unsigned int max_num_peaks = 100, double window_size = .5, )¶ Computes key estimate given normalized samples.
- Return
KeyOutput A struct containing the following: key: Estimated key, from A to G. scale: Scale of the key (major or minor). strength: Strength of the estimated key. first_to_second_relative_strength: The relative strength difference between the best estimate and second best estimate of the key.
- Parameters
normalized_samples
: Normalized samples, either stereo or mono.sample_rate
: Sampling rate of the audio signal [Hz].profile_type
: The type of polyphic profile to use for correlation calculation.use_polphony
: Enables the use of polyphonic profiles to define key profiles (this includes the contributions from triads as well as pitch harmonics).use_three_chords
: Consider only the 3 main triad chords of the key (T, D, SD) to build the polyphonic profiles.num_harmonics
: Number of harmonics that should contribute to the polyphonic profile (1 only considers the fundamental harmonic).slope
: Value of the slope of the exponential harmonic contribution to the polyphonic profile.use_maj_min
: Use a third profile called ‘majmin’ for ambiguous tracks [4]. Only available for the edma, bgate and braw profiles.pcp_size
: Number of array elements used to represent a semitone times 12.frame_size
: Output frame size.hop_size
: Hop size between frames.window_type_func
: The window type function. Examples: BlackmanHarris92dB, BlackmanHarris62dB…max_num_peaks
: Maximum number of returned peaks (set to 0 to return all peaks).window_size
: Size, in semitones, of the window used for the weighting.
Mono Mixer¶
-
std::vector<double>
musher::core
::
MonoMixer
(const std::vector<std::vector<double>> &input)¶ Downmixes the signal into a single channel given a stereo signal.
If the signal was already a monoaural, it is left unchanged.
- Return
std::vector<double> Downmixed audio signal
- Parameters
input
: Stereo or mono audio signal
Peak Detect¶
-
std::tuple<double, double>
musher::core
::
QuadraticInterpolation
(double a, double b, double y, int middle_point_index)¶ Interpolate the peak of a parabola given 3 points on the parabola.
α(a) = left point value of parabola
β(b) = middle point value of parabola
γ(y) = right point value of parabola
Interpolated peak location is given in bins (spectral samples) by:
p = 1/2 ((α - γ) / (α - 2β + γ))
The peak magnitude estimate is:
y(p) = β - 1/4(α - γ)p
Smith, J.O. “Quadratic Interpolation of Spectral Peaks”, in Spectral Audio Signal Processing, https://ccrma.stanford.edu/~jos/sasp/Quadratic_Interpolation_Spectral_Peaks.html, online book, 2011 edition, accessed 12/18/2019.
- Return
std::tuple<double, double> Tuple of (location (position) of the peak, peak height estimate).
- Parameters
a
: Left point value of parabola.b
: Middle point value of parabola.y
: Right point value of parabola.middle_point_index
: Position of the middle point in the parabola.
-
std::vector<std::tuple<double, double>>
musher::core
::
PeakDetect
(const std::vector<double> &inp, double threshold = -1000.0, bool interpolate = true, std::string sort_by = "position", int max_num_peaks = 0, double range = 0., int min_pos = 0, int max_pos = 0)¶ Detects local maxima (peaks) in a vector.
The algorithm finds positive slopes and detects a peak when the slope changes sign and the peak is above the threshold.
- Return
std::vector<std::tuple<double, double>> Vector of peaks, each peak being a tuple (positions, heights).
- Parameters
inp
: Input vector.threshold
: Peaks below this given threshold are not outputted.interpolate
: Enables interpolation.sort_by
: Ordering type of the outputted peaks (ascending by position or descending by height).max_num_peaks
: Maximum number of returned peaks (set to 0 to return all peaks).range
: Input range.min_pos
: Maximum position of the range to evaluate.max_pos
: Minimum position of the range to evaluate.
Spectral Peaks¶
-
std::vector<std::tuple<double, double>>
musher::core
::
SpectralPeaks
(const std::vector<double> &input_spectrum, double threshold = -1000.0, std::string sort_by = "position", unsigned int max_num_peaks = 100, double sample_rate = 44100., int min_pos = 0, int max_pos = 0)¶ Extracts peaks from a spectrum.
It is important to note that the peak algorithm is independent of an input that is linear or in dB, so one has to adapt the threshold to fit with the type of data fed to it. The algorithm relies on PeakDetect algorithm which is run with parabolic interpolation [1]. The exactness of the peak-searching depends heavily on the windowing type. It gives best results with dB input, a blackman-harris 92dB window and interpolation set to true. According to [1], spectral peak frequencies tend to be about twice as accurate when dB magnitude is used rather than just linear magnitude. For further information about the peak detection, see the description of the PeakDetection algorithm.
References: [1] Peak Detection, http://ccrma.stanford.edu/~jos/parshl/Peak_Detection_Steps_3.html
- Return
std::vector<std::tuple<double, double>> Vector of spectral peaks, each peak being a tuple (frequency, magnitude).
- Parameters
input_spectrum
: Input spectrum.threshold
: Peaks below this given threshold are not outputted.sort_by
: Ordering type of the outputted peaks (ascending by frequency (position) or descending by magnitude (height)).max_num_peaks
: Maximum number of returned peaks (set to 0 to return all peaks).sample_rate
: Sampling rate of the audio signal [Hz].min_pos
: Maximum frequency (position) of the range to evaluate [Hz].max_pos
: Minimum frequency (position) of the range to evaluate [Hz].
Spectrum¶
-
double
musher::core
::
Magnitude
(const std::complex<double> complex_pair)¶ Calculate the magnitude (absolute value or modulus) of a complex number.
- Return
double The magnitude of a complex number.
- Parameters
complex_pair
: Complex number. Contains 1 real and 1 imaginary number.
-
double
musher::core
::
NormFct
(int inorm, size_t N)¶
-
double
musher::core
::
NormFct
(int inorm, const pocketfft::shape_t &shape, const pocketfft::shape_t &axes, size_t fct, int delta)¶
-
size_t
musher::core
::
NextFastLen
(size_t n)¶ Calculate an efficient length to pad the inputs of the FFT.
Copied from Peter Bell. https://gdoc.pub/doc/e/2PACX-1vR6iXXG1uS9ds47GvDgQk6XtpYzVTtYepu5B8onBrMmoorfKHhnHbN0ArDoXgoA23nZrcrm_DSFMW45
- Return
size_t Efficient FFT input size.
- Parameters
n
: Original input size.
-
std::vector<double>
musher::core
::
ConvertToFrequencySpectrum
(const std::vector<double> &audio_frame)¶ Computes the frequency spectrum of an array of Reals.
The resulting spectrum has a size which is half the size of the input array plus one. Bins contain raw (linear) magnitude values.
- Return
std::vector<double> Frequency spectrum of the input audio signal.
- Parameters
frame
: Input audio frame.
Utilities¶
-
std::string
musher::core
::
Uint8VectorToHexString
(const std::vector<uint8_t> &v)¶ Convert uint8_t vector to hex string.
- Return
std::string string of hex
- Parameters
uint8_t::vector
: vector of uint8_t
-
std::string
musher::core
::
StrBetweenSQuotes
(const std::string &s)¶ Get string between two single quotes.
NOTE: There must only be 2 quotes in the entire string.
- Return
string between single quotes
- Parameters
s
: String that contains 2 single quotes
-
bool
musher::core
::
IsBigEndian
(void)¶ Check if the architecture of the machine running the code is big endian.
- Return
true If big endian.
- Return
false If not big endian.
-
template<typename
T
>
std::vector<std::vector<T>>musher::core
::
Deinterweave
(const std::vector<T> &interweaved_vector)¶ Deinterweave a vector in alternating order to form two vectors.
interweaved_vector = {1, 9, 2, 8, 3, 7, 4, 6} deinterweaved_vector = { {1, 2, 3, 4}, {9, 8, 7, 6} }
- Return
std::vector<std::vector<double>> Deinterweaved vector.
- Parameters
interweaved_vector
: Interleaved vector.
-
double
musher::core
::
Median
(std::vector<double> &inVec)¶ Compute the median of a vector.
- Return
double Median.
- Parameters
inVec
: Input vector.
-
std::vector<double>
musher::core
::
OnePoleFilter
(const std::vector<double> &vec)¶ Compute a one pole filter on an audio signal.
- Return
std::vector<double> Filtered audio signal.
- Parameters
vec
: Audio signal.
Windowing¶
-
std::vector<double>
musher::core
::
Square
(const std::vector<double> &window)¶ Square windowing function.
- Return
std::vector<double> Square window.
- Parameters
window
: Audio signal window.
-
std::vector<double>
musher::core
::
BlackmanHarris
(const std::vector<double> &window, double a0, double a1, double a2, double a3)¶ Blackmanharris windowing algorithm.
Window functions help control spectral leakage when doing Fourier Analysis.
- Return
std::vector<double> BlackmanHarris window.
- Parameters
window
: Audio signal window.a0
: Constant a0.a1
: Constant a1.a2
: Constant a2.a3
: Constant a3.
-
std::vector<double>
musher::core
::
BlackmanHarris62dB
(const std::vector<double> &window)¶ Blackmanharris62db windowing algorithm.
- Return
std::vector<double> Blackmanharris62db window.
- Parameters
window
: Audio signal window.
-
std::vector<double>
musher::core
::
BlackmanHarris92dB
(const std::vector<double> &window)¶ Blackmanharris92db windowing algorithm.
- Return
std::vector<double> Blackmanharris92db window.
- Parameters
window
: Audio signal window.
-
std::vector<double>
musher::core
::
Normalize
(const std::vector<double> &input)¶ Normalize a vector (to have an area of 1) and then scale by a factor of 2.
- Return
std::vector<double> normalized vector.
- Parameters
input
: Input vector.
-
std::vector<double>
musher::core
::
Windowing
(const std::vector<double> &audio_frame, const std::function<std::vector<double>(const std::vector<double>&)> &window_type_func = BlackmanHarris62dB, unsigned zero_padding_size = 0, bool zero_phase = true, bool _normalize = true, )¶ Applies windowing to an audio signal.
It optionally applies zero-phase windowing and optionally adds zero-padding. The resulting windowed frame size is equal to the incoming frame size plus the number of padded zeros. By default, the available windows are normalized (to have an area of 1) and then scaled by a factor of 2.
References: [1] F. J. Harris, On the use of windows for harmonic analysis with the discrete Fourier transform, Proceedings of the IEEE, vol. 66, no. 1, pp. 51-83, Jan. 1978 [2] Window function - Wikipedia, the free encyclopedia, http://en.wikipedia.org/wiki/Window_function
- Return
std::vector<double> Windowed audio frame.
- Parameters
audio_frame
: Input audio frame.window_type_func
: The window type function. Examples: BlackmanHarris92dB, BlackmanHarris62dB…zero_padding_size
: Size of the zero-padding.zero_phase
: Enables zero-phase windowing._normalize
: Specify whether to normalize windows (to have an area of 1) and then scale by a factor of 2.