Hello everyone. Creating music, I've seen a lot of different virtual instruments and effects. One of the most interesting effects is the vocoder, which allows you to modulate his voice and make it look like a voice for example a robot or something like that. Vocoder was originally used to compress the voice data, and then it began to be used in the music industry. Because I had free time, I decided to write something like this for the sake of the experiment and describe in detail the stages of development for VB6.
So, take a look at the simplest scheme vocoder:
The signal from the microphone (speech) is fed to a bank of bandpass filters, each of which passes only a small part of the frequency band of the speech signal. The greater the number of filters - the better speech intelligibility. At the same time, the carrier signal (e.g. ramp) is also passed through the same filter bank. Filter output speech signal is fed to envelope detectors which control modulators and outputs a filter carrier signal passes to the other input of the modulator. As a result, each band speech signal adjusts the level of the corresponding band carrier (modulates it). Further, output signals from all modulators are mixed and sent to the output. Further, all signal modulators are mixed and sent to the output. In order to improve speech intelligibility also apply additional blocks, such as the detector "sizzling" sound. So, to begin development necessary to determine the source signals, where they will take. It is possible for example to capture data from a file or directly processed in real-time from a microphone or line input. To test very easy to use file, so we will do and so and so. As the carrier will use an external file looped in a circle, to adjust the tone simply add the ability to change the playback speed, which will change the tone. To capture the sound of the file will use Audio Compression Manager (ACM), with it very convenient to make conversion between formats (because the file can be in any format, you would have to write some functions to different formats). It may be that to convert to the desired format will not correct ACM drivers, then play this file will not be available (although you can try to do it in 2 stages). As input files will use the wav - files, because to work with them in the system has special features to facilitate retrieving data from them.
We examine in detail the code. To open a file is the method ReadWaveFile, as an argument it takes the name of the wav-file. Wav-file extension is a file format RIFF, which in turn is composed of blocks called chunks. So we open a file using the mmioOpen, which returns a file handle that can be used with functions with RIFF files. On success, then we begin to search for the type of chunk WAVE, for this we call mmioDescend, which fills the structure MMCKINFO information chunk, if it is found. The identifier is used chunk structure FOURCC, which is a 4 ASCII characters, which are packed into a 32-bit number (in this case, Long). As a parent chunk use NULL, as we do not have a child chunk, and as the flag pass MMIO_FINDRIFF, which sets the RIFF chunk search with a given type (in this case WAVE). So, if the function mmioDescend worked successfully, our file is a WAVE-file, and you can proceed to obtain the data format. The format of the data is stored in a chunk fmt, inside chunk WAVE (embedded chunk). For this chunk, we call again mmioDescend, just as the parent chunk pass just found WAVE-chunk, and as a flag - MMIO_FINDCHUNK, which makes the search for the specified chunk. If successful, check the size of the chunk, it must match the size of the structure WAVEFORMATEX, and if all goes well read data chunk (which are the structure WAVEFORMATEX) by calling mmioRead. So now we need to make sure whether the ACM convert data from this format you want us to. To do this, we call acmStreamOpen with flag ACM_STREAMOPENF_QUERY, which allows you to query whether the ACM to convert data between the two formats. If successful start further analysis. So we are now inside the fmt chunk, we need to go back to WAVE chunk to chunk the data request. To do this, we call mmioAscend. Further, as we did with the fmt chunk the same sequence of steps is repeated for the data chunk that contains the data directly in the format fmt chunk. Data is read into the buffer, zero out the pointer in the array at the beginning of the data (bufIdx) and fill the structure with its original format. To set the output format is the method SetFormat, which tests the ability to convert to a format file when it was opened. The main function of class clsTrickWavConverter - Convert, which converts the data from the buffer at offset bufIdx in the required format. Let's examine how it works. When you first convert the stream conversion is not already open (mInit variable defines the initialisation stream conversion), so we call the Init method that opens stream conversion through acmStreamOpen. The first parameter is a pointer to a handle stream (hStream) - it function returns a handle on success and we will use for the conversion. In case of successful initialization stream, we define the size of the data needed something to convert. Because the caller passes a pointer to the buffer and its length in bytes, we need to correctly fill the buffer, without going outside. To do this, we call acmStreamSize, which returns the required size of the data to be converted. As we pass flag ACM_STREAMSIZEF_DESTINATION, which indicates getting the size in bytes of the original data based on the buffer size of the output buffer. Next we correct size based on the initial output buffer beyond since possible that the source file for example too short or we read the data near the end of the buffer. Next we fill ACMSTREAMHEADER header describes the data conversion and prepare (fix) it to the conversion using the acmStreamPrepareHeader. After that we call acmStreamConvert, which performs the conversion. ACM_STREAMCONVERTF_BLOCKALIGN flag indicates that we convert integer blocks, in this case the block size - mInpFmt.nBlockAlign. After conversion, we have to cancel the fixation through acmStreamUnprepareHeader and returns the number of bytes returned, and move the pointer to the source buffer to the number of bytes processed.
As a capture/playback of audio use clsTrickSound class to work with sound by winmm.
' // Start the capture/playback
Public Function StartProcess() As Boolean
Dim ret As MMRESULT
If mActive And Not paused Then Exit Function
If Not Init Then
err.Raise Errors.ERROR_OBJECT_FAILED
Exit Function
End If
If Not unavailable Then
err.Raise Errors.NOT_INITIALIZE
Exit Function
End If
If hWaveIn Then
ret = waveInStart(hWaveIn)
If ret Then
err.Raise ERROR_STARTUP Or ret
Exit Function
End If
Else
Dim idx As Long
If paused Then
ret = waveOutRestart(hWaveOut)
If ret Then
err.Raise ERROR_STARTUP Or ret
Exit Function
End If
paused = False
Else
For idx = 0 To bufCount - 1
RaiseEvent NewData(Buffers(idx).Header.lpData, UBound(Buffers(idx).data) + 1)
ret = waveOutWrite(hWaveOut, Buffers(idx).Header, Len(Buffers(idx).Header))
If ret Then
err.Raise ERROR_STARTUP Or ret
Exit Function
End If
Next
End If
End If
StartProcess = True
mActive = True
End Function
' // Pause playback
Public Function PauseProcess() As Boolean
Dim ret As MMRESULT
If Not Init Then
err.Raise Errors.ERROR_OBJECT_FAILED
Exit Function
End If
If Not unavailable Then
err.Raise Errors.NOT_INITIALIZE
Exit Function
End If
If Not mActive Then Exit Function
If hWaveOut Then
paused = True
waveOutPause hWaveOut
mActive = False
PauseProcess = True
End If
End Function
' // Stop playback/capture
Public Function StopProcess() As Boolean
Dim ret As Long
If Not Init Then
err.Raise Errors.ERROR_OBJECT_FAILED
Exit Function
End If
If Not unavailable Then
err.Raise Errors.NOT_INITIALIZE
Exit Function
End If
If Not mActive Then Exit Function
If hWaveIn Then
ret = waveInStop(hWaveIn)
If ret Then
err.Raise ERROR_STOP Or ret
Exit Function
End If
Else
ret = waveOutReset(hWaveOut)
If ret Then
err.Raise ERROR_STOP Or ret
Exit Function
End If
End If
mActive = False
paused = False
StopProcess = True
End Function
' // Playback initialization
Public Function InitPlayback(ByVal NumOfChannels As Integer, _
ByVal SamplesPerSec As Long, _
ByVal BitsPerSample As Integer, _
ByVal BufferSampleCount As Long, _
Optional ByVal DeviceID As Long = WAVE_MAPPER, _
Optional ByVal BuffersCount As Byte = 4) As Boolean
Dim ret As MMRESULT
Dim idx As Long
If Not Init Then
err.Raise Errors.ERROR_OBJECT_FAILED
Exit Function
End If
If unavailable Then
err.Raise Errors.ERROR_UNAVAILABLE
Exit Function
End If
If BuffersCount < 1 Then
err.Raise Errors.INVALID_BUFFERS_COUNT
Exit Function
End If
unavailable = True
With mFormat
.cbSize = 0
.wFormatTag = WAVE_FORMAT_PCM
.wBitsPerSample = BitsPerSample
.nSamplesPerSec = SamplesPerSec
.nChannels = NumOfChannels
.nBlockAlign = .nChannels * .wBitsPerSample \ 8
.nAvgBytesPerSec = .nSamplesPerSec * .nBlockAlign
End With
mSmpCount = BufferSampleCount - (BufferSampleCount Mod mFormat.nBlockAlign)
ret = waveOutOpen(hWaveOut, DeviceID, mFormat, hwnd, 0, CALLBACK_WINDOW)
If ret Then
err.Raise ERROR_OPEN_DEVICE Or ret
Exit Function
End If
bufCount = BuffersCount
ReDim Buffers(BuffersCount - 1)
For idx = 0 To BuffersCount - 1
With Buffers(idx)
ReDim .data(mSmpCount * mFormat.nBlockAlign - 1)
.Header.lpData = VarPtr(.data(0))
.Header.dwBufferLength = UBound(.data) + 1
.Header.dwFlags = 0
.Header.dwLoops = 0
ret = waveOutPrepareHeader(hWaveOut, .Header, Len(.Header))
.Status = ret = MMSYSERR_NOERROR
End With
If ret Then
Clear
err.Raise ERROR_PREPARE_BUFFERS Or ret
Exit Function
End If
Next
InitPlayback = True
End Function
' // Capture initialization
Public Function InitCapture(ByVal NumOfChannels As Integer, _
ByVal SamplesPerSec As Long, _
ByVal BitsPerSample As Integer, _
ByVal BufferSampleCount As Long, _
Optional ByVal DeviceID As Long = WAVE_MAPPER, _
Optional ByVal BuffersCount As Byte = 4) As Boolean
Dim ret As MMRESULT
Dim idx As Long
If Not Init Then
err.Raise Errors.ERROR_OBJECT_FAILED
Exit Function
End If
If unavailable Then
err.Raise Errors.ERROR_UNAVAILABLE
Exit Function
End If
If BuffersCount < 1 Then
err.Raise Errors.INVALID_BUFFERS_COUNT
Exit Function
End If
unavailable = True
With mFormat
.cbSize = 0
.wFormatTag = WAVE_FORMAT_PCM
.wBitsPerSample = BitsPerSample
.nSamplesPerSec = SamplesPerSec
.nChannels = NumOfChannels
.nBlockAlign = .nChannels * .wBitsPerSample \ 8
.nAvgBytesPerSec = .nSamplesPerSec * .nBlockAlign
End With
mSmpCount = BufferSampleCount - (BufferSampleCount Mod mFormat.nBlockAlign)
ret = waveInOpen(hWaveIn, DeviceID, mFormat, hwnd, 0, CALLBACK_WINDOW)
If ret Then
err.Raise ERROR_OPEN_DEVICE Or ret
Exit Function
End If
bufCount = BuffersCount
ReDim Buffers(BuffersCount - 1)
For idx = 0 To BuffersCount - 1
With Buffers(idx)
ReDim .data(mSmpCount * mFormat.nBlockAlign - 1)
.Header.lpData = VarPtr(.data(0))
.Header.dwBufferLength = UBound(.data) + 1
.Header.dwFlags = 0
.Header.dwLoops = 0
ret = waveInPrepareHeader(hWaveIn, .Header, Len(.Header))
.Status = ret = MMSYSERR_NOERROR
End With
If ret Then
Clear
err.Raise ERROR_PREPARE_BUFFERS Or ret
Exit Function
End If
Next
For idx = 0 To BuffersCount - 1
ret = waveInAddBuffer(hWaveIn, Buffers(idx).Header, Len(Buffers(idx).Header))
If ret Then
Clear
err.Raise ERROR_PREPARE_BUFFERS Or ret
Exit Function
End If
Next
InitCapture = True
End Function
' // ------------------------------------------------------------------------------------------------------------
Private Function WndProc(ByVal hwnd As Long, ByVal Msg As Long, ByVal wParam As Long, ByVal lParam As Long) As Long
Dim idx As Long
Dim hdr As WAVEHDR
If unavailable Then
Select Case Msg
Case MM_WIM_DATA
memcpy hdr, ByVal lParam, Len(hdr)
idx = GetBufferIndex(hdr.lpData)
If idx = -1 Then Exit Function
RaiseEvent NewData(hdr.lpData, mSmpCount * mFormat.nBlockAlign)
waveInAddBuffer hWaveIn, Buffers(idx).Header, Len(Buffers(idx).Header)
Exit Function
Case MM_WOM_DONE
memcpy hdr, ByVal lParam, Len(hdr)
idx = GetBufferIndex(hdr.lpData)
If idx = -1 Then Exit Function
RaiseEvent NewData(hdr.lpData, mSmpCount * mFormat.nBlockAlign)
waveOutWrite hWaveOut, Buffers(idx).Header, Len(Buffers(idx).Header)
Exit Function
End Select
End If
WndProc = DefWindowProc(hwnd, Msg, wParam, lParam)
End Function
Private Function CreateAsm() As Boolean
Dim inIDE As Boolean
Dim AsmSize As Long
Dim ptr As Long
Dim isFirst As Boolean
Debug.Assert MakeTrue(inIDE)
If lpAsm = 0 Then
If inIDE Then AsmSize = &H2C Else AsmSize = &H20
hHeap = GetPrevHeap()
If hHeap = 0 Then
hHeap = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE Or HEAP_NO_SERIALIZE, 0, 0)
If hHeap = 0 Then err.Raise 7: Exit Function
If Not SaveCurHeap() Then HeapDestroy hHeap: hHeap = 0: err.Raise 7: Exit Function
isFirst = True
End If
lpAsm = HeapAlloc(hHeap, HEAP_NO_SERIALIZE Or HEAP_ZERO_MEMORY, AsmSize)
If lpAsm = 0 Then
If isFirst Then HeapDestroy hHeap
hHeap = 0
err.Raise 7
Exit Function
End If
End If
ptr = lpAsm
If inIDE Then
CreateIDEStub (ptr): ptr = ptr + &HD
End If
CreateStackConv ptr
CreateAsm = True
End Function
Private Function SaveCurHeap() As Boolean
Dim i As Long
Dim out As String
out = Hex(hHeap)
For i = Len(out) + 1 To 8: out = "0" & out: Next
SaveCurHeap = SetEnvironmentVariable(StrPtr(SndClass), StrPtr(out))
End Function
Private Function GetPrevHeap() As Long
Dim out As String
out = Space(&H8)
If GetEnvironmentVariable(StrPtr(SndClass), StrPtr(out), LenB(out)) Then GetPrevHeap = Val("&H" & out)
End Function
Private Function CreateStackConv(ByVal ptr As Long) As Boolean
Dim lpMeth As Long
Dim vTable As Long
GetMem4 ByVal ObjPtr(Me), vTable
GetMem4 ByVal vTable + WNDPROCINDEX * 4 + &H1C, lpMeth
GetMem4 &H5450C031, ByVal ptr + &H0: GetMem4 &H488DE409, ByVal ptr + &H4: GetMem4 &H2474FF04, ByVal ptr + &H8
GetMem4 &H68FAE018, ByVal ptr + &HC: GetMem4 &H12345678, ByVal ptr + &H10: GetMem4 &HFFFFDAE8, ByVal ptr + &H14
GetMem4 &H10C258FF, ByVal ptr + &H18: GetMem4 &H0, ByVal ptr + &H1C
GetMem4 ObjPtr(Me), ByVal ptr + &H10 ' Push Me
GetMem4 lpMeth - (ptr + &H14) - 5, ByVal ptr + &H14 + 1 ' Call WndProc
End Function
Private Function CreateIDEStub(ByVal ptr As Long) As Boolean
Dim hInstVB6 As Long
Dim lpEbMode As Long
Dim hInstUser32 As Long
Dim lpDefProc As Long
hInstVB6 = GetModuleHandle(StrPtr("vba6"))
If hInstVB6 = 0 Then Exit Function
hInstUser32 = GetModuleHandle(StrPtr("user32"))
If hInstUser32 = 0 Then Exit Function
lpEbMode = GetProcAddress(hInstVB6, "EbMode")
If lpEbMode = 0 Then Exit Function
lpDefProc = GetProcAddress(hInstUser32, "DefWindowProcW")
If lpDefProc = 0 Then Exit Function
GetMem4 &HFFFFFBE8, ByVal ptr + &H0: GetMem4 &HFC8FEFF, ByVal ptr + &H4
GetMem4 &H34566B85, ByVal ptr + &H8: GetMem4 &H12, ByVal ptr + &HC
GetMem4 lpEbMode - ptr - 5, ByVal ptr + 1 + 0 ' Call EbMode
GetMem4 lpDefProc - (ptr + &HD), ByVal ptr + &H9 ' JNE DefWindowProcW
CreateIDEStub = True
End Function
Private Function MakeTrue(Value As Boolean) As Boolean
Value = True
MakeTrue = True
End Function
Private Sub Clear()
Dim idx As Long
unavailable = False
If hWaveIn Then
waveInReset hWaveIn
For idx = 0 To bufCount - 1
If Buffers(idx).Status Then
waveInUnprepareHeader hWaveIn, Buffers(idx).Header, Len(Buffers(idx).Header)
End If
Next
waveInClose hWaveIn
Else
waveOutReset hWaveOut
For idx = 0 To bufCount - 1
If Buffers(idx).Status Then
waveOutUnprepareHeader hWaveOut, Buffers(idx).Header, Len(Buffers(idx).Header)
End If
Next
waveOutClose hWaveOut
End If
hWaveIn = 0
hWaveOut = 0
paused = False
mActive = False
bufCount = 0
Erase Buffers()
ZeroMemory mFormat, Len(mFormat)
End Sub
Private Function GetBufferIndex(ByVal ptr As Long) As Long
Dim idx As Long
For idx = 0 To UBound(Buffers)
If Buffers(idx).Header.lpData = ptr Then
GetBufferIndex = idx
Exit Function
End If
Next
GetBufferIndex = -1
End Function
Private Sub Class_Initialize()
Dim cls As WNDCLASSEX
Dim hUser As Long
cls.cbSize = Len(cls)
If GetClassInfoEx(App.hInstance, StrPtr(SndClass), cls) = 0 Then
hUser = GetModuleHandle(StrPtr("user32"))
If hUser = 0 Then Exit Sub
cls.hInstance = App.hInstance
cls.lpfnwndproc = GetProcAddress(hUser, "DefWindowProcW")
cls.lpszClassName = StrPtr(SndClass)
If RegisterClassEx(cls) = 0 Then Exit Sub
End If
If Not CreateAsm() Then Exit Sub
hwnd = CreateWindowEx(0, StrPtr(SndClass), 0, 0, 0, 0, 0, 0, HWND_MESSAGE, 0, App.hInstance, ByVal 0&)
If hwnd = 0 Then Exit Sub
SetWindowLong hwnd, GWL_WNDPROC, lpAsm
Init = True
End Sub
Private Sub Class_Terminate()
If Not Init Then Exit Sub
Clear
DestroyWindow hwnd
UnregisterClass StrPtr(SndClass), App.hInstance
If hHeap = 0 Then Exit Sub
HeapFree hHeap, HEAP_NO_SERIALIZE, ByVal lpAsm
End Sub
Last edited by The trick; Apr 5th, 2016 at 04:51 AM.
Reason: Translation
Working with the winmm I will not, I can only say that as a notification window messages are used. We create for each instance of your window and wave-functions transmit it in the form of a notification message, and we are using inline assembly, treat them in a special class method after setting it as the window procedure. I also added a check there EbMode, that would not be such as in DirectSound, when you can not put a normal breakpoint using the circular buffer. The class generates an event NewData when he needed next portion of audio data during playback and when once the buffer is full when capturing. To initialize the playback method is used InitPlayback, which initializes the playback device (DeviceID) on the basis of a predetermined size and number of buffers in the queue. List of devices obtained property PlaybackDevices, which represents a collection of playback devices. Device index (starting with 0) corresponds to the desired DeviceID. To provide functions to select the device by default for a given format, then transferred to constant WAVE_MAPPER. Initialization capture produced by a method similar InitCapture; list of capture devices obtained by the method CaptureDevices. Methods StartProcess, StopProcess respectively launch process playback / recording and stop; method PauseProcess pauses playback. Appointment of the remaining properties is clear from the comments in the code.
So, the source and the modulating signals we have. Now the next step is to filter. You can go several ways: use a bank of filters (IIR, FIR), or use the FFT (FFT, fast Fourier transform) or wavelet transform. For our implementation, we take the Fourier transform of the window, because IIR filter calculation is quite a complex task, and FIR filters on the computational complexity is not very effective. (Frankly, I originally did for the implementation of IIR Butterworth filter 2nd order, but I was not satisfied with the quality and the load on the processor). With the FFT turns pretty simple. Decompose speech signal into harmonic where each element of the vector represents the information about a particular frequency (it turns out that something like a large number of band-pass filters).Also decompose the carrier signal and performs modulation. After all do the inverse Fourier transform and obtain the desired signal. It turns out that the FFT makes two tasks at once - it decomposes the signal into frequency bands (see. Diagram) and performs the mixing signal after IFFT. For our task to make the adjustment amount of the frequency bands, this allows you to configure the desired color tone. For FFT and its binding write a class clsTrickFFT:
Convert performs "FFT" method; for the reverse transformation of the second parameter passed to True. As the use of complex numbers will form an array arr (1, x), where x - complex number, the numbers arr (0, x) - the real part, arr (1, x) - the imaginary part. Dwell on the FFT I will not, because this is a very big topic, and who are interested in the network where there are many articles accessible language explained its meaning and properties; consider only the highlights. You need to convert the original real signal put into an array of complex numbers, resetting the imaginary part (the truth is based on the properties of the FFT can still accelerate if written in the real part of the one part and the imaginary other, but I did not so complicate). After conversion, we get a set of complex coefficients which correspond to the real part of the coefficients of the cosine and imaginary - before sinus. If you imagine this in the complex plane, each coefficient is a vector whose length characterizes the amplitude of the signal at that frequency and angle - phase:
Also, there is a mirror effect (moire) - mirroring coefficients relative half the sampling frequency which is equal in amplitude and opposite in phase. This occurs because the sampled signal as frequency may be correctly presented only to half the sampling frequency when the frequency aliasing occurs:
As can be seen initially red sinusoid has a frequency equal to two sampling periods, and gradually increases the sampling period, the frequency of the sampled signal is reduced and eventually at a sampling frequency equal to the frequency sine wave signal frequency becomes equal to 0 Hz. Because of this, the Fourier coefficients mirrored relative to half the sampling frequency. Therefore, when working with the spectrum can only handle half of the spectrum, before IFFT simply copy the second half of the mirror array make complex conjugate only (additional imaginary coefficients multiplied by -1). For this purpose the method MakeMirror. When the modulation signal, we will phase distortions occur because making the transformation to which any portion of the signal, we take this site for one period, which is repeated on both sides of the window indefinitely. And if we make any changes in the spectrum, our signals may not be the same at the edges of the window and breaks will occur (in our case, clicks). To prevent this, we multiply the signal by the weight window, which gradually decreases to the edges of the signal amplitude, and take the blocks overlapping. Because we do not need high quality sound, we will not use the weight of the window before the conversion (although should do so, because there is a blur of frequencies), and compute a "head-on" with the raw signal, transform, and perform IFFT only The results are applied window function. Also it will take the blocks with a 50% overlap at the hearing that is acceptable and fast enough. To make it clear here is clearly an example:
As you can see, we take the original signal 2 times with a shift by grabbing the second half in the second pass. After manipulation, we mixed the two signals at the overlap and outputs the first part of the second half will later be mixed with the following parts. As the window we will use the Hann window. The method ApplyWindow do it.
As mentioned above for the FFT operation, we need to take the data with overlapping and send data to the output from the ceiling. To do this, we will write a special class (clsTrickOverlappedBuffer), which will give us the data, taking the overlap:
Init method initializes internal buffers storage. WriteInputData method writes data to the internal buffer of the input signal. Using this method, we write the captured signal and the carrier signal. WriteOutputData method mixes the transmitted data in an internal buffer with past data added to the previous call to this method. This method we will use to write processed data signal already modulated by using this method. GetInputBuffer and GetOutputBuffer fill the input buffer of data, taking the overlap. GetInputBuffer receives data recorded by WriteInputData, respectively GetOutputBuffer method gets the data recorded by WriteOutputData. Now consider the representation of the class itself modulator clsTrickModulator, which deals specifically with the transformation of the spectrum:
The class has a property Volume, which determines the level of output volume. Bands property specifies the number of bands which will be divided in modulation spectrum. For example, at a sampling rate of 44100 Hz. and the size of the FFT to 2048, we obtain the frequency resolution equal 44100/2048 ≈ 21.53 Hz. When the number of frequency bands equal to 64 will take 2048/2/64 = 16 samples (344.48 Hz) frequency for each modulation. DryWet property determines the balance between the original signal and convert the output of the modulator. SetLevels method sets an array of coefficients of the amplitude-frequency characteristic (AFC) which multiplies the signal. This will produce the equalization signal and improve the sound quality after processing. The main method - Process, which actually makes processing; analyze it in detail. First, we calculate the number of samples per band based on the properties of the Bands, and then calculate the gain of the output signal depending on the number of frequency bands - this formula was obtained experimentally. Then we go through the speech frequency bands (modulation) signal and the coefficients corresponding to each band to calculate the energy data frequencies. Earlier I wrote that the amplitude spectral component - is the length of the vector, so we'll just summarize the lengths of the vectors corresponding frequencies, it will be the energy in this frequency range. Next we are going to have carrier signal in the same spectral counts change the signal according to the calculated energy also directly calculate the output level, apply equalization. When multiplying two components of (complex number) by the amount of energy is its scalability. All these manipulations we modulate a carrier signal, a speech that we required.
Thus, all components are ready. Now all you need to build and test the function. For the user interface I developed several controls specifically for the vocoder. Describe the operation and development of each, I will not, because it will take a lot of time and tell us briefly about each of them. ctlTrickKnob - control knob that something as simple as a potentiometer. He'll understand it is a regular controller, the similarity of the same Slider, only with a circular control. ctlTrickCommand - is a normal button with support for icons and added only for appearance. ctlTrickEqualizer - most interesting control. It allows you to adjust the frequency response of the signal. His panel has a logarithmic scale, both in frequency and level, which allows for more natural hearing to change the parameters. To add a point on the response you have to press the left mouse button in an empty place, to remove the - right. If you change the frequency response of a control generates an event Change. All the controls are designed only for the vocoder so their functionality is minimal.
Now all the "throws" to the form and write the code:
When loading of form we perform initialization of all components. Capture, playing back the audio size FFT, the amount of overlap, overlapping buffers, creating buffers for integer and complex data. Next, I made a box shape with rounded corners, as use a window without frame (draw in the nonclient area had no desire). Now the whole problem is reduced to handling events - AudioPlayback_NewData and AudioCapture_NewData. First event occurs when the playback device needs another portion of the audio data, the second when the buffer capture, in which we simply copy the data into a temporary buffer from where it will take them at processing AudioPlayback_NewData. The main method - Process, in it we just do the conversion. First we check whether we capture from a file or device. To do this, we check the variable mInpFile, which specifies the name of the input file to capture. If capture is made from a file, then we are using object inpConv, which is an instance of clsTrickWavConverter, convert the data into the format you want us to. If the data is finished (the number of bytes read does not match the passed), it means that we are on the edge of the file and continue to have to start over again. Also check the carrier signal and if it is not set then just copy the input data on output and, in this case, we will hear the raw sound. Otherwise, we translate the data into a complex form (count a real part of the signal and the imaginary zero out) and puts the resulting array in an overlapping buffer. Next, start processing the carrier signal. Because carrier signal we can have a very small length (you can use one wave period), in order to optimize I will do the repetition of the signal if required. Let me explain. For example, if we have a carrier signal 10 ms and 100 ms buffer (for example), then you could just call the conversion each time using ACM overwriting the pointer to the array destination, but it is not optimal. For optimization can be converted only once, and then simply duplicate the data to the end of the array, which we did. Only then do not forget to change the position in the source file, otherwise the next phase of the reading will not be the same and will flicks. We will write to another buffer (rawBuffer). This buffer length is based on the pitch shift. For example, if we want to shift the tone for the amount of semitones (halftones), the buffer size must be rawBuffer 2semitones / 12 times more. Then we simply compress / stretch buffer to a value mFFTSize, which will give us the acceleration / deceleration, and as a result increase / decrease tone. After all the manipulations we write data in an overlapping buffer and start processing. To do this, we pass by the number of overlapping data and handle them. Class objects clsTrickOverlappedBuffer return us the correct data. Processing is clear from the code, as We consider in detail the performance of each class. After processing all of overlap we get the output and convert them to integer suitable for playback. As the setting uses a form frmSettings. As the list of devices using a standard listbox, just going through my drawing class. The list of devices will be added in the following order:
A default device predetermined format
Device 1
Device 2
...
Device n
Capturing from a file
For testing click on the last point message is used LB_GETITEMRECT, which receives the coordinates and size of the item in the list. If this is not done then click the sheet of paper, if there is an empty space at the bottom will be equivalent to clicking on the last point. In the handler settings button in the main form frmTrickVocoder we check capture device and either open the file for conversion or initialize capture. To adjust the volume and mixing using a logarithmic scale, as the sensitivity of the human ear is not linear.
That's basically all. Thank you for your attention.
Good luck!
Last edited by The trick; Apr 5th, 2016 at 04:56 AM.
Hi Trick, I just wanted to say that this is a really beautiful code example. Thank you for sharing this (and other projects) on vbForums. I've learned a lot from your code.
Your clsTrickFFT example is particularly nice. Out of curiosity, have you developed a version of the class that operates on Byte arrays? I might be interested in repurposing the class to work on image data (32-bit 2D RGBA data, ideally packing two channels into each real/imaginary pair, for performance reasons, and performing horizontal/vertical FFTs in separate passes).
I've looked at porting something like FFTReal over to VB, and would be interested in seeing how it compares performance-wise to your existing implementation. Since you have a lot of experience in this area, I wanted to ask in case you have already worked on something similar.
If not, no problem. Thanks again for your great work.
Hello Tanner_H. My class clsTrickFFT work with float-numbers (complex). You can translate byte-array to float-array (Single on VB6) and use this class for image processing. Also you can use a trick that allows speed up the process: put to real-part - first part, put to imaginary-part - second part, make FFT. Because spectrum has a mirror effect, you can select the amplitude sine and cosine combine mirrors parts.
Microsoft (R) Incremental Linker Version 9.00.30729.01
Copyright (C) Microsoft Corporation. All rights reserved.
LINK : warning LNK4010: invalid subsystem version number 4.0; default subsystem version assumed
LINK : fatal error LNK1101: incorrect MSPDB80.DLL version; recheck installation of this product
Strange... because all other VB6 project works!
yes LINK.EXE and MSPDB80.DLL are not original VB6 but taken from VisualStudio9
(Files I added are take from Visual Basic 2008 Express Edition [Microsoft Visual Studio 9.0])
I added in VB98 folder even
Link.exe.config
and
mspdbsrv.exe
and the error changed to this
Code:
Microsoft (R) Incremental Linker Version 9.00.30729.01
Copyright (C) Microsoft Corporation. All rights reserved.
LINK : warning LNK4010: invalid subsystem version number 4.0; default subsystem version assumed
LINK : fatal error LNK1318: Unexpected PDB error; NOT_FOUND (4) ''msobj80.dll''
So I added even
msobj80.dll
And Now works
this way it is created one "extra" file: "TrickVocoder.pdb"