Nokia 5110 LCD initialization issue

I am trying to connect Nokia 5110 LCD to BeagleBone Black Rev-C over SPI protocol.
The connections are exactly as shown on the page 6 of:
Nokia5110-BeagleBone Black Connections
I wrote a C equivalent of Arduino's code for Philips PCD8544 (Nokia 3310) driver.
Where I export the required GPIO ports and send commands and data over SPI interface.
I successfully installed and ran Adafruit's python-library:
Adafruit Nokia LCD
My problem is
I have a strange issue, when I run this python code first and then my C code, the code works perfect!
But if I run my C code before the python code, I get no output. Logic says that the python
code must be initializing something that I am missing in my code.
Here's how I initialize the LCD:
fd_spi_dev = open(device, O_RDWR);
//set mode
mode = SPI_MODE_0;
ioctl(fd_spi_dev, SPI_IOC_WR_MODE, &mode);
ioctl(fd_spi_dev, SPI_IOC_RD_MODE, &mode);
//set max bitrate
speed = 4000000;
ioctl(fd_spi_dev, SPI_IOC_RD_MAX_SPEED_HZ, &speed);
ioctl(fd_spi_dev, SPI_IOC_WR_MAX_SPEED_HZ, &speed);
// set an msb first
lsbsetting = 0;
ioctl(fd_spi_dev, SPI_IOC_WR_LSB_FIRST, &lsbsetting);
// set bits per word
bits = 8;
ioctl(fd_spi_dev, SPI_IOC_WR_BITS_PER_WORD, &bits);
ioctl(fd_spi_dev, SPI_IOC_RD_BITS_PER_WORD, &bits);
lcd_write_cmd(0x21); // LCD extended commands
lcd_write_cmd(0xB8); // set LCD Vop (contrast)
lcd_write_cmd(0x04); // set temp coefficient
lcd_write_cmd(0x14); // set biad mode 1:40
lcd_write_cmd(0x20); // LCD basic commands
lcd_write_cmd(0x09); // LCD all segments on
/* I am expecting to see all segments lit here */
lcd_write_cmd(0x0C); // LCD normal video
void lcd_write_cmd(uint8_t cmd) {
uint8_t *tx = &cmd;
uint8_t rx;
uint32_t len = 1;
struct spi_ioc_transfer tr = {
.tx_buf = (uint32_t)tx,
.rx_buf = (uint32_t)&rx,
.len = len,
.delay_usecs = delay,
.speed_hz = speed,
.bits_per_word = bits,
.cs_change = 1,
size = write(fd_dc_val, "0", 1);
size = write(fd_cs_val, "0", 1);
ioctl(fd_spi_dev, SPI_IOC_MESSAGE(1), &tr);
write(fd_cs_val, "1", 1);
I am a novice in embedded programming. I would greatly appreciate any help. Thank you.

If you're not missing an initialization step (and I haven't checked you against the 5110 datasheet), it must either be something wrong with your ioctls or a timing issue.
You could try using a library that abstracts away the ioctl calls to rule that out (I'm partial to my own: ;).
If it still doesn't work with that then I'd say it's probably a timing issue - Python is a lot slower than C when it comes to file I/O, so it might not be giving the LCD driver enough time to update after some of the commands - check the datasheet to see if it needs you to give it some time after any of the commands.


STM32 - Reading I2S to record a .WAV file. Audio choppy, what is causing it?

I'm using an STM32 (STM32F446RE) to receive audio from two INMP441 mems microphone in an stereo setup via I2S protocol and record it into a .WAV on a micro SD card, using the HAL library.
I wrote the firmware that records audio into a .WAV with FreeRTOS. But the audio files that I record sound like Darth Vader. Here is a screenshot of the audio in audacity:
if you zoom in you can see a constant noise being inserted in between the real audio data:
I don't know what is causing this.
I have tried increasing the MessageQueue, but that doesnt seem to be the problem, the queue is kept at 0 most of the time. I've tried different frame sizes and sampling rates, changing the number of channels, using only one inmp441. All this without any success.
I proceed explaining the firmware.
Here is a block diagram of the architecture for the RTOS that I have implemented:
It consists of three tasks. The first one receives a command via UART (with interrupts) that signals to start or stop recording. the second one is simply an state machine that walks through the steps to write a .WAV.
Here the code for the WriteWavFileTask:
sprintf(filename, "%saud_%03d.wav", SDPath, count++);
res = f_open(&file_ptr, filename, FA_CREATE_ALWAYS|FA_WRITE);
while(res != FR_OK);
res = fwrite_wav_header(&file_ptr, I2S_SAMPLE_FREQUENCY, I2S_FRAME, 2);
HAL_I2S_Receive_DMA(&hi2s2, aud_buf, READ_SIZE);
audio_state = STATE_RECORDING;
while(osMessageQueueGetCount(AudioQueueHandle)) osDelay(1000);
filesize = f_size(&file_ptr);
data_len = filesize - 44;
total_len = filesize - 8;
f_lseek(&file_ptr, 4);
f_write(&file_ptr, (uint8_t*)&total_len, 4, bw);
f_lseek(&file_ptr, 40);
f_write(&file_ptr, (uint8_t*)&data_len, 4, bw);
audio_state = STATE_IDLE;
Here are the macros used in the code for readability:
#define I2S_DATA_WORD_LENGTH (24) // industry-standard 24-bit I2S
#define I2S_FRAME (32) // bits per sample
#define READ_SIZE (128) // samples to read from I2S
#define WRITE_SIZE (READ_SIZE*I2S_FRAME/16) // half words to write
#define WRITE_SIZE_BYTES (WRITE_SIZE*2) // bytes to write
#define I2S_SAMPLE_FREQUENCY (16000) // sample frequency
The last task is the responsible for processing the buffer received via I2S. Here is the code:
void convert_endianness(uint32_t *array, uint16_t Size) {
for (int i = 0; i < Size; i++) {
array[i] = __REV(array[i]);
void HAL_I2S_RxCpltCallback(I2S_HandleTypeDef *hi2s)
convert_endianness((uint32_t *)aud_buf, READ_SIZE);
osMessageQueuePut(AudioQueueHandle, aud_buf, 0L, 0);
HAL_I2S_Receive_DMA(hi2s, aud_buf, READ_SIZE);
void pvrWriteAudioTask(void *argument)
/* USER CODE BEGIN pvrWriteAudioTask */
static UINT *bw;
static uint16_t aud_ptr[WRITE_SIZE];
/* Infinite loop */
osMessageQueueGet(AudioQueueHandle, aud_ptr, 0L, osWaitForever);
res = f_write(&file_ptr, aud_ptr, WRITE_SIZE_BYTES, bw);
/* USER CODE END pvrWriteAudioTask */
This tasks reads from a queue an array of 256 uint16_t elements containing the raw audio data in PCM. f_write takes the Size parameter in number of bytes to write to the SD card, so 512 bytes. The I2S Receives 128 frames (for a 32 bit frame, 128 words).
The following is the configuration for the I2S and clocks:
Any help would be much appreciated!
As pmacfarlane pointed out, the problem was with the method used for buffering the audio data. The solution consisted of easing the overhead on the ISR and implementing a circular DMA for double buffering. Here is the code:
#define I2S_DATA_WORD_LENGTH (24) // industry-standard 24-bit I2S
#define I2S_FRAME (32) // bits per sample
#define READ_SIZE (128) // samples to read from I2S
#define BUFFER_SIZE (READ_SIZE*I2S_FRAME/16) // number of uint16_t elements expected
#define WRITE_SIZE_BYTES (BUFFER_SIZE*2) // bytes to write
#define I2S_SAMPLE_FREQUENCY (16000) // sample frequency
uint16_t aud_buf[2*BUFFER_SIZE]; // Double buffering
static volatile int16_t *BufPtr;
void convert_endianness(uint32_t *array, uint16_t Size) {
for (int i = 0; i < Size; i++) {
array[i] = __REV(array[i]);
void HAL_I2S_RxHalfCpltCallback(I2S_HandleTypeDef *hi2s)
BufPtr = aud_buf;
void HAL_I2S_RxCpltCallback(I2S_HandleTypeDef *hi2s)
BufPtr = &aud_buf[BUFFER_SIZE];
void pvrWriteAudioTask(void *argument)
/* USER CODE BEGIN pvrWriteAudioTask */
static UINT *bw;
/* Infinite loop */
osSemaphoreAcquire(RxAudioSemHandle, osWaitForever);
convert_endianness((uint32_t *)BufPtr, READ_SIZE);
res = f_write(&file_ptr, BufPtr, WRITE_SIZE_BYTES, bw);
/* USER CODE END pvrWriteAudioTask */
I think the problem is your method of buffering the audio data - mainly in this function:
void HAL_I2S_RxCpltCallback(I2S_HandleTypeDef *hi2s)
convert_endianness((uint32_t *)aud_buf, READ_SIZE);
osMessageQueuePut(AudioQueueHandle, aud_buf, 0L, 0);
HAL_I2S_Receive_DMA(hi2s, aud_buf, READ_SIZE);
The main problem is that you are re-using the same buffer each time. You have queued a message to save aud_buf to the SD-card, but you've also instructed the I2S to start DMAing data into that same buffer, before it has been saved. You'll end up saving some kind of mish-mash of "old" data and "new" data.
#Flexz pointed out that the message queue takes a copy of the data, so there is no issue about the I2S writing over the data that is being written to the SD-card. However, taking the copy (in an ISR) adds overhead, and delays the start of the new I2S DMA.
Another problem is that you are doing the endian conversion in this function (that is called from an ISR). This will block any other (lower priority) interrupts from being serviced while this happens, which is a bad thing in an embedded system. You should do the endian conversion in the task that reads from the queue. ISRs should be very short and do the minimum possible work (often just setting a flag, giving a semaphore, or adding something to a queue).
Lastly, while you are doing the endian conversion, what is happening to audio samples? The previous DMA has completed, and you haven't started a new one, so they will just be dropped on the floor.
Possible solution
You probably want to allocate a suitably big buffer, and configure your DMA to work in circular buffer mode. This means that once started, the DMA will continue forever (until you stop it), so you'll never drop any samples. There won't be any gap between one DMA finishing and a new one starting, since you never need to start a new one.
The DMA provides a "half-complete" interrupt, to say when it has filled half the buffer. So start the DMA, and when you get the half-complete interrupt, queue up the first half of the buffer to be saved. When you get the fully-complete interrupt, queue up the second half of the buffer to be saved. Rinse and repeat.
You might want to add some logic to detect if the interrupt happens before the previous save has completed, since the data will be overrun and possibly corrupted. Depending on the speed of the SD-card (and the sample rate), this may or may not be a problem.

GpuMat to FFMPEG Encoder

I'm doing some image processing with opencv::cuda so what I end up with is a cv::cuda::GpuMat. I now want to encode it using ffmpeg(so I can choose the encoder to be hardware accelerated or not). Now I wonder if i can somehow keep the data on the GPU for the encoder without downloading it, because that seems to be the bottleneck in my application running multiple threads.
I'm resizing the images with Opencv CUDA so I have less to download. (resizing with sws_scale makes no difference)
cv::cuda::GpuMat currentFrame;
cv::cuda::GpuMat resized;
cv::Mat frameEnc = cv::Mat(resized);
const int stride[] = { static_cast<int>(frameEnc.step[0]) };
sws_scale(swsctx, &, stride, 0, frameEnc.rows, avframe->data, avframe->linesize);
ret = avcodec_send_frame(codec, avframe);
if(!ret) {
/* rescale packet timestamp */
pkt->duration = 1;
av_packet_rescale_ts(pkt, codec->time_base, vstrm->time_base);
/* write packet */
av_write_frame(outctx, pkt);
Now this does work and performs ok, but I really wish I could do something like:
cv::cuda::GpuMat currentFrame;
ret = avcodec_send_frame(codec, avframe);
if(!ret) {
/* rescale packet timestamp */
pkt->duration = 1;
av_packet_rescale_ts(pkt, codec->time_base, vstrm->time_base);
/* write packet */
av_write_frame(outctx, pkt);
where the avframe data is also on the gpu so that I don't download need any transfer between GPU-CPU/CPU-GPU
I think the class cv::cudacodec::VideoWriter could help, once an issue with OpenCV gets fixed. The class allows you to write a GpuMat directly. However I believe that due to a bug in OpenCV, you can't build OpenCV with support for this class. Which means this isn't a great solution now, but might be in the future.

Varispeed with Libsndfile, Libsamplerate and Portaudio in C

I'm working on an audio visualizer in C with OpenGL, Libsamplerate, portaudio, and libsndfile. I'm having difficulty using src_process correctly within my whole paradigm. My goal is to use src_process to achieve Vinyl Like varispeed in real time within the visualizer. Right now my implementation changes the pitch of the audio without changing the speed. It does so with lots of distortion due to what sounds like missing frames as when I lower the speed with the src_ratio it almost sounds granular like chopped up samples. Any help would be appreciated, I keep experimenting with my buffering chunks however 9 times out of 10 I get a libsamplerate error saying my input and output arrays are overlapping. I've also been looking at the speed change example that came with libsamplerate and I can't find where I went wrong. Any help would be appreciated.
Here's the code I believe is relevant. Thanks and let me know if I can be more specific, this semester was my first experience in C and programming.
#define FRAMES_PER_BUFFER 1024
float src_inBuffer[ITEMS_PER_BUFFER];
float src_outBuffer[ITEMS_PER_BUFFER];
void initialize_SRC_DATA()
data.src_ratio = 1; //Sets Default Playback Speed
data.src_data.data_in = data.src_inBuffer; //Point to SRC inBuffer
data.src_data.data_out = data.src_outBuffer; //Point to SRC OutBuffer
data.src_data.input_frames = 0; //Start with Zero to Force Load
data.src_data.output_frames = ITEMS_PER_BUFFER
/ data.sfinfo1.channels; //Number of Frames to Write Out
data.src_data.src_ratio = data.src_ratio; //Sets Default Playback Speed
/* Open audio stream */
err = Pa_OpenStream( &g_stream,
&data );
/* Read FramesPerBuffer Amount of Data from inFile into buffer[] */
numberOfFrames = sf_readf_float(data->inFile, data->src_inBuffer, framesPerBuffer);
/* Looping of inFile if EOF is Reached */
if (numberOfFrames < framesPerBuffer)
sf_seek(data->inFile, 0, SEEK_SET);
numberOfFrames = sf_readf_float(data->inFile,
/* Inform SRC Data How Many Input Frames To Process */
data->src_data.end_of_input = 0;
data->src_data.input_frames = numberOfFrames;
/* Perform SRC Modulation, Processed Samples are in src_outBuffer[] */
if ((data->src_error = src_process (data->src_state, &data->src_data))) {
printf ("\nError : %s\n\n", src_strerror (data->src_error)) ;
exit (1);
* Write Processed SRC Data to Audio Out and Visual Out */
for (i = 0; i < framesPerBuffer * data->sfinfo1.channels; i++)
// gl_audioBuffer[i] = data->src_outBuffer[i] * data->amplitude;
out[i] = data->src_outBuffer[i] * data->amplitude;
I figured out a solution that works well enough for me and am just going to explain it best I can for anyone else with a similar issue. So to get the Varispeed to work, the way the API works is you give it a certain number of frames, and it spits out a certain number of frames. So for a SRC ratio of 0.5, if you process 512 frames per loop you are feeding in 512/0.5 frames = 1024 frames. That way when the API runs its src_process function, it compresses those 1024 frames into 512, speeding up the samples. So I dont fully understand why it solved my issue, but the problem was if the ratio is say 0.7, you end up with a float number which doesn't work with the arrays indexed int values. Therefore there's missing samples unless the src ratio is eqaully divisble by the framesperbuffer potentially at the end of each block. So what I did was add +2 frames to be read if the framesperbuffer%src.ratio != 0 and it seemed to fix 99% of the glitches.
/* This if Statement Ensures Smooth VariSpeed Output */
if (fmod((double)framesPerBuffer, data->src_data.src_ratio) == 0)
numInFrames = framesPerBuffer;
numInFrames = (framesPerBuffer/data->src_data.src_ratio) + 2;
/* Read FramesPerBuffer Amount of Data from inFile into buffer[] */
numberOfFrames = sf_readf_float(data->inFile, data->src_inBuffer, numInFrames);

LATCH and PORT, dsPIC33f

I am referring to a simple program (example 2) on
#include "p33Fxxxx.h"
#pragma config WDT = OFF
void main (void)
TRISB = 0;
/* Reset the LEDs */
PORTB = 0;
/* Light the LEDs */
LATB = 0x005A;// tested with PORTB= 0X005A;at first, no change of PORTB in watch
while (1)
In the watch window, latchB is changed to 0x5A successfully while PORTB remains 0x0000.
I wonder why it is so.
If I were to connect portb to LEDs would they light up?
Oh I forgot to set analog input to digital by AD1PCFGL=0xffff;

View GPU Memory / View Texture2D memory space for debugging

I've got a question about a PixelShader I am trying to implement, and what I currently do (this is just for debugging, and trying to figure stuff out):
int3 loc;
loc.x = (int)(In.TextureUV.x * resolution_XY.x);
loc.y = (int)(In.TextureUV.x * resolution_XY.x);
loc.z = 0;
float4 r = g_txDiffuse.Load(loc);
return float4(r.x, r.y, r.z, 1);
The point is, this is always 0,0,0,1
The texture buffer is created:
tDesc.Height = 480;
tDesc.Width = 640;
tDesc.Usage = D3D11_USAGE_DYNAMIC;
tDesc.MipLevels = 1;
tDesc.ArraySize = 1;
tDesc.SampleDesc.Count = 1;
tDesc.SampleDesc.Quality = 0;
tDesc.Format = DXGI_FORMAT_R8_UINT;
tDesc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
tDesc.MiscFlags = 0;
V_RETURN(pd3dDevice->CreateTexture2D(&tDesc, NULL, &g_pCurrentImage));
I upload the texture (which should be a live display at the end) via:
pd3dImmediateContext->Map(g_pCurrentImage, 0, D3D11_MAP_WRITE_DISCARD, 0, &resource);
memcpy( resource.pData, g_Images.GetData(), g_Images.GetDataSize() );
pd3dImmediateContext->Unmap( g_pCurrentImage, 0 );
I've checked the resource.pData, the data in there is a valid 8bit monochrome image. I made sure the data coming from the camera is 8bit monochrome 640x480.
There's a few things I don't fully understand:
if I run the Map / memcpy / Unmap routine in every frame, the driver will ultimately crash, the system will be unresponsive. Is there a different way to update a complete texture every frame which should be done?
the texture I uploaded is 8bit, why is the Texture2D.load() a float4 return? Do I have to use a different method to access the texture data? I tried to .sample it, but that didn't work either. Would I have to use a int buffer or something instead?
is there a way to debug the GPU memory, to check if the memcpy worked in the first place?
The Map, memcpy, Unmap really ought not to crash unless2 you are trying to copy too much data into the texture. It would be interesting to know what "GetDataSize()" returns. Does it equal 307,200? If its more than that then there lies your problem.
Texture2D returns a float4 because thats what you've asked for. If you write float r = g_txDiffuse.Load( ... ). The 8-bits get extended to a normalised float as part of the load process. Are you sure, btw, that your calculation of "loc" is correct because as you have it now loc.x and loc.y will always be the same.
You can debug whats going on with DirectX using PIX. Its a great tool and I highly recommend you familiarise yourself with it.
