STM32F767ZI External Interrupt Handling

1.3k Views Asked by At

I'm attempting to create a proper SPI slave interface for an AD7768-4 ADC. The ADC has a SPI interface, but it doesn't output the conversions via SPI. Instead, there are data outputs that are clocked out on individual GPIO pins. So I basically need to bit-bang data, and output to SPI to get a proper slave SPI interface. Please don't ask why I'm doing it this way, it was assigned to me.

The issue I'm having is with the interrupts. I'm using the STM32F767ZI processor - it runs at 216 MHz, and my ADC data MUST BE clocked out at 20MHz. I've set up my NMIs but what I'm not seeing is where the system calls or points to the interrupt handler.

I used the STMCubeMX software to assign pins and generate the setup code, and in the stm32F7xx.c file, it shows the NMI_Handler() function, but I don't see a pointer to it anywhere in the system files. I also found void HAL_GPIO_EXTI_IRQHandler() function in STM32F7xx_hal_gpio.c, which appears to check if the pin is asserted, and clears any pending bits, but it doesn't reset the interrupt flag, or check it, and again, I see no pointer to this function.

To more thoroughly complicate things, I have 10 clock cycles to determine which flag is set (1 of two at a time), reset it, incerment a variable, and move data from the GPIO registers. I believe this is possible, but again, I'm uncertain of what the system is doing as soon as the interrupt is tripped.

Does anyone have any experience in working with external interrupts on this processor that could shed some light on how this particular system handles things? Again - 10 clock cycles to do what I need to... moving data should only take me 1-2 clock cycles, leaving me 8 to handle interrupts...

EDIT:

We changed the DCLK speed to 5.12 MHz (20.48 MHz MCLK/4) because at 2.56 MHz we had exactly 12.5 microseconds to pipe data out and set up for the next DRDY pulse, and 80 kHz speed gives us exactly zero margin. At 5.12 MHz, I have 41 clock cycles to run the interrupt routine, which I can reduce slightly if I skip checking the second flag and just handle incoming data. But I feel I must use the DRDY flag check at least, and use the routine to enable the second interrupt otherwise I'll be constantly interrupting because DCLK on the ADC is always running. This allows me 6.12 microseconds to read in the data, and 6.25 microseconds to shuffle it out before the next DRDY pulse. I should be able to do that at 32 MHz SPI clock (slave) but will most likely do it at 50MHz. This is my current interrupt code:

void NMI_Handler(void)
{
    if(__HAL_GPIO_EXTI_GET_IT(GPIO_PIN_0) != RESET)         
     {
         count = 0;                                             
         __HAL_GPIO_EXTI_CLEAR_IT(GPIO_PIN_0);                  
         HAL_GPIO_EXTI_Callback(GPIO_PIN_0);                        
         //     __HAL_GPIO_EXTI_CLEAR_FLAG(GPIO_PIN_0);             

         HAL_NVIC_EnableIRQ(GPIO_PIN_1);                            
     }  
    else
  {  
        if(__HAL_GPIO_EXTI_GET_IT(GPIO_PIN_1) != RESET)         
          {
             data_pad[count] = GPIOF->IDR;                      
             count++;                                           
             if (count == 31)
              {
                data_send = !data_send;                     
                HAL_NVIC_DisableIRQ(GPIO_PIN_1);            
              }
        __  HAL_GPIO_EXTI_CLEAR_IT(GPIO_PIN_1);         
        HAL_GPIO_EXTI_Callback(GPIO_PIN_1); 
//      __HAL_GPIO_EXTI_CLEAR_FLAG(GPIO_PIN_0); 
          }
   }
}  

I am still concerned about clock cycles, and I believe I can get away with only checking the DRDY flag if I operate on the presumption that the only other EXTI flag that will trip is for the clock pin. Although I question how this will work if SYS_TICK is running in the background... I'll have to find out.

We're investigating a faster processor to handle the bit-banging, but right now, it looks like the PI3 won't be able to handle it if it's running Linux, and I'm unaware of too many faster processors that run either a very small reliable RTOS, or can be bare metal programmed in a pinch...

3

There are 3 best solutions below

0
On

There's a startup file generated before compile: startup_stm32f767xx.s - which contains all the pointers to functions.

Under the marker g_pfnVectors: is .word NMI_Handler pointing to a function for handling the non-masked interrupts, and two other pointers, .word EXTI0_IRQHandler and .word EXTI1_IRQHandler as vectors to the external interrupt handlers. Further down in the same file, is the following compiler directives:

.weak      NMI_Handler  
.thumb_set NMI_Handler,Default_Handler

.weak      EXTI0_IRQHandler         
.thumb_set EXTI0_IRQHandler,Default_Handler

.weak      EXTI1_IRQHandler         
.thumb_set EXTI1_IRQHandler,Default_Handler  

This was the info I was looking for to be able to control my interrupts with more precision and fewer clock cycles.

4
On

10 clock cycles to do what I need to... moving data should only take me 1-2 clock cycles, leaving me 8 to handle interrupts...

No way. Interrupt entry (pushing registers, fetching the vector and filling the pipeline) takes 10-12 cycles even on a Cortex-M7. Then consider a very simple interrupt handler, just moving the input data bits to a buffer and clearing the interrupt flag:

uint32_t *p;
void handler(void) {
    *p++ = GPIOA->IDR;
    EXTI->PR = 0x10;
}

it gets translated to something like this

handler:
    ldr     r0, .addr_of_idr // load &GPIOA->IDR
    ldr     r1, [r0]         // load GPIOA->IDR
    ldr     r2, .addr_ofr_p  // load &p
    ldr     r3, [r2]         // load p
    str     r1, [r3]         // store the value from IDR to *p
    adds    r3, r3, #4       // increment p
    str     r3, [r2]         // store p
    ldr     r0, .addr_of_pr  // load &EXTI->PR
    movs    r1, #0x10
    str     r1, [r0]         // store 0x10 to EXTI->PR
    bx      lr
.addr_of_p:
    .word   p
.addr_of_idr
    .word   0x40020010
.addr_of_pr
    .word   0x40013C14

So it's 11 instructions, each taking at least one cycle, after interrupt entry. That's assuming the code, vector table, and the stack are all in the fastest RAM region. I'm not sure whether literal pools work in ITCM at all, using immediate literals would add 3 more cycles. Forget it.

This has to be solved with hardware.

The controller has 6 SPI interfaces, pick 4 of them. Connect DRDY to all four NSS pins, DCLK to all SCK pins, and each DOUT pin to one MISO pin. Now each SPI interface handles a single channel, and can collect up to 32 bits in its internal FIFO.

Then I'd set an interrupt on a rising edge on one of the NSS pins (EXTI still works even if the pin is in alternate function mode), and read all data at once.

EDIT

It turns out that the STM32 SPI requres an inordinate amount of delay between NSS falling and SCK rising, which the AD7768 does not provide, so it will not work.

Sigma-Delta interface

The STM32F767 has a DFSDM peripheral, designed to receive data from external ADCs. It can receive up to 8 channels of serial data with 20 MHz, and it can even do some preprocessing that your application might need.

The problem is that the DFSDM has no DRDY input, I don't exactly know how could the data transfer be synchronized. It might work by asserting the START# singal to reset the communication.

If that doesn't work, then you can try starting the DFSDM channels using a timer and DMA. Connect DRDY to the external trigger of TIM1 or TIM8 (other timers won't work, because they are connected to the slower APB1 bus and the other DMA controller), start it on the rising edge of ETR, and let it generate a DMA request after ~20 ns. Then let the DMA write the value needed to start the channel to the DFSDM channel configuration register. Repeat for the oher three channels.

0
On

I readed AD7768 DS more carefully and found that it can srnd four channels data to one DOUT pin. So, I talking again about serial audio interface (SAI).

If you can lower DCLK frequency up to 2.5MHz than you can lower sample with ratio 1:8 (as ratio 2.5 MHz to 20 MHz) irt sample rate at full ADC clock.

If you route all 4 channels to one output DOUT0 you slow down sample rate just in ratio 1:4.

AD7768-4 DS

page 53

On the AD7768, the interface can be configured to output conversion data on one, two, or eight of the DOUTx pins. The DOUTx configuration for the AD7768 is selected using the FORMATx pins (see Table 33).

page 66 table 34: (for AD7768-4) page 67 figure 98:

FORMAT0 = 1 All channels output on the DOUT0 pin, in TDM output. Only DOUT0 is in use.

You can use SAI with FS = DRDY and four slots, 32 bits/slot