How to operate a CNC router without attracting attention ...

My CNC router served faithfully for two years, but something went wrong, the firmware went off , and it was woodpecker 0.9.



At first I just wanted to re-upload it, and for this purpose I got the source code for the Grbl CNC Project. But curiosity overcame and I plunged into the study of these sources ...



they build a very simple and logical, but what Russian does not like to drive fast as it is possible to pass by the possibility that a shoe to improve! Based on what happened, this short Sunday post.



Actually, the idea of ​​a controller for a CNC machine is quite simple and interesting. There are several processing threads - one reads data (gcode) and parses them, the second turns commands into execution blocks, and the third (stepper) actually executes these blocks. This third stream will be discussed.



The stepper deals with a list of individual commands of the form - take (X, Y, Z) steps for all three (at least) stepper motors, and in a specified time and in a specified direction (well, this is so simplistic). I must say that a stepper motor with its driver is quite a simple thing to control - you set (0 or 1) the direction of rotation and then the motor tries to make one step by a positive input drop (0 -> 1) (and there are usually 200 steps per revolution). The data is already prepared, so you just need to somehow correlate the 3 integers with the given time.



In the original, the author used the atmega328p controller, but practically unchanged everything is easily transferred to the arm (for example, stm32). But the algorithm itself cannot but raise questions.



On the one hand, a very perfect Bresenham's algorithm is used, or rather its version of Adaptive Multi-Axis Step-Smoothing. But on the other hand, somehow it's all complicated and most importantly, the smoothness of the stepper motor and the accuracy of the router directly depend on the accuracy of the control signals. In this case, this is due to the frequency at which the timer operates and the interrupt processing time - and this gives no more than 40-50 kHz at best, and usually even less - well, that is, the control setting accuracy is 20-50 microseconds.



But it is quite obvious that when we process one command from the buffer, we just need to calculate the moments of signal switching at the output port and these moments and make the switch.



Since I was considering switching to cortex-m (well, more precisely, to stm32h750, which I love very much and which has become very cheaper), such a task can be solved at all without involving the CPU, only using two DMA channels and one 32-bit counter.



The idea is very simple. Let one channel write new data to the port on counter overflow, and the second channel writes a new maximum counter value (it is reasonable to do this on the very first clock cycle of the counter). Then, to process the command from the list, you need to prepare for an array of change values ​​for the port and the timeouts between them.



It will turn out something like this.



Interrupt handling - switching to a new buffer (double buffering).



#define MAX_PGM 32
typedef struct _pgm_buffer {
        uint32_t data[MAX_PGM];
        uint32_t delta[MAX_PGM];
} pgm_buffer;
pgm_buffer buf[2];
uint32_t current_buf = 1;
uint32_t flags = 0;
void program_down(DMA_HandleTypeDef *_hdma) {
        TIM2->CR1 &= ~TIM_CR1_CEN;
        if ((flags & BUF_RUNNING) == 0)
                return;
        current_buf ^= 1;
        DMA1_Channel5->CCR &= ~1;
        DMA1_Channel2->CCR &= ~1;
        DMA1_Channel5->CNDTR = MAX_PGM;
        DMA1_Channel2->CNDTR = MAX_PGM;
        DMA1_Channel5->CMAR = (uint32_t) (buf[current_buf].delta);
        DMA1_Channel2->CMAR = (uint32_t) (buf[current_buf].data);
        DMA1_Channel5->CCR |= 1;
        DMA1_Channel2->CCR |= 1;
        TIM2->CNT = 0;
        TIM2->ARR = 8;
        TIM2->EGR |= TIM_EGR_UG;
        TIM2->CR1 |= TIM_CR1_CEN;
}


You can initiate this way:



       HAL_DMA_RegisterCallback(&hdma_tim2_up, HAL_DMA_XFER_CPLT_CB_ID,
                        program_down);
        HAL_DMA_Start_IT(&hdma_tim2_up, buf, &GPIOA->BSRR, MAX_PGM);
        DMA1_Channel5->CCR &= ~1;
        DMA1_Channel5->CPAR = &TIM2->ARR;
        DMA1_Channel5->CCR |= 1;
        TIM2->CCR1 = 1;
        TIM2->DIER |= TIM_DIER_UDE | TIM_DIER_CC1DE;
        flags |= BUF_RUNNING;


Well, the start is:



        program_down(NULL);


What does it do? Let's calculate using the example of the same stm32h750. The timer (TIM2) operates there at a frequency of 200 MHz, the minimum latency is two clock cycles, but DMA cannot send data faster than 50 MHz, that is, between two commands for switching the port, you can put (taking into account the possible bus occupancy) 40 nsec (25 MHz) - this is 1000 times better than the original implementation!



On the other hand, the port width is 16 bits, so that you can simultaneously control 8 stepper motors instead of 3 you would still know why ...



In this case, filling in the data itself does not cause problems (with such and such a resolution!) - simple linear interpolation for each motor separately with by combining (for optimization) events closer than 40 nsec.



The actual conclusions.



In the workshop there is a finished CNC machine measuring 1.2 meters by 0.8 meters with motors and drivers, but no controller. Looks like we need to finish the job and try it on how epic it will be. If I do, I will definitely write a sequel. In the meantime, I don't understand why the controllers do this on atmega and they squeak on all 3d printers and cnc routers on these rough interrupts ...



And of course, probably having the power of Cortex-M7, you can implement smoother trajectory control with all the restrictions , but that's a completely different article.



PS Apparently, it is necessary to give some hypothetical example why it is so important to have such a short time.



Let's say that the machine needs to move 100 mm in X and 11 mm in Y, and the software broke it all into sections of acceleration and uniform movement - there are many sections of 100 steps in 11 steps and they are traversed at maximum speed, let it correspond to 10 kHz. Well, 10 steps in Y will exactly fit into 100 steps in X, but trouble can happen to the 11th - it can be skipped, since it causes a doubling of the frequency. As a result, the movement will be 100 mm in X and from 10 to 11 mm in Y. And this is with linear movement, where, in fact, even the restrictions on permissible accelerations and speeds are simple. And if it is done in a zigzag? Well, for example, there is a sweeping of an area of ​​100 by 110 mm in 10 passes - then we generally miss very much ... The



proposed algorithm is used to eliminate this error, and not at all for super mills, etc.



All Articles