Consulting on my next project

NorthGuy · May 16, 2014

electroRF said:
Are you suggesting to trigger the DMA after every write?

Yes.

electroRF said:
It'd be a waste to trigger the DMA every time I write few bytes to the 10KB buffer.

My guess is that triggering DMA will take less CPU cycles than deciding whether to trigger it.

electroRF said:
My main problem is, how to efficiently decide when to trigger the DMA?
And base on what to make that decision?

How many cycles do you need to trigger DMA? Can you post DMA triggering code?

How many cycles do you need to evaluate the size of the buffer and decide that it is filled enough?

electroRF · May 16, 2014

Hi NorthGuy,

You may be right, it'd probably cost me the same cycles to trigger the DMA and evaluate the size of the used buffer - It's an important note which I'll check, thank you!

But, I'd not want the uC to be get an interrupt from the DMA on its completion of transfer, every time I write a log.

Therefore I'd rather trigger the DMA on every certain amount of logs, e.g. 1KB.

Since I need to write into the 1MB buffer in Cyclic mode, I need to take into account knowing when I reached the end of the 1MB buffer.
If I'd write known-sized amount of data into the 1MB buffer, it'd make it more efficient to know when I reached the 1MB buffer's end.
What would you suggest please?

Another option would be using some other timer, to expire every 3ms for example, and then transfer a certain amount of logs into the 1MB buffer.
What do you think of this option?

Thank you NorthGuy

NorthGuy · May 16, 2014

electroRF said:
But, I'd not want the uC to be get an interrupt from the DMA on its completion of transfer, every time I write a log.

You do not need to enable DMA interrupt. That is a very bad idea to enable extra interrupts for debuging purposes because it'll disturb the flow.

What you do is very simple. After you write your buffer, you see if DMA is still busy with the previous request, and if it isn't you request a DMA transfer. If not, you just do nothing - you'll take care of it after the next write. All the checking that you may do is likely to make it less efficient.

Of course there could be idiosyncratic problems with MCU, such as you can't disable DMA interrupts or you need to do lots of work to initiate DMA transfer. That would change things.

misterT · May 16, 2014

electroRF said:
Since I need to write into the 1MB buffer in Cyclic mode, I need to take into account knowing when I reached the end of the 1MB buffer.
If I'd write known-sized amount of data into the 1MB buffer, it'd make it more efficient to know when I reached the 1MB buffer's end.
What would you suggest please?

If you want to keep the latest data, then the only option is to overwrite old data. Write some "book-keeping info" to the first bytes in the memory which tells where the latest data is located.

electroRF · May 18, 2014

Hi T and NorthGuy,
It's always a pleasure to read your posts.

NorthGuy,
I think I'll give up the calculations of whether I should trigger the DMA already or not, as you said, but will trigger the DMA periodically, and that way, I can use the DMA more efficiently and not just for transferring any few bytes.

T,
I indeed do book-keeping and I got a pointer which always points to the next available address in the 10KB Buffer.

Thanks yo you, I think I'll use the following mechanism:

1. I write to the 10KB buffer at Max. Pace of 4KB / 1ms
2. Therefore, every 1ms (equal to 2KB of data written to 10KB Buffer), I'll trigger an interrupt (using Timer)
3. in the Timer's ISR, I'll have to:
3.1

C:

Size = (Last Address Written To 10KB Buffer - Last Address in 10KB Buffer that was Transferred by DMA to 1MB Buffer) //Size is Size to Transfer
if (Size < 0) Size += 10KB

3.2

C:

Size = (Size > (Last Address of 10MB Buffer - Next Address to Write to in 10MB Buffer) ?  
        Size : (Last Address of 10MB Buffer - Next Address to Write to in 10MB Buffer)

I'd appreciate your advises on the following:
1. How can I make Steps 3.1 Efficiently with as less Cycles as possible?
I'd like to spare that if(Size <0) Size += 10KB if possible.

2. Is it also possible to make 3.2 more efficient?

3. Did I miss anything?

Thank you very much dear friends.

misterT · May 18, 2014

Why is it so important to squeeze out one or two instructions.. the data transfer itself takes many times longer than the dma triggering itself. Think quality before speed. Optimize only when you really need to.

electroRF · May 18, 2014

Hi T,
Thank you again for your precious comments.

Why is it so important to squeeze out one or two instructions..

The reason I hope to optimize it cycles-wise, is that the TIMER's ISR works under the context of Disabled Interrupts Mode.
Therefore, I'd like it to be very short in order to not delay the handling of other interrupts which may occur at this point.

the data transfer itself takes many times longer than the dma triggering itself.

Yes, but it'd happen in background and therefore won't delay the uC

Addition (EDIT)
Oh T,
I found out that the Max Pace of writing to the 10KB Buffer is 4.5KB / 1ms.
Do you think it's enough to trigger the Timer (which in its ISR I'll trigger the DMA) every 1ms? (taking into account that the smaller of the Buffers is only 10KB)

My fear is the case when the DMA happens to be busy at the point which the Timer expires, what would you suggest to do at this case?

I can't wait until the DMA gets available again.

misterT · May 18, 2014

.. just start by writing a very simple test program and see how it works. I think you are over thinking this whole thing. You are trying to solve problems you do not have.

electroRF · May 18, 2014

Hi T,

You're right (as always I must add).

The thing is that I need to make a presentation in which we'd decide at its end on which way to go.

Since it'd take me some time to learn how to operate the DMA, I'll not be able to test it all this week, however a decision needs to be made.

I'm now trying to see how I can efficiently calculate the Size that needs to be transferred to the large 1MB Buffer, since the last transfer.

I think I got it, and would love your help.

What do you think of the following:
1.

Code:

Size =(Last Address Written To 10KB Buffer - Last Address in 10KB Buffer that was Transferred by DMA to 1MB Buffer)

2. Use Sign-Bit of Size Variable in order to XOR it with Size Variable, and Add it the Sign-Bit

#2 would take care of having the case of Size Negative, and will actually do nothing in case it is positive.

What do you think of #2?
You're the master in bits

Can it be done otherwise?

What I actually trying to do is an efficient Absolute Value: Size = |A-B|, where:
A = Last Address Written To 10KB Buffer
B = Last Address in 10KB Buffer that was Transferred by DMA to 1MB Buffer

NorthGuy · May 18, 2014

You cannot get off the hook with interrupts that easily, because your interrupt code will now interfere with your buffer-writting code, and you will have to spend quite a bit of your precious cycles to make sure they don't harm each other.

When you changed your task descrition and introduced big buffer, using a fast circular buffer as an interim storage doesn't look like a good solution any more. When you want to do something fast, the most important factor is speed. It is extremely important to know how many cycles it takes to write a byte (word, dword) to the slow memory (the target of the DMA transfer), how many cycles it takes to trigger DMA, how many cycles a simple istruction takes, what is the processor frequency. How can you possibly design something without knowing these numbers?

On cicrcular buffers: I have posted a response to one of your queries where I described how to organize circular buffers efficiently. You probably can find it.

electroRF · May 18, 2014

Hi NorthGuy,
Thnaks a lot again!

NorthGuy said:
On cicrcular buffers: I have posted a response to one of your queries where I described how to organize circular buffers efficiently. You probably can find it.

Yes, I remember well your comment

There it it:

NorthGuy said:
With circular buffer I usually allocate a piece whch has a size of power of 2 and never worry about circularity.

However, the copying into the large 1MB buffer is not done by SW, but by the DMA.
Therefore, correct me if I'm wrong, but you can't have the DMA to AND each address with a power of 2, right?
that is since the DMA which I work with, is given with Source Address, Destination Address, and Length (and of course Channel, and Interrupt Mode)

NorthGuy said:
You cannot get off the hook with interrupts that easily, because your interrupt code will now interfere with your buffer-writting code, and you will have to spend quite a bit of your precious cycles to make sure they don't harm each other.

Could you elaborate on that please?
How will triggering the DMA in the Timer's ISR will harm the circular writing to the 10KB Buffer?

misterT · May 18, 2014

electroRF said:
Therefore, correct me if I'm wrong, but you can't have the DMA to AND each address with a power of 2, right?
that is since the DMA which I work with, is given with Source Address, Destination Address, and Length (and of course Channel, and Interrupt Mode)

The addresses are not powers of two. The size of the buffer is a power of two.

ISR functions always have entry and exit code (overhead).. Trying to save few cycles in dma triggering is useless.

electroRF · May 18, 2014

Hi T,

misterT said:
The addresses are not powers of two. The size of the buffer is a power of two.

I may be missing something, but in the case that the size of the big buffer is a power of 2, e.g. 1024KB = 1MB.
I still would give the DMA a Source Address (which belongs to the 10KB Buffer), and a destination address (that belongs to the 1MB Buffer),
and a size of bytes to transfer.

How does that help for efficient circularity in the 1MB Buffer, using the DMA?

-----------

I also face the following problem:

In this case, I'd need to tell the DMA to move Data of size of only 0.8KB from Last-Transferred-Address-to-1MB-Buffer.
How can one "see" it efficiently?
as it takes here to:
1. firstly check if (Last-Transferred-Address-to-1MB-Buffer) > (Last-Written-Address-to-10KB-Buffer)
2. If yes, then Size = (10KB-Buffer-Ending-Address) - (Last-Transferred-Address-to-1MB-Buffer)

misterT · May 18, 2014

electroRF said:
How does that help for efficient circularity in the 1MB Buffer, using the DMA?

It doesn't.. It only helps with the efficiency of the software buffer. DMA is designed to be efficient way to move data around.. don't try to make it any more efficient in software. That is impossible.

misterT · May 18, 2014

electroRF said:
In this case, I'd need to tell the DMA to move Data of size of only 0.8KB from Last-Transferred-Address-to-1MB-Buffer.

True.. no way around it.

NorthGuy · May 18, 2014

electroRF said:
that is since the DMA which I work with, is given with Source Address, Destination Address, and Length (and of course Channel, and Interrupt Mode)

You use the same Head and Tail for both 1MB and 10K (which you can always re-make to 8K), but to access 8K you do &0x1fff, but to access 1MB you do &0xfffff. So you pass to DMA: Source - SmallBuffer[Tail&0x1fff], Destingation - BugBuffer[Tail&0xfffff], Length - (Head-Tail). You will have some problems when it wraps around, but nothing too bad.

electroRF said:
Could you elaborate on that please?
How will triggering the DMA in the Timer's ISR will harm the circular writing to the 10KB Buffer?

Timer ISR may interrupt you in the middle of writting into the small buffer. Say, you adjusted Head in the buffer, but didn't write the content - DMA may then copy some garbage.

I'll tell you again - you must get all the timing (and other) information before you do the design. You're already well into the design, but basic information is not known yet. That's a recepie for disaster.

electroRF · May 18, 2014

NorthGuy said:
You use the same Head and Tail for both 1MB and 10K (which you can always re-make to 8K), but to access 8K you do &0x1fff, but to access 1MB you do &0xfffff. So you pass to DMA: Source - SmallBuffer[Tail&0x1fff], Destingation - BugBuffer[Tail&0xfffff], Length - (Head-Tail).

Thanks!
I got you on the Source = &SmallBuffer[Tail&0x1fff], and Destination = &BigBuffer[Tail&0xfffff].
I didn't get you on the Length = (Head-Tail).
How do you define the Head and Tail? (I didn't understand how you calculated the correct length please)

NorthGuy said:
You will have some problems when it wraps around, but nothing too bad.

Are you talking on a situation where we'd fail the calculate the correct length?
I'd appreciate it if you could elaborate on that please.

misterT said:
It doesn't.. It only helps with the efficiency of the software buffer. DMA is designed to be efficient way to move data around.. don't try to make it any more efficient in software. That is impossible

T,
I'm trying to make the triggering of the DMA efficient, not the DMA Copying operation itself.
As you know, I need to take care of the circularity myself (as NorthGuy's solution suggested), as the DMA will not do it itself).

Thank you guys! I love reading your posts!

NorthGuy · May 18, 2014

electroRF said:
How do you define the Head and Tail? (I didn't understand how you calculated the correct length please)

Head is the counter of what you wrote. It gets incremented for every byte you write, and it points to the place where you continue writing.

Tail is the counter of what you already took out with your DMA. It gets incremented for every byte you transfer with DMA, and it points to the place where untransferred data begins.

The size available for DMA transfers is the difference between the two. Size = Head - Tail;

electroRF said:
Are you talking on a situation where we'd fail the calculate the correct length?
I'd appreciate it if you could elaborate on that please.

Say, the Tail&0x1fff is 0x1ff8 and you need to transfer 20 bytes. Apparently, you cannot do this in a single transfer, because DMA won't wrap. You need to transfer 8 bytes, then 12 from the beginning of the buffer. You'll have to deal with that somehow. There are many ways to do that.

electroRF · May 18, 2014

Hi NorthGuy,
Thank you again and again my friend

NorthGuy said:
Head is the counter of what you wrote. It gets incremented for every byte you write, and it points to the place where you continue writing.

Tail is the counter of what you already took out with your DMA. It gets incremented for every byte you transfer with DMA, and it points to the place where untransferred data begins.

Got you!
Thanks! thats a beautiful implementation.

NorthGuy said:
The size available for DMA transfers is the difference between the two. Size = Head - Tail;

Yes, I see now.
That is of course in the case where Head > Tail. (as you said below).

NorthGuy said:
Say, the Tail&0x1fff is 0x1ff8 and you need to transfer 20 bytes. Apparently, you cannot do this in a single transfer, because DMA won't wrap. You need to transfer 8 bytes, then 12 from the beginning of the buffer. You'll have to deal with that somehow. There are many ways to do that.

Yes, you're right.
However, how would you conclude that 20B are needed.
I mean, in the case you described:
Tail&0x1FFF = 0x1FF8
Head&0x1FFF = 0x000C

C:

Head&0x1FFF - Tail&0x1FFF gives you: 0x000C - 0x1FF8 = 0xE014

How do you reach the 20B in that case?

Would you ask at each time the Timer Expires:

C:

if (head&0x1FFF > tail&0x1FFF)   size = Head&0x1FFF - Tail&0x1FFF
else size = 0x2000 - Tail&0x1FFF

I'm trying to see if there's a smarter way to do it and recognize that Tail > Head

NorthGuy · May 18, 2014

electroRF said:
That is of course in the case where Head > Tail. (as you said below).

Head is always ahead of the Tail. That's why you call them Head and Tail. Therefore, Size = (Head - Tail) is always positive (could be zero, of course).

electroRF said:
Yes, you're right.
How would you deal with it?
What would you find to be an efficient way of handling it?

There are many ways. Starting from straightforward - transfer only min(Size,(0x2000 - (Tail&0x1fff))), but this will add few cycles to every write, probably too slow for you. The other would be to do fixed size transfers, say 2048 at a time (e.g. if (Size >= 0x800) { transfer 0x800 bytes}). This adds less cycles, but introduces delays in transfer.

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

Consulting on my next project

Well-Known Member

Member

Well-Known Member

Well-Known Member

Member

Well-Known Member

Member

Well-Known Member

Member

Well-Known Member

Member

Well-Known Member

Member

Well-Known Member

Well-Known Member

Well-Known Member

Member

Well-Known Member

Member

Well-Known Member

Similar threads

New Articles From Microcontroller Tips