[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-chat] Missed Optimisation ?

From: Alex Eremeenkov
Subject: Re: [avr-chat] Missed Optimisation ?
Date: Thu, 03 Mar 2011 02:08:02 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; uk; rv: Gecko/20110221 Thunderbird/3.1.8

03.03.2011 1:45, Michael Hennebry writes:
On Wed, 2 Mar 2011, Alex Eremeenkov wrote:

02.03.2011 19:20, Michael Hennebry writes:
On Tue, 1 Mar 2011, Graham Davies wrote:

Michael Hennebry wrote:

On further examination, I did find a "volatile uint32_t result;".
In context, I would guess that it was a complete
statement in the same file as the ISR.
Note the absence of attributes.
How could result not be in internal SRAM?

You may know that 'result' is going to be in internal SRAM, but you know things that the compiler doesn't. The compiler just puts the variable in memory.

If the compiler doesn't kow it, how would I?
The compiler knows the toolchain better than I do.

You *must* know because you are developer.
It's only developers responsibility to know what dedicated memory arrays means. Compiler doesn't know and shouldn't know what output linker set mean according real world.

Horse hocky.
The compiler is required to know because it is
required to emit the correct kinds of instructions.
volatile uint32_t result;
will go in the same memory space as
uint32_t consequence;

So you sure here that RW access to internal SRAM and external memory(via CPU memory interfaces), access to cached memory have a different instructions?
And according different memory types instructions will be different?
Explain me, please, for example, how I can compile program by same compiler once( have in result single object file). And make a link of it twice: first time for run with SRAM, second time for run with external memory? Compiler output is a same, programs works correct in both variants but each time from own place. Or you are disagree here?

It doesn't need to.
It only needs
volatile uint32_t result;  // definition, not just a declation


Your example explain only one node from big possibilities tree where some compiler works must generate code. You are expect that compiler will generate quite correct code in all variants. Agree? So why it generate code that could work quite incorrect in *other* different solutions?

The compiler should generate code that depends on the source.
That is rather the point of a compiler.
Here is do it correct, as expected.

threads of execution that appear in two different translation units, the compiler cannot know that it must avoid certain optimizations unless the programmer "tells it" using the volatile storage qualifier.

PORTB |= _BV(3);
is usually compiled as an SBI instruction.
Following the volatile rules blindly
would require at least two accesses.

Strictly speaking, this AVR-specific optimization violates the language specification, given that PORTB is volatile. The compiler should generate

I disagree.
The SBI instrution is equilavent to
uint8_t sreg=SREG;
uint8_t portb=PORTB | _BV(3);
Admittedly volatile implies "I know something you don't know."
That said, if the compiler understands the hardware,
there are limits on what that could be and the as-if rule applies.
One more time - compiler it's not a God. It's mostly a state machine only. And what it must do - work pretty well. Some inline optimization that could be skipped by compiler, at least theoretically, it's not huge pain for use hi-level abstractions.

P.S. We may be descending into a personal argument. I would be happy at this point to say "we're both right, from out different points of view, but are perhaps not explaining it well to each other". OK?

I think we are close.
It seems to me that the sticking points are:
Can the compiler know that result is in internal SRAM?

It don't know. It's only linker point.

If it doesn't know, it can't emit correct code.
The code actually emitted, generally regarded as correct,
required just that knowledge.

Why it need such knowledge?
Compiler know that loading instruction it's a 'lds' for example. Nothning more he didn't need at all. It's a CPU and linker process responsibility where data at some address will be(SRAM, external RAM, cache, other memory interface) and how it's

If you disagree here again,
explain me how instruction below will determine in what actual memory segment address mapped during execution, if we have CPU with memory mapping configuration(any modern ARM CPU for example):

lds     r10, 0xFF00025A

reply via email to

[Prev in Thread] Current Thread [Next in Thread]