discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Scheduler Issues with Branch Prediction


From: Tom Rondeau
Subject: Re: [Discuss-gnuradio] Scheduler Issues with Branch Prediction
Date: Wed, 17 Jul 2013 12:59:54 -0400

On Mon, Jul 15, 2013 at 2:29 PM, Ron Sadlier <address@hidden> wrote:
> Hello everyone,
>
> I've been having the same issue with the scheduler for the past 2 months or
> so and I'm starting to unravel what I believe is occurring. Let me describe
> my flowgraph first, then describe what's happening when I run the flow
> graph, and then describe some REALLY interesting things I've noticed and my
> thoughts. I apologize for the length of this in advance! :) . This isn't as
> much of a "help me!" post as it is "what is happening here and is there a
> problem with the scheduler or my thinking?"
>
> My flowgraph consists of:
>
> File source -> Custom Encode -> Custom Server Query -> Custom Server Query
> -> Custom Decode -> Bit Error Rate Calculator
>
> The encode scheme I am using is rather simple: it splits a byte into 4
> "flags", each representing the four possible values of two bits
> (00,01,10,11). The first "Custom Server Query" sends these flags to our
> server, which stores them and returns a reference number, which is the
> output of the block, with the second query block doing this process in
> reverse. Overall, the actual implementations of these blocks are all
> extremely simple, but unimportant, except that they all have if statement
> chains to perform different processes based on the incoming flag. (This is
> important later)
>
> Now I expect the entire file to transfer over with a 0 BER. What I find is
> that only a relatively short length transfers successfully before the data
> on the other end becomes completely jumbled. I cannot ascertain what sort of
> jumbling occurs, ie dropping, shifting, delays, because it appears to change
> each run and happen at different file locations.
>
> I don't have a throttle after the file source because it does not work.
> Extreme example: I can set the throttle speed to 1 and it will still process
> at the fastest rate possible. If I put any other block in between the file
> source and the throttle, then the throttle works as it should. It doesn't
> fix the jumble issue, but the throttle acts as it was designed to by
> bursting with an average of the set value. That's weird thing #1: Why does
> throttle work if there is a block in between it and the file source block,
> but not if directly connected?

That should be a clue. The throttle block absolutely works right after
a file source. It always has in the thousands of cases I've tried and
seen. If it's not working in this way, I would look to your code for
doing something wrong, possibly interacting with the scheduler
incorrectly (are you consuming and producing at the right rates?).

> Next, I determined that the jumbling doesn't happen for small file sizes
> (21kB) for random content files, but does for larger files. Why is random
> import? Because if I put sequential data through (0x0 0x1 0x2 0x3 0x0 0x1
> 0x2 0x3 0x4...), it works *perfectly* with any file length at any flow graph
> speed. This is a little unexpected, but not unheard of, as it appears to be
> an artifact of branch prediction failure (see the following for a good
> discussion if you are unfamiliar:
> http://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-an-unsorted-array).
> This would be a factor because my blocks have if statements for what type of
> input flag is being processed.
>
> After that, I put a sleep() in my encode block to more uniformly control
> execution speed for testing. Even then, the lowest sleep value I can use for
> random file data before jumbling is around 300microseconds. For sequential
> data I can omit the sleep altogether, meaning it can process at the fastest
> possible rate.
>
> This is the opposite of what I would think would occur. If this was a buffer
> overrun issue, then why does the jumbling only happen when the flowgraph
> runs slower?
>
> After some thinking, I'm starting to become suspicious of the threading
> model. I'll admit that I haven't fully reviewed the threading code (thus, am
> I making any obvious blunders in thinking below?) My idea of the issue boils
> down to threads becoming unsynchronized. Executing with sequential data
> would perform all encoding and decoding tasks with optimal speed and ideally
> no branch prediction failures, such that the time slice of each thread for
> general work would ideally take the same time to execute. If running random
> data, branch prediction would surely fail almost every time. But the time it
> didn't fail would speed up that particular iteration greatly. Is is possible
> these threads are getting jumbled when this occurs a few times in a row,
> making a "bubble" of fast execution speed and the data NOT being guaranteed
> FIFO?
>
> Unless there is another feature of the scheduler I am unaware of and I am
> over thinking the importance of branch prediction here, I cannot reason why
> sequential data produces no problems. Even if I am completely wrong with my
> reasoning here, it doesn't explain away the fact that my jumbling still
> occurs. I've tried this on multiple machines with different versions of
> linux, but with GNR version 3.6.4.1. I've used built from source installs
> and the Ettus binary.
>
> Does anyone have any thoughts on this? Where are the guts of the scheduler
> located in the repo? I can find files for the thread for a block, but not
> the file that creates the threads and controls them. I'm just using sleep()
> in my encoding block as a temporary fix.
>
> Best,
> Ron Sadlier

Ron,

I've been over the thread-per-block scheduler code a lot in the past
few years. What you are suggesting here is, frankly impossible. And I
don't think that branch prediction has anything to do with it. I
haven't looked at the link you sent, but branch prediction misses
should only slow down the processing but have no effect on the data.

I think you'll have to look closer at your own blocks for the problem.
Have you written QA for each of the blocks? Best to verify that each
one works as expected on its own before trying to put them all in line
together.

Tom



reply via email to

[Prev in Thread] Current Thread [Next in Thread]