bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] SMP initialization: detection and enumeration


From: Almudena Garcia
Subject: Re: [PATCH] SMP initialization: detection and enumeration
Date: Tue, 11 Aug 2020 02:17:35 +0200

> That being said, instead of hardcoding the maximum number of CPUs to
> be 256, you can just let the user choose whatever value is preferred.
> That's what Linux does.

But this could cause coherency problems.
By example, if the user sets mach_ncpus=4 , and the processor has 8 cores, It can produce an out-of-index error in the access to the arrays which store the info about the cpus.

> Re-read what I wrote. I was talking about lapic being qualified
> volatile, and apic_get_lapic returning it as non-volatile. Aren't you
> getting a warning there?
Yes, I've just fixed It.

> I can't see how this declaration can't be in kern/smp.h. It's not even
> returning a struct or whatever. What is the actual error message when
> you put it in kern/smp.h?
I've just solved It declaring smp_info in kern/smp.c , and adding an extern declaration in its header.

> What for?
> num_cpus is something that you make returned by a smp_get_numcpus()
> function, so it's not actually useful to also expose it in a struct.
> What else would go into that structure?
By example, the master cpu (BSP in x86)

> But the function returns an APIC id. Really not much can be done with
> that without knowing details of the APIC.
It's true. I didn't remember this. I can search the kernel ID of this APIC ID, but cpu_number() already will do that.
Then, maybe it can be simpler to remove this function.
My idea was to avoid calling directly to the apic function when I will implement cpu_number(). But maybe this is not the best solution for this.
I have to rethink this.

> What for?
> num_cpus is something that you make returned by a smp_get_numcpus()
> function, so it's not actually useful to also expose it in a struct.
> What else would go into that structure? Won't we actually rather
> use functions to return such information rather than imposing that
> structure over all archs?

I will not use this struct to share the information, simply to ease the access to the functions which really return such information.

smp_get_numcpus() doesn't return NCPUS value, but returns the real number of cpus existent in the machine.
I simply use a struct to store this generic information instead access to apic's ApicInfo struct (which stores this value in x86).

> Re-read what I wrote. I do *not* think we want to print this as an
> error like you did. Not having SMP is not an error. Just do not print
> anything.
Really, the mistake is in the message. I wanted to advise that the processor detection has failed in any step (although maybe the cause is not that SMP doesn't exist in the machine...)
Then, I will have to change this message to a better explanation. But showing a different message for each error code can be very lazy... :(



El mar., 11 ago. 2020 a las 1:39, Samuel Thibault (<samuel.thibault@gnu.org>) escribió:
Almudena Garcia, le mar. 11 août 2020 01:23:49 +0200, a ecrit:
> > ? git diff produces the same as format-patch, in terms of formatting
> > mistakes... The disadvantage of git diff is that your patches then mix
> > things altogether, see the additions to Makefrag.am. I just cannot apply
> > the series as it is.
>
> Yes, The problem is that I didn't write each file in a single commit.

Then use git rebase to rewrite your patch history?

> > ? The whole point of the mach_ncpus variable is to contain the number
> > of processors. I don't see why we could want mach_ncpus to be defined to
> > a different value than NCPUS.
>
> As I explained in a previous mail, in Mach source code the are many structures
> (arrays) which size is defined by NCPUS.
> Added to this, all SMP special cases are controlled by preprocessor directives
> like "if NCPUS > 1". For this reason, removing NCPUS can be a very difficult
> task.

I didn't write about removing NCPUS. I wrote about your code defining
NCPUS to a value that is different from mach_ncpus, while it could just
define mach_ncpus, and let NCPUS inherit the value from mach_ncpus so
it's all coherent.

That being said, instead of hardcoding the maximum number of CPUs to
be 256, you can just let the user choose whatever value is preferred.
That's what Linux does.

> > Err, it's empty, so just drop the file.
> I prefer to keep this file, to use that as an arch-independent interface for
> SMP.

But there is nothing there to be compiled. Some compilation toolchains
would even emit a warning in such a case, or just plainly error out
because they cannot produce an empty .o file.

> > Does it really need to be a separate function? Will we ever want to call
> > it somewhere else than smp_init?  If not just fold it into smp_init, or
> > make it static, there is no point in making it extern.
>
> I put It in a separate function because It's possible that, in next steps, I
> will need to add more fields to the structure, which will need to be
> initialized together.

Ok, but since smp_init is almost empty, you can as well initialize the
data in there.

> > Is this function really useful?  I mean in the long run we will want a
> > CPU number from 0, which will have to be knowing the apic enumeration,
> > and thus that's probably acpi.c that will define it anyway, and that is
> > the function whose declaration will belong to kernel/smp.h
>
> The idea about this function is to know the number of cpus (this will be
> necessary to enable the cpus in next steps), without knowing the details about
> APIC.

But the function returns an APIC id. Really not much can be done with
that without knowing details of the APIC.

> > > +volatile ApicLocalUnit* lapic = NULL;
> > [...]
>
> If, by any reason, we can't find APIC structures, o we need to have a secure
> value for lapic pointer, to take notice that APIC search or parsing has failed.
> And I simply prefered set this value at start. When we find the APIC table, if
> the parser is successful, this pointer will take the real value of the common
> Local APIC memory address.

For global variables the default value is *already* asserted to be NULL.
But anyway I was not talking about that at all.

Re-read what I wrote. I was talking about lapic being qualified
volatile, and apic_get_lapic returning it as non-volatile. Aren't you
getting a warning there?

> > That one belongs to kern/smp.h: non-i386 code will probably want to use
> > it.
> Yes, this was my idea. But It was very difficult to declare smp_info structure
> in kern/apic.c , and then share this structure with i386/i386/smp.c to
> initialize its data.
> I had many compilation problems, so I had to put this function there to ease
> the compilation.

?????????????????????????????

I can't see how this declaration can't be in kern/smp.h. It's not even
returning a struct or whatever. What is the actual error message when
you put it in kern/smp.h?

> > ? Is this really useful to expose to the rest of the kernel? Isn't that
> > structure something specific to the i386 SMP implementation?
>
> Really, the smp_data structure might be generic for all architectures.

What for?
num_cpus is something that you make returned by a smp_get_numcpus()
function, so it's not actually useful to also expose it in a struct.
What else would go into that structure? Won't we actually rather
use functions to return such information rather than imposing that
structure over all archs?

> The only reason because I declared smp_info in i386 files is because I
> didn't get to compile the code declaring it in kern/smp.c.

Which is possibly another sign that it's generally simpler to keep it in
the arch-specific part, and only expose simple functions like
int smp_get_numcpus(void); which pose no declaration problems.

> > ? I don't think we want to print this as an error?
> Is there a function to print messages as errors? If yes, then I replace this
> print with It.

Re-read what I wrote. I do *not* think we want to print this as an
error like you did. Not having SMP is not an error. Just do not print
anything.

Samuel

reply via email to

[Prev in Thread] Current Thread [Next in Thread]