qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode


From: Daniel P . Berrangé
Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Date: Wed, 26 Aug 2020 16:03:40 +0100
User-agent: Mutt/1.14.6 (2020-07-11)

On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 14:36:38 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >   
> > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > >     
> > > > > > To support some of the complex topology, we introduced EPYC mode 
> > > > > > apicid decode.
> > > > > > But, EPYC mode decode is running into problems. Also it can become 
> > > > > > quite a
> > > > > > maintenance problem in the future. So, it was decided to remove 
> > > > > > that code and
> > > > > > use the generic decode which works for majority of the topology. 
> > > > > > Most of the
> > > > > > SPECed configuration would work just fine. With some non-SPECed 
> > > > > > user inputs,
> > > > > > it will create some sub-optimal configuration.
> > > > > > Here is the discussion thread.
> > > > > > c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/">https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > 
> > > > > > This series removes all the EPYC mode specific apicid changes and 
> > > > > > use the generic
> > > > > > apicid decode.    
> > > > > 
> > > > > the main difference between EPYC and all other CPUs is that
> > > > > it requires numa configuration (it's not optional)
> > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > 
> > > > >  if (epyc && !numa) 
> > > > >     error("EPYC cpu requires numa to be configured")    
> > > > 
> > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > as a requirement.
> > > > 
> > > > Why do we need to force this ?  People have been successfuly using
> > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > 
> > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > obviously caused the world to come crashing down.  
> > > So far it produces warning in linux kernel (RHBZ1728166),
> > > (resulting performance might be suboptimal), but I haven't seen
> > > anyone reporting crashes yet.
> > > 
> > > 
> > > What other options do we have?
> > > Perhaps we can turn on strict check for new machine types only,
> > > so old configs can keep broken topology (CPUID),
> > > while new ones would require -numa and produce correct topology.  
> > 
> > No, tieing this to machine types is not viable either. That is still
> > going to break essentially every single management application that
> > exists today using QEMU.
> for that we have deprecation process, so users could switch to new CLI
> that would be required.

We could, but I don't find the cost/benefit tradeoff is compelling.

There are so many places where we diverge from what bare metal would
do, that I don't see a good reason to introduce this breakage, even
if we notify users via a deprecation message. 

If QEMU wants to require NUMA for EPYC, then QEMU could internally
create a single NUMA node if none was specified for new machine
types, such that there is no visible change or breakage to any
mgmt apps.  


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]