home | list info | list archive | date index | thread index

Re: [OCLUG-Tech] How good is Linux at NUMA ?

On Tue, 2006-03-28 at 13:56 -0500, Martin Hicks wrote:
> On Tue, Mar 28, 2006 at 01:44:38PM -0500, Peter Sjoberg wrote:
> > On Tue, 2006-03-28 at 12:21 -0500, Martin Hicks wrote:
> > > On Tue, Mar 28, 2006 at 11:27:08AM -0500, Peter Sjoberg wrote:
> > > > I just wonder if anyone knows what state  the linux NUMA implementation
> > > > is at. I looked on some linux numa sites but it seems very dated (many
> > > > 2001 and 2004 the latest).
> > > > I had a discussion with a friend about how good numa is on linux and his
> > > > opinion is that we should run x86 Solaris on all opterons or replace the
> > > > 4way with UP system since numa is so immature under linux.
> > > 
> > > NUMA works quite well under Linux.  There is room for improvement,
> > > perhaps, but people like HP, AMD and SGI all have NUMA machines, and
> > > SGI's are even Linux *only* (and they're actually the biggest computers
> > > too, with up to 1024CPUs in a single machine)
> > I'm interested in how far it goes. As I understand it you have some
> > stickiness when it comes to process cpu but don't know how strong it is.
> 
> Its pretty strong.  With cpusets you can limit it to a single CPU.
> 
> > If a process starts to access a file, are the cache buffers allocated
> > randomly or going to the same cpu(socket). Is memory migration planned?
> 
> I think buffer cache is allocated round-robin, although you might be
> able to get it to allocate only in a specific set of nodes (cpuset)
> 
> Some parts of migration work already.  I'm not exactly sure where things
> are right now.  Ray Bryant was working on that for SGI and AMD.
> 
> > I seen some talk about making it more aware about near -far-farther
> > memory, and make it work better with multi-core and multi threading.
> 
> The locality of memory is part of ACPI.  Linux reads this information
> from ACPi tables and uses it to build zonelists appropriately.
> 
> > Solaris has implemented at least some of it in the latest version (to
> > work better with there 8 core 32 thread cpu). 
> > Where does Linux kernel stand on this?
> 
> Linux worked fine on 512p machines a two years ago.
I guess I'm asking a little on the difference between "work fine" and
"work perfect".
Just look at 
http://www.jroller.com/page/jaimec?entry=sparc_linux

"Linux as almost nothing to make it perform better on NUMA architectures
and, despite the efforts companies like RedHat, it's still laging
Solaris (and probably Windows) in this."

"Just to clear some confusions, AMD is NUMA, Intel and the T1 aren't,
Solaris is NUMA aware, Linux isn't and Microsoft is trying to introduce
NUMA awareness into Windows."
> 
> > > 
> > > There is libNUMA to do manual intervention on how you want memory
> > > policies to be inforced (local first, round robin, local only) as well
> > > as assigning jobs to run on certain CPUs.
> > Guess that works for program written for it but what about normal
> > cases. 
> 
> numactl
> 
> mh
>