Commit graph

2957 commits

Author SHA1 Message Date
Brandon Potter 3fa311e5ac syscall_emul: add many Linux kernel flags 2016-03-17 10:22:39 -07:00
Brandon Potter b8688346a5 syscall_emul: rename OpenFlagTransTable struct
The structure definition only had the open system call flag set in mind when
it was named, so we rename it here with the intention of using it to define
additional tables to translate flags for other system calls in the future.
2016-03-17 10:22:39 -07:00
Nathanael Premillieu f9cae4ae58 arm: Fix disasm printing
Fix the printDataInst function to properly print the immediate value.
2016-03-16 16:08:24 +00:00
Andreas Sandberg 5383e1ada4 base: Add support for changing output directories
This changeset adds support for changing the simulator output
directory. This can be useful when the simulation goes through several
stages (e.g., a warming phase, a simulation phase, and a verification
phase) since it allows the output from each stage to be located in a
different directory. Relocation is done by calling core.setOutputDir()
from Python or simout.setOutputDirectory() from C++.

This change affects several parts of the design of the gem5's output
subsystem. First, files returned by an OutputDirectory instance (e.g.,
simout) are of the type OutputStream instead of a std::ostream. This
allows us to do some more book keeping and control re-opening of files
when the output directory is changed. Second, new subdirectories are
OutputDirectory instances, which should be used to create files in
that sub-directory.

Signed-off-by: Andreas Sandberg <andreas@sandberg.pp.se>
[sascha.bischoff@arm.com: Rebased patches onto a newer gem5 version]
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2015-11-27 14:41:59 +00:00
Stephan Diestelhorst f703160e5a mem, cpu: Add assertions to snoop invalidation logic
This patch adds assertions that enforce that only invalidating snoops
will ever reach into the logic that tracks in-order load completion and
also invalidation of LL/SC (and MONITOR / MWAIT) monitors. Also adds
some comments to MSHR::replaceUpgrades().
2015-08-10 11:25:52 +01:00
Mitch Hayenga c0d19391d4 arm: Squash after returning from exceptions in v7
Properly done for the ERET instruction in v8, but not for v7.
Many control register changes are only visible after explicit
instruction synchronization barriers or exception entry/exit.
This means mode changing instructions should squash any
younger in-flight speculative instructions.
2016-02-29 19:13:13 -06:00
Andreas Hansson 4619f0ee8b scons: Add missing override to appease clang
Make clang happy...again.
2016-02-23 03:27:20 -05:00
Andreas Hansson 0d50979888 misc: Add missing overrides to appease clang
Since the last round of fixes a few new issues have snuck in. We
should consider switching the regression runs to clang.
2016-02-15 03:40:32 -05:00
Michael LeBeane 8d923c7380 syscall_emul: Implement clock_getres() system call
This patch implements the clock_getres() system call for arm and x86 in linux
SE mode.
2016-02-13 12:33:07 -05:00
Alexandru Dutu 0f27d70e90 x86: revamp cmpxchg8b/cmpxchg16b implementation
The previous implementation did a pair of nested RMW operations,
which isn't compatible with the way that locked RMW operations are
implemented in the cache models.  It was convenient though in that
it didn't require any new micro-ops, and supported cmpxchg16b using
64-bit memory ops.  It also worked in AtomicSimpleCPU where
atomicity was guaranteed by the core and not by the memory system.
It did not work with timing CPU models though.

This new implementation defines new 'split' load and store micro-ops
which allow a single memory operation to use a pair of registers as
the source or destination, then uses a single ldsplit/stsplit RMW
pair to implement cmpxchg.  This patch requires support for 128-bit
memory accesses in the ISA (added via a separate patch) to support
cmpxchg16b.
2016-02-06 17:21:20 -08:00
Steve Reinhardt 5200e04e92 arch, x86: add support for arrays as memory operands
Although the cache models support wider accesses, the ISA descriptions
assume that (for the most part) memory operands are integer types,
which makes it difficult to define instructions that do memory accesses
larger than 64 bits.

This patch adds some generic support for memory operands that are arrays
of uint64_t, and specifically a 'u2qw' operand type for x86 that is an
array of 2 uint64_ts (128 bits).  This support is unused at this point,
but will be needed shortly for cmpxchg16b.  Ideally the 128-bit SSE
memory accesses will also be rewritten to use this support.

Support for 128-bit accesses could also have been added using the gcc
__int128_t extension, which would have been less disruptive.  However,
although clang also supports __int128_t, it's still non-standard.
Also, more importantly, this approach creates a path to defining
256- and 512-byte operands as well, which will be useful for eventual
AVX support.
2016-02-06 17:21:20 -08:00
Steve Reinhardt f5343df1e1 arch: get rid of dummy var init
MemOperand variables were being initialized to 0
"to avoid 'uninitialized variable' errors" but these
no longer seem to be a problem (with the exception of
one use case in POWER that is arguably broken and
easily fixed here).

Getting rid of the initialization is necessary to
set up a subsequent patch which extends memory
operands to possibly not be scalars, making the
'= 0' initialization no longer feasible.
2016-02-06 17:21:20 -08:00
Steve Reinhardt 92b750d5ef syscall_emul: fix bug in aux vector initialization
Writing 16 bytes from an 8-byte source value is a bad idea.
This doesn't appear to have broken anything, but showed up
as spurious differences when tracediffing runs.
2016-02-06 17:21:20 -08:00
Steve Reinhardt f6b828d068 style: eliminate explicit boolean comparisons
Result of running 'hg m5style --skip-all --fix-control -a' to get
rid of '== true' comparisons, plus trivial manual edits to get
rid of '== false'/'== False' comparisons.

Left a couple of explicit comparisons in where they didn't seem
unreasonable:
invalid boolean comparison in src/arch/mips/interrupts.cc:155
>>        DPRINTF(Interrupt, "Interrupts OnCpuTimerINterrupt(tc) == true\n");<<
invalid boolean comparison in src/unittest/unittest.hh:110
>>            "EXPECT_FALSE(" #expr ")", (expr) == false)<<
2016-02-06 17:21:20 -08:00
Steve Reinhardt 2d91e741e8 x86: create function to check miscreg validity
In the process of trying to get rid of an '== false' comparison,
it became apparent that a slightly more involved solution was
needed.  Split this out into its own changeset since it's not
a totally trivial local change like the others.
2016-02-06 17:21:20 -08:00
Steve Reinhardt 5592798865 style: fix missing spaces in control statements
Result of running 'hg m5style --skip-all --fix-control -a'.
2016-02-06 17:21:19 -08:00
Steve Reinhardt dc8018a5c3 style: remove trailing whitespace
Result of running 'hg m5style --skip-all --fix-white -a'.
2016-02-06 17:21:18 -08:00
Tony Gutierrez 1a7d3f9fcb gpu-compute: AMD's baseline GPU model 2016-01-19 14:28:22 -05:00
Steve Reinhardt 1b6355c895 cpu. arch: add initiateMemRead() to ExecContext interface
For historical reasons, the ExecContext interface had a single
function, readMem(), that did two different things depending on
whether the ExecContext supported atomic memory mode (i.e.,
AtomicSimpleCPU) or timing memory mode (all the other models).
In the former case, it actually performed a memory read; in the
latter case, it merely initiated a read access, and the read
completion did not happen until later when a response packet
arrived from the memory system.

This led to some confusing things, including timing accesses
being required to provide a pointer for the return data even
though that pointer was only used in atomic mode.

This patch splits this interface, adding a new initiateMemRead()
function to the ExecContext interface to replace the timing-mode
use of readMem().

For consistency and clarity, the readMemTiming() helper function
in the ISA definitions is renamed to initiateMemRead() as well.
For x86, where the access size is passed in explicitly, we can
also get rid of the data parameter at this level.  For other ISAs,
where the access size is determined from the type of the data
parameter, we have to keep the parameter for that purpose.
2016-01-17 18:27:46 -08:00
Steve Reinhardt e595d9cccb arch: don't call *Timing functions from *Atomic versions
The readMemAtomic/writeMemAtomic helper functions were calling
readMemTiming/writeMemTiming respectively.  This is functionally
correct, since the *Timing functions are doing the same access
initiation operation as the *Atomic functions (just that the
*Atomic versions also complete the access in line).  It also
provides for some (very minimal) code reuse.  Unfortunately,
it's potentially pretty confusing, since it makes it look like
the atomic accesses are somehow being converted to timing
accesses.  It also gets in the way of specializing the timing
interface (as will be done in a future patch).
2016-01-17 18:27:46 -08:00
Steve Reinhardt fb0383bc72 arch: get rid of unused LargestRead typedef 2016-01-17 18:27:46 -08:00
Steve Reinhardt 28a0e5a165 sim: don't ignore SIG_TRAP
By ignoring SIG_TRAP, using --debug-break <N> when not connected to
a debugger becomes a no-op.  Apparently this was intended to be a
feature, though the rationale is not clear.

If we don't ignore SIG_TRAP, then using --debug-break <N> when not
connected to a debugger causes the simulation process to terminate
at tick N.  This is occasionally useful, e.g., if you just want to
collect a trace for a specific window of execution then you can combine
this with --debug-start to do exactly that.

In addition to not ignoring the signal, this patch also updates
the --debug-break help message and deletes a handful of unprotected
calls to Debug::breakpoint() that relied on the prior behavior.
2016-01-17 18:27:46 -08:00
Andreas Hansson 12eb034378 scons: Enable -Wextra by default
Make best use of the compiler, and enable -Wextra as well as
-Wall. There are a few issues that had to be resolved, but they are
all trivial.
2016-01-11 05:52:20 -05:00
Gabor Dozsa e677494260 pseudo inst,util: Add optional key to initparam pseudo instruction
The key parameter can be used to read out various config parameters from
within the simulated software.
2016-01-07 16:33:47 -06:00
Boris Shingarov d765dbf22c arm: remote GDB: rationalize structure of register offsets
Currently, the wire format of register values in g- and G-packets is
modelled using a union of uint8/16/32/64 arrays.  The offset positions
of each register are expressed as a "register count" scaled according
to the width of the register in question.  This results in counter-
intuitive and error-prone "register count arithmetic", and some
formats would even be altogether unrepresentable in such model, e.g.
a 64-bit register following a 32-bit one would have a fractional index
in the regs64 array.
Another difficulty is that the array is allocated before the actual
architecture of the workload is known (and therefore before the correct
size for the array can be calculated).

With this patch I propose a simpler mechanism for expressing the
register set structure.  In the new code, GdbRegCache is an abstract
class; its subclasses contain straightforward structs reflecting the
register representation.  The determination whether to use e.g. the
AArch32 vs. AArch64 register set (or SPARCv8 vs SPARCv9, etc.) is made
by polymorphically dispatching getregs() to the concrete subclass.
The subclass is not instantiated until it is needed for actual
g-/G-packet processing, when the mode is already known.

This patch is not meant to be merged in on its own, because it changes
the contract between src/base/remote_gdb.* and src/arch/*/remote_gdb.*,
so as it stands right now, it would break the other architectures.
In this patch only the base and the ARM code are provided for review;
once we agree on the structure, I will provide src/arch/*/remote_gdb.*
for the other architectures; those patches could then be merged in
together.

Review Request: http://reviews.gem5.org/r/3207/
Pushed by Joel Hestness <jthestness@gmail.com>
2015-12-18 15:12:07 -06:00
Andreas Sandberg 6a05179e13 arm, config: Automatically discover available platforms
Add support for automatically discover available platforms. The
Python-side uses functionality similar to what we use when
auto-detecting available CPU models. The machine IDs have been updated
to match the platform configurations. If there isn't a matching
machine ID, the configuration scripts default to -1 which Linux uses
for device tree only platforms.
2015-12-04 00:19:05 +00:00
Andreas Sandberg a1aeff27ce arm: Add support for automatic boot loader selection
Add support for automatically selecting a boot loader that matches the
guest system's kernel. Instead of accepting a single boot loader, the
ArmSystem class now accepts a vector of boot loaders. When
initializing a system, the we now look for the first boot loader with
an architecture that matches the kernel.

This changeset makes it possible to use the same system for both
64-bit and 32-bit kernels.
2015-12-03 23:53:37 +00:00
Nathanael Premillieu bbdd7cecb9 arm: Fix fplib 128-bit shift operators
Appease clang.
2015-11-22 05:10:18 -05:00
Swapnil Haria 08cec03f8e x86: Invalidating TLB entry on page fault
As per the x86 architecture specification, matching TLB entries need to be
invalidated on a page fault. For instance, after a page fault due to inadequate
protection bits on a TLB hit, the TLB entry needs to be invalidated. This
behavior is clearly specified in the x86 architecture manuals from both AMD and
Intel.  This invalidation is missing currently in gem5, due to which linux
kernel versions 3.8 and up cannot be simulated efficiently. This is exposed by
a linux optimisation in commit e4a1cc56e4d728eb87072c71c07581524e5160b1, which
removes a tlb flush on updating page table entries in x86.

Testing: Linux kernel versions 3.8 onwards were booting very slowly in FS mode,
due to repeated page faults (~300000 before the first print statement in a
bash file). Ensured that page fault rate drops drastically and observed
reduction in boot time from order of hours to minutes for linux kernel v3.8
and v3.11
2015-11-16 05:08:54 -06:00
Bjoern A. Zeeb f50e92d2c7 x86: cpuid: add family to warn() message
doCpuid() has to identical warn messages about unimplemented functions.  Add
the family to the log message to make them distinguishable.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-11-16 04:58:39 -06:00
Bjoern A. Zeeb 5c49635f20 x86: pagetable walker: fix typo in comment 2015-11-16 04:58:39 -06:00
Palle Lyckegaard a95e8ab887 sparc: Make remote debugging with gdb work
Remove sparc V8 TBR register from list of registers since it is not part of
sparc V9. This brings the number of registers in sync with what gdb expects

Without this patch gdb complains about receoved packet too long.

with this patch gdb is able to work properly with gem5 for remote debugging.

Note: gdb is version 7.8
Note: gdb is configured with --target=sparc64-sun-solaris2.8

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-11-16 04:58:39 -06:00
Nathanael Premillieu e6a6d6445b arm: Add secure flag to TableWalker request when needed 2015-10-29 08:48:26 -04:00
Victor Garcia 8427d05daa kvm, arm: Fix compilation errors due to API changes
The checkpoint changes, along with the SMT patches have changed a
number of APIs. Adapt the ArmKvmCPU accordingly.
2015-10-29 08:48:23 -04:00
Boris Shingarov 58cb57bacc power: Implement Remote GDB 2015-10-25 16:01:52 -07:00
Andreas Hansson b48ed9b6c2 x86: Add missing explicit overrides for X86 devices
Make clang >= 3.5 happy when compiling build/X86/gem5.opt on OSX.
2015-10-23 09:51:12 -04:00
Andreas Hansson 2ac04c11ac misc: Add explicit overrides and fix other clang >= 3.5 issues
This patch adds explicit overrides as this is now required when using
"-Wall" with clang >= 3.5, the latter now part of the most recent
XCode. The patch consequently removes "virtual" for those methods
where "override" is added. The latter should be enough of an
indication.

As part of this patch, a few minor issues that clang >= 3.5 complains
about are also resolved (unused methods and variables).
2015-10-12 04:08:01 -04:00
Andreas Hansson 22c04190c6 misc: Remove redundant compiler-specific defines
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap
(and similar) abstractions, as these are no longer needed with gcc 4.7
and clang 3.1 as minimum compiler versions.
2015-10-12 04:07:59 -04:00
Rekai Gonzalez Alberquilla d3d159749a isa: Add parameter to pick different decoder inside ISA
The decoder is responsible for splitting instructions in micro
operations (uops). Given that different micro architectures may split
operations differently, this patch allows to specify which micro
architecture each isa implements, so different cores in the system can
split instructions differently, also decoupling uop splitting
(microArch) from ISA (Arch). This is done making the decodification
calls templates that receive a type 'DecoderFlavour' that maps the
name of the operation to the class that implements it. This way there
is only one selection point (converting the command line enum to the
appropriate DecodeFeatures object). In addition, there is no explicit
code replication: template instantiation hides that, and the compiler
should be able to resolve a number of things at compile-time.
2015-10-09 14:50:54 -05:00
Steve Reinhardt 90c279e4b1 arch: clean up isa_parser error handling
Although some decent error messages were getting generated inside
isa_parser.py, they weren't always getting printed because of the
screwy way we were handling exceptions.  (Basically an inner
exception would get hidden by an outer exception, and the more
informative inner error message would not get printed.)

Also line numbers were messed up, since they were taken from the
lexer, which is typically a token (or more) ahead of the grammar
rule that's being matched.  Using the 'lineno' attribute that
PLY associates with the grammar production is more accurate.
The new LineTracker class extends lineno to track filenames as
well as line numbers.
2015-10-06 17:26:50 -07:00
Steve Reinhardt a2c875c746 x86: implement rcpps and rcpss SSE insts
These are packed single-precision approximate reciprocal operations,
vector and scalar versions, respectively.

This code was basically developed by copying the code for
sqrtps and sqrtss.  The mrcp micro-op was simplified relative to
msqrt since there are no double-precision versions of this operation.
2015-10-06 17:26:50 -07:00
Steve Reinhardt 57b9f53afa x86: implement fild, fucomi, and fucomip x87 insts
fild loads an integer value into the x87 top of stack register.
fucomi/fucomip compare two x87 register values (the latter
also doing a stack pop).
These instructions are used by some versions of GNU libstdc++.
2015-10-06 17:26:50 -07:00
Mitch Hayenga ccf4f6c3d7 arm: Change TLB Software Caching
In ARM, certain variables are only updated when a necessary change is
detected.  Having 2 SMT threads share a TLB resulted in these not being
updated as required.  This patch adds a thread context identifer to
assist in the invalidation of these variables.
2015-09-30 11:14:19 -05:00
Mitch Hayenga 9e07a7504c cpu,isa,mem: Add per-thread wakeup logic
Changes wakeup functionality so that only specific threads on SMT
capable cpus are woken.
2015-09-30 11:14:19 -05:00
Mitch Hayenga a5c4eb3de9 isa,cpu: Add support for FS SMT Interrupts
Adds per-thread interrupt controllers and thread/context logic
so that interrupts properly get routed in SMT systems.
2015-09-30 11:14:19 -05:00
Mitch Hayenga e255fa053f arm: SMT MPIDR Setting
Changes assignment of the MPIDR for multi-threaded systems only.
2015-09-30 11:14:19 -05:00
Palle Lyckegaard 3de9def6c1 sparc: writing to tick_cmpr should not cause a panic
This register is writable according to UA2005

Tried to boot NetBSD which starts the kernel by writing to the tick_cmpr
register.  Without the patch gem5 crashes with a panic.  With the patch NetBSD
starts to boot normally (although sun4v support in NetBSD is not complete yet)

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-09-15 08:14:07 -05:00
Andreas Hansson 6eb434c8a2 arm, mem: Remove unused CLEAR_LL request flag
Cleaning up dead code. The CLREX stores zero directly to
MISCREG_LOCKFLAG and so the request flag is no longer needed. The
corresponding functionality in the cache tags is also removed.
2015-08-21 07:03:25 -04:00
Andreas Hansson ae06e9a5c6 cpu: Move invldPid constant from Request to BaseCPU
A more natural home for this constant.
2015-08-21 07:03:14 -04:00
David Hashe a2d9aae3c3 x86: x86 instruction-implementation bug fixes
Added explicit data sizes and an opcode type for correct execution.
2015-07-20 09:15:18 -05:00