Commit graph

1608 commits

Author SHA1 Message Date
Mitch Hayenga a5c4eb3de9 isa,cpu: Add support for FS SMT Interrupts
Adds per-thread interrupt controllers and thread/context logic
so that interrupts properly get routed in SMT systems.
2015-09-30 11:14:19 -05:00
Mitch Hayenga fafa83ed32 cpu: Add per-thread monitors
Adds per-thread address monitors to support FullSystem SMT.
2015-09-30 11:14:19 -05:00
Mitch Hayenga 582a0148b4 config,cpu: Add SMT support to Atomic and Timing CPUs
Adds SMT support to the "simple" CPU models so that they can be
used with other SMT-supported CPUs. Example usage: this enables
the TimingSimpleCPU to be used to warmup caches before swapping to
detailed mode with the in-order or out-of-order based CPU models.
2015-09-30 11:14:19 -05:00
Mitch Hayenga 52d521e433 cpu: Change thread assignments for heterogenous SMT
Trying to run an SE system with varying threads per core (SMT cores + Non-SMT
cores) caused failures due to the CPU id assignment logic.  The comment
about thread assignment (worrying about core 0 not having tid 0) seems
not to be valid given that our configuration scripts initialize them in
order.

This removes that constraint so a heterogenously threaded sytem can work.
2015-09-30 11:14:19 -05:00
Andrew Lukefahr 543efd5ca6 cpu: pred: Local Predictor Reset in Tournament Predictor
When a branch gets squashed, it's speculative branch predictor state should get
rolled back in squash().  However, only the globalHistory state was being
rolled back.  This patch adds (at least some) support for rolling back the
local predictor state also.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-09-15 08:14:07 -05:00
Hongil Yoon fb0f9884e2 cpu, o3: consider split requests for LSQ checksnoop operations
This patch enables instructions in LSQ to track two physical addresses for
corresponding two split requests. Later, the information is used in
checksnoop() to search for/invalidate the corresponding LD instructions.

The current implementation has kept track of only the physical address that is
referenced by the first split request. Thus, for checksnoop(), the line
accessed by the second request has not been considered, causing potential
correctness issues.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-09-15 08:14:06 -05:00
Nilay Vaish 4727fc26f8 ruby: eliminate type uint64 and int64
These types are being replaced with uint64_t and int64_t.
2015-08-29 10:19:23 -05:00
Andreas Hansson daaae3744d mem: Reflect that packet address and size are always valid
This patch simplifies the packet, and removes the possibility of
creating a packet without a valid address and/or size. Under no
circumstances are these fields set at a later point, and thus they
really have to be provided at construction time.

The patch also fixes a case there the MinorCPU creates a packet
without a valid address and size, only to later delete it.
2015-08-21 07:03:27 -04:00
Andreas Hansson ae06e9a5c6 cpu: Move invldPid constant from Request to BaseCPU
A more natural home for this constant.
2015-08-21 07:03:14 -04:00
Nilay Vaish 2f44dada68 ruby: reverts to changeset: bf82f1f7b040 2015-08-19 10:02:01 -05:00
Nilay Vaish a6f3f38f2c ruby: eliminate type uint64 and int64
These types are being replaced with uint64_t and int64_t.
2015-08-14 19:28:43 -05:00
Nilay Vaish 91a84c5b3c ruby: replace Address by Addr
This patch eliminates the type Address defined by the ruby memory system.
This memory system would now use the type Addr that is in use by the
rest of the system.
2015-08-14 12:04:51 -05:00
Nilay Vaish 759fe30d9f ruby: drop some redundant includes 2015-08-11 11:39:23 -05:00
Andreas Sandberg 53e777d683 base: Declare a type for context IDs
Context IDs used to be declared as ad hoc (usually as int). This
changeset introduces a typedef for ContextIDs and a constant for
invalid context IDs.
2015-08-07 09:59:13 +01:00
David Hashe 8f71e667b3 cpu: Fixed a bug on where to fetch the next instruction from
Figure out if the next instruction to fetch comes from the micro-op ROM
or not. Otherwise, wrong instructions may be fetched.
2015-07-20 09:15:18 -05:00
Andreas Sandberg f789d729b5 cpu: Update debug message from Fetch1 isDrained() in Minor
Fix a spurious %s and include the state of the Fetch1 stage in the
debug printout.
2015-07-31 17:04:59 +01:00
Andreas Sandberg f73b05431a cpu: Fix Minor drain issues when switched out
The Minor CPU currently doesn't drain properly when it is switched
out. This happens because Fetch 1 expects to be in the FetchHalted
state when it is drained. However, because the CPU is switched out, it
is stuck in the FetchWaitingForPC state. Fix this by ignoring drain
requests and returning DrainState::Drained from MinorCPU::drain() if
the CPU is switched out. This is always safe since a switched out CPU,
by definition, doesn't have any instructions in flight.
2015-07-31 17:04:59 +01:00
Andreas Sandberg ff8195235e cpu: Only activate thread 0 in Minor if the CPU is active
Minor currently activates thread 0 in startup() to work around an
issue where activateContext() is called from LiveProcess before the
process entry point is known. When activateContext() is called, Minor
creates a branch instruction to the process's entry point. The first
time it is called, the branch points to an undefined location (0). The
call in startup() updates the branch to point to the actual entry
point.

When instantiating a switched out Minor CPU, it still tries to
activate thread 0. This is clearly incorrect since a switched out CPU
can't have any active threads. This changeset adds a check to ensure
that the thread is active before reactivating it.
2015-07-30 10:15:50 +01:00
Andreas Sandberg 473a0dcc63 cpu: Fix drain issues in the Minor CPU
The drain refactor patches introduced a couple of bugs in the way
Minor handles draining. This patch fixes an incorrect assert and a
case of infinite recursion when the CPU signals drain done.
2015-07-30 10:15:50 +01:00
Andreas Hansson 85a44e24dc cpu: Fix issue identified by UBSan 2015-07-30 03:41:22 -04:00
Nilay Vaish aafa5c3f86 revert 5af8f40d8f2c 2015-07-28 01:58:04 -05:00
Nilay Vaish 608641e23c cpu: implements vector registers
This adds a vector register type.  The type is defined as a std::array of a
fixed number of uint64_ts.  The isa_parser.py has been modified to parse vector
register operands and generate the required code.  Different cpus have vector
register files now.
2015-07-26 10:21:20 -05:00
Nilay Vaish 6e354e82d9 cpu: o3: slight correction to identation in rename_impl.hh 2015-07-26 10:20:07 -05:00
Brandon Potter bfe7ee96ad ruby: replace global g_abs_controls with per-RubySystem var
This is another step in the process of removing global variables
from Ruby to enable multiple RubySystem instances in a single simulation.

The list of abstract controllers is per-RubySystem and should be
represented that way, rather than as a global.

Since this is the last remaining Ruby global variable, the
src/mem/ruby/Common/Global.* files are also removed.
2015-07-10 16:05:24 -05:00
Andreas Sandberg ed38e3432c sim: Refactor and simplify the drain API
The drain() call currently passes around a DrainManager pointer, which
is now completely pointless since there is only ever one global
DrainManager in the system. It also contains vestiges from the time
when SimObjects had to keep track of their child objects that needed
draining.

This changeset moves all of the DrainState handling to the Drainable
base class and changes the drain() and drainResume() calls to reflect
this. Particularly, the drain() call has been updated to take no
parameters (the DrainManager argument isn't needed) and return a
DrainState instead of an unsigned integer (there is no point returning
anything other than 0 or 1 any more). Drainable objects should return
either DrainState::Draining (equivalent to returning 1 in the old
system) if they need more time to drain or DrainState::Drained
(equivalent to returning 0 in the old system) if they are already in a
consistent state. Returning DrainState::Running is considered an
error.

Drain done signalling is now done through the signalDrainDone() method
in the Drainable class instead of using the DrainManager directly. The
new call checks if the state of the object is DrainState::Draining
before notifying the drain manager. This means that it is safe to call
signalDrainDone() without first checking if the simulator has
requested draining. The intention here is to reduce the code needed to
implement draining in simple objects.
2015-07-07 09:51:05 +01:00
Andreas Sandberg e9c3d59aae sim: Make the drain state a global typed enum
The drain state enum is currently a part of the Drainable
interface. The same state machine will be used by the DrainManager to
identify the global state of the simulator. Make the drain state a
global typed enum to better cater for this usage scenario.
2015-07-07 09:51:04 +01:00
Andreas Sandberg 76cd4393c0 sim: Refactor the serialization base class
Objects that are can be serialized are supposed to inherit from the
Serializable class. This class is meant to provide a unified API for
such objects. However, so far it has mainly been used by SimObjects
due to some fundamental design limitations. This changeset redesigns
to the serialization interface to make it more generic and hide the
underlying checkpoint storage. Specifically:

  * Add a set of APIs to serialize into a subsection of the current
    object. Previously, objects that needed this functionality would
    use ad-hoc solutions using nameOut() and section name
    generation. In the new world, an object that implements the
    interface has the methods serializeSection() and
    unserializeSection() that serialize into a named /subsection/ of
    the current object. Calling serialize() serializes an object into
    the current section.

  * Move the name() method from Serializable to SimObject as it is no
    longer needed for serialization. The fully qualified section name
    is generated by the main serialization code on the fly as objects
    serialize sub-objects.

  * Add a scoped ScopedCheckpointSection helper class. Some objects
    need to serialize data structures, that are not deriving from
    Serializable, into subsections. Previously, this was done using
    nameOut() and manual section name generation. To simplify this,
    this changeset introduces a ScopedCheckpointSection() helper
    class. When this class is instantiated, it adds a new /subsection/
    and subsequent serialization calls during the lifetime of this
    helper class happen inside this section (or a subsection in case
    of nested sections).

  * The serialize() call is now const which prevents accidental state
    manipulation during serialization. Objects that rely on modifying
    state can use the serializeOld() call instead. The default
    implementation simply calls serialize(). Note: The old-style calls
    need to be explicitly called using the
    serializeOld()/serializeSectionOld() style APIs. These are used by
    default when serializing SimObjects.

  * Both the input and output checkpoints now use their own named
    types. This hides underlying checkpoint implementation from
    objects that need checkpointing and makes it easier to change the
    underlying checkpoint storage code.
2015-07-07 09:51:03 +01:00
Nilay Vaish 11a48faeb4 o3: correct the number of cc registers in rename map 2015-07-04 10:43:46 -05:00
Andreas Sandberg 7c4eb3b4d8 kvm, arm: Add support for aarch64
This changeset adds support for aarch64 in kvm. The CPU module
supports both checkpointing and online CPU model switching as long as
no devices are simulated by the host kernel. It currently has the
following limitations:

   * The system register based generic timer can only be simulated by
     the host kernel. Workaround: Use a memory mapped timer instead to
     simulate the timer in gem5.

   * Simulating devices (e.g., the generic timer) in the host kernel
     requires that the host kernel also simulates the GIC.

   * ID registers in the host and in gem5 must match for switching
     between simulated CPUs and KVM. This is particularly important
     for ID registers describing memory system capabilities (e.g.,
     ASID size, physical address size).

   * Switching between a virtualized CPU and a simulated CPU is
     currently not supported if in-kernel device emulation is
     used. This could be worked around by adding support for switching
     to the gem5 (e.g., the KvmGic) side of the device models. A
     simpler workaround is to avoid in-kernel device models
     altogether.
2015-06-01 19:44:19 +01:00
Andreas Sandberg dbfd6effe0 kvm, arm, dev: Add an in-kernel GIC implementation
This changeset adds a GIC implementation that uses the kernel's
built-in support for simulating the interrupt controller. Since there
is currently no support for state transfer between gem5 and the
kernel, the device model does not support serialization and CPU
switching (which would require switching to a gem5-simulated GIC).
2015-06-01 19:44:17 +01:00
Andreas Sandberg 8e7c0575dc kvm: Handle inst events at the current instruction count
There are cases (particularly when attaching GDB) when instruction
events are scheduled at the current instruction tick. This used to
trigger an assertion error in kvm. This changeset adds a check for
this condition and forces KVM to do a quick entry that completes any
pending IO operations, but does not execute any new instructions,
before servicing the event. We could check if we need to enter KVM at
all, but forcing a quick entry is makes the code slightly cleaner and
does not hurt correctness (performance is hardly an issue in these
cases).
2015-06-01 19:43:41 +01:00
Andreas Sandberg 06cf5cc60b kvm, arm: Move ARM-specific files to arch/arm/kvm/
This changeset moves the ARM-specific KVM CPU implementation to
arch/arm/kvm/. This change is expected to keep the source tree
somewhat cleaner as we start adding support for ARMv8 and KVM
in-kernel interrupt controller simulation.

--HG--
rename : src/cpu/kvm/ArmKvmCPU.py => src/arch/arm/kvm/ArmKvmCPU.py
rename : src/cpu/kvm/arm_cpu.cc => src/arch/arm/kvm/arm_cpu.cc
rename : src/cpu/kvm/arm_cpu.hh => src/arch/arm/kvm/arm_cpu.hh
2015-06-01 19:43:40 +01:00
Andrew Bardsley cea1d14a93 cpu: Fix a bug in counting issued instructions in MinorCPU
The MinorCPU would count bubbles in Execute::issue as part of
the num_insts_issued and so sometimes reach the instruction
issue limit incorrectly.

Fixed by checking for a bubble in one new place.
2015-05-26 03:21:37 -04:00
Andreas Sandberg 5435f25ec8 kvm: Fix dumping code for large registers
The register dumping code in kvm tries to print the bytes in large
registers (128 bits and larger) instead of printing them as hex. This
changeset fixes that.
2015-05-23 13:37:22 +01:00
Andreas Sandberg ed447bbff9 kvm, x86: Guard x86-specific APIs in KvmVM
Protect x86-specific APIs in KvmVM with compile-time guards to avoid
breaking ARM builds.
2015-05-23 13:37:20 +01:00
Andreas Hansson bd583d00f9 misc: Appease gcc 5.1
Three minor issues are resolved:

1. Apparently gcc 5.1 does not like negation of booleans followed by
   bitwise AND.

2. Somehow the compiler also gets confused and warns about
   NoopMachInst being unused (removing it causes compilation errors
   though). Most likely a compiler bug.

3. There seems to be a number of instances where loop unrolling causes
   false positives for the array-bounds check. For now, switch to
   std::array. Potentially we could disable the warning for newer gcc
   versions, but switching to std::array is probably a good move in
   any case.
2015-05-15 13:39:53 -04:00
Andreas Sandberg 48281375ee mem, cpu: Add a separate flag for strictly ordered memory
The Request::UNCACHEABLE flag currently has two different
functions. The first, and obvious, function is to prevent the memory
system from caching data in the request. The second function is to
prevent reordering and speculation in CPU models.

This changeset gives the order/speculation requirement a separate flag
(Request::STRICT_ORDER). This flag prevents CPU models from doing the
following optimizations:

    * Speculation: CPU models are not allowed to issue speculative
      loads.

    * Write combining: CPU models and caches are not allowed to merge
      writes to the same cache line.

Note: The memory system may still reorder accesses unless the
UNCACHEABLE flag is set. It is therefore expected that the
STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent
this behavior.
2015-05-05 03:22:33 -04:00
Andreas Hansson 36f29496a0 mem: Snoop into caches on uncacheable accesses
This patch takes a last step in fixing issues related to uncacheable
accesses. We do not separate uncacheable memory from uncacheable
devices, and in cases where it is really memory, there are valid
scenarios where we need to snoop since we do not support cache
maintenance instructions (yet). On snooping an uncacheable access we
thus provide data if possible. In essence this makes uncacheable
accesses IO coherent.

The snoop filter is also queried to steer the snoops, but not updated
since the uncacheable accesses do not allocate a block.
2015-05-05 03:22:29 -04:00
Andreas Hansson d0d933facc cpu: Work around gcc 4.9 issues with Num_OpClasses
This patch fixes a recent issue with gcc 4.9 (and possibly more) being
convinced that indices outside the array bounds are used when
initialising the FUPool members.
2015-05-05 03:22:19 -04:00
Nilay Vaish 4333549575 cpu: o3: replace issueLatency with bool pipelined
Currently, each op class has a parameter issueLat that denotes the cycles after
which another op of the same class can be issued.  As of now, this latency can
either be one cycle (fully pipelined) or same as execution latency of the op
(not at all pipelined).  The fact that issueLat is a parameter of type Cycles
makes one believe that it can be set to any value.  To avoid the confusion, the
parameter is being renamed as 'pipelined' with type boolean.  If set to true,
the op would execute in a fully pipelined fashion. Otherwise, it would execute
in an unpipelined fashion.
2015-04-29 22:35:22 -05:00
Nilay Vaish 0dbd696aae cpu: o3: single cycle default div microop latency on x86
This patch sets the default latency of the division microop to a single cycle
on x86.  This is because the division instructions DIV and IDIV have been
implemented as loops of div microops, where each microop computes a single bit
of the quotient.
2015-04-29 22:35:22 -05:00
Brandon Potter a70a83155b cpu: remove conditional check (count > 0) on o3 IQ squashes
The o3 cpu instruction queue model uses the count variable to track the number
of unissued instructions in the queue. Previously, the squash method used
this variable to avoid executing the doSquash method when there were no
unissued instructions in the pipeline.  A corner case problem exists when
only issued instructions exist in the pipeline and a squash occurs; the
doSquash code is not invoked and subsequently does not clean up state properly.
2015-04-22 07:52:03 -07:00
Andreas Hansson cd76e34056 cpu: Remove the InOrderCPU from the tree
This patch takes the final step in removing the InOrderCPU from the
tree. Rest in peace.

The MinorCPU is now used to model an in-order microarchitecture, and
long term the MinorCPU will eventually be renamed InOrderCPU.
2015-04-20 12:46:35 -04:00
Malek Musleh 826f69b470 config, cpu: fix progress interval for switched CPUs
This patch ensures that the CPU progress Event is triggered for the new set of
switched_cpus that get scheduled (e.g. during fast-forwarding). it also avoids
printing the interval state if the cpu is currently switched out.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-14 11:01:10 -05:00
Dibakar Gope 34ad1123ee cpu: re-organizes the branch predictor structure.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-13 17:33:57 -05:00
Nikos Nikoleris 305e29b98e cpu: fix system total instructions accounting
The totalInstructions counter is only incremented when the whole instruction is
commited and not on every microop. It was incorrectly reset in atomic and
timing cpus.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>"
2015-04-03 11:42:10 -05:00
Andreas Hansson a196dbe3bf cpu: Fix InstPBTrace inheritance
This patch fixes an issue that prevented gem5 to be built with C++
config and without Python.
2015-03-26 11:16:43 -04:00
Steve Reinhardt 6677b9122a mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMW
Makes x86-style locked operations even more distinct from
LLSC operations.  Using "locked" by itself should be
obviously ambiguous now.
2015-03-23 16:14:20 -07:00
Wendy Elsasser 9b4d8030e6 cpu: Fix TrafficGen message format
Fix erroneous message format for fatal error.
Previously, code did not have type indicator (% instead of %d).

Also removed redundant fatal check.

Ran modified sweep.py with in range and out of range values to test.
2015-03-19 04:06:12 -04:00
Steve Reinhardt ee0b52404c mem: restructure Packet cmd initialization a bit more
Refactor the way that specific MemCmd values are generated for packets.
The new approach is a little more elegant in that we assign the right
value up front, and it's also more amenable to non-heap-allocated
Packet objects.

Also replaced the code in the Minor model that was still doing it the
ad-hoc way.

This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249.
2015-02-11 10:48:50 -08:00