The change in 20.parser is from new x87 instructions. The change to
pc-o3-timing is not clear to me. It seems that this test might be invoking
some undefined behavior.
This patch ensures that the CPU progress Event is triggered for the new set of
switched_cpus that get scheduled (e.g. during fast-forwarding). it also avoids
printing the interval state if the cpu is currently switched out.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Restoring from a checkpoint with ruby + the DRAMCtrl memory model was not
working, because ruby and DRAMCtrl disagreed on the current tick during warmup.
Since there is no reason to do timing requests during warmup, use functional
requests instead.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
This patch adds an example configuration in ext/sst/tests/ that allows
an SST/gem5 instance to simulate a 4-core AArch64 system with SST's
memHierarchy components providing all the caches and memories.
This patch adds a connector that allows gem5 to be used as a component
in SST (Structural Simulation Toolkit, sst-simulator.org). At a high
level, this allows memory traffic to pass between the two simulators.
SST Links are roughly analogous to gem5 Ports, although Links do not
have a notion of master and slave. This distinction is important to
gem5, so when connecting a gem5 CPU to an SST cache, an ExternalSlave
must be used, and similarly when connecting the memory side of SST cache
to a gem5 port (for memory <-> I/O), an ExternalMaster must be used.
These connectors handle the administrative aspects of gem5
(initialization, simulation, shutdown) as well as translating SST's
MemEvents into gem5 Packets and vice-versa.
Restoring from a checkpoint fails if either the RTC or the RTC Timer
Interrrupt event is disabled. The restored machine tried incorrectly
to schedule the next event with negative offset.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Add 32-bit access width for PrimaryTiming register and 16bit for UDMAControl
register as FreeBSD required.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
The totalInstructions counter is only incremented when the whole instruction is
commited and not on every microop. It was incorrectly reset in atomic and
timing cpus.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>"
When running with the Exec flag, the mwait instruction attempted
to print out its source registers, which were never actually
initialized. This led to sporadic assertion failures when the
value stored there was invalid.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
The stride prefetcher had a hardcoded number of contexts (i.e. master-IDs)
that it could handle. Since master IDs need to be unique per system, and
every core, cache etc. requires a separate master port, a static limit on
these does not make much sense.
Instead, this patch adds a small hash map that will map all master IDs to
the right prefetch state and dynamically allocates new state for new master
IDs.
This patch changes the order of writeback allocation such that any
writebacks resulting from a tag lookup (e.g. for an uncacheable
access), are added to the writebuffer before any new MSHR entries are
allocated. This ensures that the writebacks logically precedes the new
allocations.
The patch also changes the uncacheable flush to use proper timed (or
atomic) writebacks, as opposed to functional writes.
This patch simplifies the code dealing with uncacheable timing
accesses, aiming to align it with the existing miss handling. Similar
to what we do in atomic, a timing request now goes through
Cache::access (where the block is also flushed), and then proceeds to
ignore any existing MSHR for the block in question. This unifies the
flow for cacheable and uncacheable accesses, and for atomic and timing.
This patch changes how we search for matching MSHRs, ignoring any MSHR
that is allocated for an uncacheable access. By doing so, this patch
fixes a corner case in the MSHRs where incorrect data ended up being
copied into a (cacheable) read packet due to a first uncacheable MSHR
target of size 4, followed by a cacheable target to the same MSHR of
size 64. The latter target was filled with nonsense data.
This patch removes the no-longer-needed
allocateUncachedReadBuffer. Besides the checks it is exactly the same
as allocateMissBuffer and thus provides no value.
This patch aligns all MSHR queue entries to block boundaries to
simplify checks for matches. Previously there were corner cases that
could lead to existing entries not being identified as matches.
There are, rather alarmingly, a few regressions that change with this
patch.
This patch subsumes the PREFETCH_SNOOP_SQUASH flag with the more
generic BLOCK_CACHED flag. Future patches implementing cache eviction
messages can use the BLOCK_CACHED flag in almost the same manner as
hardware prefetches use the PREFETCH_SNOOP_SQUASH flag. The
PREFTECH_SNOOP_FLAG is set if the prefetch target is found in the tags
or the MSHRs in any state, so we are simply replacing calls to
setPrefetchSquashed() with setBlockCached(). The case of where the
prefetch target is found in the writeback MSHRs of upper level caches
continues to be covered by the MEM_INHIBIT flag.
Currently if there are shell special characters in a
command-line argument, you can't copy and paste the
echoed command line onto a shell prompt because the
characters aren't quoted properly. This patch fixes
that problem.
When using gem5 as a slave simulator, it will not advance the
clock on its own and depends on the master simulator calling
simulate(). This new option lets us use the Python scripts
to do all the configuration while stopping short of actually
simulating anything.
This patch accomplishes two things:
1. Makes simulate()'s GlobalSimLoopExitEvent a singleton reused
across calls. This is slightly more efficient than recreating
it every time.
2. Gives callers to simulate() (especially other simulators) a
foolproof way of knowing that the simulation period ended
successfully by hitting the limit event. They can call
getLimitEvent() and compare it to the return
value of simulate().
This change was motivated by an ongoing effort to integrate gem5
and SST, with SST as the master sim and gem5 as the slave sim.
A recent changeset of mine (http://repo.gem5.org/gem5/rev/4cfe55719da5)
inadvertently fixed a bug in the Minor CPU model which caused it to treat
software prefetches as regular loads. Prior to this changeset, Minor
did an ad-hoc generation of memory commands that left out the PF check;
because it now uses the common code that the other CPU models use,
it generates prefetches properly. These stat changes reflect the fact
that the Minor model now issues SoftPFReqs.
Add a set of scripts to automatically test checkpointing in the
regression framework. The checkpointing tests are similar to the
switcheroo tests, but instead of switching between CPUs, they
checkpoint the system and restore from the checkpoint again. This is
done at regular intervals, typically while booting Linux.
The implementation is fairly straight forward, with the exception that
we have to work around gem5's inability to restore from a checkpoint
after a system has been instantiated. We work around this by forking
off child processes that does the actual simulation and never
instantiate a system in the parent process unless a maximum checkpoint
count is reached (in which case we just simulate the system to
completion in the parent).
Checkpoint testing is currently only enabled 32- and 64-bit ARM
systems using atomic CPUs.
Note: An unfortunate side-effect of forking is that every new process
will overwrite the stats and terminal output from the previous
process. This means that the output directory only contains data from
the last checkpoint.
This patch adds a random option to memtest.py which allows the user to
easily test valid random tree topologies. The patch also adds a
wrapper script to run soak tests using the newly introduced option.
We also adjust the progress interval and progress limit check to make
the output less noisy, and avoid false positives.
Bring on the pain.
This patch adds a new PIO-accessible GICv2m shim. This shim has a PIO
slave port on one side, and SPI 'wires' on the other. It accepts MSIs
from the system and triggers SPIs on the GIC. It is configurable with
a number of frames, each of which has a number of SPIs and a base SPI
offset.
A Linux driver for GICv2m is available upstream.
This patch removes the code that added this magic register. A
follow-up patch provides a GICv2m MSI shim that gives the same
functionality in a standard ARM system architecture way.
This patch enables users to speficy --os-type on the command
line. This option is used to take specific actions for an OS type,
such as changing the kernel command line. This patch is part of the
Android KitKat enablement.
Fix erroneous message format for fatal error.
Previously, code did not have type indicator (% instead of %d).
Also removed redundant fatal check.
Ran modified sweep.py with in range and out of range values to test.
The CommMonitor by default only allows memory traces to be gathered in
timing mode. This patch allows memory traces to be gathered in atomic
mode if all one needs is a functional trace of memory addresses used
and timing information is of a secondary concern.
For some reason we were checking mshr->hasTargets() even though
we had already called mshr->getTarget() unconditionally earlier
in the same function (which asserts if there are no targets).
Get rid of this useless check, and while we're at it get rid
of the redundant call to mshr->getTarget(), since we still have
the value saved in a local var.
Refactor the way that specific MemCmd values are generated for packets.
The new approach is a little more elegant in that we assign the right
value up front, and it's also more amenable to non-heap-allocated
Packet objects.
Also replaced the code in the Minor model that was still doing it the
ad-hoc way.
This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249.
The 'if (writebacks.size)' check was redundant, because
writeBuffer.findMatches() would return false if the
writebacks list was empty.
Also renamed 'mshr' to 'wb_entry' in this context since
we are pointing at a writebuffer entry and not an MSHR
(even though it's the same C++ class).