If multiple memory operations to the same page are miss the TLB they are
all inserted into the page table queue and before this change could result
in multiple uncessesary walks as well as duplicate enteries being inserted
into the TLB.
Replace the use of off_t in the various DiskImage related classes with
std::streampos. off_t is a signed 32 bit integer on most 32-bit
systems, whereas std::streampos is normally a 64 bit integer on most
modern systems. Furthermore, std::streampos is the type used by
tellg() and seekg() in the standard library, so it should have been
used in the first place. This patch makes it possible to use disk
images larger than 2 GiB on 32 bit systems with a modern C++ standard
library.
setMiscReg currently makes a new entry for each write to a misc reg without
checking for duplicates, this can cause a triggering of the assert if an
instruction get replayed and writes to the same misc regs multiple times.
This fix prevents duplicate entries and instead updates the value.
The rename can mis-handle serializing instructions (i.e. strex) if it gets
into a resource constrained situation and the serializing instruction has
to be placed on the skid buffer to handle blocking. In this situation the
instruction informs the pipeline it is serializing and logs that the next
instruction must be serialized, but since we are blocking the pipeline
defers this action to place the serializing instruction and
incoming instructions into the skid buffer. When resuming from blocking,
rename will pull the serializing instruction from the skid buffer and
the current logic will see this as the "next" instruction that has to
be serialized and because of flags set on the serializing instruction,
it passes through the pipeline stage as normal and resets rename to
non-serializing. This causes instructions to follow the serializing inst
incorrectly and eventually leads to an error in the pipeline. To fix this
rename should check first if it has to block before checking for serializing
instructions.
This patch merely adopts a more strict use of const for the cache
member functions and variables, and also moves a large portion of the
member functions from public to protected.
This patch adds two fuctions to m5.util, warn and inform, which mirror those
found in the C++ side of gem5. These are added in addition to the already
existing m5.util.panic and m5.util.fatal which already mirror the C++
functionality. This ensures that warning and information messages generated
by python are in the same format as those generated by C++.
Occurrences of
print "Warning: %s..." % name
have been replaced with
warn("%s...", name)
Fixes the tick used from rename:
- previously this gathered the tick on leaving rename which was always 1 less
than the dispatch. This conflated the decode ticks when back pressure built
in the pipeline.
- now picks up tick on entry.
Added --store_completions flag:
- will additionally display the store completion tail in the viewer.
- this highlights periods when large numbers of stores are outstanding (>16 LSQ
blocking)
Allows selection by tick range (previously this caused an infinite loop)
This patch moves the GIC interface to a separate base class and makes
all interrupt devices use that base class instead of a pointer to the
PL390 implementation. This allows us to have multiple GIC
implementations. Future implementations will allow in-kernel GIC
implementations when using hardware virtualization.
--HG--
rename : src/dev/arm/gic.cc => src/dev/arm/gic_pl390.cc
rename : src/dev/arm/gic.hh => src/dev/arm/gic_pl390.hh
Virtualized CPUs and the fastmem mode of the atomic CPU require direct
access to physical memory. We currently require caches to be disabled
when using them to prevent chaos. This is not ideal when switching
between hardware virutalized CPUs and other CPU models as it would
require a configuration change on each switch. This changeset
introduces a new version of the atomic memory mode,
'atomic_noncaching', where memory accesses are inserted into the
memory system as atomic accesses, but bypass caches.
To make memory mode tests cleaner, the following methods are added to
the System class:
* isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'.
* isTimingMode() -- True if the memory mode is 'timing'.
* bypassCaches() -- True if caches should be bypassed.
The old getMemoryMode() and setMemoryMode() methods should never be
used from the C++ world anymore.
CPUs need to test that the memory system is in the right mode in two
places, when the CPU is initialized (unless it's switched out) and on
a drainResume(). This led to some code duplication in the CPU
models. This changeset introduces the verifyMemoryMode() method which
is called by BaseCPU::init() if the CPU isn't switched out. The
individual CPU models are responsible for calling this method when
resuming from a drain as this code is CPU model specific.
The default cache configuration script currently import the O3_ARM_v7a
model configuration, which depends on the O3 CPU. This breaks if gem5
has been compiled without O3 support. This changeset removes the
dependency by only importing the model if it is requested by the
user. As a bonus, it actually removes some code duplication in the
configuration scripts.
CPU switching consists of the following steps:
1. Drain the system
2. Switch out old CPUs (cpu.switchOut())
3. Change the system timing mode to the mode the new CPUs require
4. Flush caches if switching to hardware virtualization
5. Inform new CPUs of the handover (cpu.takeOverFrom())
6. Resume the system
m5.switchCpus() previously only did step 2 & 5. Since information
about the new processors' memory system requirements is now exposed,
do all of the steps above.
This patch adds automatic memory system switching and flush (if
needed) to switchCpus(). Additionally, it adds optional draining to
switchCpus(). This has the following implications:
* changeToTiming and changeToAtomic are no longer needed, so they have
been removed.
* changeMemoryMode is only used internally, so it is has been renamed
to be private.
* switchCpus requires a reference to the system containing the CPUs as
its first parameter.
WARNING: This changeset breaks compatibility with existing
configuration scripts since it changes the signature of
m5.switchCpus().
The CPUs supported by the configuration scripts used to be
hard-coded. This was not ideal for several reasons. For example, the
configuration scripts depend on all CPU models even though only a
subset might have been compiled.
This changeset adds a new module to the configuration scripts that
automatically discovers the available CPU models from the compiled
SimObjects. As a nice bonus, the use of introspection allows us to
automatically generate a list of available CPU models suitable for
printing. This list is augmented with the Python doc string from the
underlying class if available.
Checker CPUs currently don't inherit from the CheckerCPU in the Python
object hierarchy. This has two consequences:
* It makes CPU model discovery from the Python world somewhat
complicated as there is no way of testing if a CPU is a checker.
* Parameters are duplicated in the checker configuration
specification.
This changeset makes all checker CPUs inherit from the base checker
CPU class.
The configuration scripts currently hard-code the requirements of each
CPU. This is clearly not optimal as it makes writing new configuration
scripts painful and adding new CPU models requires existing scripts to
be updated. This patch adds the following class methods to the base
CPU and all relevant CPUs:
* memory_mode -- Return a string describing the current memory mode
(invalid/atomic/timing).
* require_caches -- Does the CPU model require caches?
* support_take_over -- Does the CPU support CPU handover?
The explict tests in the follwing fp comparison operations were
incorrect as they checked for only signaling NaNs and not quite-NaNs
as well. When compiled with gcc, the comparison generates a fp exception
that causes the FE_INVALID flag to be set and we check for it, so even
though the check was incorrect, the correct exception was set. With clang
this behavior seems to not occur. The checks are updated to test for nans and
the behavior is now correct with both clang and gcc.
Clang generated executables would enter the if condition when it wasn't
supposted to, resulting in the wrong simulated behavior.
Implementing the operation this way is a bit faster anyway.
Fix a case in the O3 CPU where the decode stage blocks and unblocks in a
single cycle sending both signals to fetch which causes an assert or worse.
The previous check could never work before since the status was set to Blocked
before a test for the status being Unblocking was executed.
Check if an instruction just enabled interrupts and we've previously had an
interrupt pending that was not handled because interrupts were subsequently
disabled before the pipeline reached a place to handle the interrupt. In that
case squash now to make sure the interrupt is handled.
IPython is used for the interactive gem5 shell if it exists. IPython
made API changes in version 0.11. This patch adds support for IPython
version 0.11 and above.
--HG--
extra : rebase_source : 5388d0919adb58d97f49a1a637db48cba61283a3
The transition for state MII and event Store was found missing during testing.
The transition is being added. The controller will not stall the Store request
in state MII
This patch allows ruby to have multiple clock domains. As I understand
with this patch, controllers can have different frequencies. The entire
network needs to run at a single frequency.
The idea is that with in an object, time is treated in terms of cycles.
But the messages that are passed from one entity to another should contain
the time in Ticks. As of now, this is only true for the message buffers,
but not for the links in the network. As I understand the code, all the
entities in different networks (simple, garnet-fixed, garnet-flexible) should
be clocked at the same frequency.
Another problem is that the directory controller has to operate at the same
frequency as the ruby system. This is because the memory controller does
not make use of the Message Buffer, and instead implements a buffer of its
own. So, it has no idea of the frequency at which the directory controller
is operating and uses ruby system's frequency for scheduling events.
This patch is as of now the final patch in the series of patches that replace
Time with Cycles.This patch further replaces Time with Cycles in Sequencer,
Profiler, different protocols and related entities.
Though Time has not been completely removed, the places where it is in use
seem benign as of now.
The patch started of with replacing Time with Cycles in the Consumer class.
But to get ruby to compile, the rest of the changes had to be carried out.
Subsequent patches will further this process, till we completely replace
Time with Cycles.
This patch modifies the Histogram class' add() function so that it can add
linear histograms as well. The function assumes that the left end point of
the ranges of the two histograms are the same. It also assumes that when
the ranges of the two histogram are changed to accomodate an element not in
the range, the factor used in changing the range is same for both the
histograms.
This function is then used in removing one of the calls to the global
profiler*. The histograms for recording the delays incurred in processing
different requests are now maintained by the controllers. The profiler
adds these histograms when it needs to print the stats.
This patch does several things. First, the counter for fully busy cycles for a
controller is now kept with in the controller, instead of being part of the profiler.
Second, the topology class no longer keeps an array of controllers which was only
used for printing stats. Instead, ruby system will now ask each controller to print
the stats. Thirdly, the statistical variable for recording how many different types
were created is being moved in to the controller from the profiler. Note that for
printing, the profiler will collate results from different controllers.
Prior to this changeset, we used to clear sys.argv before entering the
IPython shell. This caused some versions of IPython to crash because
they assume argv[0] to exist. The correct way of overriding the
arguments passed to IPython is to set the argv keyword argument when
initializing the shell.
The run() method in Simulation.py used to call sys.exit() when the
simulator exits. This is undesirable when user has requested the
simulator to be run in interactive mode since it causes the simulator
to exit rather than entering the interactive Python environment.
The number of bits required for an address was set to floorLog2(memory size).
This is correct under the assumption that the memory size is a power of 2,
which is not always true. Hence, floorLog2 is being replaced with ceilLog2.
This patch moves the default DRAM parameters from the SimpleDRAM class
to two different subclasses, one for DDR3 and one for LPDDR2. More can
be added as we go forward.
The regressions that previously used the SimpleDRAM are now using
SimpleDDR3 as this is the most similar configuration.
This patch adds two additional scheduling constraints to the DRAM
controller model, to constrain the activation rate. The two metrics
are determine the size of the activation window in terms of the number
of activates and the minimum time required for that number of
activates. This maps to current DDRx, LPDDRx and WIOx standards that
have either tFAW (4 activate window) or tTAW (2 activate window)
scheduling constraints.
This patch changes how the data bus busy time is calculated such that
it is delayed to the actual scheduling time of the request as opposed
to being done as soon as possible.
This patch changes a bunch of statistics, and the stats update is
bundled together with the introruction of tFAW/tTAW and the named DRAM
configurations like DDR3 and LPDDR2.