This patch moves the addition of network options into the Ruby module
to avoid the regressions all having to add it explicitly. Doing this
exposes an issue in our current config system though, namely the fact
that addtoPath is relative to the Python script being executed. Since
both example and regression scripts use the Ruby module we would end
up with two different (relative) paths being added. Instead we take a
first step at turning the config modules into Python packages, simply
by adding a __init__.py in the configs/ruby, configs/topologies and
configs/network subdirectories.
As a result, we can now add the top-level configs directory to the
Python search path, and then use the package names in the various
modules. The example scripts are also updated, and the messy
path-deducing variations in the scripts are unified.
The generic timer needs a pointer to an ArmSystem to wire itself to the
system register handler. This was previously specified as an instance
of System that was later cast to ArmSystem. Make this more robust by
specifying it as an ArmSystem in the Python interface and add a check
to make sure that it is non-NULL.
Change-Id: I989455e666f4ea324df28124edbbadfd094b0d02
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
This patch adds port direction names to the links during topology
creation, which can be used for better printed names for the links
or for users to code up their own adaptive routing algorithms.
It also adds support for every router to have an independent latency
value to support heterogeneous topologies with the subsequent
garnet2.0 patch.
This patch makes the internal links within the network topology
unidirectional, thus allowing any deadlock-free routing algorithms to
be specified from the topology itself using weights.
This patch also renames Mesh.py and MeshDirCorners.py to
Mesh_XY.py and MeshDirCorners_XY.py (Mesh with XY routing).
It also adds a Mesh_westfirst.py and CrossbarGarnet.py topologies.
networktest is essentially a collection of synthetic traffic patterns
for the network. The protocol name and the tester having the same name
led to multiple python configuration files with the same name, adding
confusion. This patch renames networktest to garnet_synthetic_traffic,
and also adds more synthetic traffic patterns.
Over the past 6 years, we realized that the protocol is essentially used
to run the garnet network in a standalone manner, and feed standard synthetic
traffic patterns through it.
Instead of scheduling another event, this patch adds a warning in case gdb
is attached multiple times and the first attachement event has not been
processed yet.
This patch adds a method to the Wavefront class to compute the actual workgroup
size. This can be different from the maximum workgroup size specified when
launching the kernel through the NDRange object. Current solution is still not
optimal, as we are computing these for each wavefront and the dispatcher also
needs to have this information and can't actually call
Wavefront::computeActuallWgSz before the wavefronts are being created. A long
term solution would be to have a Workgroup class that deals with all these
details.
Adjust the traffic generator time-out so that the script works out of
the box
Change-Id: I6b3b6b11f98b094ae3acdbe09488c26e4aeb0ab4
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
When loading a checkpoint, it's sometimes desirable to be able to test
whether an entry within a secion exists. This is currently done
automatically in the UNSERIALIZE_OPT_SCALAR macro, but it isn't
possible to do for arrays, containers, or enums. Instead of adding
even more macros, add a helper function (CheckpointIn::entryExists())
that tests for the presence of an entry.
Change-Id: I4b4646b03276b889fd3916efefff3bd552317dbc
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
The drain did not wait until stages were ready again. Therefore, as a
result of messages in the TimeBuffer being drain, the state after the
drain was not consistent and asserts fired in some places when the
draining happened after a stage got blocked, but before the notification
arrived to the previous stages.
Change-Id: Ib50b3b40b7f745b62c1eba2931dec76860824c71
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
The memtest and memcheck are not designed to test timing. Make them
functional only to make ref diffs less noisy in the future.
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
The switcheroo tests only really serve to check functional
correctness. Checking for stat differences in them just increases the
size of reference diffs.
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Align configuration with new SST change [1] requiring units for
memHierarchy's backend.mem_size parameter.
[1] c901abb4e7
Change-Id: I19fa09bec8aa453dc52d154598a4ebb20ea304d8
This patch adds methods to serialize the context of a particular wavefront
to the simulated system memory. Context serialization is used when a wavefront
is preempeted (i.e. context switch).
std::stack has no iterators, therefore the reconvergence stack can't be
iterated without poping elements off. We will be using std::list instead to be
able to iterate for saving and restoring purposes.
WFContext struct is currently unused and it has been rendered not useful in
saving and restoring the context of a Wavefront. Wavefront class should be
sufficient for that purpose and the runtime can figure out the memory size
it will need to allocate for a Wavefront through an IOCTL.
Switcheroo and checkpoint tests should generally be considered to be
successful if they run to completion. Remove all reference output
files from the switcheroo and checkopint tests to make them purely
functional.
Change-Id: I70b47853bd662b7a33716d9e0d2154b16077f9dc
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Modify the ClassicTest class to only emit a stat verification test
unit if there is a reference stat file. This makes it possible to
design tests that don't care about stat changes.
To generate purely functional tests, we need to be able to create
empty test reference directories. This does not work well with many
revision control systems. As a workaround, add a file named EMPTY to
the list of ignored files in the test harness. This file can be used
as a placeholder in otherwise empty test directories.
Change-Id: I583c8c4e55479f0d48fa99d0b0d1eac9221e6652
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
This change adds a Trace CPU param to exit simulation early,
i.e. when the first (any one) trace execution is complete. With
this change the user gets a choice to configure exit as either
when the last CPU finishes (default) or first CPU finishes
replay. Configuring an early exit enables simulating and
measuring stats strictly when memory-system resources are being
stressed by all Trace CPUs.
Change-Id: I3998045fdcc5cd343e1ca92d18dd7f7ecdba8f1d
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
This change subtracts the time offset present in the trace from
all the event times when nodes and request are sent so that the
replay starts immediately when the simulation starts. This makes
the stats accurate when the time offset in traces is large, for
example when traces are generated in the middle of a workload
execution. It also solves the problem of unnecessary DRAM
refresh events that would keep occuring during the large time
offset before even a single request is replayed into the system.
Change-Id: Ie0898842615def867ffd5c219948386d952af7f7
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
This change adds a simple feature to scale the frequency of
the Trace CPU.
The compute delays in the input traces provide timing. This
change adds a freqency multiplier parameter to the Trace CPU
set to 1.0 by default. The compute delay is manipulated to
effectively achieve the frequency at which the nodes become
ready and thus scale the frequency of the Trace CPU.
Change-Id: Iaabbd57806941ad56094fcddbeb38fcee1172431
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
This patch refactors the configuration file to use a more
object-oriented design.
Change-Id: I44ac2d063c2b5901f385544fb6ce3f259459cb05
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
This patch enables timing accesses for KVM cpu. A new state,
RunningMMIOPending, is added to indicate that there are outstanding timing
requests generated by KVM in the system. KVM's tick() is disabled and the
simulation does not enter into KVM until all outstanding timing requests have
completed. The main motivation for this is to allow KVM CPU to perform MMIO
in Ruby, since Ruby does not support atomic accesses.
Normal MMAPPED_IPR requests are allowed to execute speculatively under the
assumption that they have no side effects. The special case of m5ops that are
treated like MMAPPED_IPR should not be allowed to execute speculatively, since
they can have side-effects. Adding the STRICT_ORDER flag to these requests
blocks execution until the associated instruction hits the ROB head.
The quiesce family of magic ops can be simplified by the inclusion of
quiesceTick() and quiesce() functions on ThreadContext. This patch also
gets rid of the FS guards, since suspending a CPU is also a valid
operation for SE mode.
This patch introduces the DmaCallback helper class, which registers a callback
to fire after a sequence of (potentially non-contiguous) DMA transfers on a
DmaPort completes.
Connecting basic blocks would stop too early in kernels where ret was not the
last instruction. This patch allows basic blocks after the ret instruction
to be properly connected.