This patch hosts gem5 onto SystemC scheduler. There's already an upstream
review board patch that does something similar but this patch ...:
1) is less obtrusive to the existing gem5 code organisation. It's divided
into the 'generic' preparatory patches (already submitted) and this patch
which affects no existing files
2) does not try to exactly track the gem5 event queue with notifys into
SystemC and so doesn't requive the event queue to be modified for
anything other than 'out of event queue' scheduling events
3) supports debug logging with SC_REPORT
The patch consists of the files:
util/systemc/
sc_gem5_control.{cc,hh} -- top level objects to use to
instantiate gem5 Systems within
larger SystemC test harnesses as
sc_module objects
sc_logger.{cc,hh} -- logging support
sc_module.{cc,hh} -- a separated event loop specific to
SystemC
stats.{cc,hh} -- example Stats handling for the sample
top level
main.{cc,hh} -- a sample top level
On the downside this patch is only currently functional with C++
configuration at the top level.
The above sc_... files are indended to be compiled alongside gem5 (as a
library, see main.cc for a command line and util/systemc/README for
more details.)
The top-level system instantiation in sc_gem5_control.{cc,hh} provides
two classes: Gem5Control and Gem5System
Gem5Control is a simulation control class (from which a singleton
object should be created) derived from Gem5SystemC::Module which
carries the top level simulation control interface for gem5. This
includes hosting a system-building configuration file and
instantiating the Root object from that file.
Gem5System is a base class for instantiating renamed gem5 Systems
from the config file hosted by the Gem5Control object. In use, a
SystemC module class should be made which represents the desired,
instantiable gem5 System. That class's instances should create
a Gem5System during their construction, set the parameters of that
system and then call instantiate to build that system. If this
is all carried out in the sc_core::sc_module-derived classes
constructor, the System's external ports will become children of
that module and can then be recovered by name using sc_core::
sc_find_object.
It is intended that this interface is used with dlopen. To that
end, the header file sc_gem5_control.hh includes no other header
files from gem5 (and so can be easily copied into another project).
The classes Gem5System and Gem5Control have all their member
functions declared `virtual' so that those functions can be called
through the vtable acquired by building the top level Gem5Control
using dlsym(..., "makeGem5Control") and `makeSystem' on the
Gem5Control.
This patch adds the ability to load in config.ini files generated from
gem5 into another instance of gem5 built without Python configuration
support. The intended use case is for configuring gem5 when it is a
library embedded in another simulation system.
A parallel config file reader is also provided purely in Python to
demonstrate the approach taken and to provided similar functionality
for as-yet-unknown use models. The Python configuration file reader
can read both .ini and .json files.
C++ configuration file reading:
A command line option has been added for scons to enable C++ configuration
file reading: --with-cxx-config
There is an example in util/cxx_config that shows C++ configuration in action.
util/cxx_config/README explains how to build the example.
Configuration is achieved by the object CxxConfigManager. It handles
reading object descriptions from a CxxConfigFileBase object which
wraps a config file reader. The wrapper class CxxIniFile is provided
which wraps an IniFile for reading .ini files. Reading .json files
from C++ would be possible with a similar wrapper and a JSON parser.
After reading object descriptions, CxxConfigManager creates
SimObjectParam-derived objects from the classes in the (generated with this
patch) directory build/ARCH/cxx_config
CxxConfigManager can then build SimObjects from those SimObjectParams (in an
order dictated by the SimObject-value parameters on other objects) and bind
ports of the produced SimObjects.
A minimal set of instantiate-replacing member functions are provided by
CxxConfigManager and few of the member functions of SimObject (such as drain)
are extended onto CxxConfigManager.
Python configuration file reading (configs/example/read_config.py):
A Python version of the reader is also supplied with a similar interface to
CxxConfigFileBase (In Python: ConfigFile) to config file readers.
The Python config file reading will handle both .ini and .json files.
The object construction strategy is slightly different in Python from the C++
reader as you need to avoid objects prematurely becoming the children of other
objects when setting parameters.
Port binding also needs to be strictly in the same port-index order as the
original instantiation.
This patch adds a python script that processes the configuration and the
statistics file from a simulation run. Configuration and activity of network
routers and links obtained from this processing is fed to DSENT via its Python
interface. DSENT then computes the area and the power consumption of these
network components. The script outputs these quantities to the console.
Updated the stat_config.ini files to reflect new structure.
Moved to a more generic stat naming scheme that can easily handle
multiple CPUs and L2s by letting the script replace pre-defined #
symbols to CPU or L2 ids.
Removed the previous per_switch_cpus sections. Still can be used by
spelling out the stat names if necessary. (Resuming from checkpoints
no longer use switch_cpus. Only fast-forwarding does.)
Analogous to ee049bf (for x86). Requires a bump of the checkpoint version
and corresponding upgrader code to move the condition code register values
to the new register file.
This patch adds basic functionality to quickly visualise the output
from the DRAM efficiency script. There are some unfortunate hacks
needed to communicate the needed information from one script to the
other, and we fall back on (ab)using the simout to do this.
As part of this patch we also trim the efficiency sweep to stop at 512
bytes as this should be sufficient for all forseeable DRAMs.
There are some directories within the repository where we don't want
to enforce our coding style. Specifically, we don't want the style
hooks to warn whenever we update external code in the ext/ directory.
The 'hg m5style' command had some rather strange semantics. When
called without arguments, it applied the style checker to all added
files and modified regions of modified files. However, when providing
a list of files, it used that list as an ignore list instead of
specifically checking those files.
This patch makes the m5style command behave more like other Mercurial
commands where the arguments are used to specify which files to work
on instead of which files to ignore.
This patch adds a fix for older checkpoints before support for
multiple event queues were added in changeset 2cce74fe359e. The change
in checkpoint version should really hav ebeen part of the
aforementioned changeset.
There are cases where the state of a SortIncludes object gets messed
up and leaks between invocations/files. This typically happens when a
file ends with an include block (dump_block() gets called at the end
of __call__). In this case, the state of the class is not reset
between files. This bug manifests itself as ghost includes that leak
between files when applying the style hooks.
This changeset adds a reset at the beginning of the __call__ method
which ensures that the class is always in a clean state when
processing a new file.
This patch moves the code for opening an input protobuf packet trace into
a function defined in the protobuf library. This is because the code is
commonly used in decode scripts and is independent of the src protobuf
message.
This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).
The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).
Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.
Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.
Minor is faster than the o3 model. Sample results:
Benchmark | Stat host_seconds (s)
---------------+--------v--------v--------
(on ARM, opt) | simple | o3 | minor
| timing | timing | timing
---------------+--------+--------+--------
10.linux-boot | 169 | 1883 | 1075
10.mcf | 117 | 967 | 491
20.parser | 668 | 6315 | 3146
30.eon | 542 | 3413 | 2414
40.perlbmk | 2339 | 20905 | 11532
50.vortex | 122 | 1094 | 588
60.bzip2 | 2045 | 18061 | 9662
70.twolf | 207 | 2736 | 1036
This patch adds a first version of a script that processes the debug
output and generates a command trace for DRAMPower. This is work in
progress and is intended as a snapshot of ongoing work at this point.
The longer term plan is to link in DRAMPower as a library and have one
instance of the model per rank, and instantiate it based on a struct
passed from gem5. Each command will then be a call to the model and no
parsing of traces will be necessary.
The style checker used to traverse symlinks if they pointed to files, which can
result in style checker failure if the pointed-to file doesn't exist. This
style check is actually unnecessary, since symlinks either point to other files
that are already style checked, or files outside gem5, which shouldn't be
checked. Skip symlinks.
Upon aggregating records, serialize system's cache-block size, as the
cache-block size can be different when restoring from a checkpoint. This way,
we can correctly read all records when restoring from a checkpoints, even if
the cache-block size is different.
Note, that it is only possible to restore from a checkpoint if the
desired cache-block size is smaller or equal to the cache-block size
when the checkpoint was taken; we can split one larger request into
multiple small ones, but it is not reliable to do the opposite.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
1) fixes a typo for clean target libgemOpJni.so -> libgem5OpJni.so
2) addes jni_gem5Op.h to clean since it is added during make
3) links the m5 utility statically since it won't work on some images otherwise
This patch changes the decode script to output the optional fields of
the proto message Packet, namely id and flags. The flags field is set
by the communication monitor.
The id field is useful for CPU trace experiments, e.g. linking the
fetch side to decode side. It had to be renamed because it clashes
with a built in python function id() for getting the "identity" of an
object.
This patch also takes a few common function definitions out from the
multiple scripts and adds them to a protolib python module.
Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64
kernel you are restricted to AArch64 user-mode binaries. This will be addressed
in a later patch.
Note: Virtualization is only supported in AArch32 mode. This will also be fixed
in a later patch.
Contributors:
Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation)
Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation)
Mbou Eyole (AArch64 NEON, validation)
Ali Saidi (AArch64 Linux support, code integration, validation)
Edmund Grimley-Evans (AArch64 FP)
William Wang (AArch64 Linux support)
Rene De Jong (AArch64 Linux support, performance opt.)
Matt Horsnell (AArch64 MP, validation)
Matt Evans (device models, code integration, validation)
Chris Adeniyi-Jones (AArch64 syscall-emulation)
Prakash Ramrakhyani (validation)
Dam Sunwoo (validation)
Chander Sudanthi (validation)
Stephan Diestelhorst (validation)
Andreas Hansson (code integration, performance opt.)
Eric Van Hensbergen (performance opt.)
Gabe Black
The checkpoint aggregation script had become outdated due to numerous changes
to checkpoints over the past couple of years. This updates the script. It
now supports aggregation for x86 architecture instead of alpha. Also a couple
of new options have been added that specify the size of the memory file to be
created and whether or not the memory file should be compressed.
Thumb2 ARM kernels may access the TEEHBR via thumbee_notifier
in arch/arm/kernel/thumbee.c. The Linux kernel code just seems
to be saving and restoring the register. This patch adds support
for the TEEHBR cp14 register. Note, this may be a special case
when restoring from an image that was run on a system that
supports ThumbEE.
Newer linux kernels and distros exercise more functionality in the IDE device
than previously, exposing 2 races. The first race is the handling of aborted
DMA commands would immediately report the device is ready back to the kernel
and cause already in flight commands to assert the simulator when they returned
and discovered an inconsitent device state. The second race was due to the
Status register not being handled correctly, the interrupt status bit would get
stuck at 1 and the driver eventually views this as a bad state and logs the
condition to the terminal. This patch fixes these two conditions by making the
device handle aborted commands gracefully and properly handles clearing the
interrupt status bit in the Status register.
Changes to make m5ops work under virtualization seemed to break them working
with non-virtualized systems and the recently added m5 fail command makes
the m5op binary not compile. For now remove the code for virtualization.
This Python script generates an ARM DS-5 Streamline .apc project based
on gem5 run. To successfully convert, the gem5 runs needs to be run
with the context-switch-based stats dump option enabled (The guest
kernel also needs to be patched to allow gem5 interrogate its task
information.) See help for more information.
In order to support m5ops in virtualized environments, we need to use
a memory mapped interface. This changeset adds support for that by
reserving 0xFFFF0000-0xFFFFFFFF and mapping those to the generic IPR
interface for m5ops. The mapping is done in the
X86ISA::TLB::finalizePhysical() which means that it just works for all
of the CPU models, including virtualized ones.
This patch makes it possible to once again build gem5 without any
ISA. The main purpose is to enable work around the interconnect and
memory system without having to build any CPU models or device models.
The regress script is updated to include the NULL ISA target. Currently
no regressions make use of it, but all the testers could (and perhaps
should) transition to it.
--HG--
rename : build_opts/NOISA => build_opts/NULL
rename : src/arch/noisa/SConsopts => src/arch/null/SConsopts
rename : src/arch/noisa/cpu_dummy.hh => src/arch/null/cpu_dummy.hh
rename : src/cpu/intr_control.cc => src/cpu/intr_control_noisa.cc
Using arm-linux-gnueabi-gcc 4.7.3-1ubuntu1 on Ubuntu 13.04 to compiled
the m5 binary yields the error:
m5op_arm.S: Assembler messages:
m5op_arm.S:85: Error: selected processor does not support ARM mode `bxj lr'
For each of of the SIMPLE_OPs. Apparently, this compiler doesn't like the
interworking of these code types for the default arch. Adding -march=armv7-a
makes it compile. Another alternative that I found to work is replacing the
bxj lr instruction with mov pc, lr, but I don't know how that affects the
KVM stuff and if bxj is needed.
This patch adds checkpointing support to x86 tlb. It upgrades the
cpt_upgrader.py script so that previously created checkpoints can
be updated. It moves the checkpoint version to 6.
This patch simplifies the usage of the packet trace encoder/decoder by
attempting to automatically generating the packet proto definitions in
case they cannot be found.
Changeset 5ca6098b9560 accidentally broke the m5 utility. This
changeset adds the missing co-processor call used to trigger the
pseudo-op in ARM mode and fixes an alignment issue that caused some
pseudo-ops to leave thumb mode.
This changeset adds support for m5 pseudo-ops when running in
kvm-mode. Unfortunately, we can't trap the normal gem5 co-processor
entry in KVM (it doesn't seem to be possible to trap accesses to
non-existing co-processors). We therefore use BZJ instructions to
cause a trap from virtualized mode into gem5. The BZJ instruction is
becomes a normal branch to the gem5 fallback code when running in
simulated mode, which means that this patch does not need to change
the ARM ISA-specific code.
Note: This requires a patched host kernel.
This patch adds a simple Python script that reads the protobuf-encoded
packet traces (not gzipped), and prints them to an ASCII trace file.
The script can also be used as a template for other packet output
formats.
This patch adds a simple Python script that reads a simple ASCII trace
format and encodes it as protobuf output compatible with the traffic
generator.
The script can also be used as a template for other packet input
formats that should be converted to the gem5 packet protobuf format.
Changeset 02321b16685f added m5_writefile to m5op_x86.S a second time,
which causes a compilation error on when compiling for x86. This
changeset reverts that changeset and fixes the error.
Fixes the tick used from rename:
- previously this gathered the tick on leaving rename which was always 1 less
than the dispatch. This conflated the decode ticks when back pressure built
in the pipeline.
- now picks up tick on entry.
Added --store_completions flag:
- will additionally display the store completion tail in the viewer.
- this highlights periods when large numbers of stores are outstanding (>16 LSQ
blocking)
Allows selection by tick range (previously this caused an infinite loop)
Used as a command in full-system scripts helps the user ensure the benchmarks have finished successfully.
For example, one can use:
/path/to/benchmark args || /sbin/m5 fail 1
and thus ensure gem5 will exit with an error if the benchmark fails.
The number of arguments specified when calling parse_int_args() in
do_exit() is incorrect. This leads to stack corruption since it causes
writes past the end of the ints array.
In order to see all registers independent of the current CPU mode, the
ARM architecture model uses the magic MISCREG_CPSR_MODE register to
change the register mappings without actually updating the CPU
mode. This hack is no longer needed since the thread context now
provides a flat interface to the register file. This patch replaces
the CPSR_MODE hack with the flat register interface.
After making the ISA an independent SimObject, it is serialized
automatically by the Python world. Previously, this just resulted in
an empty ISA section. This patch moves the contents of the ISA to that
section and removes the explicit ISA serialization from the thread
contexts, which makes it behave like a normal SimObject during
serialization.
Note: This patch breaks checkpoint backwards compatibility! Use the
cpt_upgrader.py utility to upgrade old checkpoints to the new format.