Commit graph

1198 commits

Author SHA1 Message Date
Mrinmoy Ghosh
9b05e96b9e BPred: Fix RAS to handle predicated call/return instructions.
Change RAS to fix issues with predicated call/return instructions.
Handled all cases in the life of a predicated call and return instruction.
2012-02-13 12:26:25 -06:00
Mrinmoy Ghosh
fd90c3676d BP: Fix several Branch Predictor issues.
1. Updates the Branch Predictor correctly to the state
   just after a mispredicted branch, if a squash occurs.
2. If a BTB does not find an entry, the branch is predicted not taken.
   The global history is modified to correctly reflect this prediction.
3. Local history is now updated at the fetch stage instead of
   execute stage.
4. In the Update stage of the branch predictor the local predictors are
   now correctly updated according to the state of local history during
   fetch stage.

This patch also improves performance by as much as 17% on some benchmarks
2012-02-13 12:26:24 -06:00
Andreas Hansson
5a9a743cfc MEM: Introduce the master/slave port roles in the Python classes
This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.

The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.

Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.
2012-02-13 06:43:09 -05:00
Anthony Gutierrez
542d0ceebc cpu: add separate stats for insts/ops both globally and per cpu model 2012-02-12 16:07:39 -06:00
Ali Saidi
8aaa39e93d mem: Add a master ID to each request object.
This change adds a master id to each request object which can be
used identify every device in the system that is capable of issuing a request.
This is part of the way to removing the numCpus+1 stats in the cache and
replacing them with the master ids. This is one of a series of changes
that make way for the stats output to be changed to python.
2012-02-12 16:07:38 -06:00
Nilay Vaish
6a7a6263e1 O3 CPU: Improve handling of delayed commit flag
The delayed commit flag is used in conjunction with interrupt pending flag to
figure out whether or not fetch stage should get more instructions. This patch
clears this flag when instructions are squashed. Also, in case an interrupt is
pending, currently it is not possible to access the instruction cache. This
patch allows accessing the cache in case this flag is set.
2012-02-10 08:37:31 -06:00
Nilay Vaish
cd765c23a2 O3 CPU: Strengthen condition for handling interrupts
The condition for handling interrupts is to check whether or not the cpu's
instruction list is empty. As observed, this can lead to cases in which even
though the instruction list is empty, interrupts are handled when they should
not be. The condition is being strengthened so that interrupts get handled only
when the last committed microop did not had IsDelayedCommit set.
2012-02-10 08:37:30 -06:00
Nilay Vaish
8f7e03d4cf O3 CPU: Provide the squashing instruction
This patch adds a function to the ROB that will get the squashing instruction
from the ROB's list of instructions. This squashing instruction is used for
figuring out the macroop from which the fetch stage should fetch the microops.
Further, a check has been added that if the instructions are to be fetched
from the cache maintained by the fetch stage, then the data in the cache should
be valid and the PC of the thread being fetched from is same as the address of
the cache block.
2012-02-10 08:37:28 -06:00
Nilay Vaish
0e597e944a O3 Fetch: Check if PC is pointing to Microcode ROM 2012-02-10 08:37:26 -06:00
Gabe Black
e80ebc308f SE/FS: Record the system pointer all the time for the simple CPU.
This pointer was only being stored in code that came from SE mode. The system
pointer is always meaningful and available, so it should always be stored.
2012-02-10 02:05:31 -08:00
Gabe Black
a6246bb047 Checker: Access workload element 0 only if there is an element 0. 2012-02-07 04:44:01 -08:00
Gabe Black
f2b46fdb85 Faults: Turn off arch/faults.hh
Because there are no longer architecture independent but specialized functions
in arch/XXX/faults.hh, code that isn't using the faults from a particular ISA
no longer needs to be able to include them through the switching header file
arch/faults.hh. By removing that header file (arch/faults.hh), the potential
interface between ISA code and non ISA code is narrowed.
2012-02-07 04:43:21 -08:00
Gabe Black
ea8b347dc5 Merge with head, hopefully the last time for this batch. 2012-01-31 22:40:08 -08:00
Koan-Sin Tan
7d4f187700 clang: Enable compiling gem5 using clang 2.9 and 3.0
This patch adds the necessary flags to the SConstruct and SConscript
files for compiling using clang 2.9 and later (on Ubuntu et al and OSX
XCode 4.2), and also cleans up a bunch of compiler warnings found by
clang. Most of the warnings are related to hidden virtual functions,
comparisons with unsigneds >= 0, and if-statements with empty
bodies. A number of mismatches between struct and class are also
fixed. clang 2.8 is not working as it has problems with class names
that occur in multiple namespaces (e.g. Statistics in
kernel_stats.hh).

clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which
causes confusion between the container std::set and the function
Packet::set, and this is currently addressed by not including the
entire namespace std, but rather selecting e.g. "using std::vector" in
the appropriate places.
2012-01-31 12:05:52 -05:00
Andreas Hansson
4fdecae443 Thread: Use inherited baseCpu rather than cpu in SimpleThread
This patch is a trivial simplification, removing the cpu pointer from
SimpleThread and relying on the baseCpu pointer in ThreadState. The
patch does not add or change any functionality, it merely cleans up
the code.
2012-01-31 11:50:07 -05:00
Geoffrey Blake
af6aaf2581 CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5
Brings the CheckerCPU back to life to allow FS and SE checking of the
O3CPU.  These changes have only been tested with the ARM ISA.  Other
ISAs potentially require modification.
2012-01-31 07:46:03 -08:00
Gabe Black
e88165a431 Merge with main repository. 2012-01-30 21:07:57 -08:00
Andreas Hansson
ef9fc01073 MEM: Clean-up of Functional/Virtual/TranslatingPort remnants
This patch cleans up forward declarations and a member-function
prototype that still referred to the old FunctionalPort, VirtualPort
and TranslatingPort. There is no change in functionality.
2012-01-30 03:44:25 -05:00
Gabe Black
39f314cc15 Yet another merge with the main repository.
--HG--
rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/config.ini => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/config.ini
rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/simout => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/simout
rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/stats.txt => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/stats.txt
rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/system.pc.com_1.terminal => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/system.pc.com_1.terminal
rename : tests/long/00.gzip/ref/x86/linux/o3-timing/config.ini => tests/long/se/00.gzip/ref/x86/linux/o3-timing/config.ini
rename : tests/long/00.gzip/ref/x86/linux/o3-timing/simout => tests/long/se/00.gzip/ref/x86/linux/o3-timing/simout
rename : tests/long/00.gzip/ref/x86/linux/o3-timing/stats.txt => tests/long/se/00.gzip/ref/x86/linux/o3-timing/stats.txt
rename : tests/long/10.mcf/ref/x86/linux/o3-timing/config.ini => tests/long/se/10.mcf/ref/x86/linux/o3-timing/config.ini
rename : tests/long/10.mcf/ref/x86/linux/o3-timing/simout => tests/long/se/10.mcf/ref/x86/linux/o3-timing/simout
rename : tests/long/10.mcf/ref/x86/linux/o3-timing/stats.txt => tests/long/se/10.mcf/ref/x86/linux/o3-timing/stats.txt
rename : tests/long/20.parser/ref/x86/linux/o3-timing/config.ini => tests/long/se/20.parser/ref/x86/linux/o3-timing/config.ini
rename : tests/long/20.parser/ref/x86/linux/o3-timing/simout => tests/long/se/20.parser/ref/x86/linux/o3-timing/simout
rename : tests/long/20.parser/ref/x86/linux/o3-timing/stats.txt => tests/long/se/20.parser/ref/x86/linux/o3-timing/stats.txt
rename : tests/long/70.twolf/ref/x86/linux/o3-timing/config.ini => tests/long/se/70.twolf/ref/x86/linux/o3-timing/config.ini
rename : tests/long/70.twolf/ref/x86/linux/o3-timing/simout => tests/long/se/70.twolf/ref/x86/linux/o3-timing/simout
rename : tests/long/70.twolf/ref/x86/linux/o3-timing/stats.txt => tests/long/se/70.twolf/ref/x86/linux/o3-timing/stats.txt
rename : tests/quick/00.hello/ref/x86/linux/o3-timing/config.ini => tests/quick/se/00.hello/ref/x86/linux/o3-timing/config.ini
rename : tests/quick/00.hello/ref/x86/linux/o3-timing/simout => tests/quick/se/00.hello/ref/x86/linux/o3-timing/simout
rename : tests/quick/00.hello/ref/x86/linux/o3-timing/stats.txt => tests/quick/se/00.hello/ref/x86/linux/o3-timing/stats.txt
2012-01-29 03:27:15 -08:00
Gabe Black
dc0e629ea1 Implement Ali's review feedback.
Try to decrease indentation, and remove some redundant FullSystem checks.
2012-01-29 02:04:34 -08:00
Nilay Vaish
5c2fc35e02 O3 CPU LSQ: Implement TSO
This patch makes O3's LSQ maintain total order between stores. Essentially
only the store at the head of the store buffer is allowed to be in flight.
Only after that store completes, the next store is issued to the memory
system. By default, the x86 architecture will have TSO.
2012-01-28 19:09:04 -06:00
Gabe Black
c3d41a2def Merge with the main repo.
--HG--
rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh
rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc
rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
2012-01-28 07:24:01 -08:00
Gabe Black
da2a4acc26 Merge yet again with the main repository. 2012-01-16 04:27:10 -08:00
Andreas Hansson
07cf9d914b MEM: Separate queries for snooping and address ranges
This patch simplifies the address-range determination mechanism and
also unifies the naming across ports and devices. It further splits
the queries for determining if a port is snooping and what address
ranges it responds to (aiming towards a separation of
cache-maintenance ports and pure memory-mapped ports). Default
behaviours are such that most ports do not have to define isSnooping,
and master ports need not implement getAddrRanges.
2012-01-17 12:55:09 -06:00
Andreas Hansson
de34e49d15 MEM: Simplify ports by removing EventManager
This patch removes the inheritance of EventManager from the ports and
moves all responsibility for event queues to the owner. Eventually the
event manager should be the interface block, which could either be the
structural owner or a subblock like a LSQ in the O3 CPU for example.
2012-01-17 12:55:09 -06:00
Andreas Hansson
b3f930c884 CPU: Moving towards a more general port across CPU models
This patch performs minimal changes to move the instruction and data
ports from specialised subclasses to the base CPU (to the largest
degree possible). Ultimately it servers to make the CPU(s) have a
well-defined interface to the memory sub-system.
2012-01-17 12:55:08 -06:00
Andreas Hansson
f85286b3de MEM: Add port proxies instead of non-structural ports
Port proxies are used to replace non-structural ports, and thus enable
all ports in the system to correspond to a structural entity. This has
the advantage of accessing memory through the normal memory subsystem
and thus allowing any constellation of distributed memories, address
maps, etc. Most accesses are done through the "system port" that is
used for loading binaries, debugging etc. For the entities that belong
to the CPU, e.g. threads and thread contexts, they wrap the CPU data
port in a port proxy.

The following replacements are made:
FunctionalPort      > PortProxy
TranslatingPort     > SETranslatingPortProxy
VirtualPort         > FSTranslatingPortProxy

--HG--
rename : src/mem/vport.cc => src/mem/fs_translating_port_proxy.cc
rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh
rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc
rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
2012-01-17 12:55:08 -06:00
Maximilien Breughe
a7394ad680 inorder: MDU deadlock fix 2012-01-12 10:15:00 -05:00
Nilay Vaish
9957035a42 DPRINTF: Improve some dprintf messages. 2012-01-10 10:15:02 -06:00
Anders Handler
b587d511c3 CPU: Remove Alpha-specific PC alignment check. 2012-01-09 20:05:07 -05:00
Ali Saidi
525d1e46dc O3: Remove some asserts that no longer seem to be valid. 2012-01-09 18:08:20 -06:00
Ali Saidi
d2c26f402c O3: Add support of function tracing with O3 CPU. 2012-01-09 18:08:20 -06:00
Andreas Hansson
c2dbfc1d6c MAC: Make gem5 compile and run on MacOSX 10.7.2
Adaptations to make gem5 compile and run on OSX 10.7.2, with a stock
gcc 4.2.1 and the remaining dependencies from macports, i.e. python
2.7,.2 swig 2.0.4, mercurial 2.0. The changes include an adaptation of
the SConstruct to handle non-library linker flags, and Darwin-specific
code to find the memory usage of gem5. A number of Ruby files relied
on ambigious uint (without the 32 suffix) which caused compilation
errors.
2012-01-09 18:08:20 -06:00
Gabe Black
241cc0c840 Another merge with the main repository. 2012-01-07 02:16:37 -08:00
Gabe Black
ec936364b7 Merge with the main repository again. 2012-01-07 02:15:35 -08:00
Gabe Black
36a822f08e Merge with main repository. 2012-01-07 02:10:34 -08:00
Nathan Binkert
6ef9691035 gcc: fix unused variable warnings from GCC 4.6.1
--HG--
extra : rebase_source : f9e22de341493a25ac6106c16ac35c61c128a080
2011-12-13 11:49:27 -08:00
Chris Emmons
5bde1d359f Output: Add hierarchical output support and cleanup existing codebase.
--HG--
extra : rebase_source : 3301137733cdf5fdb471d56ef7990e7a3a865442
2011-12-01 00:15:25 -08:00
Chander Sudanthi
61c14da751 O3: Remove hardcoded tgts_per_mshr in O3CPU.py.
There are two lines in O3CPU.py that set the dcache and icache
tgts_per_mshr to 20, ignoring any pre-configured value of tgts_per_mshr.
This patch removes these hardcoded lines from O3CPU.py and sets the default
L1 cache mshr targets to 20.

--HG--
extra : rebase_source : 6f92d950e90496a3102967442814e97dc84db08b
2011-12-01 00:15:22 -08:00
Ali Saidi
946f7f0f55 ARM: Add support for having a TLB cache.
--HG--
extra : rebase_source : 7a5780ab74d7c294682738c7ccb3ce8d56c6fd63
2011-12-01 00:15:22 -08:00
Ali Saidi
1444103998 O3: Add stat that counts how many cycles the O3 cpu was quiesced.
--HG--
extra : rebase_source : 043b9307eef3c5b87f8e6370765641e016ed1fa7
2011-12-01 00:15:22 -08:00
Gabe Black
85424bef19 SE/FS: Get rid of includes of config/full_system.hh. 2011-11-18 02:20:22 -08:00
Gabe Black
de21bb93ea SE/FS: Get rid of FULL_SYSTEM in the CPU directory. 2011-11-18 01:33:28 -08:00
Nilay Vaish
a547cf34b9 Ruby: Remove some unused typedefs
This patch removes some of the unused typedefs. It also moves
some of the typedefs from Global.hh to TypeDefines.hh. The patch
also eliminates the file NodeID.hh.
2011-11-03 22:46:45 -05:00
Gabe Black
8b4a3f4070 SE/FS: Get rid of FULL_SYSTEM in sim. 2011-11-02 02:11:14 -07:00
Gabe Black
b6da5e2086 SE/FS: Get rid of uses of FULL_SYSTEM in Alpha. 2011-11-01 04:01:14 -07:00
Gabe Black
1268e0df1f SE/FS: Expose the same methods on the CPUs in SE and FS modes. 2011-11-01 04:01:13 -07:00
Gabe Black
8ad2b8c559 SE/FS: Make the functions available from the TC consistent between SE and FS. 2011-10-31 02:58:22 -07:00
Gabe Black
d735abe5da GCC: Get everything working with gcc 4.6.1.
And by "everything" I mean all the quick regressions.
2011-10-31 01:09:44 -07:00
Gabe Black
facb40f3ff SE/FS: Make getProcessPtr available in both modes, and get rid of FULL_SYSTEMs. 2011-10-30 00:33:02 -07:00