Commit graph

4977 commits

Author SHA1 Message Date
Gabe Black
2e4fb3f139 X86: Mark IO reads and writes as non-speculative. 2011-03-01 22:42:59 -08:00
Gabe Black
72d35701e9 X86: Mark prefetches as such in their instruction and request flags. 2011-03-01 22:42:18 -08:00
Nilay Vaish
3a10b200f7 Ruby: Fix DPRINTF bugs in PerfectSwitch and MessageBuffer
At a couple of places in PerfectSwitch.cc and MessageBuffer.cc, DPRINTF()
has not been provided with correct number of arguments. The patch fixes these
bugs.
2011-03-01 15:26:11 -06:00
Gabe Black
993e83ef80 Ruby: Mention that Ruby's bound checking option only applies to Ruby. 2011-03-01 02:59:09 -08:00
Gabe Black
d3214c5c5e X86: If PCI config space is disabled, pass through to regular IO addresses. 2011-02-27 16:25:06 -08:00
Gabe Black
0ce5d31159 X86: Use regular read requests in the walker instead of read exclusive. 2011-02-27 16:24:10 -08:00
Nathan Binkert
586564895f getopt: Remove GPL code.
This code is unused and should never have been committed
2011-02-26 21:43:11 -08:00
Nilay Vaish
a4c038764d Ruby: Remove store buffer
This patch removes the store buffer from Ruby. It is not in use currently.
Since libruby is being and store buffer makes calls to libruby, it is not
possible to maintain it until substantial changes are made.
2011-02-25 17:55:20 -06:00
Nilay Vaish
e7edd270aa Ruby: Remove libruby
This patch removes libruby_internal.hh, libruby.hh and libruby.cc. It moves
the contents to libruby.hh to RubyRequest.hh and RubyRequest.cc files.
2011-02-25 17:54:56 -06:00
Nilay Vaish
6bf7153104 Ruby: Make Address.hh independent of RubySystem
This patch changes Address.hh so that it is not dependent on RubySystem.
This dependence seems unecessary. All those functions that depend on
RubySystem have been moved to Address.cc file.
2011-02-25 17:51:56 -06:00
Nilay Vaish
80b3886475 Ruby: Make DataBlock.hh independent of RubySystem
This patch changes DataBlock.hh so that it is not dependent on RubySystem.
This dependence seems unecessary. All those functions that depende on
RubySystem have been moved to DataBlock.cc file.
2011-02-25 17:51:02 -06:00
Timothy M. Jones
a10685ad1e O3CPU: Fix iqCount and lsqCount SMT fetch policies.
Fixes two of the SMT fetch policies in O3CPU that were returning the count
of instructions in the IQ or LSQ rather than the thread ID to fetch from.
2011-02-25 13:50:29 +00:00
Brad Beckmann
12a05c23b7 ruby: automate permission setting
This patch integrates permissions with cache and memory states, and then
automates the setting of permissions within the generated code.  No longer
does one need to manually set the permissions within the setState funciton.
This patch will faciliate easier functional access support by always correctly
setting permissions for both cache and memory states.

--HG--
rename : src/mem/slicc/ast/EnumDeclAST.py => src/mem/slicc/ast/StateDeclAST.py
rename : src/mem/slicc/ast/TypeFieldEnumAST.py => src/mem/slicc/ast/TypeFieldStateAST.py
2011-02-23 16:41:59 -08:00
Brad Beckmann
7842e95519 MOESI_hammer: cache probe address clean up 2011-02-23 16:41:58 -08:00
Brad Beckmann
3bc33eeaea ruby: cleaned up access permission enum 2011-02-23 16:41:58 -08:00
Brad Beckmann
c09a33e5d5 ruby: removed unsupported protocol files 2011-02-23 16:41:26 -08:00
Korey Sewell
0a74246fb9 inorder: InstSeqNum bug
Because int and not InstSeqNum was used in a couple of places, you can
overflow the int type and thus get wierd bugs when the sequence number
is negative (or some wierd value)
2011-02-23 16:35:18 -05:00
Korey Sewell
3e1ad73d08 inorder: dyn inst initialization
remove constructors that werent being used (it just gets confusing)
use initialization list for all the variables instead of relying on initVars()
function
2011-02-23 16:35:04 -05:00
Korey Sewell
e0a021005d inorder: cache packet handling
-use a pointer to CacheReqPacket instead of PacketPtr so correct destructors
get called on packet deletion
- make sure to delete the packet if the cache blocks the sendTiming request
or for some reason we dont use the packet
- dont overwrite memory requests since in the worst case an instruction will
be replaying a request so no need to keep allocating a new request
- we dont use retryPkt so delete it
- fetch code was split out already, so just assert that this is a memory
reference inst. and that the staticInst is available
2011-02-23 16:30:45 -05:00
Ali Saidi
057598843a Mem: Print out memory when access > 8 bytes 2011-02-23 15:10:50 -06:00
Ali Saidi
2eb19dac65 ARM: Set ITSTATE correctly after FlushPipe 2011-02-23 15:10:50 -06:00
Ali Saidi
916c7f162d ARM: This panic can be hit during misspeculation so it can't exist. 2011-02-23 15:10:50 -06:00
Ali Saidi
1201c5a134 ARM: Bad interworking warn way to noisy when running real code w/misspeculation. 2011-02-23 15:10:50 -06:00
Ali Saidi
f9d4d9df1b O3: When a prefetch causes a fault, don't record it in the inst 2011-02-23 15:10:50 -06:00
Giacomo Gabrielli
7ee2de31c4 ARM: NEON instruction templates modified to set the predicate flag to false when needed. 2011-02-23 15:10:50 -06:00
Ali Saidi
3de8e0a0d4 O3: If there is an outstanding table walk don't let the inst queue sleep.
If there is an outstanding table walk and no other activity in the CPU
it can go to sleep and never wake up. This change makes the instruction
queue always active if the CPU is waiting for a store to translate.

If Gabe changes the way this code works then the below should be removed
as indicated by the todo.
2011-02-23 15:10:49 -06:00
Ali Saidi
326191adc9 ARM: Squash state on FPSCR stride or len write. 2011-02-23 15:10:49 -06:00
Matt Horsnell
bb319a589e ARM: Mark store conditionals as such. 2011-02-23 15:10:49 -06:00
Ali Saidi
7391ea6de6 ARM: Do something for ISB, DSB, DMB 2011-02-23 15:10:49 -06:00
Ali Saidi
ae3d456855 ARM: Fix bug that let two table walks occur in parallel. 2011-02-23 15:10:49 -06:00
Ali Saidi
f05f35df99 Includes: Don't include isa_traits.hh and use the TheISA namespace unless really needed. 2011-02-23 15:10:49 -06:00
Ali Saidi
805ad4ba41 ARM: Make Noop actually decode to a noop and set it's instflags. 2011-02-23 15:10:49 -06:00
Ali Saidi
68bd80794c O3: Fix bug when a squash occurs right before TLB miss returns.
In this case we need to throw away the TLB miss, not assume it was the
one we were waiting for.
2011-02-23 15:10:49 -06:00
Ali Saidi
e572cf93ee ARM: Delete OABI syscall handling.
We only support EABI binaries, so there is no reason to support OABI syscalls.
The loader detects OABI calls and fatal() so there is no reason to even check
here.
2011-02-23 15:10:48 -06:00
Ali Saidi
511c637ab0 CLCD: Fix some serialization bugs with the clcd controller. 2011-02-23 15:10:48 -06:00
Ali Saidi
e2a6275c03 ARM: Add support for read of 100MHz clock in system controller. 2011-02-23 15:10:48 -06:00
Ali Saidi
2157b9976b ARM: Reset simulation statistics when pref counters are reset.
The ARM performance counters are not currently supported by the model.
This patch interprets a 'reset performance counters' command to mean 'reset
the simulator statistics' instead.
2011-02-23 15:10:48 -06:00
Ali Saidi
d63020717c ARM: Adds dummy support for a L2 latency miscreg. 2011-02-23 15:10:48 -06:00
Korey Sewell
78c37b8048 ruby: extend dprintfs for RubyGenerated TraceFlag
"executing" isnt a very descriptive debug message and in going through the
output you get multiple messages that say "executing" but nothing to help
you parse through the code/execution.

So instead, at least print out the name of the action that is taking
place in these functions.
2011-02-23 00:58:42 -05:00
Korey Sewell
67cc52a605 ruby: cleaning up RubyQueue and RubyNetwork dprintfs
Overall, continue to progress Ruby debug messages to more of the normal M5
debug message style
- add a name() to the Ruby Throttle & PerfectSwitch objects so that the debug output
isn't littered w/"global:" everywhere.
- clean up messages that print over multiple lines when possible
- clean up duplicate prints in the message buffer
2011-02-23 00:58:40 -05:00
Brad Beckmann
63a25a56cc m5: merged in hammer fix 2011-02-22 11:16:40 -08:00
Nilay Vaish
77eed184f5 Ruby: Machine Type missing in MOESI CMP directory protocol
In certain actions of the L1 cache controller, while creating an outgoing
message, the machine type was not being set. This results in a
segmentation fault when trace is collected. Joseph Pusudesris provided
his patch for fixing this issue.
2011-02-19 17:32:43 -06:00
Nilay Vaish
293ccb7037 Ruby: clean MOESI CMP directory protocol
The L1 cache controller file contains references to foo and goo queues, which
are not in use at all. These have been removed.
2011-02-19 17:32:00 -06:00
Korey Sewell
66bb732c04 m5: merge inorder/release-notes/make_release changes 2011-02-18 14:35:15 -05:00
Korey Sewell
bc16bbc158 inorder: add names and slot #s to res. dprints 2011-02-18 14:31:31 -05:00
Korey Sewell
64d31e75b9 inorder: ignore nops in execution unit 2011-02-18 14:30:38 -05:00
Korey Sewell
0fe19836c7 inorder: update graduation unit
make sure instructions are able to commit before writing back to the RF
do not commit more than 1 non-speculative instruction per cycle
2011-02-18 14:30:05 -05:00
Korey Sewell
89335118a5 inorder: recognize isSerializeAfter flag
keep track of when an instruction needs the execution
behind it to be serialized. Without this, in SE Mode
instructions can execute behind a system call exit().
2011-02-18 14:29:48 -05:00
Korey Sewell
bbffd9419d inorder: update default thread size(=1)
a lot of structures get allocated based off that MaxThreads parameter so this is an
effort to not abuse it
2011-02-18 14:29:44 -05:00
Korey Sewell
a278df0b95 inorder: don't overuse getLatency()
resources don't need to call getLatency because the latency is already a member
in the class. If there is some type of special case where different instructions
impose a different latency inside a resource then we can revisit this and
add getLatency() back in
2011-02-18 14:29:40 -05:00
Korey Sewell
37df925953 inorder: update max. resource bandwidths
each resource has a certain # of requests it can take per cycle. update the #s here
to be more realistic based off of the pipeline width and if the resource needs to
be accessed on multiple cycles
2011-02-18 14:29:31 -05:00
Korey Sewell
91c48b1c3b inorder: cleanup in destructors
cleanup hanging pointers and other cruft in the destructors
2011-02-18 14:29:26 -05:00
Korey Sewell
8b4b4a1ba5 inorder: fix cache/fetch unit memory leaks
---
need to delete the cache request's data on clearRequest() now that we are recycling
requests
---
fetch unit needs to deallocate the fetch buffer blocks when they are replaced or
squashed.
2011-02-18 14:29:17 -05:00
Korey Sewell
72b5233112 inorder: remove events for zero-cycle resources
if a resource has a zero cycle latency (e.g. RegFile write), then dont allocate an event
for it to use
2011-02-18 14:29:02 -05:00
Korey Sewell
d5961b2b20 inorder: update pipeline interface for handling finished resource reqs
formerly, to free up bandwidth in a resource, we could just change the pointer in that resource
but at the same time the pipeline stages had visibility to see what happened to a resource request.
Now that we are recycling these requests (to avoid too much dynamic allocation), we can't throw
away the request too early or the pipeline stage gets bad information. Instead, mark when a request
is done with the resource all together and then let the pipeline stage call back to the resource
that it's time to free up the bandwidth for more instructions
*** inteface notes ***
- When an instruction completes and is done in a resource for that cycle, call done()
- When an instruction fails and is done with a resource for that cycle, call done(false)
- When an instruction completes, but isnt finished with a resource, call completed()
- When an instruction fails, but isnt finished with a resource, call completed(false)
* * *
inorder: tlbmiss wakeup bug fix
2011-02-18 14:28:37 -05:00
Korey Sewell
d64226750e inorder: remove request map, use request vector
take away all instances of reqMap in the code and make all references use the built-in
request vectors inside of each resource. The request map was dynamically allocating
a request per instruction. The request vector just allocates N number of requests
during instantiation and then the surrounding code is fixed up to reuse those N requests
***
setRequest() and clearRequest() are the new accessors needed to define a new
request in a resource
2011-02-18 14:28:30 -05:00
Korey Sewell
c883729025 inorder: add valid bit for resource requests
this will allow us to reuse resource requests within a resource instead
of always dynamically allocating
2011-02-18 14:28:22 -05:00
Korey Sewell
ff48afcf4f inorder: remove reqRemoveList
we are going to be getting away from creating new resource requests for every
instruction so no more need to keep track of a reqRemoveList and clean it up
every tick
2011-02-18 14:28:10 -05:00
Korey Sewell
991d0185c6 inorder: initialize res. req. vectors based on resource bandwidth
first change in an optimization that will stop InOrder from allocating new memory for every instruction's
request to a resource. This gets expensive since every instruction needs to access ~10 requests before
graduation. Instead, the plan is to allocate just enough resource request objects to satisfy each resource's
bandwidth (e.g. the execution unit would need to allocate 3 resource request objects for a 1-issue pipeline
since on any given cycle it could have 2 read requests and 1 write request) and then let the instructions
contend and reuse those allocated requests. The end result is a smaller memory footprint for the InOrder model
and increased simulation performance
2011-02-18 14:27:52 -05:00
Gabe Black
fde8b5c387 X86: Get rid of "inline" on the MicroPanic constructor in decoder.cc.
This was making certain versions of gcc omit the function from the object file
which would break the build.
2011-02-15 15:58:16 -08:00
Gabe Black
989138970e Info: Clean up some info files.
Get rid of RELEASE_NOTES since we no longer do releases, update some of the
information in README, and update the date in LICENSE.
2011-02-14 21:36:37 -08:00
Nilay Vaish
343e94a257 Ruby: Improve Change PerfectSwitch's wakeup function
Currently the wakeup function for the PerfectSwitch contains three loops -

loop on number of virtual networks
  loop on number of incoming links
	    loop till all messages for this (link, network) have been routed

With an 8 processor mesh network and Hammer protocol, about 11-12% of the
was observed to have been spent in this function, which is the highest
amongst all the functions. It was found that the innermost loop is executed
about 45 times per invocation of the wakeup function, when each invocation
of the wakeup function processes just about one message.

The patch tries to do away with the redundant executions of the innermost
loop. Counters have been added for each virtual network that record the
number of messages that need to be routed for that virtual network. The
inner loops are only executed when the number of messages for that particular
virtual network > 0. This does away with almost 80% of the executions of the
innermost loop. The function now consumes about 5-6% of the total execution
time.
2011-02-14 16:14:54 -06:00
Gabe Black
77b4a37067 X86: Detect branches taking into account instruction size.
The size of the current instruction determines what the npc should be if
there's no branching.
2011-02-13 17:45:47 -08:00
Gabe Black
bce2be525d X86: Put the result used for flags in an intermediate variable.
Using the destination register directly causes the ISA parser to treat it as a
source even if none of the original bits are used.
2011-02-13 17:45:12 -08:00
Gabe Black
4e1adf85f7 X86: Don't read in dest regs if all bits are replaced.
In x86, 32 and 64 bit writes to registers in which registers appear to be 32 or
64 bits wide overwrite all bits of the destination register. This change
removes false dependencies in these cases where the previous value of a
register doesn't need to be read to write a new value. New versions of most
microops are created that have a "Big" suffix which simply overwrite their
destination, and the right version to use is selected during microop
allocation based on the selected data size.

This does not change the performance of the O3 CPU model significantly, I
assume because there are other false dependencies from the condition code bits
in the flags register.
2011-02-13 17:44:24 -08:00
Gabe Black
399e095510 X86: On a bad microopc, return a microop that returns a fault that panics.
This way a bad micropc will have to get all the way to commit before killing
the simulation. This accounts for misspeculated branches.
2011-02-13 17:42:56 -08:00
Gabe Black
1aa9698fa0 X86: Define fault objects to carry debug messages.
These faults can panic/warn/warn_once, etc., instead of instructions doing
that themselves directly. That way, instructions can be speculatively
executed, and only if they're actually going to commit will their fault be
invoked and the panic, etc., happen.
2011-02-13 17:42:05 -08:00
Gabe Black
5ee94f4a3d X86: Only reset npc to reflect instruction length once.
When redirecting fetch to handle branches, the npc of the current pc state
needs to be left alone. This change makes the pc state record whether or not
the npc already reflects a real value by making it keep track of the current
instruction size, or if no size has been set.
2011-02-13 17:41:10 -08:00
Gabe Black
f036fd9748 O3: Fetch from the microcode ROM when needed. 2011-02-13 17:40:07 -08:00
Ali Saidi
7c763b34c9 O3: Fix GCC 4.2.4 complaint 2011-02-13 16:51:15 -05:00
Nilay Vaish
0cede15d6c Ruby: Reorder Cache Lookup in Protocol Files
The patch changes the order in which L1 dcache and icache are looked up when
a request comes in. Earlier, if a request came in for instruction fetch, the
dcache was looked up before the icache, to correctly handle self-modifying
code. But, in the common case, dcache is going to report a miss and the
subsequent icache lookup is going to report a hit. Given the invariant -
caches under the same controller keep track of disjoint sets of cache blocks,
we can move the icache lookup before the dcache lookup. In case of a hit in
the icache, using our invariant, we know that the dcache would have reported
a miss. In  case of a miss in the icache, we know that icache would have
missed even if the dcache was looked up before looking up the icache.
Effectively, we are doing the same thing as before, though in the common case,
we expect reduction in the number of lookups. This was empirically confirmed
for MOESI hammer. The ratio lookups to access requests is now about 1.1 to 1.
2011-02-12 11:41:20 -06:00
Korey Sewell
470aa289da inorder: clean up the old way of inst. scheduling
remove remnants of old way of instruction scheduling which dynamically allocated
a new resource schedule for every instruction
2011-02-12 10:14:48 -05:00
Korey Sewell
e26aee514d inorder: utilize cached skeds in pipeline
allow the pipeline and resources to use the cached instruction schedule and resource
sked iterator
2011-02-12 10:14:45 -05:00
Korey Sewell
516b611462 inorder: define iterator for resource schedules
resource skeds are divided into two parts: front end (all insts) and back end (inst. specific)
each of those are implemented as separate lists,  so this iterator wraps around
the traditional list iterator so that an instruction can walk it's schedule but seamlessly
transfer from front end to back end when necessary
2011-02-12 10:14:43 -05:00
Korey Sewell
ec9b2ec251 inorder: stage scheduler for front/back end schedule creation
add a stage scheduler class to replace InstStage in pipeline_traits.cc
use that class to define a default front-end, resource schedule that all
instructions will follow. This will also replace the back end schedule in
pipeline_traits.cc. The reason for adding this is so that we can cache
instruction schedules in the future instead of calling the same function
over/over again as well as constantly dynamically alllocating memory on
every instruction to try to figure out it's schedule
2011-02-12 10:14:40 -05:00
Korey Sewell
6713dbfe08 inorder: cache instruction schedules
first step in a optimization to not dynamically allocate an instruction schedule
for every instruction but rather used cached schedules
2011-02-12 10:14:36 -05:00
Korey Sewell
af67631790 inorder: comments for resource sked class 2011-02-12 10:14:34 -05:00
Korey Sewell
800e93f358 inorder: remove unused file
inst_buffer file isn't used , so remove it
2011-02-12 10:14:32 -05:00
Korey Sewell
e65c15e931 inorder: remove unused isa ops
pass/fail ops were used for testing but arent part of isa
2011-02-12 10:14:26 -05:00
Ali Saidi
d4df9e763c VNC/ARM: Use VNC server and add support to boot into X11 2011-02-11 18:29:36 -06:00
Ali Saidi
d33c1d9592 VNC: Add VNC server to M5 2011-02-11 18:29:35 -06:00
Ali Saidi
ded4d319f2 Serialization: Allow serialization of stl lists 2011-02-11 18:29:35 -06:00
Giacomo Gabrielli
a05032f4df O3: Fix pipeline restart when a table walk completes in the fetch stage.
When a table walk is initiated by the fetch stage, the CPU can
potentially move to the idle state and never wake up.

The fetch stage must call cpu->wakeCPU() when a translation completes
(in finishTranslation()).
2011-02-11 18:29:35 -06:00
Giacomo Gabrielli
74eff1b71b O3: Fix a few bugs in the TableWalker object.
Uncacheable requests were set as such only in atomic mode.
currState->delayed is checked in place of currState->timing for resetting
currState in atomic mode.
2011-02-11 18:29:35 -06:00
Ali Saidi
1411cb0b0f SimpleCPU: Fix a case where a DTLB fault redirects fetch and an I-side walk occurs.
This change fixes an issue where a DTLB fault occurs and redirects fetch to
handle the fault and the ITLB requires a walk which delays translation. In this
case the status of the cpu isn't updated appropriately, and an additional
instruction fetch occurs. Eventually this hits an assert as multiple instruction
fetches are occuring in the system and when the second one returns the
processor is in the wrong state.

Some asserts below are removed because it was always true (typo) and the state
after the initiateAcc() the processor could be in any valid state when a
d-side fault occurs.
2011-02-11 18:29:35 -06:00
Giacomo Gabrielli
e2507407b1 O3: Enhance data address translation by supporting hardware page table walkers.
Some ISAs (like ARM) relies on hardware page table walkers.  For those ISAs,
when a TLB miss occurs, initiateTranslation() can return with NoFault but with
the translation unfinished.

Instructions experiencing a delayed translation due to a hardware page table
walk are deferred until the translation completes and kept into the IQ.  In
order to keep track of them, the IQ has been augmented with a queue of the
outstanding delayed memory instructions.  When their translation completes,
instructions are re-executed (only their initiateAccess() was already
executed; their DTB translation is now skipped).  The IEW stage has been
modified to support such a 2-pass execution.
2011-02-11 18:29:35 -06:00
Ali Saidi
453dbc772d ARM: Fix timer calculations.
The timer calculations were a bit off so time would run faster than
it otherwise should
2011-02-11 18:29:35 -06:00
Ali Saidi
59bf0e7eb4 Timesync: Make sure timesync event is setup after curTick is unserialized
Setup initial timesync event in initState or loadState so that curTick has
been updated to the new value, otherwise the event is scheduled in the past.
2011-02-11 18:29:35 -06:00
Brad Beckmann
fbebe9a642 MOESI_hammer: fixed wakeup for SS->S transistion 2011-02-10 13:28:23 -08:00
Brad Beckmann
06dfee5cea ruby: removed duplicate make response call 2011-02-09 16:02:09 -08:00
Nilay Vaish
488280e48b MESI CMP: Unset TBE pointer in L2 cache controller
The TBE pointer in the MESI CMP implementation was not being set to NULL
when the TBE is deallocated. This resulted in segmentation fault on testing
the protocol when the ProtocolTrace was switched on.
2011-02-08 07:47:02 -06:00
Tim Harris
44e5e7e053 X86: Obey the wp bit of CR0.
If cr0.wp ("write protect" bit) is clear then do not generate page faults when
writing to write-protected pages in kernel mode.
2011-02-07 15:18:52 -08:00
Tim Harris
6da83b8a1b X86: Use all 64 bits of the lstar register in the SYSCALL_64 macroop.
During SYSCALL_64, use dataSize=8 when handling new rip (ref
http://www.intel.com/Assets/PDF/manual/253668.pdf 5.8.8 IA32_LSTAR is a 64-bit
address)
2011-02-07 15:16:27 -08:00
Tim Harris
2ea1aa8a4f X86: Fix JMP_FAR_I to unpack a far pointer correctly.
JMP_FAR_I was unpacking its far pointer operand using sll instead of srl like
it should, and also putting the components in the wrong registers for use by
other microcode.
2011-02-07 15:12:59 -08:00
Tim Harris
5810ab121c X86: Read the LDT/GDT at CPL0 when executing an iret.
During iret access LDT/GDT at CPL0 rather than after transition to user mode
(if I'm reading the Intel IA-64 architecture spec correctly, the contents of
the descriptor table are read before the CPL is updated).
2011-02-07 15:05:28 -08:00
Nilay Vaish
10b4b364d9 Orion: Replace printf() with fatal()
The code for Orion 2.0 makes use of printf() at several places where there as
an error in configuration of the model. These have been replaced with fatal().
2011-02-07 12:42:23 -06:00
Korey Sewell
1b4e788407 ruby: add stdio header in SRAM.hh
missing header file caused RUBY_FS to not compile
2011-02-07 12:19:46 -05:00
Gabe Black
0c4b816d84 X86: Fix compiling vtophys.cc 2011-02-07 01:21:21 -08:00
Brad Beckmann
f5aa75fdc5 ruby: support to stallAndWait the mandatory queue
By stalling and waiting the mandatory queue instead of recycling it, one can
ensure that no incoming messages are starved when the mandatory queue puts
signficant of pressure on the L1 cache controller (i.e. the ruby memtester).

--HG--
rename : src/mem/slicc/ast/WakeUpDependentsStatementAST.py => src/mem/slicc/ast/WakeUpAllDependentsStatementAST.py
2011-02-06 22:14:19 -08:00
Brad Beckmann
194a137498 ruby: minor fix to deadlock panic message 2011-02-06 22:14:19 -08:00
Joel Hestness
ebe563e531 garnet: Split network power in ruby.stats
Split out dynamic and static power numbers for printing to ruby.stats
2011-02-06 22:14:19 -08:00
Brad Beckmann
5c2f4937b3 MOESI_hammer: fixed dir bug counting received acks 2011-02-06 22:14:19 -08:00
Brad Beckmann
7edab47448 ruby: numa bit fix for sparse memory 2011-02-06 22:14:19 -08:00
Tushar Krishna
4fa690e8ff MOESI_CMP_token: removed unused message fields 2011-02-06 22:14:19 -08:00
Brad Beckmann
273e3d4924 mem: Added support for Null data packet
The packet now identifies whether static or dynamic data has been allocated and
is used by Ruby to determine whehter to copy the data pointer into the ruby
request.  Subsequently, Ruby can be told not to update phys memory when
receiving packets.
2011-02-06 22:14:19 -08:00
Brad Beckmann
dfa8cbeb06 m5: added work completed monitoring support 2011-02-06 22:14:19 -08:00
Brad Beckmann
c41fc138e7 dev: fixed bugs to extend interrupt capability beyond 15 cores 2011-02-06 22:14:18 -08:00
Joel Hestness
3a2d2223e1 x86: Timing support for pagetable walker
Move page table walker state to its own object type, and make the
walker instantiate state for each outstanding walk. By storing the
states in a queue, the walker is able to handle multiple outstanding
timing requests. Note that functional walks use separate state
elements.
2011-02-06 22:14:18 -08:00
Joel Hestness
52b6119228 TimingSimpleCPU: split data sender state fix
In sendSplitData, keep a pointer to the senderState that may be updated after
the call to handle*Packet. This way, if the receiver updates the packet
senderState, it can still be accessed in sendSplitData.
2011-02-06 22:14:18 -08:00
Brad Beckmann
2da54d1285 ruby: Fix RubyPort to properly handle retrys 2011-02-06 22:14:18 -08:00
Joel Hestness
dedb4fbf05 Ruby: Fix to return cache block size to CPU for split data transfers 2011-02-06 22:14:18 -08:00
Joel Hestness
82844618fd Ruby: Add support for locked memory accesses in X86_FS 2011-02-06 22:14:18 -08:00
Joel Hestness
16c1edebd0 Ruby: Update the Ruby request type names for LL/SC 2011-02-06 22:14:18 -08:00
Brad Beckmann
9782ca5def ruby: Assert for x86 misaligned access
This patch ensures only aligned access are passed to ruby and includes a fix
to the DPRINTF address print.
2011-02-06 22:14:18 -08:00
Brad Beckmann
1b54344aeb MOESI_hammer: Added full-bit directory support 2011-02-06 22:14:18 -08:00
Joel Hestness
62e05ed78a x86: Add checkpointing capability to devices
Add checkpointing capability to the Intel 8254 timer, CMOS, I8042,
PS2 Keyboard and Mouse, I82094AA, I8237, I8254, I8259, and speaker
devices
2011-02-06 22:14:18 -08:00
Joel Hestness
911ccef6c0 x86: Add checkpointing capability to arch components
Add checkpointing capability to the x86 interrupt device and the TLBs
2011-02-06 22:14:17 -08:00
Joel Hestness
38140b5519 x86: implements vtophys
Calls walker to look up virt. to phys. page mapping
2011-02-06 22:14:17 -08:00
Joel Hestness
eea78f968b IntDev: packet latency fix
The x86 local apic now includes a separate latency parameter for interrupts.
2011-02-06 22:14:17 -08:00
Joel Hestness
d9f0a8288e MessagePort: implement the virtual recvTiming function to avoid double pkt delete
Double packet delete problem is due to an interrupt device deleting a packet that the SimpleTimingPort also deletes. Since MessagePort descends from SimpleTimingPort, simply reimplement the failing code from SimpleTimingPort: recvTiming.
2011-02-06 22:14:17 -08:00
Joel Hestness
02b05bf9be MOESI_hammer: trigge queue fix. 2011-02-06 22:14:17 -08:00
Joel Hestness
b4c10bd680 mcpat: Adds McPAT performance counters
Updated patches from Rick Strong's set that modify performance counters for
McPAT
2011-02-06 22:14:17 -08:00
Tushar Krishna
a679e732ce garnet: added orion2.0 for network power calculation 2011-02-06 22:14:17 -08:00
Tushar Krishna
59163f824c garnet: separate data and ctrl VCs
Separate data VCs and ctrl VCs in garnet, as ctrl VCs have 1 buffer per VC,
while data VCs have > 1 buffers per VC. This is for correct power estimations.
2011-02-06 22:14:16 -08:00
Brad Beckmann
afd754dc0d x86: set IsCondControl flag for the appropriate microops 2011-02-06 22:14:16 -08:00
Gabe Black
aa62c217c5 Fault: Forgot to refresh to grab these header guard updates. 2011-02-03 22:07:34 -08:00
Korey Sewell
e396a34b01 inorder: fault handling
Maintain all information about an instruction's fault in the DynInst object rather
than any cpu-request object. Also, if there is a fault during the execution stage
then just save the fault inside the instruction and trap once the instruction
tries to graduate
2011-02-04 00:09:20 -05:00
Korey Sewell
e57613588b inorder: pcstate and delay slots bug
not taken delay slots were not being advanced correctly to pc+8, so for those ISAs
we 'advance()' the pcstate one more time for the desired effect
2011-02-04 00:09:19 -05:00
Korey Sewell
68d962f8af inorder: add a fetch buffer to fetch unit
Give fetch unit it's own parameterizable fetch buffer to read from. Very inefficient
(architecturally and in simulation) to continually fetch at the granularity of the
wordsize. As expected, the number of fetch memory requests drops dramatically
2011-02-04 00:08:22 -05:00
Korey Sewell
56ce8acd41 inorder: overload find-req fn
no need to have separate function name findSplitRequest, just overload the function
2011-02-04 00:08:21 -05:00
Korey Sewell
ab3d37d398 inorder: implement separate fetch unit
instead of having one cache-unit class be responsible for both data and code
accesses, separate code that is just for fetch in it's own derived class off the
original base class. This makes the code easier to manage as well as handle
future cases of special fetch handling
2011-02-04 00:08:20 -05:00
Korey Sewell
f80508de65 inorder: cache port blocking
set the request to false when the cache port blocks so we dont deadlock.
also, comment out the outstanding address list sanity check for now.
2011-02-04 00:08:19 -05:00
Korey Sewell
0c6a679359 inorder: stage width as a python parameter
allow the user to specify how many instructions a pipeline stage can process
on any given cycle (stageWidth...i.e.bandwidth) by setting the parameter through
the python interface rather than compile the code after changing the *.cc file.
(we always had the parameter there, but still used the static 'ThePipeline::StageWidth'
instead)
-
Since StageWidth is now dynamically defined, change the interstage communication
structure to use a vector and get rid of array and array handling index (toNextStageIndex)
since we can just make calls to the list for the same information
2011-02-04 00:08:18 -05:00
Korey Sewell
8ac717ef4c inorder: multi-issue branch resolution
Only execute (resolve) one branch per cycle because handling more than one is
a little more complicated
2011-02-04 00:08:17 -05:00
Korey Sewell
be17617990 inorder: pipe. stage inst. buffering
use skidbuffer as only location for instructions between stages. before,
we had the insts queue from the prior stage and the skidbuffer for the
current stage, but that gets confusing and this consolidation helps
when handling squash cases
2011-02-04 00:08:16 -05:00
Korey Sewell
050944dd73 inorder: change skidBuffer to list instead of queue
manage insertion and deletion like a queue but will need
access to internal elements for future changes
Currently, skidbuffer manages any instruction that was
in a stage but could not complete processing, however
we will want to manage all blocked instructions (from prev stage
and from cur. stage) in just one buffer.
2011-02-04 00:08:15 -05:00
Korey Sewell
7f937e11e2 inorder: activity tracking bug
Previous code was marking CPU activity on almost every cycle due to a bug in
tracking the status of pipeline stages. This disables the CPU from sleeping
on long latency stalls and increases simulation time
2011-02-04 00:08:13 -05:00
Gabe Black
091a3e6cc0 Fault: Rename sim/fault.hh to fault_fwd.hh to distinguish it from faults.hh.
--HG--
rename : src/sim/fault.hh => src/sim/fault_fwd.hh
2011-02-03 21:47:58 -08:00
Gabe Black
00f24ae92c Config: Keep track of uncached and cached ports separately.
This makes sure that the address ranges requested for caches and uncached ports
don't conflict with each other, and that accesses which are always uncached
(message signaled interrupts for instance) don't waste time passing through
caches.
2011-02-03 20:23:00 -08:00
Gabe Black
869a046e41 O3: Fix a style bug in O3. 2011-02-02 23:34:14 -08:00
Gabe Black
cb22bead7d X86: Get rid of the stupd microop. 2011-02-02 19:57:12 -08:00
Gabe Black
eabbdbee63 X86: Replace the stupd microop with a store/update sequence. 2011-02-02 19:56:38 -08:00
Gabe Black
75d34c14fc Time: Add serialization functions to the Time class. 2011-02-02 18:05:03 -08:00
Gabe Black
119f5f8e94 X86: Add L1 caches for the TLB walkers.
Small L1 caches are connected to the TLB walkers when caches are used. This
allows them to participate in the coherence protocol properly.
2011-02-01 18:28:41 -08:00
Gabe Black
4b4cd0303e Fault: Move the definition of NoFault from faults.hh to fault.hh.
Moving the definition of NoFault into fault.hh doesn't bring any new
dependencies with it, and allows some files to include just fault.hh which has
less baggage. NoFault will still be available to everything that includes
faults.hh because it includes fault.hh.
2011-01-31 13:13:00 -08:00
Nathan Binkert
048b1e5843 refcnt: Change things around so that we handle constness correctly.
To use a non const pointer:
typedef RefCountingPtr<Foo> FooPtr;

To use a const pointer:
typedef RefCountingPtr<const Foo> ConstFooPtr;
2011-01-22 21:48:06 -08:00
Steve Reinhardt
5c99ae60b8 checkpointing: fix bug from curTick accessor conversion.
Regex replacement of curTick with curTick() accidentally
changed checkpoint key string for serialization but not
for unserialization.
2011-01-20 22:13:33 -08:00
Gabe Black
ddeaf1252f TimeSync: Use the new setTick and getTick functions. 2011-01-19 16:22:23 -08:00
Gabe Black
23bab6783b Time: Add setTick and getTick functions to the Time class. 2011-01-19 16:22:15 -08:00
Gabe Black
a368fba7d4 Time: Add a mechanism to prevent M5 from running faster than real time.
M5 skips over any simulated time where it doesn't have any work to do. When
the simulation is active, the time skipped is short and the work done at any
point in time is relatively substantial. If the time between events is long
and/or the work to do at each event is small, it's possible for simulated time
to pass faster than real time. When running a benchmark that can be good
because it means the simulation will finish sooner in real time. When
interacting with the real world through, for instance, a serial terminal or
bridge to a real network, this can be a problem. Human or network response time
could be greatly exagerated from the perspective of the simulation and make
simulated events happen "too soon" from an external perspective.

This change adds the capability to force the simulation to run no faster than
real time. It does so by scheduling a periodic event that checks to see if
its simulated period is shorter than its real period. If it is, it stalls the
simulation until they're equal. This is called time syncing.

A future change could add pseudo instructions which turn time syncing on and
off from within the simulation. That would allow time syncing to be used for
the interactive parts of a session but then turned off when running a
benchmark using the m5 utility program inside a script. Time syncing would
probably not happen anyway while running a benchmark because there would be
plenty of work for M5 to do, but the event overhead could be avoided.
2011-01-19 11:48:00 -08:00
Matt Horsnell
77853b9f52 O3: Fix itstate prediction and recovery.
Any change of control flow now resets the itstate to 0 mask and 0 condition,
except where the control flow alteration write into the cpsr register. These
case, for example return from an iterrupt, require the predecoder to recover
the itstate.

As there is a window of opportunity between the return from an interrupt
changing the control flow at the head of the pipe and the commit of the update
to the CPSR, the predecoder needs to be able to grab the ITstate early. This
is now handled by setting the forcedItState inside a PCstate for the control
flow altering instruction.

That instruction will have the correct mask/cond, but will not have a valid
itstate until advancePC is called (note this happens to advance the execution).
When the new PCstate is copy constructed it gets the itstate cond/mask, and
upon advancing the PC the itstate becomes valid.

Subsequent advancing invalidates the state and zeroes the cond/mask. This is
handled in isolation for the ARM ISA and should have no impact on other ISAs.

Refer arch/arm/types.hh and arch/arm/predecoder.cc for the details.
2011-01-18 16:30:05 -06:00
Matt Horsnell
b13a79ee71 O3: Fix some variable length instruction issues with the O3 CPU and ARM ISA. 2011-01-18 16:30:05 -06:00
Matt Horsnell
c98df6f8c2 O3: Don't test misprediction on load instructions until executed. 2011-01-18 16:30:05 -06:00
Ali Saidi
1167ef19cf O3: Keep around the last committed instruction and use for squashing.
Without this change 0 is always used for the youngest sequence number if
a squash occured and the ROB was empty (E.g. an instruction is marked
serializeAfter or a fetch stall prevents other instructions from issuing).
Using 0 there is a race to rename where an instruction that committed the
same cycle as the squashing instruction can have it's renamed state undone
by the squash using sequence number 0.
2011-01-18 16:30:05 -06:00
Ali Saidi
ea058b14da O3: Don't try to scoreboard misc registers.
I'm not positive this is the correct fix, but it's working right now.
Either we need to do something like this, prevent the misc reg from being renamed at all,
or there something else going on. We need to find the root cause as to why
this is only a problem sometimes.
2011-01-18 16:30:05 -06:00
Matt Horsnell
adbd84ab9f ARM: The ARM decoder should not panic when decoding undefined holes is arch.
This can abort simulations when the fetch unit runs ahead and speculatively
decodes instructions that are off the execution path.
2011-01-18 16:30:05 -06:00
Matt Horsnell
11bef2ab38 O3: Fix corner cases where multiple squashes/fetch redirects overwrite timebuf. 2011-01-18 16:30:05 -06:00
Matt Horsnell
62f2097917 O3: Fix mispredicts from non control instructions.
The squash inside the fetch unit should not attempt to remove them from the
branch predictor as non-control instructions are not pushed into the predictor.
2011-01-18 16:30:05 -06:00
Matt Horsnell
5ebf3b2808 O3: Fixes the way prefetches are handled inside the iew unit.
This patch prevents the prefetch being added to the instCommit queue twice.
2011-01-18 16:30:02 -06:00
Ali Saidi
ee9a331fe5 O3: Support timing translations for O3 CPU fetch. 2011-01-18 16:30:02 -06:00
Ali Saidi
0f9a3671b6 ARM: Add support for moving predicated false dest operands from sources. 2011-01-18 16:30:02 -06:00
Min Kyu Jeong
96375409ea O3: Fixes fetch deadlock when the interrupt clears before CPU handles it.
When this condition occurs the cpu should restart the fetch stage to fetch from
the original execution path. Fault handling in the commit stage is cleaned up a
little bit so the control flow is simplier. Finally, if an instruction is being
used to carry a fault it isn't executed, so the fault propagates appropriately.
2011-01-18 16:30:01 -06:00
Ali Saidi
965a01d913 ARM: Use an actual NOP instead of a instruction that happens to do nothing 2011-01-18 16:30:01 -06:00
Ali Saidi
a3232b534b ARM: fix mismatched new/delete. 2011-01-18 16:30:01 -06:00
Gabe Black
a39096a8c3 Unit tests: Convert the refcnttest unit test to use the new EXPECT macros. 2011-01-18 01:27:04 -08:00
Gabe Black
c04571d601 Unit tests: Define a header file for common unit testing functions/macros. 2011-01-18 01:26:55 -08:00
Nathan Binkert
318bfe9d4f time: improve time datastructure
Use posix clock functions (and librt) if it is available.
Inline a bunch of functions and implement more operators.
* * *
time: more cleanup
2011-01-15 07:48:25 -08:00
Nilay Vaish
c82a8979a3 Change interface between coherence protocols and CacheMemory
The purpose of this patch is to change the way CacheMemory interfaces with
coherence protocols. Currently, whenever a cache controller (defined in the
protocol under consideration) needs to carry out any operation on a cache
block, it looks up the tag hash map and figures out whether or not the block
exists in the cache. In case it does exist, the operation is carried out
(which requires another lookup). As observed through profiling of different
protocols, multiple such lookups take place for a given cache block. It was
noted that the tag lookup takes anything from 10% to 20% of the simulation
time. In order to reduce this time, this patch is being posted.

I have to acknowledge that the many of the thoughts that went in to this
patch belong to Brad.

Changes to CacheMemory, TBETable and AbstractCacheEntry classes:
1. The lookup function belonging to CacheMemory class now returns a pointer
to a cache block entry, instead of a reference. The pointer is NULL in case
the block being looked up is not present in the cache. Similar change has
been carried out in the lookup function of the TBETable class.
2. Function for setting and getting access permission of a cache block have
been moved from CacheMemory class to AbstractCacheEntry class.
3. The allocate function in CacheMemory class now returns pointer to the
allocated cache entry.

Changes to SLICC:
1. Each action now has implicit variables - cache_entry and tbe. cache_entry,
if != NULL, must point to the cache entry for the address on which the action
is being carried out. Similarly, tbe should also point to the transaction
buffer entry of the address on which the action is being carried out.
2. If a cache entry or a transaction buffer entry is passed on as an
argument to a function, it is presumed that a pointer is being passed on.
3. The cache entry and the tbe pointers received __implicitly__ by the
actions, are passed __explicitly__ to the trigger function.
4. While performing an action, set/unset_cache_entry, set/unset_tbe are to
be used for setting / unsetting cache entry and tbe pointers respectively.
5. is_valid() and is_invalid() has been made available for testing whether
a given pointer 'is not NULL' and 'is NULL' respectively.
6. Local variables are now available, but they are assumed to be pointers
always.
7. It is now possible for an object of the derieved class to make calls to
a function defined in the interface.
8. An OOD token has been introduced in SLICC. It is same as the NULL token
used in C/C++. If you are wondering, OOD stands for Out Of Domain.
9. static_cast can now taken an optional parameter that asks for casting the
given variable to a pointer of the given type.
10. Functions can be annotated with 'return_by_pointer=yes' to return a
pointer.
11. StateMachine has two new variables, EntryType and TBEType. EntryType is
set to the type which inherits from 'AbstractCacheEntry'. There can only be
one such type in the machine. TBEType is set to the type for which 'TBE' is
used as the name.

All the protocols have been modified to conform with the new interface.
2011-01-17 18:46:16 -06:00
Gabe Black
371603f12c SPARC: Adjust the "call" instruction so R15 doesn't get marked as a source. 2011-01-15 15:30:17 -08:00
Nilay Vaish
47ba26f6b3 Ruby: Fixes MESI CMP directory protocol
The current implementation of MESI CMP directory protocol is broken.
This patch, from Arkaprava Basu, fixes the protocol.
2011-01-13 22:17:11 -06:00
Korey Sewell
cd5a7f7221 inorder: fix RUBY_FS build
the current code was using incorrect dummy instruction in interrupts function
2011-01-12 11:52:29 -05:00
Nathan Binkert
bd18ac8287 ruby: get rid of ruby's Debug.hh
Get rid of the Debug class
Get rid of ASSERT and use assert
Use DPRINTFR for ProtocolTrace
2011-01-10 11:11:20 -08:00
Nathan Binkert
8e262adf4f stats: Add a histogram statistic type 2011-01-10 11:11:17 -08:00
Nathan Binkert
b9ddc1a726 stats: fix stat test from curTick change 2011-01-10 11:11:17 -08:00
Nathan Binkert
ff592e0ed1 stats: fix the distribution stat 2011-01-10 11:11:16 -08:00
Gabe Black
ae7e67f334 Root: Get rid of unnecessary includes in root.cc. 2011-01-10 04:53:34 -08:00
Gabe Black
df14312e08 Curtick: Fix mysql.cc build needing curTick. 2011-01-10 04:53:20 -08:00
Gabe Black
dc64732dee RefCount: Add a unit test for reference counting pointers.
This test exercises each of the functions in the reference counting pointer
implementation individually (except get()) and verifies they have some
minimially expected behavior. It also checks that reference counted objects
are freed when their usage count goes to 0 in some basic situations,
specifically a pointer being set to NULL and a pointer being deleted.
2011-01-10 03:56:42 -08:00
Steve Reinhardt
6f1187943c Replace curTick global variable with accessor functions.
This step makes it easy to replace the accessor functions
(which still access a global variable) with ones that access
per-thread curTick values.
2011-01-07 21:50:29 -08:00
Steve Reinhardt
c22be9f2f0 stats: rename StatEvent() function to schedStatEvent().
This follows the style rules and is more descriptive.
2011-01-07 21:50:29 -08:00
Steve Reinhardt
94807214c4 sim: clean up CountedDrainEvent slightly.
There's no reason for it to derive from SimLoopExitEvent.
This whole drain thing needs to be redone eventually,
but this is a stopgap to make later changes to
SimLoopExitEvent feasible.
2011-01-07 21:50:29 -08:00
Steve Reinhardt
030736a69b sim: delete unused CheckSwapEvent code.
There's no way to even create one of these anymore.
2011-01-07 21:50:29 -08:00
Steve Reinhardt
df9f99567d pseudoinst: get rid of mainEventQueue references.
Avoid direct references to mainEventQueue in pseudo-insts
by indirecting through associated CPU object.
Made exitSimLoop() more flexible to enable some of these.
2011-01-07 21:50:29 -08:00
Steve Reinhardt
d60c293bbc inorder: replace schedEvent() code with reschedule().
There were several copies of similar functions that looked
like they all replicated reschedule(), so I replaced them
with direct calls.  Keeping this separate from the previous
cset since there may be some subtle functional differences
if the code ever reschedules an event that is scheduled but
not squashed (though none were detected in the regressions).
2011-01-07 21:50:29 -08:00
Steve Reinhardt
214cc0fafc inorder: get rid of references to mainEventQueue.
Events need to be scheduled on the queue assigned
to the SimObject, not on the global queue (which
should be going away).
Also cleaned up a number of redundant expressions
that made the code unnecessarily verbose.
2011-01-07 21:50:29 -08:00
Steve Reinhardt
d650f4138e scons: show sources and targets when building, and colorize output.
I like the brevity of Ali's recent change, but the ambiguity of
sometimes showing the source and sometimes the target is a little
confusing.  This patch makes scons typically list all sources and
all targets for each action, with the common path prefix factored
out for brevity.  It's a little more verbose now but also more
informative.

Somehow Ali talked me into adding colors too, which is a whole
'nother story.
2011-01-07 21:50:13 -08:00
Nilay Vaish
d36cc62c11 Ruby: Updates MOESI Hammer protocol
This patch changes the manner in which data is copied from L1 to L2 cache in
the implementation of the Hammer's cache coherence protocol. Earlier, data was
copied directly from one cache entry to another. This has been broken in to
two parts. First, the data is copied from the source cache entry to a
transaction buffer entry. Then, data is copied from the transaction buffer
entry to the destination cache entry.

This has been done to maintain the invariant - at any given instant, multiple
caches under a controller are exclusive with respect to each other.
2011-01-04 21:40:49 -06:00
Gabe Black
498ea0bdab Params: Print the IP components in the right order. 2011-01-04 17:11:49 -05:00
Steve Reinhardt
89cf3f6e85 Move sched_list.hh and timebuf.hh from src/base to src/cpu.
These files really aren't general enough to belong in src/base.
This patch doesn't reorder include lines, leaving them unsorted
in many cases, but Nate's magic script will fix that up shortly.

--HG--
rename : src/base/sched_list.hh => src/cpu/sched_list.hh
rename : src/base/timebuf.hh => src/cpu/timebuf.hh
2011-01-03 14:35:47 -08:00
Steve Reinhardt
2f4c71968a Delete unused files from src/base directory. 2011-01-03 14:35:45 -08:00
Steve Reinhardt
c69d48f007 Make commenting on close namespace brackets consistent.
Ran all the source files through 'perl -pi' with this script:

s|\s*(};?\s*)?/\*\s*(end\s*)?namespace\s*(\S+)\s*\*/(\s*})?|} // namespace $3|;
s|\s*};?\s*//\s*(end\s*)?namespace\s*(\S+)\s*|} // namespace $2\n|;
s|\s*};?\s*//\s*(\S+)\s*namespace\s*|} // namespace $1\n|;

Also did a little manual editing on some of the arch/*/isa_traits.hh files
and src/SConscript.
2011-01-03 14:35:43 -08:00
Gabe Black
1a10ccc5e5 RefCount: Fix reference counting pointer == and != with a T* on the left.
These operators were expecting a const T& instead of a const T*, and were not
being picked up and used by gcc in the right places as a result. Apparently no
one used these operators before. A unit test which exposed these problems,
verified the solution, and checks other basic functionality is on the way.
2011-01-03 15:31:20 -05:00
Nathan Binkert
d6ad7419ff swig: use <> for system %includes instead of "" 2010-12-30 12:51:04 -05:00
Nilay Vaish
04f5bb34ce PerfectCacheMemory: Add return statements to two functions.
Two functions in src/mem/ruby/system/PerfectCacheMemory.hh, tryCacheAccess()
and cacheProbe(), end with calls to panic(). Both of these functions have
return type other than void. Any file that includes this header file fails
to compile because of the missing return statement. This patch adds dummy
values so as to avoid the compiler warnings.
2010-12-23 13:36:18 -06:00
Nilay Vaish
58fa2857e1 This patch removes the WARN_* and ERROR_* from src/mem/ruby/common/Debug.hh file. These statements have been replaced with warn(), panic() and fatal() defined in src/base/misc.hh 2010-12-22 23:15:24 -06:00
Steve Reinhardt
2c0e80f96b memtest: delete some crufty dead code 2010-12-21 22:57:29 -08:00
Steve Reinhardt
3e0ed66ff2 Get rid of unused file src/base/dbl_list.hh 2010-12-21 22:39:26 -08:00
Nathan Binkert
88033eb608 stats: allow stats to be reset even if no objects have been instantiated 2010-12-21 08:02:41 -08:00
Nathan Binkert
c24f1df343 importer: fix error message 2010-12-21 08:02:40 -08:00
Nathan Binkert
a7d9e5c9e0 scons: remove extra dependencies 2010-12-21 08:02:39 -08:00
Gabe Black
672d6a4b98 Style: Replace some tabs with spaces. 2010-12-20 16:24:40 -05:00
Gabe Black
89850d6370 Params: Fix a broken error message in verifyIp. 2010-12-20 04:20:58 -05:00
Gabe Black
2ff3e6b399 ARM: Take advantage of new PCState syntax. 2010-12-09 14:45:17 -08:00
Gabe Black
24c5b5925d ARM: Get rid of some unused FP operands. 2010-12-09 14:45:04 -08:00
Gabe Black
55978f0395 Merge. 2010-12-08 16:52:38 -08:00
Brad Beckmann
7e42b753e7 ruby: remove Ruby asserts for m5.fast
This diff is for changing the way ASSERT is handled in Ruby. m5.fast
compiles out the assert statements by using the macro NDEBUG. Ruby uses the
macro RUBY_NO_ASSERT to do so. This macro has been removed and NDEBUG has
been put in its place.
2010-12-08 11:52:02 -08:00
Gabe Black
5a895ab92c Alpha: Take advantage of new PCState syntax. 2010-12-08 10:55:33 -08:00
Gabe Black
f26051eb1a MIPS: Take advantage of new PCState syntax. 2010-12-08 10:45:14 -08:00
Gabe Black
7f3f90f71d POWER: Take advantage of new PCState syntax. 2010-12-08 10:33:03 -08:00
Gabe Black
f01d2efe8a SPARC: Take advantage of new PCState syntax. 2010-12-08 00:27:43 -08:00
Gabe Black
d3e021820e X86: Take advantage of new PCState syntax. 2010-12-08 00:27:23 -08:00
Gabe Black
4c9b023a7a ISA: Get the parser to support pc state components more elegantly. 2010-12-07 23:08:05 -08:00
Ali Saidi
42ba158479 O3: Allow a store entry to store up to 16 bytes (instead of TheISA::IntReg).
The store queue doesn't need to be ISA specific and architectures can
frequently store more than an int registers worth of data. A 128 bits seems
more common, but even 256 bits may be appropriate. Pretty much anything less
than a cache line size is buildable.
2010-12-07 16:19:57 -08:00
Ali Saidi
e681c0f7b3 O3: Support squashing all state after special instruction
For SPARC ASIs are added to the ExtMachInst. If the ASI is changed simply
marking the instruction as Serializing isn't enough beacuse that only
stops rename. This provides a mechanism to squash all the instructions
and refetch them
2010-12-07 16:19:57 -08:00
Giacomo Gabrielli
719f9a6d4f O3: Make all instructions that write a misc. register not perform the write until commit.
ARM instructions updating cumulative flags (ARM FP exceptions and saturation
flags) are not serialized.

Added aliases for ARM FP exceptions and saturation flags in FPSCR.  Removed
write accesses to the FP condition codes for most ARM VFP instructions: only
VCMP and VCMPE instructions update the FP condition codes.  Removed a potential
cause of seg. faults in the O3 model for NEON memory macro-ops (ARM).
2010-12-07 16:19:57 -08:00
Min Kyu Jeong
4bbdd6ceb2 O3: Support SWAP and predicated loads/store in ARM. 2010-12-07 16:19:57 -08:00
Ali Saidi
21bfbd422c ARM: Support switchover with hardware table walkers 2010-12-07 16:19:57 -08:00
Nilay Vaish
658849d101 ruby: Converted old ruby debug calls to M5 debug calls
This patch developed by Nilay Vaish converts all the old GEMS-style ruby
debug calls to the appropriate M5 debug calls.
2010-12-01 11:30:04 -08:00
Ali Saidi
0f039fe447 IGbE: return 0 on an invalid descriptor size instead of -1.
Asserts where descSize() get called with assert if we end up returning
0.
2010-11-26 20:47:23 -05:00
Gabe Black
7f6ca0981f Copyright: Add AMD copyright to the param changes I just made. 2010-11-23 17:08:41 -05:00
Gabe Black
b3de4855c3 Params: Add parameter types for IP addresses in various forms.
New parameter forms are:
IP address in the format "a.b.c.d" where a-d are from decimal 0 to 255.
IP address with netmask which is an IP followed by "/n" where n is a netmask
length in bits from decimal 0 to 32 or by "/e.f.g.h" where e-h are from
decimal 0 to 255 and which is all 1 bits followed by all 0 bits when
represented in binary. These can also be specified as an integral IP and
netmask passed in separately.
IP address with port which is an IP followed by ":p" where p is a port index
from decimal 0 to 65535. These can also be specified as an integral IP and
port value passed in separately.
2010-11-23 15:54:43 -05:00
Gabe Black
40d434d551 X86: Loosen an assert for x86 and connect the APIC ports when caches are used. 2010-11-23 06:11:50 -05:00
Gabe Black
3cd349f443 X86: Obey the PCD (cache disable) bit in the page tables. 2010-11-23 06:10:17 -05:00
Gabe Black
c8c921b9db X86: Mark IO space accesses as uncachable. 2010-11-22 05:49:03 -05:00
Gabe Black
6a00519e73 IDE,X86: Fix IDE controller BAR configuration for x86. 2010-11-22 02:33:47 -05:00
Nathan Binkert
4d9ff1954b random: small comment about our random number generator and its origin 2010-11-20 12:12:27 -08:00
Ali Saidi
34a8e37c13 SE: Fix simulating more than 4GB of RAM in SE mode
This change removes some dead code in PhysicalMemory, uses a 64 bit type
for the page pointer in System (instead of 32 bit) and cleans up some style.
2010-11-19 18:01:01 -06:00
Ali Saidi
e1b9a815dd SCons: Support building without an ISA 2010-11-19 18:00:39 -06:00
Gabe Black
92655b6399 O3: Fix fp destination register flattening, and index offset adjusting.
This change makes O3 flatten floating point destination registers, and also
fixes misc register flattening so that it's correctly repositioned relative to
the resized regions for integer and floating point indices.

It also fixes some overly long lines.
2010-11-18 13:11:36 -05:00
Gabe Black
8b9b85e92c O3: Make O3 support variably lengthed instructions. 2010-11-15 19:37:03 -08:00
Ali Saidi
776c075917 O3: reset architetural state by calling clear() 2010-11-15 14:04:05 -06:00
Ali Saidi
5f59e195d6 ARM: Add comment about the organization of the IT state register 2010-11-15 14:04:05 -06:00
Giacomo Gabrielli
0058927190 CPU/ARM: Add SIMD op classes to CPU models and ARM ISA. 2010-11-15 14:04:04 -06:00
Min Kyu Jeong
745df74fe0 O3: prevent a squash when completeAcc() modifies misc reg through TC.
This happens on ARM instructions when they update the IT state bits.
Code and associated comment was copied from execute() and initiateAcc() methods
2010-11-15 14:04:04 -06:00
Ali Saidi
4a1814bd52 ARM: Return an FailUnimp instruction when an unimplemented CP15 register is accessed.
Just panicing in readMiscReg() doesn't work because a speculative access
in the o3 model can end the simulation.
2010-11-15 14:04:04 -06:00
Ali Saidi
d4767f440a SCons: Cleanup SCons output during compile 2010-11-15 14:04:04 -06:00
William Wang
6fbea15064 ARM: Add a Keyboard Mouse Interface controller 2010-11-15 14:04:03 -06:00
William Wang
fc1eeafc94 ARM: Implement a CLCD Frame buffer 2010-11-15 14:04:03 -06:00
William Wang
80db6a5ecb ARM: Add support for GDB on ARM
--HG--
rename : src/arch/alpha/remote_gdb.cc => src/arch/arm/remote_gdb.cc
2010-11-15 14:04:03 -06:00
Ali Saidi
06864386a1 ARM: Make utility.hh meet style guidelines 2010-11-15 14:04:03 -06:00
Ali Saidi
d7b8efa0df ARM: Add support for a dumb IDE controller 2010-11-15 14:04:03 -06:00
Ali Saidi
13931b9b82 ARM: Cache the misc regs at the TLB to limit readMiscReg() calls. 2010-11-15 14:04:03 -06:00
Ali Saidi
4c2e5c282b ARM: Add support for switching CPUs 2010-11-15 14:04:03 -06:00
Ali Saidi
08c5673d56 ARM: Use the correct delete operator for RFE 2010-11-15 14:04:03 -06:00
Ali Saidi
50431f4eab ARM: Fix SRS instruction to micro-code memory operation and register update.
Previously the SRS instruction attempted to writeback in initiateAcc() which
worked until a recent change, but was incorrect.
2010-11-15 14:04:03 -06:00
Ali Saidi
16f210da37 CPU: Fix bug when a split transaction is issued to a faster cache
In the case of a split transaction and a cache that is faster than a CPU we
could get two responses before next_tick expires. Add an event that is
scheduled in this case and return false rather than asserting.
2010-11-15 14:04:03 -06:00
Ali Saidi
265e145db2 ARM: Do something predictable for an UNPREDICTABLE branch. 2010-11-15 14:04:03 -06:00
Gabe Black
46472279c0 Params: Fix an off by one error and a misleading comment. 2010-11-11 11:58:09 -08:00
Gabe Black
3c237f44c9 SimObject: Add a comment near clear_child that it's unlikely to be called. 2010-11-11 11:41:13 -08:00
Gabe Black
cdc585e0e8 SPARC: Clean up some historical style issues. 2010-11-11 02:03:58 -08:00
Gabe Black
2fd9dc19cd SimObject: Use "self" when calling the clear_child method. 2010-11-09 10:45:02 -08:00
Gabe Black
388124492e X86: Fix X86_FS compilation. 2010-11-08 12:43:38 -08:00
Ali Saidi
057b451773 ARM: Add some TLB statistics for ARM 2010-11-08 13:58:25 -06:00
Ali Saidi
a1e8225975 ARM: Add checkpointing support 2010-11-08 13:58:25 -06:00
Ali Saidi
432fa0aad6 ARM: Add support for M5 ops in the ARM ISA 2010-11-08 13:58:24 -06:00
Ali Saidi
0f2bbe15dd ARM: Keep the warnings to a minimum.
These warnings still need to be addresses, but pages of them is
counterproductive.
2010-11-08 13:58:24 -06:00
Ali Saidi
c779af4e12 Mem: Finish half-baked support for mmaping file in physmem.
Physmem has a parameter to be able to mem map a file, however
it isn't actually used. This changeset utilizes the parameter
so a file can be mmapped.
2010-11-08 13:58:24 -06:00
Ali Saidi
ea1167dd9f Bus: Have the I/O devices that return address ranges print them out.
This way we actually get device names associated with the devices.
2010-11-08 13:58:24 -06:00
Ali Saidi
e6c31ceb2b ARM: Don't return the result of a table walk the same cycle it's completed.
The L1 cache may have been accessed to provide this data, which confuses
it, if it ends up being accesses twice in one cycle. Instead wait 1 tick
which will force the timing simple CPU to forward to its next clock cycle
when the translation completes.

Also prevent multiple outstanding table walks from occuring at once.
2010-11-08 13:58:24 -06:00
Ali Saidi
cdacbe734a ARM/Alpha/Cpu: Change prefetchs to be more like normal loads.
This change modifies the way prefetches work. They are now like normal loads
that don't writeback a register. Previously prefetches were supposed to call
prefetch() on the exection context, so they executed with execute() methods
instead of initiateAcc() completeAcc(). The prefetch() methods for all the CPUs
are blank, meaning that they get executed, but don't actually do anything.

On Alpha dead cache copy code was removed and prefetches are now normal ops.
They count as executed operations, but still don't do anything and IsMemRef is
not longer set on them.

On ARM IsDataPrefetch or IsInstructionPreftech is now set on all prefetch
instructions. The timing simple CPU doesn't try to do anything special for
prefetches now and they execute with the normal memory code path.
2010-11-08 13:58:22 -06:00
Ali Saidi
f4f5d03ed2 ARM: Make all ARM uops delayed commit. 2010-11-08 13:58:22 -06:00
Ali Saidi
0ea794bcf4 sim: Use forward declarations for ports.
Virtual ports need TLB data which means anything touching a file in the arch
directory rebuilds any file that includes system.hh which in everything.
2010-11-08 13:58:22 -06:00
Gabe Black
72b5262278 scons: Replace the build_dir parameter to SConscript with variant_dir.
The build_dir parameter name has been deprecated and replaced with
variant_dir. This change switches us over to avoid warning spew in newer
versions of scons.
2010-11-06 17:48:58 -07:00
Gabe Black
6f4bd2c1da ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors.
This change is a low level and pervasive reorganization of how PCs are managed
in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about,
the PC and the NPC, and the lsb of the PC signaled whether or not you were in
PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next
micropc, x86 and ARM introduced variable length instruction sets, and ARM
started to keep track of mode bits in the PC. Each CPU model handled PCs in
its own custom way that needed to be updated individually to handle the new
dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack,
the complexity could be hidden in the ISA at the ISA implementation's expense.
Areas like the branch predictor hadn't been updated to handle branch delay
slots or micropcs, and it turns out that had introduced a significant (10s of
percent) performance bug in SPARC and to a lesser extend MIPS. Rather than
perpetuate the problem by reworking O3 again to handle the PC features needed
by x86, this change was introduced to rework PC handling in a more modular,
transparent, and hopefully efficient way.


PC type:

Rather than having the superset of all possible elements of PC state declared
in each of the CPU models, each ISA defines its own PCState type which has
exactly the elements it needs. A cross product of canned PCState classes are
defined in the new "generic" ISA directory for ISAs with/without delay slots
and microcode. These are either typedef-ed or subclassed by each ISA. To read
or write this structure through a *Context, you use the new pcState() accessor
which reads or writes depending on whether it has an argument. If you just
want the address of the current or next instruction or the current micro PC,
you can get those through read-only accessors on either the PCState type or
the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the
move away from readPC. That name is ambiguous since it's not clear whether or
not it should be the actual address to fetch from, or if it should have extra
bits in it like the PAL mode bit. Each class is free to define its own
functions to get at whatever values it needs however it needs to to be used in
ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the
PC and into a separate field like ARM.

These types can be reset to a particular pc (where npc = pc +
sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as
appropriate), printed, serialized, and compared. There is a branching()
function which encapsulates code in the CPU models that checked if an
instruction branched or not. Exactly what that means in the context of branch
delay slots which can skip an instruction when not taken is ambiguous, and
ideally this function and its uses can be eliminated. PCStates also generally
know how to advance themselves in various ways depending on if they point at
an instruction, a microop, or the last microop of a macroop. More on that
later.

Ideally, accessing all the PCs at once when setting them will improve
performance of M5 even though more data needs to be moved around. This is
because often all the PCs need to be manipulated together, and by getting them
all at once you avoid multiple function calls. Also, the PCs of a particular
thread will have spatial locality in the cache. Previously they were grouped
by element in arrays which spread out accesses.


Advancing the PC:

The PCs were previously managed entirely by the CPU which had to know about PC
semantics, try to figure out which dimension to increment the PC in, what to
set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction
with the PC type itself. Because most of the information about how to
increment the PC (mainly what type of instruction it refers to) is contained
in the instruction object, a new advancePC virtual function was added to the
StaticInst class. Subclasses provide an implementation that moves around the
right element of the PC with a minimal amount of decision making. In ISAs like
Alpha, the instructions always simply assign NPC to PC without having to worry
about micropcs, nnpcs, etc. The added cost of a virtual function call should
be outweighed by not having to figure out as much about what to do with the
PCs and mucking around with the extra elements.

One drawback of making the StaticInsts advance the PC is that you have to
actually have one to advance the PC. This would, superficially, seem to
require decoding an instruction before fetch could advance. This is, as far as
I can tell, realistic. fetch would advance through memory addresses, not PCs,
perhaps predicting new memory addresses using existing ones. More
sophisticated decisions about control flow would be made later on, after the
instruction was decoded, and handed back to fetch. If branching needs to
happen, some amount of decoding needs to happen to see that it's a branch,
what the target is, etc. This could get a little more complicated if that gets
done by the predecoder, but I'm choosing to ignore that for now.


Variable length instructions:

To handle variable length instructions in x86 and ARM, the predecoder now
takes in the current PC by reference to the getExtMachInst function. It can
modify the PC however it needs to (by setting NPC to be the PC + instruction
length, for instance). This could be improved since the CPU doesn't know if
the PC was modified and always has to write it back.


ISA parser:

To support the new API, all PC related operand types were removed from the
parser and replaced with a PCState type. There are two warts on this
implementation. First, as with all the other operand types, the PCState still
has to have a valid operand type even though it doesn't use it. Second, using
syntax like PCS.npc(target) doesn't work for two reasons, this looks like the
syntax for operand type overriding, and the parser can't figure out if you're
reading or writing. Instructions that use the PCS operand (which I've
consistently called it) need to first read it into a local variable,
manipulate it, and then write it back out.


Return address stack:

The return address stack needed a little extra help because, in the presence
of branch delay slots, it has to merge together elements of the return PC and
the call PC. To handle that, a buildRetPC utility function was added. There
are basically only two versions in all the ISAs, but it didn't seem short
enough to put into the generic ISA directory. Also, the branch predictor code
in O3 and InOrder were adjusted so that they always store the PC of the actual
call instruction in the RAS, not the next PC. If the call instruction is a
microop, the next PC refers to the next microop in the same macroop which is
probably not desirable. The buildRetPC function advances the PC intelligently
to the next macroop (in an ISA specific way) so that that case works.


Change in stats:

There were no change in stats except in MIPS and SPARC in the O3 model. MIPS
runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could
likely be improved further by setting call/return instruction flags and taking
advantage of the RAS.


TODO:

Add != operators to the PCState classes, defined trivially to be !(a==b).
Smooth out places where PCs are split apart, passed around, and put back
together later. I think this might happen in SPARC's fault code. Add ISA
specific constructors that allow setting PC elements without calling a bunch
of accessors. Try to eliminate the need for the branching() function. Factor
out Alpha's PAL mode pc bit into a separate flag field, and eliminate places
where it's blindly masked out or tested in the PC.
2010-10-31 00:07:20 -07:00
Gabe Black
373154a25a X86: Fault on divide by zero instead of panicing. 2010-10-29 02:20:47 -07:00
Gabe Black
7378424b14 X86: Make syscalls also serialize after. 2010-10-29 02:20:46 -07:00
Gabe Black
d5dbd91f3d O3: Get rid of a bunch of commented out lines. 2010-10-24 00:43:32 -07:00
Gabe Black
2eae11be64 X86: Make nop a regular, non-microcoded instruction.
Code in the CPUs that need a nop to carry a fault can't easily deal with a
microcoded nop. This instruction format provides for one that isn't.

--HG--
rename : src/arch/x86/isa/formats/syscall.isa => src/arch/x86/isa/formats/nop.isa
2010-10-22 00:24:15 -07:00
Gabe Black
23f6196d61 X86: Implement genMachineCheckFault.
Even though this shouldn't ever be used, it might get called speculatively and
shouldn't panic.
2010-10-22 00:24:08 -07:00
Gabe Black
255685534a X86: Make syscall instructions non-speculative in SE. 2010-10-22 00:23:50 -07:00
Gabe Black
29676286c8 ISA: Simplify various implementations of completeAcc. 2010-10-22 00:23:19 -07:00
Gabe Black
bc49381287 ARM: Don't pretend to writeback registers in initiateAcc. 2010-10-22 00:22:59 -07:00
Steve Reinhardt
45aebaccde cache: minor SC assertion fix
Thanks to Joe Gross for finding/testing this.
2010-10-18 13:05:15 -07:00
Gabe Black
968447db66 MIPS: Get rid of the backdoor device copy/pasted from and only used in Alpha. 2010-10-17 23:15:53 -07:00
Gabe Black
b289966a78 Mem: Reclaim some request flags used by MIPS for alignment checking.
These flags were being used to identify what alignment a request needed, but
the same information is available using the request size. This change also
eliminates the isMisaligned function. If more complicated alignment checks are
needed, they can be signaled using the ASI_BITS space in the flags vector like
is currently done with ARM.
2010-10-16 00:00:54 -07:00
Gabe Black
ab9f062166 GetArgument: Rework getArgument so that X86_FS compiles again.
When no size is specified for an argument, push the decision about what size
to use into the ISA by passing a size of -1.
2010-10-15 23:57:06 -07:00
Gabe Black
b197a542b4 SPARC: Get rid of the copy/pasted StackTrace stolen from Alpha. 2010-10-14 14:02:23 -07:00
Gabe Black
930c653270 Mem: Change the CLREX flag to CLEAR_LL.
CLREX is the name of an ARM instruction, not a name for this generic flag.
2010-10-13 01:57:31 -07:00
Gabe Black
b273e0be33 X86: Detect attempts to load a 32 bit kernel and panic. 2010-10-10 20:39:26 -07:00
Gabe Black
157d6f9c2f SPARC: Make SPARC's ISA's clear function initialize everything it should.
Also make it not set some pointers to NULL potentially introducing a memory
leak. That should be done in the constructor.
2010-10-10 20:38:05 -07:00
Gabe Black
63fa65613e Alpha: Force all the IPRs to an initial, determinstic value when cleared. 2010-10-10 20:37:50 -07:00
Gabe Black
b4a76f0b0b Alpha: Initialize the data TLB mode IPR. 2010-10-10 20:37:39 -07:00
Gabe Black
9268f895d5 UART: Make the 8250's MCR return a deterministic value.
This change makes the 8250 device return the value it has for the MCR when
read instead of leaving the packet data unmodified/uninitialized. The value
the UART has for the MCR may not be right, but that's a seperate issue that
apparently hasn't caused any problems to date.
2010-10-09 12:41:31 -07:00
Gabe Black
d4492190e6 Alpha: Fix Alpha NumMiscArchRegs constant.
Also add asserts in O3's Scoreboard class to catch bad indexes.
2010-10-04 11:58:06 -07:00
Ali Saidi
538acf2082 Power: Fix compile error from previous push. 2010-10-01 17:57:56 -05:00
Ali Saidi
dcaa0668ae ARM: Make the TLB a little bit faster by moving most recently used items to front of list 2010-10-01 16:04:04 -05:00
Ali Saidi
f0c0b8a7f6 ARM: Add a fake flash controller so that unmodified linux can boot
With this change an unmodified Linux kernel can boot in M5.
2010-10-01 16:04:02 -05:00
Prakash Ramrakhyani
9792bbc324 ARM: Fix some subtle bugs in the GIC
The GIC code can write to the registers with 8, 16, or 32 byte
accesses which could set/clear different numbers of interrupts.
2010-10-01 16:04:00 -05:00
Ali Saidi
521d68c82a ARM: Implement functional virtual to physical address translation
for debugging and program introspection.
2010-10-01 16:03:27 -05:00
Ali Saidi
518b5e5b1c Debug: Implement getArgument() and function skipping for ARM.
In the process make add skipFuction() to handle isa specific function skipping
instead of ifdefs and other ugliness. For almost all ABIs, 64 bit arguments can
only start in even registers.  Size is now passed to getArgument() so that 32
bit systems can make decisions about register selection for 64 bit arguments.
The number argument is now passed by reference because getArgument() will need
to change it based on the size of the argument and the current argument number.

For ARM, if the argument number is odd and a 64-bit register is requested the
number must first be incremented to because all 64 bit arguments are passed
in an even argument register. Then the number will be incremented again to
access both halves of the argument.
2010-10-01 16:02:46 -05:00
Ali Saidi
b331b02669 ARM: Clean up use of TBit and JBit.
Rather tha constantly using ULL(1) << PcXBitShift define those directly.
Additionally, add some helper functions to further clean up the code.
2010-10-01 16:02:45 -05:00
Ali Saidi
aef4a9904e CPU/Cache: Fix some errors exposed by valgrind 2010-09-30 09:35:19 -05:00
Gabe Black
c41e633e0e X86: Fix the RIP relative versions of the BT, BTC, BTR, and BTS instructions. 2010-09-29 11:31:03 -07:00
Steve Reinhardt
7bae1f5d43 python: get rid of internal.enums package.
Move generated enums into internal.params, which gets
imported into object.params, restoring backward
compatibility for scripts that expect to find them there.
2010-09-22 08:45:35 -07:00
Steve Reinhardt
e918536380 cache: improve coherence handling of writebacks
If we write back an exclusive copy, we now mark it
as such, so the cache receiving the writeback can
mark its copy as exclusive.  This avoids some
unnecessary upgrade requests when a cache later
tries to re-acquire exclusive access to the block.
2010-09-21 23:07:34 -07:00
Gabe Black
ab8d7eee76 CPU: Fix O3 and possible InOrder segfaults in FS. 2010-09-20 02:46:42 -07:00
Steve Reinhardt
3f9f4bf3d6 devices: undo cset 017baf09599f that added timer drain functions.
It's not the right fix for the checkpoint deadlock problem
Brad was having, and creates another bug where the system can
deadlock on restore.  Brad can't reproduce the original bug
right now, so we'll wait until it arises again and then try
to fix it the right way then.
2010-09-16 20:24:05 -07:00
Gabe Black
2dd9f4fcf0 X86: Make the halt microop non-speculative.
Executing this microop makes the CPU halt even if it was misspeculated.
2010-09-14 12:31:37 -07:00
Gabe Black
0bbd88eb40 X86: Make unrecognized instructions behave better in x86. 2010-09-14 12:27:30 -07:00
Gabe Black
0dd1f7f01a CPU: Trim unnecessary includes from some common files.
This reduces the scope of those includes and makes it less likely for there to
be a dependency loop. This also moves the hashing functions associated with
ExtMachInst objects to be with the ExtMachInst definitions and out of
utility.hh.
2010-09-14 00:29:38 -07:00
Gabe Black
8f3fbd2d13 CPU: Get rid of the now unnecessary getInst/setInst family of functions.
This code is no longer needed because of the preceeding change which adds a
StaticInstPtr parameter to the fault's invoke method, obviating the only use
for this pair of functions.
2010-09-13 21:58:34 -07:00
Gabe Black
6833ca7eed Faults: Pass the StaticInst involved, if any, to a Fault's invoke method.
Also move the "Fault" reference counted pointer type into a separate file,
sim/fault.hh. It would be better to name this less similarly to sim/faults.hh
to reduce confusion, but fault.hh matches the name of the type. We could change
Fault to FaultPtr to match other pointer types, and then changing the name of
the file would make more sense.
2010-09-13 19:26:03 -07:00
Nathan Binkert
2edfcbbaee swig: make all generated files go into the m5.internal package
This is necessary because versions of swig older than 1.3.39 fail to
do the right thing and try to do relative imports for everything (even
with the package= option to %module).  Instead of putting params in
the m5.internal.params package, put params in the m5.internal package
and make all param modules start with param_.  Same thing for
m5.internal.enums.

Also, stop importing all generated params into m5.objects.  They are
not necessary and now with everything using relative imports we wound
up with pollution of the namespace (where builtin-range got overridden).

--HG--
rename : src/python/m5/internal/enums/__init__.py => src/python/m5/internal/enums.py
rename : src/python/m5/internal/params/__init__.py => src/python/m5/internal/params.py
2010-09-12 15:41:34 -07:00
Nathan Binkert
afafaf1dcb style: fix sorting of includes and whitespace in some files 2010-09-10 14:58:04 -07:00
Nathan Binkert
47ef97b9ca scons: Stop building the big monolithic swigged params module
kill params.i and create a separate .i for each object (param, enums, etc.)
2010-09-09 14:26:29 -07:00
Nathan Binkert
e6ee56c657 init: don't build files that centralize python and swig code
Instead of putting all object files into m5/object/__init__.py, interrogate
the importer to find out what should be imported.
Instead of creating a single file that lists all of the embedded python
modules, use static object construction to put those objects onto a list.
Do something similar for embedded swig (C++) code.
2010-09-09 14:15:42 -07:00
Nathan Binkert
710ed8f492 scons: use code_formatter wherever we can in the build system 2010-09-09 14:15:41 -07:00
Nathan Binkert
c514ad9b09 code_formatter: make it easier to insert whitespace
a newline by just doing "code()". indent() and dedent() now take a
"count" parameter to indent/dedent multiple levels.
2010-09-09 14:15:41 -07:00
Nathan Binkert
18ef1bcfa2 swig: don't override SWIG_name anymore
It doesn't appear to be necessary and it is somewhat odd.  I'm pretty
sure that the package parameter to %module does whatever this might
have been before.  It's necessary in future revisions anyway.
2010-09-09 14:15:40 -07:00
Steve Reinhardt
1249728494 cache: fail SC when invalidated while waiting for bus
Corrects an oversight in cset f97b62be544f.  The fix there only
failed queued SCUpgradeReq packets that encountered an
invalidation, which meant that the upgrade had to reach the L2
cache.  To handle pending requests in the L1 we must similarly
fail StoreCondReq packets too.
2010-09-09 14:40:19 -04:00
Steve Reinhardt
6dc599ea9b mem: fix functional accesses to deal with coherence change
We can't just obliviously return the first valid cache block
we find any more... see comments for details.
2010-09-09 14:40:19 -04:00
Steve Reinhardt
71aca6d29e cache: coherence protocol enhancements & bug fixes
Allow lower-level caches (e.g., L2 or L3) to pass exclusive
copies to higher levels (e.g., L1).  This eliminates a lot
of unnecessary upgrade transactions on read-write sequences
to non-shared data.

Also some cleanup of MSHR coherence handling and multiple
bug fixes.
2010-09-09 14:40:18 -04:00
Gabe Black
7c4dc4491a ARM: Get rid of the checkFpEnableFault function in ARM. 2010-08-31 09:50:49 -07:00
Gabe Black
ebf5c5b91b Alpha: Alpha's mt.hh was including mips header files. 2010-08-31 09:48:05 -07:00
Gabe Black
c9d01c6557 CPU: Get rid of the unused ev5_trap function on the simple and checker CPUs. 2010-08-31 09:47:29 -07:00
Gabe Black
794ca517f2 X86: Change the copyright holder to AMD.
I accidentally left myself as a placeholder copyright holder on this file when
I checked it in. Copyright should be assigned to AMD.
2010-08-27 15:35:36 -07:00
Steve Reinhardt
3ffc4505f7 mem: fix m5.fast compile bug in previous cset 2010-08-26 08:03:20 -07:00
Steve Reinhardt
1bf944be62 cache: fix a bug in atomic multilevel snoops 2010-08-25 21:55:55 -07:00
Steve Reinhardt
ee6a92863a memtest: fix/cleanup functional access testing
Don't assert that the response packet is marked as a response
since it won't always be so for functional accesses.

Also cleanup code to refer to functional accesses rather
than "probes" (old terminology), and mention in the
DPRINTF which type of access we're doing.
2010-08-25 21:55:44 -07:00
Ali Saidi
546eaa6109 CPU: Print out traces for faluting inst when the flag ExecFaulting is set 2010-08-25 19:10:43 -05:00
Min Kyu Jeong
dee8f3d500 ARM: Support unaligned memory access.
Without this flag set, page-crossing requests were not split into two mem
request.

Depending on the alignment bit in the SCTLR, misaligned access could
raise a fault. However it seems unnecessary to implement that.
2010-08-25 19:10:43 -05:00
Gene WU
b52fed4747 ARM: Seperate the queues of L1 and L2 walker states. 2010-08-25 19:10:43 -05:00
Min Kyu Jeong
c23e8c31eb ARM: Adding a bogus fault that does nothing.
This fault can used to flush the pipe, not including the faulting instruction.

The particular case I needed this was for a self-modifying code. It needed to
drain the store queue and force the following instruction to refetch from
icache. DCCMVAC cp15 mcr instruction is modified to raise this fault.
2010-08-25 19:10:43 -05:00
William Wang
8376f7bca3 ARM: Remove ALPHA KSeg functions.
These were erronously copied years ago into the ARM directory.
2010-08-25 19:10:43 -05:00
Ali Saidi
c0b54f579c ARM: Limited implementation of dprintk.
Does not work with vfp arguments or arguments passed on the stack.
2010-08-25 19:10:43 -05:00
Min Kyu Jeong
e1168e72ca ARM: Fixed register flattening logic (FP_Base_DepTag was set too low)
When decoding a srs instruction, invalid mode encoding returns invalid instruction.
This can happen when garbage instructions are fetched from mispredicted path
2010-08-25 19:10:43 -05:00
Ali Saidi
edca5f7da6 ARM: Make VMSR, RFE PC/LR etc non speculative, and serializing 2010-08-25 19:10:43 -05:00
Gene WU
4d8f4db8d1 ARM: Use fewer micro-ops for register update loads if possible.
Allow some loads that update the base register to use just two micro-ops. three
micro-ops are only used if the destination register matches the offset register
or the PC is the destination regsiter. If the PC is updated it needs to be
the last micro-op otherwise O3 will mispredict.
2010-08-25 19:10:42 -05:00
Ali Saidi
c2d5d2b53d ARM: Set the high bits in the part number so it's considered new by some code. 2010-08-25 19:10:42 -05:00
Ali Saidi
99fafb72b8 ARM: Fix VFP enabled checks for mem instructions 2010-08-25 19:10:42 -05:00
Gabe Black
63464d950e ARM: Seperate out the renamable bits in the FPSCR. 2010-08-25 19:10:42 -05:00
Gabe Black
93ce7238bf ARM: Eliminate some unused enums. 2010-08-25 19:10:42 -05:00
Gabe Black
0efe2f6769 ARM: Fix type comparison warnings in Neon. 2010-08-25 19:10:42 -05:00
Gabe Black
54a919f225 ARM: Implement CPACR register and return Undefined Instruction when FP access is disabled. 2010-08-25 19:10:42 -05:00
Gabe Black
6368edb281 ARM: Implement all ARM SIMD instructions. 2010-08-25 19:10:42 -05:00
Gabe Black
f4f6b31df1 ARM: Expand the mode checking utility functions.
inUserMode now can take either a threadcontext or a CPSR value directly. If
given a thread context it just extracts the CPSR and calls the other version.
An inPrivelegedMode function was also implemented which just returns the
opposite of inUserMode.
2010-08-25 19:10:41 -05:00
Ali Saidi
75955d6c42 Tracing: Fix trace so 'Predicated False' doesn't show up 2010-08-25 19:10:41 -05:00
Steve Reinhardt
62c06c1403 mem: fix dumb typo in copyrights 2010-08-25 14:08:27 -07:00
Brad Beckmann
e983ef9e8c testers: move testers to a new directory
This patch moves the testers to a new subdirectory under src/cpu and includes
the necessary fixes to work with latest m5 initialization patches.

--HG--
rename : configs/example/determ_test.py => configs/example/ruby_direct_test.py
rename : src/cpu/directedtest/DirectedGenerator.cc => src/cpu/testers/directedtest/DirectedGenerator.cc
rename : src/cpu/directedtest/DirectedGenerator.hh => src/cpu/testers/directedtest/DirectedGenerator.hh
rename : src/cpu/directedtest/InvalidateGenerator.cc => src/cpu/testers/directedtest/InvalidateGenerator.cc
rename : src/cpu/directedtest/InvalidateGenerator.hh => src/cpu/testers/directedtest/InvalidateGenerator.hh
rename : src/cpu/directedtest/RubyDirectedTester.cc => src/cpu/testers/directedtest/RubyDirectedTester.cc
rename : src/cpu/directedtest/RubyDirectedTester.hh => src/cpu/testers/directedtest/RubyDirectedTester.hh
rename : src/cpu/directedtest/RubyDirectedTester.py => src/cpu/testers/directedtest/RubyDirectedTester.py
rename : src/cpu/directedtest/SConscript => src/cpu/testers/directedtest/SConscript
rename : src/cpu/directedtest/SeriesRequestGenerator.cc => src/cpu/testers/directedtest/SeriesRequestGenerator.cc
rename : src/cpu/directedtest/SeriesRequestGenerator.hh => src/cpu/testers/directedtest/SeriesRequestGenerator.hh
rename : src/cpu/memtest/MemTest.py => src/cpu/testers/memtest/MemTest.py
rename : src/cpu/memtest/SConscript => src/cpu/testers/memtest/SConscript
rename : src/cpu/memtest/memtest.cc => src/cpu/testers/memtest/memtest.cc
rename : src/cpu/memtest/memtest.hh => src/cpu/testers/memtest/memtest.hh
rename : src/cpu/rubytest/Check.cc => src/cpu/testers/rubytest/Check.cc
rename : src/cpu/rubytest/Check.hh => src/cpu/testers/rubytest/Check.hh
rename : src/cpu/rubytest/CheckTable.cc => src/cpu/testers/rubytest/CheckTable.cc
rename : src/cpu/rubytest/CheckTable.hh => src/cpu/testers/rubytest/CheckTable.hh
rename : src/cpu/rubytest/RubyTester.cc => src/cpu/testers/rubytest/RubyTester.cc
rename : src/cpu/rubytest/RubyTester.hh => src/cpu/testers/rubytest/RubyTester.hh
rename : src/cpu/rubytest/RubyTester.py => src/cpu/testers/rubytest/RubyTester.py
rename : src/cpu/rubytest/SConscript => src/cpu/testers/rubytest/SConscript
2010-08-24 12:07:22 -07:00
Brad Beckmann
20b2f0ce9f MOESI_hammer: fixed bug for dma reads in single cpu systems 2010-08-24 12:06:53 -07:00
Gabe Black
c13640a89c Faults: Get rid of some commented out code in sim/faults.hh. 2010-08-23 16:23:47 -07:00
Gabe Black
25ffa8eb8b X86: Create a directory for files that define register indexes.
This is to help tidy up arch/x86. These files should not be used external to
the ISA.

--HG--
rename : src/arch/x86/apicregs.hh => src/arch/x86/regs/apic.hh
rename : src/arch/x86/floatregs.hh => src/arch/x86/regs/float.hh
rename : src/arch/x86/intregs.hh => src/arch/x86/regs/int.hh
rename : src/arch/x86/miscregs.hh => src/arch/x86/regs/misc.hh
rename : src/arch/x86/segmentregs.hh => src/arch/x86/regs/segment.hh
2010-08-23 16:14:24 -07:00
Gabe Black
7a6ed1b10b Power: Get rid of unused checkFpEnableFault.
This function was brought in from another ISA and doesn't actually do anything
or get used.
2010-08-23 16:14:23 -07:00
Gabe Black
943c171480 ISA: Get rid of old, unused utility functions cluttering up the ISAs. 2010-08-23 16:14:20 -07:00
Gabe Black
9581562e65 X86: Get rid of the flagless microop constructor.
This will reduce clutter in the source and hopefully speed up compilation.
2010-08-23 09:44:19 -07:00
Gabe Black
f6182f948b X86: Make the TLB fault instead of panic when something is unmapped in SE mode.
The fault object, if invoked, would then panic. This is a bit less direct, but
it means speculative execution won't panic the simulator.
2010-08-23 09:44:19 -07:00
Gabe Black
172e45fc97 X86: Make the x86 ExtMachInst serializable with (UN)SERIALIZE_SCALAR.
--HG--
rename : src/arch/x86/types.hh => src/arch/x86/types.cc
2010-08-23 09:44:19 -07:00
Gabe Black
249549f9c3 X86: Define a noop ExtMachInst. 2010-08-23 09:44:19 -07:00
Gabe Black
d43eb42d00 X86: Mark serializing macroops and regular instructions as such. 2010-08-23 09:44:19 -07:00
Gabe Black
69fc2af006 X86: Add a .serializing directive that makes a macroop serializing.
This directive really just tells the macroop to set IsSerializing and
IsSerializeAfter on its final microop.
2010-08-23 09:44:19 -07:00
Gabe Black
5a1dbe4d99 X86: Consolidate extra microop flags into one parameter.
This single parameter replaces the collection of bools that set up various
flavors of microops. A flag parameter also allows other flags to be set like
the serialize before/after flags, etc., without having to change the
constructor.
2010-08-23 09:44:19 -07:00
Gabe Black
b187e7c9cc CPU: Make the constants for StaticInst flags visible outside the class. 2010-08-23 09:44:19 -07:00
Min Kyu Jeong
d8d6b869a2 O3: Skipping mem-order violation check for uncachable loads.
Uncachable load is not executed until it reaches the head of the ROB,
hence cannot cause one.
2010-08-23 11:18:42 -05:00
Min Kyu Jeong
e6a0be648e ARM: Improve printing of uop disassembly. 2010-08-23 11:18:42 -05:00
Min Kyu Jeong
d2fac84b95 ARM: Clean up flattening for SPSR adding 2010-08-23 11:18:41 -05:00
Gene Wu
a02d82f9f8 ARM: Implement DBG instruction that doesn't do much for now. 2010-08-23 11:18:41 -05:00
Gene Wu
d6736384b2 MEM: Make CLREX a first class request operation and clear locks in caches when it in received 2010-08-23 11:18:41 -05:00
Gene Wu
23626d99af ARM: Make sure that software prefetch instructions can't change the state of the TLB 2010-08-23 11:18:41 -05:00
Gene Wu
1fd104fc35 ARM: Don't write tracedata on writes, it might have been freed already. 2010-08-23 11:18:41 -05:00
Gene Wu
9db2ab8a62 ARM: Implement CLREX init/complete acc methods 2010-08-23 11:18:41 -05:00
Gene Wu
f29e09746a ARM: Fix Uncachable TLB requests and decoding of xn bit 2010-08-23 11:18:41 -05:00
Gene Wu
4b9de42439 Devices: Allow a device to specify that a request is uncachable. 2010-08-23 11:18:41 -05:00
Gene Wu
aa601750f8 ARM: For non-cachable accesses set the UNCACHABLE flag 2010-08-23 11:18:41 -05:00
Gene Wu
7405f4b774 ARM: Implement DSB, DMB, ISB 2010-08-23 11:18:41 -05:00
Gene Wu
aabf478920 ARM: Get SCTLR TE bit from reset SCTLR 2010-08-23 11:18:41 -05:00
Gene Wu
1f032ad345 ARM: Implement CLREX 2010-08-23 11:18:41 -05:00
Gene Wu
66bcbec96e ARM: BX instruction can be contitional if last instruction in a IT block
Branches are allowed to be the last instuction in an IT block. Before it was
assumed that they could not. So Branches in thumb2 were Uncond.
2010-08-23 11:18:41 -05:00
Min Kyu Jeong
ad2c3b008d CPU: Print out flatten-out register index as with IntRegs/FloatRegs traceflag 2010-08-23 11:18:41 -05:00
Min Kyu Jeong
03286e9d4e CPU: Make Exec trace to print predication result (if false) for memory instructions 2010-08-23 11:18:41 -05:00
Min Kyu Jeong
92ae620be8 ARM: mark msr/mrs instructions as SerializeBefore/After
Since miscellaneous registers bypass wakeup logic, force serialization
to resolve data dependencies through them
* * *
ARM: adding non-speculative/serialize flags for instructions change CPSR
2010-08-23 11:18:41 -05:00
Min Kyu Jeong
43c938d23e O3: Handle loads when the destination is the PC.
For loads that PC is the destination, check if the load
was mispredicted again when the value being loaded returns from memory
2010-08-23 11:18:40 -05:00
Min Kyu Jeong
5f91ec3f46 ARM/O3: store the result of the predicate evaluation in DynInst or Threadstate.
THis allows the CPU to handle predicated-false instructions accordingly.
This particular patch makes loads that are predicated-false to be sent
straight to the commit stage directly, not waiting for return of the data
that was never requested since it was predicated-false.
2010-08-23 11:18:40 -05:00
Min Kyu Jeong
7acf67971c ARM: adding genMachineCheckFault() stub for ARM that doesn't panic 2010-08-23 11:18:40 -05:00
Gene Wu
5486fa6612 ARM: DFSR status value for sync external data abort is expected to be 0x8 in ARMv7 2010-08-23 11:18:40 -05:00
Gene Wu
a993188034 ARM: Temporary local variables can't conflict with isa parser operands.
PC is an operand, so we can't have a temp called PC
2010-08-23 11:18:40 -05:00
Ali Saidi
0c434b7f56 ARM: Exclusive accesses must be double word aligned 2010-08-23 11:18:40 -05:00
Ali Saidi
5148c693d8 ARM: Add some registers for big loads/stores to support neon. 2010-08-23 11:18:40 -05:00
Ali Saidi
fc1730044e ARM: Decode neon memory instructions. 2010-08-23 11:18:40 -05:00
Gabe Black
d1362d582a ARM: Clean up the ISA desc portion of the ARM memory instructions. 2010-08-23 11:18:40 -05:00
Ali Saidi
ef3a3dc28a Loader: Don't insert symbols into the symbol table that begin wiht '$'. 2010-08-23 11:18:40 -05:00
Ali Saidi
230acc291c ARM: We don't currently support ThumbEE exceptions, so don't report that we do 2010-08-23 11:18:40 -05:00
Ali Saidi
c0ca01ec36 ARM: Change how the AMBA device ID checking is done to make it more generic 2010-08-23 11:18:40 -05:00
Ali Saidi
bb5377899a ARM: Add system for ARM/Linux and bootstrapping 2010-08-23 11:18:40 -05:00
Ali Saidi
8ed4f0a02c ARM: Add I/O devices for booting linux
--HG--
rename : src/dev/arm/Versatile.py => src/dev/arm/RealView.py
rename : src/dev/arm/versatile.cc => src/dev/arm/realview.cc
rename : src/dev/arm/versatile.hh => src/dev/arm/realview.hh
2010-08-23 11:18:40 -05:00
Ali Saidi
38cf6a164d ARM: Implement some more misc registers 2010-08-23 11:18:40 -05:00
Ali Saidi
b7b2eae6fa ARM: Fix an un-initialized variable bug 2010-08-23 11:18:39 -05:00
Ali Saidi
4ab68fc999 Loader: Use address mask provided to load*Symbols when loading the symbols from the symbol table. 2010-08-23 11:18:39 -05:00
Ali Saidi
f2642e2055 Loader: Make the load address mask be a parameter of the system rather than a constant.
This allows one two different OS requirements for the same ISA to be handled.
Some OSes are compiled for a virtual address and need to be loaded into physical
memory that starts at address 0, while other bare metal tools generate
images that start at address 0.
2010-08-23 11:18:39 -05:00
Min Kyu Jeong
d4e83a4001 ARM: Finish the timing translation when taking a fault. 2010-08-23 11:18:39 -05:00
Dam Sunwoo
cb76111a7e ARM: Use a stl queue for the table walker state 2010-08-23 11:18:39 -05:00
Ali Saidi
1d1837ee98 CPU: Set a default value when readBytes faults.
This was being done in read(), but if readBytes was called directly it
wouldn't happen. Also, instead of setting the memory blob being read to -1
which would (I believe) require using memset with -1 as a parameter, this now
uses bzero. It's hoped that it's more specialized behavior will make it
slightly faster.
2010-08-23 11:18:39 -05:00
Ali Saidi
ac575a9d82 Compiler: Fixes for GCC 4.5. 2010-08-23 11:18:39 -05:00
Ali Saidi
7d191366e1 BASE: Fix genrand to generate both 0s and 1s when max equals one.
previously was only generating 0s.
2010-08-23 11:18:39 -05:00
Ali Saidi
7793773809 stats: Fix off-by-one error in distributions.
bkt size isn't evenly divisible by max-min and it would round down,
it's possible to sample a distribution and have no place to put the sample.
When this case occured the simulator would assert.
2010-08-23 11:18:39 -05:00
Gabe Black
fa01fbddeb X86: Get rid of unused file arguments.hh. 2010-08-22 18:42:23 -07:00
Gabe Black
4ad30a662d SPARC: Fix some style issues in utility.hh. 2010-08-22 18:39:39 -07:00
Gabe Black
5836023ab2 X86: Get rid of the unused getAllocator on the python base microop class.
This function is always overridden, and doesn't actually have the right
signature.
2010-08-22 18:24:09 -07:00
Brad Beckmann
8557480300 ruby: Added merge GETS optimization to hammer
Added an optimization that merges multiple pending GETS requests into a
single request to the owner node.
2010-08-20 11:46:14 -07:00
Brad Beckmann
908364a1c9 ruby: Fixed minor bug in ruby test for setting the request type 2010-08-20 11:46:14 -07:00
Brad Beckmann
e7f2da517a ruby: Stall and wait input messages instead of recycling
This patch allows messages to be stalled in their input buffers and wait
until a corresponding address changes state.  In order to make this work,
all in_ports must be ranked in order of dependence and those in_ports that
may unblock an address, must wake up the stalled messages.  Alot of this
complexity is handled in slicc and the specification files simply
annotate the in_ports.

--HG--
rename : src/mem/slicc/ast/CheckAllocateStatementAST.py => src/mem/slicc/ast/StallAndWaitStatementAST.py
rename : src/mem/slicc/ast/CheckAllocateStatementAST.py => src/mem/slicc/ast/WakeUpDependentsStatementAST.py
2010-08-20 11:46:14 -07:00
Brad Beckmann
af6b97e3ee ruby: Recycle latency fix for hammer
Patch allows each individual message buffer to have different recycle latencies
and allows the overall recycle latency to be specified at the cmd line. The
patch also adds profiling info to make sure no one processor's requests are
recycled too much.
2010-08-20 11:46:14 -07:00
Brad Beckmann
f57053473a MOESI_hammer: break down miss latency stalled cycles
This patch tracks the number of cycles a transaction is delayed at different
points of the request-forward-response loop.
2010-08-20 11:46:14 -07:00
Brad Beckmann
8b28848321 ruby: added probe filter support to hammer 2010-08-20 11:46:14 -07:00
Brad Beckmann
593ae7457e ruby: fixed DirectoryMemory's numa_high_bit configuration
This fix includes the off-by-one bit selection bug for numa mapping.
2010-08-20 11:46:13 -07:00
Brad Beckmann
ac5bb214e3 ruby: Reset ruby stats in RubySystem unserialize
The main purpose for clearing stats in the unserialize process is so
that the profiler can correctly set its start time to the unserialized
value of curTick.
2010-08-20 11:46:13 -07:00
Brad Beckmann
72044e3f5a ruby: Disable migratory sharing for token and hammer
This patch allows one to disable migratory sharing for those cache blocks that
are accessed by atomic requests.  While the implementations are different
between the token and hammer protocols, the motivation is the same.  For
Alpha, LLSC semantics expect that normal loads do not unlock cache blocks that
have been locked by LL accesses.  Therefore, locked blocks should not transfer
write permissions when responding to these load requests.  Instead, only they
only transfer read permissions so that the subsequent SC access can possibly
succeed.
2010-08-20 11:46:13 -07:00
Brad Beckmann
bcdd19df03 ruby: Added SC fail indication to trace profiling 2010-08-20 11:46:13 -07:00
Brad Beckmann
283be34a99 devices: Fixed periodic interrupts to work with draining
Added drain functions to the RTC and 8254 timer so that periodic interrupts
stop when the system is draining.  This patch is needed to checkpoint in
timing mode.  Otherwise under certain situations, the event queue will never
be completely empty.
2010-08-20 11:46:13 -07:00
Brad Beckmann
b6d08e0455 ruby: Fixed RubyPort sendTiming callbacks
Fixed RubyPort schedSendTiming calls to match ruby frequency.
2010-08-20 11:46:13 -07:00
Brad Beckmann
45f6f31d7a ruby: fixed token bugs associated with owner token counts
This patch fixes several bugs related to previous inconsistent assumptions on
how many tokens the Owner had.  Mike Marty should have fixes these bugs years
ago.  :)
2010-08-20 11:46:13 -07:00
Brad Beckmann
fb2e0f56ef ruby: MOESI_CMP_token dma fixes
This patch fixes various protocol bugs regarding races between dma requests
and persistent requests.
2010-08-20 11:46:13 -07:00
Brad Beckmann
6a4f99899b ruby: Resurrected Ruby's deterministic tests
Added the request series and invalidate deterministic tests as new cpu models
and removed the no longer needed ruby tests

--HG--
rename : configs/example/rubytest.py => configs/example/determ_test.py
rename : src/mem/ruby/tester/DetermGETXGenerator.cc => src/cpu/directedtest/DirectedGenerator.cc
rename : src/mem/ruby/tester/DetermGETXGenerator.hh => src/cpu/directedtest/DirectedGenerator.hh
rename : src/mem/ruby/tester/DetermGETXGenerator.cc => src/cpu/directedtest/InvalidateGenerator.cc
rename : src/mem/ruby/tester/DetermGETXGenerator.hh => src/cpu/directedtest/InvalidateGenerator.hh
rename : src/cpu/rubytest/RubyTester.cc => src/cpu/directedtest/RubyDirectedTester.cc
rename : src/cpu/rubytest/RubyTester.hh => src/cpu/directedtest/RubyDirectedTester.hh
rename : src/mem/ruby/tester/DetermGETXGenerator.cc => src/cpu/directedtest/SeriesRequestGenerator.cc
rename : src/mem/ruby/tester/DetermGETXGenerator.hh => src/cpu/directedtest/SeriesRequestGenerator.hh
2010-08-20 11:46:13 -07:00
Brad Beckmann
984adf198a ruby: Updated MOESI_hammer L2 latency behavior
Previously, the MOESI_hammer protocol calculated the same latency for L1 and
L2 hits.  This was because the protocol was written using the old ruby
assumption that L1 hits used the sequencer fast path.  Since ruby no longer
uses the fast-path, the protocol delays L2 hits by placing them on the
trigger queue.
2010-08-20 11:46:13 -07:00
Brad Beckmann
29c45ccd23 ruby: Reduced ruby latencies
The previous slower ruby latencies created a mismatch between the faster M5
cpu models and the much slower ruby memory system.  Specifically smp
interrupts were much slower and infrequent, as well as cpus moving in and out
of spin locks.  The result was many cpus were idle for large periods of time.

These changes fix the latency mismatch.
2010-08-20 11:46:12 -07:00
Brad Beckmann
8e5c441a54 ruby: fix ruby llsc support to sync sc outcomes
Added support so that ruby can determine the outcome of store conditional
operations and reflect that outcome to M5 physical memory and cpus.
2010-08-20 11:46:12 -07:00
Brad Beckmann
54d76f0ce5 ruby: Fixed L2 cache miss profiling
Fixed L2 cache miss profiling for the MOESI_CMP_token protocol
2010-08-20 11:46:12 -07:00
Brad Beckmann
a3b4b9b3e3 ruby: Added bcast msg profiling to hammer and token 2010-08-20 11:46:12 -07:00
Brad Beckmann
1f82eb1a03 ruby: Added consolidated network msg stats 2010-08-20 11:46:12 -07:00
Brad Beckmann
4b4e725921 ruby: Reincarnated the responding machine profiling
This patch adds back to ruby the capability to understand the response time
for messages that hit in different levels of the cache heirarchy.
Specifically add support for the MI_example, MOESI_hammer, and MOESI_CMP_token
protocols.
2010-08-20 11:46:12 -07:00
Brad Beckmann
9fb4381ddc MOESI_CMP_token: Fixed dma persistent lockdown bugs 2010-08-20 11:46:12 -07:00
Brad Beckmann
808701a10c memtest: Memtester support for DMA
This patch adds DMA testing to the Memtester and is inherits many changes from
Polina's old tester_dma_extension patch.  Since Ruby does not work in atomic
mode, the atomic mode options are removed.
2010-08-20 11:46:12 -07:00
Brad Beckmann
64b2205992 ruby: Added ruby_request_type ostream def to libruby.hh 2010-08-20 11:46:12 -07:00
Brad Beckmann
d694cc1384 slicc: Consolidated the protocol stats printing
Created a separate ProfileDumper that consolidates the generated stats for
each controller of a certain type.
2010-08-20 11:46:12 -07:00
Brad Beckmann
09854be558 config: Added the topology description to m5 config.ini 2010-08-20 11:46:11 -07:00
Brad Beckmann
eb1e5636e3 ruby: Fixed printout when Sequencer detects a deadlock 2010-08-20 11:41:35 -07:00
Brad Beckmann
d7d73680c4 MESI_CMP_directory: bug fix for old PUTX requests 2010-08-20 11:41:35 -07:00
Steve Reinhardt
e0754c0f6c misc: add some AMD copyright notices
Meant to add these with the previous batch of csets.
2010-08-17 05:49:05 -07:00
Steve Reinhardt
164a211f10 x86: minor checkpointing bug fixes 2010-08-17 05:20:39 -07:00
Steve Reinhardt
f064aa3060 sim: revamp unserialization procedure
Replace direct call to unserialize() on each SimObject with a pair of
calls for better control over initialization in both ckpt and non-ckpt
cases.

If restoring from a checkpoint, loadState(ckpt) is called on each
SimObject.  The default implementation simply calls unserialize() if
there is a corresponding checkpoint section, so we get backward
compatibility for existing objects.  However, objects can override
loadState() to get other behaviors, e.g., doing other programmed
initializations after unserialize(), or complaining if no checkpoint
section is found.  (Note that the default warning for a missing
checkpoint section is now gone.)

If not restoring from a checkpoint, we call the new initState() method
on each SimObject instead.  This provides a hook for state
initializations that are only required when *not* restoring from a
checkpoint.

Given this new framework, do some cleanup of LiveProcess subclasses
and X86System, which were (in some cases) emulating initState()
behavior in startup via a local flag or (in other cases) erroneously
doing initializations in startup() that clobbered state loaded earlier
by unserialize().
2010-08-17 05:17:06 -07:00
Steve Reinhardt
2519d116c9 sim: fold checkpoint restore code into instantiate()
The separate restoreCheckpoint() call is gone; just pass
the checkpoint dir as an optional arg to instantiate().
This change is a precursor to some more extensive
reworking of the startup code.
2010-08-17 05:17:06 -07:00
Steve Reinhardt
c2e1458746 sim: clean up child handling
The old code for handling SimObject children was kind of messy,
with children stored both in _values and _children, and
inconsistent and potentially buggy handling of SimObject
vectors.  Now children are always stored in _children, and
SimObject vectors are consistently handled using the
SimObjectVector class.

Also, by deferring the parenting of SimObject-valued parameters
until the end (instead of doing it at assignment), we eliminate
the hole where one could assign a vector of SimObjects to a
parameter then append to that vector, with the appended objects
never getting parented properly.

This patch induces small stats changes in tests with data races
due to changes in the object creation & initialization order.
The new code does object vectors in order and so should be more
stable.
2010-08-17 05:11:00 -07:00
Steve Reinhardt
5ea906ba16 sim: move iterating over SimObjects into Python. 2010-08-17 05:08:50 -07:00
Steve Reinhardt
c2cce96a0b sim: fail on implicit creation of orphans via ports
Orphan SimObjects (not in the config hierarchy) could get
created implicitly if they have a port connection to a SimObject
that is in the hierarchy.  This means that there are objects on
the C++ SimObject list (created via the C++ SimObject
constructor call) that are unknown to Python and will get
skipped if we walk the hierarchy from the Python side (as we are
about to do).  This patch detects this situation and prints an
error message.

Also fix the rubytester config script which happened to rely on
this behavior.
2010-08-17 05:06:22 -07:00
Steve Reinhardt
1fbe466345 sim: make Python Root object a singleton
Enforce that the Python Root SimObject is instantiated only
once.  The C++ Root object already panics if more than one is
created.  This change avoids the need to track what the root
object is, since it's available from Root.getInstance() (if it
exists).  It's now redundant to have the user pass the root
object to functions like instantiate(), checkpoint(), and
restoreCheckpoint(), so that arg is gone.  Users who use
configs/common/Simulate.py should not notice.
2010-08-17 05:06:22 -07:00
Steve Reinhardt
0685ae7a2d bus: clean up default responder code.
Clean up some minor things left over from the default responder
change in rev 9af6fb59752f.  Mostly renaming the 'responder_set'
param to 'use_default_range' to actually reflect what it does...
old name wasn't that descriptive in the first place, but now
it really doesn't make sense at all.

Also got rid of the bogus obsolete assignment to 'bus.responder'
which used to be a parameter but now is interpreted as an
implicit child assignment, and which was giving me problems in
the config restructuring to come.  (A good argument for not
allowing implicit child assignments, IMO, but that's water under
the bridge, I'm afraid.)

Also moved the Bus constructor to the .cc file since that's
where it should have been all along.
2010-08-17 05:06:21 -07:00
Gabe Black
c4ba6967a5 Inorder: Fix compilation of m5.fast.
printMemData is only used in DPRINTFs. If those are removed by compiling
m5.fast, that function is unused, gcc generates a warning, that gets turned
into an error, and the build fails. This change surrounds the function
definition with #if TRACING_ON so it only gets compiled in if the DPRINTFs do
to.
2010-08-14 01:00:45 -07:00
Gabe Black
961aafc044 Merge with head. 2010-08-13 06:16:30 -07:00
Gabe Black
aa8c6e9c95 CPU: Add readBytes and writeBytes functions to the exec contexts. 2010-08-13 06:16:02 -07:00
Gabe Black
65dbcc6ea1 InOrder: Clean up some DPRINTFs that print data sent to/from the cache. 2010-08-13 06:16:00 -07:00
Gabe Black
52a90a5998 CPU: Tidy up endianness handling for mmapped "IPR"s. 2010-08-13 06:10:45 -07:00
Joel Hestness
53c241fc16 TimingSimpleCPU: fix NO_ACCESS memory op handling
When a request is NO_ACCESS (x86 CDA microinstruction), the memory op
doesn't go to the cache, so TimingSimpleCPU::completeDataAccess needs
to handle the case where the current status of the CPU is Running
and not DcacheWaitResponse or DTBWaitResponse
2010-08-12 17:16:02 -07:00
Timothy M. Jones
97d245278d Power: The condition register should be set or cleared upon a system call
return to indicate success or failure.
2010-07-22 18:54:37 +01:00
Timothy M. Jones
607f519800 LSQ Unit: After deleting part of a split request, set it to NULL so that it
isn't accidentally deleted again later (causing a segmentation fault).
2010-07-22 18:54:37 +01:00
Timothy M. Jones
28a5ea3f99 Port: Only indicate that a SimpleTimingPort is drained if its send event is
not scheduled, as well as the transmit list being empty.
2010-07-22 18:54:37 +01:00
Timothy M. Jones
e50a880297 O3CPU: Fix a bug where stores in the cpu where never marked as split. 2010-07-22 18:52:02 +01:00
Timothy M. Jones
0d301ca4c4 Syscall: Don't close the simulator's standard file descriptors. 2010-07-22 18:47:52 +01:00
Timothy M. Jones
9a3533ec84 O3CPU: O3's tick event gets squashed when it is switched out. When repeatedly
switching between O3 and another CPU, O3's tick event might still be scheduled
in the event queue (as squashed).  Therefore, check for a squashed tick event
as well as a non-scheduled event when taking over from another CPU and deal
with it accordingly.
2010-07-22 18:47:43 +01:00
Timothy M. Jones
8c76715979 Power: Provide a utility function to copy registers from one thread context
to another in the Power ISA.
2010-07-22 18:47:03 +01:00
Nathan Binkert
21bf6ff101 stats: unify the two stats distribution type better 2010-07-21 18:54:53 -07:00
Nathan Binkert
2a1309f213 stats: cleanup a few small problems in stats 2010-07-21 15:53:53 -07:00