Without this change 0 is always used for the youngest sequence number if
a squash occured and the ROB was empty (E.g. an instruction is marked
serializeAfter or a fetch stall prevents other instructions from issuing).
Using 0 there is a race to rename where an instruction that committed the
same cycle as the squashing instruction can have it's renamed state undone
by the squash using sequence number 0.
I'm not positive this is the correct fix, but it's working right now.
Either we need to do something like this, prevent the misc reg from being renamed at all,
or there something else going on. We need to find the root cause as to why
this is only a problem sometimes.
The squash inside the fetch unit should not attempt to remove them from the
branch predictor as non-control instructions are not pushed into the predictor.
When this condition occurs the cpu should restart the fetch stage to fetch from
the original execution path. Fault handling in the commit stage is cleaned up a
little bit so the control flow is simplier. Finally, if an instruction is being
used to carry a fault it isn't executed, so the fault propagates appropriately.
There were several copies of similar functions that looked
like they all replicated reschedule(), so I replaced them
with direct calls. Keeping this separate from the previous
cset since there may be some subtle functional differences
if the code ever reschedules an event that is scheduled but
not squashed (though none were detected in the regressions).
Events need to be scheduled on the queue assigned
to the SimObject, not on the global queue (which
should be going away).
Also cleaned up a number of redundant expressions
that made the code unnecessarily verbose.
These files really aren't general enough to belong in src/base.
This patch doesn't reorder include lines, leaving them unsorted
in many cases, but Nate's magic script will fix that up shortly.
--HG--
rename : src/base/sched_list.hh => src/cpu/sched_list.hh
rename : src/base/timebuf.hh => src/cpu/timebuf.hh
Ran all the source files through 'perl -pi' with this script:
s|\s*(};?\s*)?/\*\s*(end\s*)?namespace\s*(\S+)\s*\*/(\s*})?|} // namespace $3|;
s|\s*};?\s*//\s*(end\s*)?namespace\s*(\S+)\s*|} // namespace $2\n|;
s|\s*};?\s*//\s*(\S+)\s*namespace\s*|} // namespace $1\n|;
Also did a little manual editing on some of the arch/*/isa_traits.hh files
and src/SConscript.
The store queue doesn't need to be ISA specific and architectures can
frequently store more than an int registers worth of data. A 128 bits seems
more common, but even 256 bits may be appropriate. Pretty much anything less
than a cache line size is buildable.
For SPARC ASIs are added to the ExtMachInst. If the ASI is changed simply
marking the instruction as Serializing isn't enough beacuse that only
stops rename. This provides a mechanism to squash all the instructions
and refetch them
ARM instructions updating cumulative flags (ARM FP exceptions and saturation
flags) are not serialized.
Added aliases for ARM FP exceptions and saturation flags in FPSCR. Removed
write accesses to the FP condition codes for most ARM VFP instructions: only
VCMP and VCMPE instructions update the FP condition codes. Removed a potential
cause of seg. faults in the O3 model for NEON memory macro-ops (ARM).
This change makes O3 flatten floating point destination registers, and also
fixes misc register flattening so that it's correctly repositioned relative to
the resized regions for integer and floating point indices.
It also fixes some overly long lines.
In the case of a split transaction and a cache that is faster than a CPU we
could get two responses before next_tick expires. Add an event that is
scheduled in this case and return false rather than asserting.
This change modifies the way prefetches work. They are now like normal loads
that don't writeback a register. Previously prefetches were supposed to call
prefetch() on the exection context, so they executed with execute() methods
instead of initiateAcc() completeAcc(). The prefetch() methods for all the CPUs
are blank, meaning that they get executed, but don't actually do anything.
On Alpha dead cache copy code was removed and prefetches are now normal ops.
They count as executed operations, but still don't do anything and IsMemRef is
not longer set on them.
On ARM IsDataPrefetch or IsInstructionPreftech is now set on all prefetch
instructions. The timing simple CPU doesn't try to do anything special for
prefetches now and they execute with the normal memory code path.
This change is a low level and pervasive reorganization of how PCs are managed
in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about,
the PC and the NPC, and the lsb of the PC signaled whether or not you were in
PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next
micropc, x86 and ARM introduced variable length instruction sets, and ARM
started to keep track of mode bits in the PC. Each CPU model handled PCs in
its own custom way that needed to be updated individually to handle the new
dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack,
the complexity could be hidden in the ISA at the ISA implementation's expense.
Areas like the branch predictor hadn't been updated to handle branch delay
slots or micropcs, and it turns out that had introduced a significant (10s of
percent) performance bug in SPARC and to a lesser extend MIPS. Rather than
perpetuate the problem by reworking O3 again to handle the PC features needed
by x86, this change was introduced to rework PC handling in a more modular,
transparent, and hopefully efficient way.
PC type:
Rather than having the superset of all possible elements of PC state declared
in each of the CPU models, each ISA defines its own PCState type which has
exactly the elements it needs. A cross product of canned PCState classes are
defined in the new "generic" ISA directory for ISAs with/without delay slots
and microcode. These are either typedef-ed or subclassed by each ISA. To read
or write this structure through a *Context, you use the new pcState() accessor
which reads or writes depending on whether it has an argument. If you just
want the address of the current or next instruction or the current micro PC,
you can get those through read-only accessors on either the PCState type or
the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the
move away from readPC. That name is ambiguous since it's not clear whether or
not it should be the actual address to fetch from, or if it should have extra
bits in it like the PAL mode bit. Each class is free to define its own
functions to get at whatever values it needs however it needs to to be used in
ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the
PC and into a separate field like ARM.
These types can be reset to a particular pc (where npc = pc +
sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as
appropriate), printed, serialized, and compared. There is a branching()
function which encapsulates code in the CPU models that checked if an
instruction branched or not. Exactly what that means in the context of branch
delay slots which can skip an instruction when not taken is ambiguous, and
ideally this function and its uses can be eliminated. PCStates also generally
know how to advance themselves in various ways depending on if they point at
an instruction, a microop, or the last microop of a macroop. More on that
later.
Ideally, accessing all the PCs at once when setting them will improve
performance of M5 even though more data needs to be moved around. This is
because often all the PCs need to be manipulated together, and by getting them
all at once you avoid multiple function calls. Also, the PCs of a particular
thread will have spatial locality in the cache. Previously they were grouped
by element in arrays which spread out accesses.
Advancing the PC:
The PCs were previously managed entirely by the CPU which had to know about PC
semantics, try to figure out which dimension to increment the PC in, what to
set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction
with the PC type itself. Because most of the information about how to
increment the PC (mainly what type of instruction it refers to) is contained
in the instruction object, a new advancePC virtual function was added to the
StaticInst class. Subclasses provide an implementation that moves around the
right element of the PC with a minimal amount of decision making. In ISAs like
Alpha, the instructions always simply assign NPC to PC without having to worry
about micropcs, nnpcs, etc. The added cost of a virtual function call should
be outweighed by not having to figure out as much about what to do with the
PCs and mucking around with the extra elements.
One drawback of making the StaticInsts advance the PC is that you have to
actually have one to advance the PC. This would, superficially, seem to
require decoding an instruction before fetch could advance. This is, as far as
I can tell, realistic. fetch would advance through memory addresses, not PCs,
perhaps predicting new memory addresses using existing ones. More
sophisticated decisions about control flow would be made later on, after the
instruction was decoded, and handed back to fetch. If branching needs to
happen, some amount of decoding needs to happen to see that it's a branch,
what the target is, etc. This could get a little more complicated if that gets
done by the predecoder, but I'm choosing to ignore that for now.
Variable length instructions:
To handle variable length instructions in x86 and ARM, the predecoder now
takes in the current PC by reference to the getExtMachInst function. It can
modify the PC however it needs to (by setting NPC to be the PC + instruction
length, for instance). This could be improved since the CPU doesn't know if
the PC was modified and always has to write it back.
ISA parser:
To support the new API, all PC related operand types were removed from the
parser and replaced with a PCState type. There are two warts on this
implementation. First, as with all the other operand types, the PCState still
has to have a valid operand type even though it doesn't use it. Second, using
syntax like PCS.npc(target) doesn't work for two reasons, this looks like the
syntax for operand type overriding, and the parser can't figure out if you're
reading or writing. Instructions that use the PCS operand (which I've
consistently called it) need to first read it into a local variable,
manipulate it, and then write it back out.
Return address stack:
The return address stack needed a little extra help because, in the presence
of branch delay slots, it has to merge together elements of the return PC and
the call PC. To handle that, a buildRetPC utility function was added. There
are basically only two versions in all the ISAs, but it didn't seem short
enough to put into the generic ISA directory. Also, the branch predictor code
in O3 and InOrder were adjusted so that they always store the PC of the actual
call instruction in the RAS, not the next PC. If the call instruction is a
microop, the next PC refers to the next microop in the same macroop which is
probably not desirable. The buildRetPC function advances the PC intelligently
to the next macroop (in an ISA specific way) so that that case works.
Change in stats:
There were no change in stats except in MIPS and SPARC in the O3 model. MIPS
runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could
likely be improved further by setting call/return instruction flags and taking
advantage of the RAS.
TODO:
Add != operators to the PCState classes, defined trivially to be !(a==b).
Smooth out places where PCs are split apart, passed around, and put back
together later. I think this might happen in SPARC's fault code. Add ISA
specific constructors that allow setting PC elements without calling a bunch
of accessors. Try to eliminate the need for the branching() function. Factor
out Alpha's PAL mode pc bit into a separate flag field, and eliminate places
where it's blindly masked out or tested in the PC.
This reduces the scope of those includes and makes it less likely for there to
be a dependency loop. This also moves the hashing functions associated with
ExtMachInst objects to be with the ExtMachInst definitions and out of
utility.hh.
This code is no longer needed because of the preceeding change which adds a
StaticInstPtr parameter to the fault's invoke method, obviating the only use
for this pair of functions.
Also move the "Fault" reference counted pointer type into a separate file,
sim/fault.hh. It would be better to name this less similarly to sim/faults.hh
to reduce confusion, but fault.hh matches the name of the type. We could change
Fault to FaultPtr to match other pointer types, and then changing the name of
the file would make more sense.
Don't assert that the response packet is marked as a response
since it won't always be so for functional accesses.
Also cleanup code to refer to functional accesses rather
than "probes" (old terminology), and mention in the
DPRINTF which type of access we're doing.
When decoding a srs instruction, invalid mode encoding returns invalid instruction.
This can happen when garbage instructions are fetched from mispredicted path
Since miscellaneous registers bypass wakeup logic, force serialization
to resolve data dependencies through them
* * *
ARM: adding non-speculative/serialize flags for instructions change CPSR
THis allows the CPU to handle predicated-false instructions accordingly.
This particular patch makes loads that are predicated-false to be sent
straight to the commit stage directly, not waiting for return of the data
that was never requested since it was predicated-false.
This was being done in read(), but if readBytes was called directly it
wouldn't happen. Also, instead of setting the memory blob being read to -1
which would (I believe) require using memset with -1 as a parameter, this now
uses bzero. It's hoped that it's more specialized behavior will make it
slightly faster.
This patch adds DMA testing to the Memtester and is inherits many changes from
Polina's old tester_dma_extension patch. Since Ruby does not work in atomic
mode, the atomic mode options are removed.
printMemData is only used in DPRINTFs. If those are removed by compiling
m5.fast, that function is unused, gcc generates a warning, that gets turned
into an error, and the build fails. This change surrounds the function
definition with #if TRACING_ON so it only gets compiled in if the DPRINTFs do
to.
When a request is NO_ACCESS (x86 CDA microinstruction), the memory op
doesn't go to the cache, so TimingSimpleCPU::completeDataAccess needs
to handle the case where the current status of the CPU is Running
and not DcacheWaitResponse or DTBWaitResponse
switching between O3 and another CPU, O3's tick event might still be scheduled
in the event queue (as squashed). Therefore, check for a squashed tick event
as well as a non-scheduled event when taking over from another CPU and deal
with it accordingly.
m5 doesnt do stats specific to binary and this resource request stat is probably only
useful for people who really know the ins/outs of the model anyway
replace priority queue with vector of lists(1 list per stage) and place inside a class
so that we have more control of when an instruction uses a particular schedule entry
...
also, this is the 1st step toward making the InOrderCPU fully parameterizable. See the
wiki for details on this process
- use InOrderBPred instead of Resource for DPRINTFs
- account for DELAY SLOT in updating RAS and in squashing
- don't let squashed instructions update the predictor
- the BTB needs to use the ASID not the TID to work for multithreaded programs
- add stats for BTB hits
Expand the help text on the --remote-gdb-port option so
people know you can use it to disable remote gdb without
reading the source code, and thus don't waste any time
trying to add a separate option to do that.
Clean up some gdb-related cruft I found while looking
for where one would add a gdb disable option, before
I found the comment that told me that I didn't need
to do that.
Suppose the saturating counters of a branch predictor contain n bits. When the
counter is between 0 and (2^(n-1) - 1), boundaries included, the branch is
predicted as not taken. When the counter is between 2^(n-1) and (2^n - 1),
boundaries included, the branch is predicted as taken.
when insts execute, they mark the time they finish to be used for subsequent isnts
they may need forwarding of data. However, the regdepmap was using the wrong
value to index into the destination operands of the instruction to be forwarded.
Thus, in some cases, we are checking to see if the 3rd destination register
for an instruction is executed at a certain time, when there is only 1 dest. register
valid. Thus, we get a bad, uninitialized time value that will stall forwarding
causing performance loss but still the correct execution.
In addition to obvious changes, this required a slight change to the slicc
grammar to allow types with :: in them. Otherwise slicc barfs on std::string
which we need for the headers that slicc generates.
make sure to only read 1 src reg. for write-hint and any other similar
'store' instruction. Reading the source reg when its not necessary
can cause the simulator to read from uninitialized values
These recordEvent() calls could cause crashes since they
access the req pointer after it's potentially been
deleted during a failed translation call. (Similar
problem to the traceData bug fixed in the previous cset.)
Moving them above the translation call (as was done
recentlyi in cset 8b2b8e5e7d35) avoids the crash
but doesn't work, since at that point we don't know if
the access is uncached or not.
It's not clear why these calls are there, and no one
seems to use them, so we'll just delete them. If they
are needed, they should be moved to somewhere that's
guaranteed to be after the translation completes but
before the request is possibly deleted, e.g., in
finishTranslation().
Accessing traceData (to call setAddress() and/or setData())
after initiating a timing translation was causing crashes,
since a failed translation could delete the traceData
object before returning.
It turns out that there was never a need to access traceData
after initiating the translation, as the traced data was
always available earlier; this ordering was merely
historical. Furthermore, traceData->setAddress() and
traceData->setData() were being called both from the CPU
model and the ISA definition, often redundantly.
This patch standardizes all setAddress and setData calls
for memory instructions to be in the CPU models and not
in the ISA definition. It also moves those calls above
the translation calls to eliminate the crashes.
When implementing timing address translations instead of atomic, I
forgot to preserve the faults that are returned from the read and
write calls. This patch reinstates them.
When each load or store is sent to the LSQ, we check whether it will cross a
cache line boundary and, if so, split it in two. This creates two TLB
translations and two memory requests. Care has to be taken if the first
packet of a split load is sent but the second blocks the cache. Similarly,
for a store, if the first packet cannot be sent, we must store the second
one somewhere to retry later.
This modifies the LSQSenderState class to record both packets in a split
load or store.
Finally, a new const variable, HasUnalignedMemAcc, is added to each ISA
to indicate whether unaligned memory accesses are allowed. This is used
throughout the changed code so that compiler can optimise away code dealing
with split requests for ISAs that don't need them.
This initiates a timing translation and passes the read or write on to the
processor before waiting for it to finish. Once the translation is finished,
the instruction's state is updated via the 'finish' function. A new
DataTranslation class is created to handle this.
The idea is taken from the implementation of timing translations in
TimingSimpleCPU by Gabe Black. This patch also separates out the timing
translations from this CPU and uses the new DataTranslation class.
- on certain retry requests you can get an assertion failure
- fix by allowing the request to literally "Retry" itself
if it wasnt successful before, and then block any requests
through cache port while waiting for the cache to be
made available for access
when threads are switching in/out the CPU, we need to keep
track of special cases like branches. Add appropriate
variables in ThreadState t track this and then use
these variables when updating pc after context switch
this will be used for when a thread comes back from a cache miss, it needs to update the PCs
because the inst might of been a branch or delayslot in which the next PC isnt always
a straight addition
allow a thread to wakeup and be activated after
it has been in suspended state and another
thread is switched out. Need to give
pipeline stages a "activateThread" function
so that can get to their suspended instruction
when the time is right.
give resources their own specific
activity to do for a "suspend" event
instead of defaulting to deactivating the thread for a
suspend thread event. This really matters
for the fetch sequence unit which wants to remove the
thread from fetching while other units want to
ignore a thread suspension. If you deactivate a thread
in a resource then you may lose some of the allotted
bandwidth that the thread is taking up...
update/add in the use of isThreadReady & isThreadSuspended
functions.Check in activateThread what list a thread is
on so it can be managed accordingly.
-Support ability to activate next ready thread after a cache miss
through the activateNextReadyContext/Thread() functions
-To support this a "readyList" of thread ids is added
-After a cache miss, thread will suspend and then call
activitynextreadythread
allow for events to schedule themselves later if desired. this is important
because of cases like where you need to activate a thread only after the previous
thread has been deactivated. The ordering there has to be enforced
add code to recognize memory stalls in resources and the pipeline as well
as squash a thread if there is a stall and we are in the switch on cache miss
model
add buffer for instructions to switch out to in a pipeline stage
can't squash the instruction and remove the pipeline so we kind of need
to 'suspend' an instruction at the stage while the memory stall resolves
for the switch on cache miss model
- loads were happening on same cycle as the address was generated which is slightly
unrealistic. Instead, force address generation to be on separate cycle from load
initiation
- also, mark the stages in a more traditional way (F-D-X-M-W)
This adds support for the 32-bit, big endian Power ISA. This supports both
integer and floating point instructions based on the Power ISA Book I v2.06.
When enabled, faulting instructions appear in the trace twice
(once when they fault and again when they're re-executed).
This flag is set by the Exec compound flag for backwards compatibility.
Get rid of misc.py and just stick misc things in __init__.py
Move utility functions out of SCons files and into m5.util
Move utility type stuff from m5/__init__.py to m5/util/__init__.py
Remove buildEnv from m5 and allow access only from m5.defines
Rename AddToPath to addToPath while we're moving it to m5.util
Rename read_command to readCommand while we're moving it
Rename compare_versions to compareVersions while we're moving it.
--HG--
rename : src/python/m5/convert.py => src/python/m5/util/convert.py
rename : src/python/m5/smartdict.py => src/python/m5/util/smartdict.py
TLBUnit no longer used and we also get rid of memAccSize and memAccFlags functions added to ISA and StaticInst
since TLB is not a separate resource to acquire. Instead, TLB access is done before any read/write to memory
and the result is checked before it's sent out to memory.
* * *
inorder was incorrectly storing FP values and confusing the integer/fp storage view of floating point operations. A big issue was knowing trying to infer when were doing single or double precision access
because this lets you know the size of value to store (32-64 bits). This isnt exactly straightforward since alpha uses all 64-bit regs while mips/sparc uses a dual-reg view. by getting this value from
the actual floating point register file, the model can figure out what it needs to store
Remove subinstructions eaComp/memAcc since unused in CPU Models. Instead, create eaComp that is visible from StaticInst object. Gives InOrder model capability of generating address without actually initiating access
* * *
Changes so that InOrder can work for a non-delay-slot ISA like Alpha. Typically, changes have to do with handling misspeculated branches at different points in pipeline
Edit AlphaISA to support the inorder model. Mostly alternate constructor functions and also a few skeleton multithreaded support functions
* * *
Remove namespace from header file. Causes compiler issues that are hard to find
* * *
Separate the TLB from the CPU and allow it to live in the TLBUnit resource. Give CPU accessor functions for access and also bind at construction time
* * *
Expose memory access size and flags through instruction object
(temporarily memAccSize and memFlags to get TLB stuff working.)
this was double scheduling itself (once in constructor and once in cpu code). also add support for stopping / starting
progress events through repeatEvent flag and also changing the interval of the progress event as well
For some reason o3 FS init() only called initCPU if the thread state
was Suspended, which was no longer the case. There's no apparent
reason to check, so I whacked the test completely rather than
changing the check to Halted.
The inorder init() was also updated to be symmetric, though the
previous code was just a fancy no-op.
This situation can arise now on the first fetch cycle after
the last active thread is halted. It seems easy enough to
deal with when it happens rather than trying to avoid it.
This provides a common initial status for all threads independent
of CPU model (unlike the prior situation where CPUs initialized
threads to inconsistent states).
This mostly matters for SE mode; in FS mode, ISA-specific startupCPU()
methods generally handle boot-time initialization of thread contexts
(since the right thing to do is ISA-dependent).
Basically merge it in with Halted.
Also had to get rid of a few other functions that
called ThreadContext::deallocate(), including:
- InOrderCPU's setThreadRescheduleCondition.
- ThreadContext::exit(). This function was there to avoid terminating
simulation when one thread out of a multi-thread workload exits, but we
need to find a better (non-cpu-centric) way.
Apparently we broke it with the cache rewrite and never noticed.
Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part
of these changes (and for inspiring me to work on the rest).
Some other overdue cleanup on the prefetch code too.
This model currently only works in MIPS_SE mode, so it will take some effort
to clean it up and make it generally useful. Hopefully people are willing to
help make that happen!
Make interrupts use the new wakeup method, and pull all of the interrupt
stuff into the cpu base class so that only the wakeup code needs to be updated.
I tried to make wakeup, wakeCPU, and the various other mechanisms for waking
and sleeping a little more sane, but I couldn't understand why the statistics
were changing the way they were. Maybe we'll try again some day.
the primary identifier for a hardware context should be contextId(). The
concept of threads within a CPU remains, in the form of threadId() because
sometimes you need to know which context within a cpu to manipulate.
SE. Process still keeps track of the tc's it owns, but registration occurs
with the System, this eases the way for system-wide context Ids based on
registration.
across the subclasses. generally make it so that member data is _cpuId and
accessor functions are cpuId(). The ID val comes from the python (default -1 if
none provided), and if it is -1, the index of cpuList will be given. this has
passed util/regress quick and se.py -n4 and fs.py -n4 as well as standard
switch.
The constructor no-longer schedules an event at construction and the implict conversion between int and bool was allowing the old code to compile without warning.
Signed-off By: Ali Saidi
the instruction after the hwrei to be fetched before the ITB/DTB_CM register is updated in a call pal
call sys and thus the translation fails because the user is attempting to access a super page address.
Minimally, it seems as though some sort of fetch stall or refetch after a hwrei is required. I think
this works currently because the hwrei uses the exec context interface, and the o3 stalls when that occurs.
Additionally, these changes don't update the LOCK register and probably break ll/sc. Both o3 changes were
removed since a great deal of manual patching would be required to only remove the hwrei change.
Make them easier to express by only having the cxx_type parameter which
has the full namespace name, and drop the cxx_namespace thing.
Add support for multiple levels of namespace.
Even though we're not incorrect about operator precedence, let's add
some parens in some particularly confusing places to placate GCC 4.3
so that we don't have to turn the warning off. Agreed that this is a
bit of a pain for those users who get the order of operations correct,
but it is likely to prevent bugs in certain cases.
Fix the logic in the LSQ that determines if there are any stores to
write back. In the commit stage, check for thread specific writebacks
instead of just any writeback.
python type of a latency. In addition, the multiple definitions of profile in the different cpu models caused
problems for intialization of the interval value. If a child class's profile value was defined, the parent
BaseCPU::ProfileEvent interval field would be initialized with a garbage value. The fix was to remove the
multiple redifitions of profile in the child CPU classes.
A whole bunch of stuff has been converted to use the new params stuff, but
the CPU wasn't one of them. While we're at it, make some things a bit
more stylish. Most of the work was done by Gabe, I just cleaned stuff up
a bit more at the end.
When invoking several copies of m5 on the same machine at the same
time, there can be a race for TCP ports for the terminal connections
or remote gdb. Expose a function to disable those ports, and have the
regression scripts disable them. There are some SimObjects that have
no other function than to be used with ports (NativeTrace and
EtherTap), so they will panic if the ports are disabled.
The notIdleFraction statistic isn't updated when the statistics reset, probably because the cpu Status information
was pulled into the atomic and timing cpus. This changeset pulls Status back into the BaseSimpleCPU object. Anyone
care to comment on the odd naming of the Status instance? It shouldn't just be status because that is confusing
with Port::Status, but _status seems a bit strage too.
This should help if somebody gets to the bug
fix before me (or someone else)...
--HG--
extra : convert_revision : 0ae64c58ef4f7b02996f31e9e9e6bfad344719e2
from the right point (#32 usually) instead of restarting at 0 and double-freeing.
Commented out assert line in free_list.hh that will check for when double-free condition
goes bad.
--HG--
extra : convert_revision : 08d5f9b6a874736e487d101e85c22aaa67bf59ae
where we defer a response to a read from a far-away cache A, then later
defer a ReadExcl from a cache B on the same bus as us. We'll assert
MemInhibit in both cases, but in the latter case MemInhibit will keep
the invalidation from reaching cache A. This special response tells
cache A that it gets the block to satisfy its read, but must immediately
invalidate it.
--HG--
extra : convert_revision : f85c8b47bb30232da37ac861b50a6539dc81161b
Also some bug fixes in MIPS ISA uncovered by g++ warnings
(Python string compares don't work in C++!).
--HG--
extra : convert_revision : b347cc0108f23890e9b73b3ee96059f0cea96cf6
SimObjects not yet updated:
- Process and subclasses
- BaseCPU and subclasses
The SimObject(const std::string &name) constructor was removed. Subclasses
that still rely on that behavior must call the parent initializer as
: SimObject(makeParams(name))
--HG--
extra : convert_revision : d6faddde76e7c3361ebdbd0a7b372a40941c12ed
It should be cleared prior to the call to recvRetry.
Add extra DPRINTF statement for clearer debugging output.
--HG--
extra : convert_revision : e2332754743f42d60e159ac89f6fb0fd8b7f57f8
Ignore different values or rcx and r11 after a syscall until either the local or remote value changes. Also change the codes organization somewhat.
--HG--
extra : convert_revision : 2c1f69d4e55b443e68bfc7b43e8387b02cf0b6b5
People should never put pointers in DPRINTFs; it messes up
tracediffs. Plus these used the FullCPU trace flag, which
is not right.
--HG--
extra : convert_revision : 82ed56757da0ad947c165ba205b5f752c85c6667
The correct order is unintuitively rax, rcx, rdx, rbx, etc, not rax, rbx, rcx, rdx.
--HG--
extra : convert_revision : 3abe6a723a6e30becfe34f8da707ea2ff5d4df77
Code was assuming that all argument registers followed in order from ArgumentReg0. There is now an ArgumentReg array which is indexed to find the right index. There is a constant, NumArgumentRegs, which can be used to protect against using an invalid ArgumentReg.
--HG--
extra : convert_revision : f448a3ca4d6adc3fc3323562870f70eec05a8a1f
creation and initialization now happens in python. Parameter objects
are generated and initialized by python. The .ini file is now solely for
debugging purposes and is not used in construction of the objects in any
way.
--HG--
extra : convert_revision : 7e722873e417cb3d696f2e34c35ff488b7bff4ed
src/cpu/simple/timing.cc:
Fix another SC problem.
src/mem/cache/cache_impl.hh:
Forgot to call makeTimingResponse() on uncached timing responses.
--HG--
extra : convert_revision : 5a5a58ca2053e4e8de2133205bfd37de15eb4209
into vm1.(none):/home/stever/bk/newmem-cache2
src/base/traceflags.py:
Hand merge.
--HG--
extra : convert_revision : 9e7539eeab4220ed7a7237457a8f336f79216924
src/cpu/memtest/memtest.cc:
Need to set packet source field so that response from cache
doesn't run into assertion failure when copying source to dest.
src/mem/packet.hh:
Copy source field when copying packets.
Assert that source is valid before copying it to dest
when turning packets around.
--HG--
extra : convert_revision : 09e3cfda424aa89fe170e21e955b295746832bf8
into ahchoo.blinky.homelinux.org:/home/gblack/m5/newmem-o3-micro
src/cpu/o3/fetch_impl.hh:
hand merge
--HG--
extra : convert_revision : 3f71f3ac2035eec8b6f7bceb6906edb4dd09c045
supposed to and make sure parameters have the right type.
Also make sure that any object that should be an intermediate
type has the right options set.
--HG--
extra : convert_revision : d56910628d9a067699827adbc0a26ab629d11e93
into doughnut.hpl.hp.com:/home/gblack/newmem-o3-micro
src/cpu/base_dyn_inst_impl.hh:
src/cpu/o3/fetch_impl.hh:
Hand merge
--HG--
extra : convert_revision : 0c0692033ac30133672d8dfe1f1a27e9d9e95a3d
time it was responded to is curTick. Doesn't change the results, but it does make implementation of nextCycle() more difficult
--HG--
extra : convert_revision : 67ed6261a5451d17d96d5df45992590acc353afc
into vm1.(none):/home/stever/bk/newmem-cache2
configs/example/memtest.py:
Hand merge redundant changes.
--HG--
extra : convert_revision : a2e36be254bf052024f37bcb23b5209f367d37e1
timing mode still broken.
configs/example/memtest.py:
Revamp options.
src/cpu/memtest/memtest.cc:
No need for memory initialization.
No need to make atomic response... memory system should do that now.
src/cpu/memtest/memtest.hh:
MemTest really doesn't want to snoop.
src/mem/bridge.cc:
checkFunctional() cleanup.
src/mem/bus.cc:
src/mem/bus.hh:
src/mem/cache/base_cache.cc:
src/mem/cache/base_cache.hh:
src/mem/cache/cache.cc:
src/mem/cache/cache.hh:
src/mem/cache/cache_blk.hh:
src/mem/cache/cache_builder.cc:
src/mem/cache/cache_impl.hh:
src/mem/cache/coherence/coherence_protocol.cc:
src/mem/cache/coherence/coherence_protocol.hh:
src/mem/cache/coherence/simple_coherence.hh:
src/mem/cache/miss/SConscript:
src/mem/cache/miss/mshr.cc:
src/mem/cache/miss/mshr.hh:
src/mem/cache/miss/mshr_queue.cc:
src/mem/cache/miss/mshr_queue.hh:
src/mem/cache/prefetch/base_prefetcher.cc:
src/mem/cache/tags/fa_lru.cc:
src/mem/cache/tags/fa_lru.hh:
src/mem/cache/tags/iic.cc:
src/mem/cache/tags/iic.hh:
src/mem/cache/tags/lru.cc:
src/mem/cache/tags/lru.hh:
src/mem/cache/tags/split.cc:
src/mem/cache/tags/split.hh:
src/mem/cache/tags/split_lifo.cc:
src/mem/cache/tags/split_lifo.hh:
src/mem/cache/tags/split_lru.cc:
src/mem/cache/tags/split_lru.hh:
src/mem/packet.cc:
src/mem/packet.hh:
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/tport.cc:
More major reorg. Seems to work for atomic mode now,
timing mode still broken.
--HG--
extra : convert_revision : 7e70dfc4a752393b911880ff028271433855ae87
No need to initialize memory contents; should come up as 0.
src/cpu/memtest/memtest.cc:
No need to initialize memory contents; should come up as 0.
--HG--
extra : convert_revision : 1713676956f3d33b4686fee2650bd17027bcc495
Make code compatible with new decode method.
src/arch/alpha/remote_gdb.cc:
src/cpu/base_dyn_inst_impl.hh:
src/cpu/exetrace.cc:
src/cpu/simple/base.cc:
Make code compatible with new decode method.
src/cpu/static_inst.cc:
src/cpu/static_inst.hh:
Modified instruction decode method.
--HG--
extra : convert_revision : a9a6d3a16fff59bc95d0606ea344bd57e71b8d0a
src/arch/x86/predecoder.cc:
Seperate the pc-pc and the pc of the incoming bytes, and get rid of the "moreBytes" which just takes a MachInst. Also make the "opSize" field describe the number of bytes and not the log of the number of bytes.
--HG--
extra : convert_revision : 3a5ec7053ec69c5cba738a475d8b7fd9e6e6ccc0
src/arch/x86/isa/macroop.isa:
Make microOp vs microop and macroOp vs macroop capitilization consistent. Also fill out the emulation environment handling a little more, and use an object to pass around output code.
src/arch/x86/isa/microops/base.isa:
Make microOp vs microop and macroOp vs macroop capitilization consistent. Also adjust python to C++ bool translation.
--HG--
extra : convert_revision : 6f4bacfa334c42732c845f9a7f211cbefc73f96f
real thing. Also rename the null case to something that can
be a C++ symbol.
--HG--
extra : convert_revision : e3bfc4065b59c21f613e486d234711c48d7c9070
into ahchoo.blinky.homelinux.org:/home/gblack/m5/newmem-x86
src/cpu/simple/base.cc:
Hand merge
--HG--
extra : convert_revision : a2902ef9d917d22ffb9c7dfa2fd444694a65240d
Initialize a temporary variable for thread->readPC() at setupFetchRequest() to reduce function calls.
exec tracing isn't needed for m5.fast binaries
Moved MISCREG_GL, MISCREG_CWP, and MISCREG_TLB_DATA out of switch statement and use if blocks instead.
src/arch/sparc/miscregfile.cc:
Moved MISCREG_GL, MISCREG_CWP, and MISCREG_TLB_DATA out of switch statement and use if blocks instead.
src/cpu/simple/base.cc:
Assign traceData to be NULL at BaseSimpleCPU constructor.
Initialize a temporary variable for thread->readPC() at setupFetchRequest() to reduce function calls.
exec tracing isn't needed for m5.fast binaries
--HG--
extra : convert_revision : 5dc92fff05c9bde994f1e0f1bb40e11c44eb72c6
fix the timing cpu to handle receiving a nacked packet
src/cpu/simple/timing.cc:
make the timing cpu handle receiving a nacked packet
src/mem/bridge.cc:
src/mem/bridge.hh:
the bridge never returns false when recvTiming() is called on its ports now, it always returns true and nacks the packet if there isn't sufficient buffer space
--HG--
extra : convert_revision : 5e12d0cf6ce985a5f72bcb7ce26c83a76c34c50a
src/cpu/simple/base.cc:
Cpu's should start as unallocated, not suspended
src/cpu/simple_thread.cc:
Wait for a thread to be assigned to activate the cpu
src/kern/tru64/tru64.hh:
When looking for a open cpu to assign threads, look for an unallocated one, not a suspended one.
--HG--
extra : convert_revision : 5e3ad2e96b4a715ed38293ceaccff5b9f4ea7985
src/cpu/o3/alpha/cpu_impl.hh:
Pass ISA-specific O3 CPU to FullO3CPU as a constructor parameter instead of using setCPU functions.
--HG--
extra : convert_revision : 74f4b1f5fb6f95a56081f367cce7ff44acb5688a
The removed ones were unnecessary. The commented out ones could be useful in the future, should this problem get fixed. See flyspray task #243.
src/cpu/o3/commit_impl.hh:
src/cpu/o3/decode_impl.hh:
src/cpu/o3/fetch_impl.hh:
src/cpu/o3/iew_impl.hh:
src/cpu/o3/inst_queue_impl.hh:
src/cpu/o3/lsq_impl.hh:
src/cpu/o3/lsq_unit_impl.hh:
src/cpu/o3/rename_impl.hh:
src/cpu/o3/rob_impl.hh:
Remove/comment out DPRINTFs that were causing a segfault.
--HG--
extra : convert_revision : b5aeda1c6300dfde5e0a3e9b8c4c5f6fa00b9862
src/cpu/o3/fetch.hh:
src/cpu/o3/fetch_impl.hh:
Update code so that the O3 CPU can handle not initially having anything hooked up to its ports.
--HG--
extra : convert_revision : 04bcef44e754735d821509ebd69b0ef9c8ef8e2c
into zamp.eecs.umich.edu:/z/ktlim2/clean/tmp/clean2
src/cpu/base_dyn_inst.hh:
Hand merge. Line is no longer needed because it's handled in the ISA.
--HG--
extra : convert_revision : 0be4067aa38759a5631c6940f0167d48fde2b680
1. Move interrupt handling to a separate function to clean up main commit() function a bit. Also gate the function call off properly based on whether or not there are outstanding interrupts, and the system is not in PAL mode.
2. Better handling of updating instruction's status bits. Instructions are not marked "atCommit" until other stages view it (pushed off to IEW/IQ), and they have been properly handled (faults).
3. Don't consider the ROB "empty" for the purpose of other stages until the ROB is empty, all stores have written back, and there was no store commits this cycle. The last is necessary in case a store committed, in which case it would look like all stores have written back but in actuality have not.
src/cpu/o3/commit.hh:
Slightly modify how interrupts are handled. Also include some extra bools to keep track of state properly.
src/cpu/o3/commit_impl.hh:
Slightly modify how interrupts are handled. Also include some extra bools to keep track of state.
General correctness updates, most specifically for when commit broadcasts to other stages that the ROB is empty.
--HG--
extra : convert_revision : 682ec6ccf4ee6ed0c8a030ceaba1c90a3619d102
src/cpu/o3/iew_impl.hh:
Allow for slightly more flexible handling of non-speculative instructions. They can be other classes now, such as loads or stores.
Also be sure to clear the state associated with squashes that are not used. i.e. if a squash due to a memory ordering violation happens on the same cycle as an older branch squashing, clear the state associated with the memory ordering violation.
Lastly don't consider uncached loads to officially be "at commit" until IEW receives the signal back from commit about the load.
src/cpu/o3/inst_queue_impl.hh:
Don't consider non-speculative instructions to be "at commit" until the IQ has received a signal from commit about the instruction. This prevents non-speculative instructions from being issued too early.
src/cpu/o3/mem_dep_unit_impl.hh:
Clear instruction's ability to issue if it's replayed.
--HG--
extra : convert_revision : d69dae878a30821222885485f4dee87170d56eb3
1. Requests are handled more properly now. They assume the memory system takes control of the request upon sending out an access.
2. load-load ordering is maintained.
src/cpu/base_dyn_inst.hh:
Update how requests are handled. The BaseDynInst should not be able to hold a pointer to the request because the request becomes owned by the memory system once it is sent out.
Also include some functions to allow certain status bits to be cleared.
src/cpu/base_dyn_inst_impl.hh:
Update how requests are handled. The BaseDynInst should not be able to hold a pointer to the request because the request becomes owned by the memory system once it is sent out.
src/cpu/o3/fetch_impl.hh:
General correctness fixes. retryPkt is not necessarily always set, so handle it properly. Also consider the cache unblocked only when recvRetry is called.
src/cpu/o3/lsq_unit.hh:
Handle requests a little more correctly. Now that the requests aren't pointed to by the DynInst, be sure to delete the request if it's not being used by the memory system.
Also be sure to not store-load forward from an uncacheable store.
src/cpu/o3/lsq_unit_impl.hh:
Check to make sure load-load ordering was maintained.
Also handle requests a little more correctly.
--HG--
extra : convert_revision : e86bead2886d02443cf77bf7a7a1492845e1690f
1. Set CPU ID in all modes for the O3 CPU.
2. Use nextCycle() function to prevent phase drift in O3 CPU.
3. Remove assertion in rename map that is no longer true.
src/cpu/o3/alpha/cpu_builder.cc:
Allow for CPU id in all modes, not just full system. Also include a parameter that was left out by accident.
src/cpu/o3/alpha/cpu_impl.hh:
Set the CPU ID properly.
src/cpu/o3/cpu.cc:
src/cpu/o3/cpu.hh:
Use nextCycle() function so that the CPU does not get out of phase when starting up from quiesces.
src/cpu/o3/rename_map.cc:
Remove assertion that is no longer true.
tests/configs/o3-timing.py:
Set CPU's id to 0.
--HG--
extra : convert_revision : 2b69c19adfce2adcc2d1939e89d702bd6674d5d5
src/arch/alpha/predecoder.hh:
src/arch/sparc/predecoder.hh:
Put in a missing include
src/cpu/exetrace.cc:
Convert the legion lockstep stuff from makeExtMI to the predecoder object.
--HG--
extra : convert_revision : 91bad4466f8db1447fff8608fa46a5f236dc3a89
into ahchoo.blinky.homelinux.org:/home/gblack/m5/newmem-x86
src/arch/mips/utility.hh:
src/arch/x86/SConscript:
Hand merge
--HG--
extra : convert_revision : 0ba457aab52bf6ffc9191fd1fe1006ca7704b5b0
Removed the getOpcode function from StaticInst which only made sense for Alpha.
Started implementing the x86 predecoder.
--HG--
extra : convert_revision : a13ea257c8943ef25e9bc573024a99abacf4a70d
src/arch/sparc/ua2005.cc:
fix interrupting when quisced. Since sticks correspond to instructions when not quisced we need to
check if were suspended and interrupt at the guess time
src/base/traceflags.py:
add trace flag for Iob
src/cpu/simple/base.cc:
Use Quisce instead of IPI trace flag
src/dev/sparc/iob.cc:
add some Dprintfs
--HG--
extra : convert_revision : 72e18fcc750ad1e4b2bb67b19b354eaffc6af6d5
src/cpu/memtest/memtest.cc:
Add the [] to a delete to make it work correctly
src/mem/cache/cache_impl.hh:
Fix one of the memory leaks
--HG--
extra : convert_revision : 64c7465c68a084efe38a62419205518b24d852a7
automatic. The point is that now a subdirectory can be added
to the build process just by creating a SConscript file in it.
The process has two passes. On the first pass, all subdirs
of the root of the tree are searched for SConsopts files.
These files contain any command line options that ought to be
added for a particular subdirectory. On the second pass,
all subdirs of the src directory are searched for SConscript
files. These files describe how to build any given subdirectory.
I have added a Source() function. Any file (relative to the
directory in which the SConscript resides) passed to that
function is added to the build. Clean up everything to take
advantage of Source().
function is added to the list of files to be built.
--HG--
extra : convert_revision : 103f6b490d2eb224436688c89cdc015211c4fd30
1. Make sure connectMemPorts() only gets called when the CPU's peer gets changed. This is done by making setPeer() virtual, and overriding it in the CPU's ports. When it gets called on a CPU's port (dcache specifically), it calls the normal setPeer() function, and also connectMemPorts().
2. Consolidate redundant code that handles switching in a CPU.
src/cpu/base.cc:
Move common code of switching over peers to base CPU.
src/cpu/base.hh:
Move common code of switching over peers to BaseCPU.
src/cpu/o3/cpu.cc:
Add in function that updates thread context's ports.
Also use updated function to takeOverFrom() in BaseCPU. This gets rid of some repeated code.
src/cpu/o3/cpu.hh:
Include function to update thread context's memory ports.
src/cpu/o3/lsq.hh:
Add function to dcache port that will update the memory ports upon getting a new peer.
Also include a function that will tell the CPU to update those memory ports.
src/cpu/o3/lsq_impl.hh:
Add function that will update the memory ports upon getting a new peer.
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
Add function that will update thread context's memory ports upon getting a new peer.
Also use the new BaseCPU's take over from function.
src/cpu/simple/atomic.hh:
Add in function (and dcache port) that will allow the dcache to update memory ports when it gets assigned a new peer.
src/cpu/simple/timing.hh:
Add function that will update thread context's memory ports upon getting a new peer.
src/mem/port.hh:
Make setPeer virtual so that other classes can override it.
--HG--
extra : convert_revision : 2050f1241dd2e83875d281cfc5ad5c6c8705fdaf
don't create a new physPort/virtPort every time activateContext() is called
add the ability to tell a memory object to delete it's reference to a port and a method to have a port call deletePortRefs()
on the port owner as well as delete it's peer
still need to stop calling connectMemoPorts() every time activateContext() is called or we'll overflow the bus id and panic
src/cpu/thread_state.cc:
if we hav ea (phys|virt)Port don't create a new on, have it delete it's peer and then reuse it
src/mem/bus.cc:
src/mem/bus.hh:
add ability to delete a port by usig a hash_map instead of an array to store port ids
add a function to do deleting
src/mem/cache/cache.hh:
src/mem/cache/cache_impl.hh:
src/mem/mem_object.cc:
src/mem/mem_object.hh:
adda function to delete port references from a memory object
src/mem/port.cc:
src/mem/port.hh:
add a removeConn function that tell the owener to delete any references to the port and then deletes its peer
--HG--
extra : convert_revision : 272f0c8f80e1cf1ab1750d8be5a6c9aa110b06a4
directly configured by python. Move stuff from root.(cc|hh) to
core.(cc|hh) since it really belogs there now.
In the process, simplify how ticks are used in the python code.
--HG--
extra : convert_revision : cf82ee1ea20f9343924f30bacc2a38d4edee8df3
configs/common/FSConfig.py:
Use binaries we've compiled instead of the ones that come with Legion
src/arch/alpha/interrupts.hh:
get rid of post(int int_type) and add a get_vec function that gets the interrupt vector for an interrupt number
src/arch/sparc/asi.cc:
Add AsiIsInterrupt() to AsiIsMmu()
src/arch/sparc/faults.cc:
src/arch/sparc/faults.hh:
Add InterruptVector type
src/arch/sparc/interrupts.hh:
rework interrupts. They are no longer cleared when created... A I/O or ASI read/write needs to happen before they are cleared
src/arch/sparc/isa_traits.hh:
Add the "interrupt" trap types to isa traits
src/arch/sparc/miscregfile.cc:
add names for all the misc registers and possible post an interrupt when TL is changed.
src/arch/sparc/miscregfile.hh:
Add a helper function to post an interrupt when pil < some set softint
src/arch/sparc/regfile.cc:
src/arch/sparc/regfile.hh:
InterruptLevel shouldn't really live here, moved to interrupt.hh
src/arch/sparc/tlb.cc:
Add interrupt ASIs to TLB
src/arch/sparc/ua2005.cc:
Add checkSoftInt to check if a softint needs to be posted
Check that a tickCompare isn't scheduled before scheduling one
Post and clear interrupts on queue writes and what not
src/base/bitfield.hh:
Add an helper function to return the msb that is set
src/cpu/base.cc:
src/cpu/base.hh:
get rid of post_interrupt(type) since it's no longer needed.. Add a way to see what interrupts are pending
src/cpu/intr_control.cc:
src/cpu/intr_control.hh:
src/dev/alpha/tsunami_cchip.cc:
src/python/m5/objects/IntrControl.py:
Make IntrControl have a system pointer rather than using a cpu pointer to get one
src/dev/sparc/SConscript:
add iob to SConsscrip
tests/quick/10.linux-boot/ref/alpha/linux/tsunami-simple-atomic-dual/config.ini:
tests/quick/10.linux-boot/ref/alpha/linux/tsunami-simple-atomic-dual/config.out:
tests/quick/10.linux-boot/ref/alpha/linux/tsunami-simple-atomic/config.ini:
tests/quick/10.linux-boot/ref/alpha/linux/tsunami-simple-atomic/config.out:
tests/quick/10.linux-boot/ref/alpha/linux/tsunami-simple-timing-dual/config.ini:
tests/quick/10.linux-boot/ref/alpha/linux/tsunami-simple-timing-dual/config.out:
tests/quick/10.linux-boot/ref/alpha/linux/tsunami-simple-timing/config.ini:
tests/quick/10.linux-boot/ref/alpha/linux/tsunami-simple-timing/config.out:
tests/quick/80.netperf-stream/ref/alpha/linux/twosys-tsunami-simple-atomic/config.ini:
tests/quick/80.netperf-stream/ref/alpha/linux/twosys-tsunami-simple-atomic/config.out:
update config.ini/out for intrcntrl not having a cpu pointer anymore
--HG--
extra : convert_revision : 38614f6b9ffc8f3c93949a94ff04b7d2987168dd
on in python. Fix the trace start code so it actually starts
when it is suppsed to. Make the Exec tracing stuff obey the
trace enabled flag.
--HG--
extra : convert_revision : 634ba0b4f52345d4bf40d43e239cef7ef43e7691
the traceflags infrastructure. InstExec is now just Exec
and all of the command line options are now trace options.
--HG--
extra : convert_revision : 4adfa9dfbb32622d30ef4e63c06c7d87da793c8f
into zeep.pool:/z/saidi/work/m5.newmem
src/cpu/simple/atomic.cc:
merge steve's changes in.
--HG--
extra : convert_revision : a17eda37cd63c9380af6fe68b0aef4b1e1974231
Add support for a twin 64 bit int load
Add Memory barrier and write barrier flags as appropriate
Make atomic memory ops atomic
src/arch/alpha/isa/mem.isa:
src/arch/alpha/locked_mem.hh:
src/cpu/base_dyn_inst.hh:
src/mem/cache/cache_blk.hh:
src/mem/cache/cache_impl.hh:
rename store conditional stuff as extra data so it can be used for conditional swaps as well
src/arch/alpha/types.hh:
src/arch/mips/types.hh:
src/arch/sparc/types.hh:
add a largest read data type for statically allocating read buffers in atomic simple cpu
src/arch/isa_parser.py:
Add support for a twin 64 bit int load
src/arch/sparc/isa/decoder.isa:
Make atomic memory ops atomic
Add Memory barrier and write barrier flags as appropriate
src/arch/sparc/isa/formats/mem/basicmem.isa:
add post access code block and define a twinload format for twin loads
src/arch/sparc/isa/formats/mem/blockmem.isa:
remove old microcoded twin load coad
src/arch/sparc/isa/formats/mem/mem.isa:
swap.isa replaces the code in loadstore.isa
src/arch/sparc/isa/formats/mem/util.isa:
add a post access code block
src/arch/sparc/isa/includes.isa:
need bigint.hh for Twin64_t
src/arch/sparc/isa/operands.isa:
add a twin 64 int type
src/cpu/simple/atomic.cc:
src/cpu/simple/atomic.hh:
src/cpu/simple/base.hh:
src/cpu/simple/timing.cc:
add support for twinloads
add support for swap and conditional swap instructions
rename store conditional stuff as extra data so it can be used for conditional swaps as well
src/mem/packet.cc:
src/mem/packet.hh:
Add support for atomic swap memory commands
src/mem/packet_access.hh:
Add endian conversion function for Twin64_t type
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/request.hh:
Add support for atomic swap memory commands
Rename sc code to extradata
--HG--
extra : convert_revision : 69d908512fb34a4e28b29a6e58b807fb1a6b1656
function into Alpha ISA description. write now just generically
returns a result value if the res pointer is non-null (which means
we can only provide a res pointer if we expect a valid result
value).
--HG--
extra : convert_revision : fb1c315515787f5fbbf7d1af7e428bdbfe8148b8
Created MemCmd class to wrap enum and provide handy methods to
check attributes, convert to string/int, etc.
--HG--
extra : convert_revision : 57f147ad893443e3a2040c6d5b4cdb1a8033930b
fix unaligned accesses in mmaped disk device
src/arch/sparc/isa/decoder.isa:
get (ld|st)fsr ops working right. In reality the fp enable check needs to go higher up in the emitted code
src/arch/sparc/isa/formats/basic.isa:
move the cexec into the aexec field
src/cpu/exetrace.cc:
copy the exception state from legion when we get it wrong. We aren't going to get it right without an fp emulation layer
src/dev/sparc/mm_disk.cc:
src/dev/sparc/mm_disk.hh:
fix unaligned accesses in the memory mapped disk device
--HG--
extra : convert_revision : aaa33096b08cf0563fe291d984a87493a117e528
src/arch/sparc/floatregfile.cc:
fix fp read/writing to registers... looking for suggestions on cleaner ways if anyone has them
src/arch/sparc/isa/decoder.isa:
fix some fp implementations
src/arch/sparc/isa/formats/basic.isa:
add new fp op class that 0 cexec in fsr and sets rounding mode for the up comming op
src/arch/sparc/isa/includes.isa:
include the appropriate header files for the rounding code
src/arch/sparc/miscregfile.cc:
print fsr out when it's read/written and the Sparc traceflgas in on
src/cpu/exetrace.cc:
fix printing of float registers
--HG--
extra : convert_revision : 49faab27f2e786a8455f9ca0f3f0132380c9d992
src/arch/sparc/floatregfile.cc:
Fix serialization for fpreg
src/arch/sparc/intregfile.cc:
fix serialization for intreg
src/arch/sparc/miscregfile.cc:
fix serialization from miscreg
src/arch/sparc/pagetable.cc:
fix serialization for page table
src/arch/sparc/regfile.cc:
need to serialize nnpc
src/arch/sparc/tlb.cc:
write serialization code for tlb
src/cpu/base.cc:
provide a way to find the thread number a context is
serialize the instruction counter
src/cpu/base.hh:
provide a way to find the thread number a context is
and given a thread number find a context pointer
src/cpu/cpuevent.hh:
provide method to get thread context from a cpu event for serialization
src/dev/sparc/t1000.cc:
src/dev/sparc/t1000.hh:
nothing to serialize in t1000
src/sim/serialize.cc:
src/sim/serialize.hh:
Make findObj() work (it hasn't since we did the python conversion stuff)
--HG--
extra : convert_revision : a95bc4e3c3354304171efbe3797556fdb146bea2
make fp writes also chatty with the Sparc traceflag
src/arch/sparc/floatregfile.cc:
make fp writes also chatty with the Sparc traceflag
src/cpu/exetrace.cc:
fix comparing fp registers between legion and m5
--HG--
extra : convert_revision : f3703afae56249f137451262bc1b6919d465e714
SConstruct:
src/SConscript:
Add flags for Intel CC while i'm at it
src/base/compiler.hh:
the _Pragma stuff needst to be called this way unless someone happens to have a cleaner way
src/base/cprintf_formats.hh:
add std:: where appropriate
src/base/statistics.hh:
use this->map since icc was getting confused about std::map vs the locally defined map
src/cpu/static_inst.hh:
Add some more dummy returns where needed
src/mem/packet.hh:
add more dummy returns where needed
src/sim/host.hh:
use limits to come up with max tick
--HG--
extra : convert_revision : 08e9f7898b29fb9d063136529afb9b6abceab60c
into zower.eecs.umich.edu:/eecshome/m5/newmem
src/arch/sparc/isa/formats/mem/util.isa:
src/arch/sparc/isa_traits.hh:
src/arch/sparc/system.cc:
Hand Merge
--HG--
extra : convert_revision : d5e0c97caebb616493e2f642e915969d7028109c
some fixes to fp instructions to use the single precision registers
if this is an fp op emit fp check code
add fpregs to m5legion struct
src/arch/sparc/floatregfile.cc:
Make Sparc traceflag even more chatty
src/arch/sparc/isa/base.isa:
add code to check if the fpu is enabled
src/arch/sparc/isa/decoder.isa:
some fixes to fp instructions to use the single precision registers
fix smul again
fix subc/subcc/subccc condition code setting
src/arch/sparc/isa/formats/basic.isa:
src/arch/sparc/isa/formats/mem/util.isa:
if this is an fp op emit fp check code
src/cpu/exetrace.cc:
check fp regs as well as int regs
src/cpu/m5legion_interface.h:
add fpregs to m5legion struct
--HG--
extra : convert_revision : e7d26d10fb8ce88f96e3a51f84b48c3b3ad2f232
pretty close to compiling w/ suns compiler
briefly:
add dummy return after panic()/fatal()
split out flags by compiler vendor
include cstring and cmath where appropriate
use std namespace for string ops
SConstruct:
Add code to detect compiler and choose cflags based on detected compiler
Fix zlib check to work with suncc
src/SConscript:
split out flags by compiler vendor
src/arch/sparc/isa/decoder.isa:
use correct namespace for sqrt
src/arch/sparc/isa/formats/basic.isa:
add dummy return around panic
src/arch/sparc/isa/formats/integerop.isa:
use correct namespace for stringops
src/arch/sparc/isa/includes.isa:
include cstring and cmath where appropriate
src/arch/sparc/isa_traits.hh:
remove dangling comma
src/arch/sparc/system.cc:
dummy return to make sun cc front end happy
src/arch/sparc/tlb.cc:
src/base/compression/lzss_compression.cc:
use std namespace for string ops
src/arch/sparc/utility.hh:
no reason to say something is unsigned unsigned int
src/base/compression/null_compression.hh:
dummy returns to for suncc front end
src/base/cprintf.hh:
use standard variadic argument syntax instead of gnuc specefic renaming
src/base/hashmap.hh:
don't need to define hash for suncc
src/base/hostinfo.cc:
need stdio.h for sprintf
src/base/loader/object_file.cc:
munmap is in std namespace not null
src/base/misc.hh:
use M5 generic noreturn macros
use standard variadic macro __VA_ARGS__
src/base/pollevent.cc:
we need file.h for file flags
src/base/random.cc:
mess with include files to make suncc happy
src/base/remote_gdb.cc:
malloc memory for function instead of having a non-constant in an array size
src/base/statistics.hh:
use std namespace for floor
src/base/stats/text.cc:
include math.h for rint (cmath won't work)
src/base/time.cc:
use suncc version of ctime_r
src/base/time.hh:
change macro to work with both gcc and suncc
src/base/timebuf.hh:
include cstring from memset and use std::
src/base/trace.hh:
change variadic macros to be normal format
src/cpu/SConscript:
add dummy returns where appropriate
src/cpu/activity.cc:
include cstring for memset
src/cpu/exetrace.hh:
include cstring fro memcpy
src/cpu/simple/base.hh:
add dummy return for panic
src/dev/baddev.cc:
src/dev/pciconfigall.cc:
src/dev/platform.cc:
src/dev/sparc/t1000.cc:
add dummy return where appropriate
src/dev/ide_atareg.h:
make define work for both gnuc and suncc
src/dev/io_device.hh:
add dummy returns where approirate
src/dev/pcidev.hh:
src/mem/cache/cache_impl.hh:
src/mem/cache/miss/blocking_buffer.cc:
src/mem/cache/tags/lru.hh:
src/mem/cache/tags/split.hh:
src/mem/cache/tags/split_lifo.hh:
src/mem/cache/tags/split_lru.hh:
src/mem/dram.cc:
src/mem/packet.cc:
src/mem/port.cc:
include cstring for string ops
src/dev/sparc/mm_disk.cc:
add dummy return where appropriate
include cstring for string ops
src/mem/cache/miss/blocking_buffer.hh:
src/mem/port.hh:
Add dummy return where appropriate
src/mem/cache/tags/iic.cc:
cast hastSets to double for log() call
src/mem/physical.cc:
cast pmemAddr to char* for munmap
src/sim/byteswap.hh:
make define work for suncc and gnuc
--HG--
extra : convert_revision : ef8a1f1064e43b6c39838a85c01aee4f795497bd
Only allow writing/reading of 32 bits of Y
Only allow writing/reading 32 bits of pc when pstate.am
Put any loaded data on the first half of a micro-op in uReg0 so it can't
overwrite the register we are using for address calculation
only erase a entry from the lookup table if it's valid
Put in a temporary check to make sure that lookup table and tlb array stay in sync
if we are interrupted in the middle of a mico-op, reset the micropc/nexpc
so we start on the first part of it when we come back
src/arch/sparc/isa/decoder.isa:
fix smul and sdiv to sign extend, and handle overflow/underflow corretly
Only allow writing/reading of 32 bits of Y
Only allow writing/reading 32 bits of pc when pstate.am
Put any loaded data on the first half of a micro-op in uReg0 so it can't
overwrite the register we are using for address calculation
src/arch/sparc/isa/formats/mem/blockmem.isa:
Put any loaded data on the first half of a micro-op in uReg0 so it can't
overwrite the register we are using for address calculation
src/arch/sparc/isa/includes.isa:
Use limits for 32bit underflow/overflow detection
src/arch/sparc/tlb.cc:
only erase a entry from the lookup table if it's valid
Put in a temporary check to make sure that lookup table and tlb array stay in sync
src/arch/sparc/tlb_map.hh:
add a print function to dump the tlb lookup table
src/cpu/simple/base.cc:
if we are interrupted in the middle of a mico-op, reset the micropc/nexpc
so we start on the first part of it when we come back
--HG--
extra : convert_revision : 50a23837fd888393a5c2aa35cbd1abeebb7f55d4
check writability of tlb cache entry before using
update tagaccess in places I forgot to
move the tlb privileged test up since it is higher priority
src/arch/sparc/faults.cc:
save only 32 bits of PC/NPC if Pstate.am is set
src/arch/sparc/isa/decoder.isa:
return only 32 bits of PC/NPC if Pstate.am is set
increment cleanwin correctly
src/arch/sparc/tlb.cc:
check writability of cache entry
update tagaccess in a few more places
move the privileged test up since it is higher priority
src/cpu/exetrace.cc:
mask off upper bits of pc if pstate.am is set before comparing to legion
--HG--
extra : convert_revision : 02a51c141ee3f9a2600c28eac018ea7216f3655c
Increment instruction count on first micro-op instead of last
src/arch/sparc/isa/decoder.isa:
Implement a twin load for ASI_LDTX_P(0xe2)
src/arch/sparc/isa/formats/mem/blockmem.isa:
set the new flag IsFirstMicroOp when needed
src/cpu/simple/atomic.cc:
Increment instruction count on first micro-op instead of last (because if we take a fault on a micro coded instruction it should be counted twice acording to legion)
src/cpu/static_inst.hh:
Add IsFirstMicroop flag to static insts
--HG--
extra : convert_revision : 02bea93d38c03bbafe4570665eb4c01c11caa2fc
src/arch/sparc/interrupts.hh:
fill in how we do interrupts on sparc a little bit.
1) create a bitfield for interrupts, and check that in checkInterrupts() to fast path CPU.
2) fill in getInterrupts() a little bit.
also, update the bitfield access to be HPSTATE::hpriv, etc.
src/arch/sparc/ua2005.cc:
1) update formatting
2) change the way interrupts are done to use the new way to tickle the CPU.
src/cpu/base.cc:
src/cpu/base.hh:
overload the post_interrupt function for SPARC interrupts - which are only denoted by a single int value.
--HG--
extra : convert_revision : 9074a003eff37a40dcce78f56d20f6cbcc453eb5
at slightly different times interrupts are seen the cycle before they happen in m5 so the pc gets changed early.
--HG--
extra : convert_revision : f237363eababb2aad67e5b41670cf40be048a042
src/cpu/o3/commit_impl.hh:
Oops, changed the logic a little bit. Fix it up to how it used to be.
--HG--
extra : convert_revision : df7f69b0997207b611374c3c92880f3a405e88be
Only print faults instructions that aren't traps or faulting loads
src/cpu/exetrace.cc:
Compare the legion and m5 tlbs and printout any differences
Only show differences if the instruction isn't a trap and isn't a memory
operation that changes the trap level (a fault)
src/cpu/m5legion_interface.h:
update the m5<->legion interface to add tlb data
--HG--
extra : convert_revision : 6963b64ca1012604e6b1d3c5e0e5f5282fd0164e
Also don't call (*activeThreads).end() over and over. Just
call activeThreads->end() once and save the result.
Make sure we always check that there are elements in the list
before we grab the first one.
--HG--
extra : convert_revision : d769d8ed52da99532d57a9bbc93e92ddf22b7e58
bugfixes and demap implementation in tlb
ignore some more differencs for one cycle
src/arch/sparc/isa/formats/mem/blockmem.isa:
twinx has 2 micro-ops
src/arch/sparc/isa/formats/mem/util.isa:
fix the fault check for twinx
src/arch/sparc/tlb.cc:
tlb bugfixes and write demapping code
src/cpu/exetrace.cc:
don't halt on a couple more instruction (ldx, stx) when things differ
beacuse of the way tlb faults are handled in legion.
--HG--
extra : convert_revision : 1e156dead6ebd58b257213625ed63c3793ef4b71
InstObjParam interface.
src/arch/alpha/isa/branch.isa:
src/arch/alpha/isa/fp.isa:
src/arch/alpha/isa/int.isa:
src/arch/alpha/isa/main.isa:
src/arch/alpha/isa/mem.isa:
src/arch/alpha/isa/pal.isa:
src/arch/mips/isa/formats/mem.isa:
src/arch/mips/isa/formats/util.isa:
Get rid of CodeBlock calls to adapt to new InstObjParam interface.
src/arch/isa_parser.py:
Check template code for operands (in addition to snippets).
src/cpu/o3/alpha/dyn_inst.hh:
Add (read|write)MiscRegOperand calls to Alpha DynInst.
--HG--
extra : convert_revision : 332caf1bee19b014cb62c1ed9e793e793334c8ee
src/arch/isa_parser.py:
Rearranged things so that classes with more than one execute function treat operands properly.
1. Eliminated the CodeBlock class
2. Created a SubOperandList
3. Redefined how InstObjParams is constructed
To define an InstObjParam, you can either pass in a single code literal which will be named "code", or you can pass in a dictionary of code snippets which will be substituted into the Templates. In order to get this to work, there is a new restriction that each template has only one function in it. These changes should only affect memory instructions which have regular and split execute functions.
Also changed the MiscRegs so that they use the instrunctions srcReg and destReg arrays.
src/arch/sparc/isa/formats/basic.isa:
src/arch/sparc/isa/formats/branch.isa:
src/arch/sparc/isa/formats/integerop.isa:
src/arch/sparc/isa/formats/mem/basicmem.isa:
src/arch/sparc/isa/formats/mem/blockmem.isa:
src/arch/sparc/isa/formats/mem/util.isa:
src/arch/sparc/isa/formats/nop.isa:
src/arch/sparc/isa/formats/priv.isa:
src/arch/sparc/isa/formats/trap.isa:
Rearranged to work with new InstObjParam scheme.
src/cpu/o3/sparc/dyn_inst.hh:
Added functions to access the miscregs using the indexes from instructions srcReg and destReg arrays. Also changed the names of the other accessors so that they have the suffix "Operand" if they use those arrays.
src/cpu/simple/base.hh:
Added functions to access the miscregs using the indexes from instructions srcReg and destReg arrays.
--HG--
extra : convert_revision : c91e1073138b72bcf4113a721e0ed40ec600cf2e
RangeSize as a function takes a start address, and a SIZE, and will make the range (start, start+size-1) for you.
src/cpu/memtest/memtest.hh:
src/cpu/o3/fetch.hh:
src/cpu/o3/lsq.hh:
src/cpu/ozone/front_end.hh:
src/cpu/ozone/lw_lsq.hh:
src/cpu/simple/atomic.hh:
src/cpu/simple/timing.hh:
Fix RangeSize arguments
src/dev/alpha/tsunami_cchip.cc:
src/dev/alpha/tsunami_io.cc:
src/dev/alpha/tsunami_pchip.cc:
src/dev/baddev.cc:
pioSize indicates SIZE, not a mask
--HG--
extra : convert_revision : d385521fcfe58f8dffc8622260937e668a47a948
exetrace.cc:
wrap this variable between FULL_SYSTEM #ifs
mmaped_ipr.hh:
fix for build
miscregfile.cc:
fixes for HPSTATE access during SE mode
src/arch/sparc/miscregfile.cc:
fixes for HPSTATE access during SE mode
src/arch/mips/mmaped_ipr.hh:
fix for build
src/cpu/exetrace.cc:
wrap this variable between FULL_SYSTEM #ifs
--HG--
extra : convert_revision : c5b9d56ab99018a91d04de47ba1d5ca7768590bb
Deal with block initializing stores (by doing nothing, at some point we might want to do the write hint 64 like thing)
Fix tcc instruction igoner in legion-lock stuff to be correct in all cases
Have console interrupts warn rather than panicing until we figure out what to do with interrupts
src/arch/sparc/miscregfile.cc:
src/arch/sparc/miscregfile.hh:
add a magic miscreg which reads all the bits the tlb needs in one go
src/arch/sparc/tlb.cc:
initialized the context type and id to reasonable values and handle block init stores
src/arch/sparc/tlb_map.hh:
fix bug in tlb map code
src/base/range_map.hh:
fix bug in rangemap code and add range_multimap
(these are probably useful for bus range stuff)
src/cpu/exetrace.cc:
fixup tcc ignore code to be correct
src/dev/sparc/t1000.cc:
make console interrupt stuff warn instead of panicing until we get interrupt stuff figured out
src/unittest/rangemaptest.cc:
fix up the rangemap unit test to catch the missing case
--HG--
extra : convert_revision : 70604a8b5d0553aa0b0bd7649f775a0cfa8267a5
This fixes a minor bug when multiple FU completions come back out of order (due to the order in which the FUs are freed up), and the oldest redirect isn't recorded properly. The eon benchmark should run now.
src/cpu/o3/iew_impl.hh:
Allow for multiple redirects to happen on a single cycle (only the one for the oldest instruction is passed on to commit).
--HG--
extra : convert_revision : b7d202dee1754539ed814f0fac59adb8c6328ee1
Fix fault formating and code for traps
fix a couple of bugs in the decoder
Cleanup/fix page table entry code
Implement more mmaped iprs, fix numbered tlb insertion code, add function to dump tlb contents
Don't panic if we differ from legion on a tcc instruction because of where legion prints its data and where we print our data
src/arch/sparc/faults.cc:
Fix fault formating and code for traps
src/arch/sparc/intregfile.hh:
allocate the correct number of global registers
src/arch/sparc/isa/decoder.isa:
fix a couple of bugs in the decoder: wrasi should write asi not ccr, done/retry should get hpstate from htstate
src/arch/sparc/pagetable.hh:
cleanup/fix page table code
src/arch/sparc/tlb.cc:
implement more mmaped iprs, fix numbered insertion code, add function to dump tlb contents
src/arch/sparc/tlb.hh:
add functions to write TagAccess register on tlb miss and to dump all tlb entries for debugging
src/cpu/exetrace.cc:
dump tlb entries on error, don't consider differences the cycle we take a trap to be bad.
--HG--
extra : convert_revision : d7d771900f6f25219f3dc6a6e51986d342a32e03
src/arch/sparc/asi.cc:
src/arch/sparc/asi.hh:
add sparc error asi
src/arch/sparc/faults.cc:
put a panic in if TL == MaxTL
src/arch/sparc/isa/decoder.isa:
Hpstate needs to be updated on a done too
src/arch/sparc/miscregfile.cc:
warn istead of panicing of fprs/fsr accesses
src/arch/sparc/tlb.cc:
add sparc error register code that just does nothing
fix a couple of other tlb bugs
src/arch/sparc/ua2005.cc:
fix implementation of HPSTATE write
src/cpu/exetrace.cc:
let exectrate mess up a couple of times before dying
src/python/m5/objects/T1000.py:
add l2 error status register fake devices
--HG--
extra : convert_revision : ed5dfdfb28633bf36e5ae07d244f7510a02874ca
src/cpu/o3/alpha/cpu.hh:
Got rid of some typedefs, and moved the tlbs to the base o3 cpu.
src/cpu/o3/alpha/thread_context.hh:
src/cpu/o3/cpu.cc:
Moved the tlbs to the base o3 cpu.
--HG--
extra : convert_revision : 1805613aa230b8974a226ee3d2584c85f7a578aa
into zower.eecs.umich.edu:/eecshome/m5/newmem
src/cpu/o3/commit_impl.hh:
Hand Merge
--HG--
extra : convert_revision : 6984db90d5b5ec71c31f1c345f5a77eed540059e
src/cpu/o3/thread_context_impl.hh:
Use flattened indices
src/cpu/simple_thread.hh:
Use flattened indices, and pass a thread context to setSyscallReturn rather than a register file.
src/cpu/thread_context.hh:
The SyscallReturn class is no longer in arch/syscallreturn.hh
--HG--
extra : convert_revision : ed84bb8ac5ef0774526ecd0d7270b0c60cd3708e
Protect other pieces of code so that sparc compiles SE again
src/arch/sparc/SConscript:
Add ua2005.cc back into SConscript
src/arch/sparc/miscregfile.hh:
add functions that deal with priv registers so we don't have to have a bunch of if defs and other ugliness
src/arch/sparc/mmaped_ipr.hh:
wrap handleIpr* with if full_system so it compiles under se
src/arch/sparc/ua2005.cc:
reorganize edit fs only miscreg functions
src/cpu/exetrace.cc:
protect legion code so it doesn't try to compile under se
--HG--
extra : convert_revision : 6b3c9f6f95b4da8544525f4f82e92861383ede76
configs/common/FSConfig.py:
seperate the hypervisor memory and the guest0 memory. In reality we're going to need a better way to do this at some point. Perhaps auto generating the hv-desc image based on the specified config.
src/arch/sparc/isa/decoder.isa:
change reads/writes to the [hs]tick(cmpr) registers to use readmiscregwitheffect
src/arch/sparc/miscregfile.cc:
For niagra stick and tick are aliased to one value (if we end up doing mps we might not want this).
Use instruction count from cpu rather than cycles because that is what legion does
we can change it back after were done with legion
src/base/bitfield.hh:
add a new function mbits() that just masks off bits of interest but doesn't shift
src/cpu/base.cc:
src/cpu/base.hh:
add instruction count to cpu
src/cpu/exetrace.cc:
src/cpu/m5legion_interface.h:
compare instruction count between legion and m5 too
src/cpu/simple/atomic.cc:
change asserts of packet success to if panics wrapped with NDEBUG defines
so we can get some more useful information when we have a bad address
src/dev/isa_fake.cc:
src/dev/isa_fake.hh:
src/python/m5/objects/Device.py:
expand isa fake a bit more having data for each size request, the ability to have writes update the data and to warn on accesses
src/python/m5/objects/System.py:
convert some tabs to spaces
src/python/m5/objects/T1000.py:
add more fake devices for each l1 bank and each memory controller
--HG--
extra : convert_revision : 8024ae07b765a04ff6f600e5875b55d8a7d3d276
into zower.eecs.umich.edu:/eecshome/m5/newmemmid
src/arch/sparc/isa_traits.hh:
src/arch/sparc/miscregfile.hh:
hand merge
--HG--
extra : convert_revision : 34f50dc5e6e22096cb2c08b5888f2b0fcd418f3e
src/arch/SConscript:
add mmaped_ipr.hh to switch headers
src/arch/sparc/asi.hh:
make ASI_IMPLICT=0 so by default nothing needs to be done
src/arch/sparc/miscregfile.hh:
miscregfile no longer needs to include asi.hh
src/arch/sparc/tlb.cc:
src/arch/sparc/tlb.hh:
implement panic instructions for mmaped ipr reads
src/cpu/simple/atomic.cc:
add check for mmaped iprs and handle them if it exists
src/mem/request.hh:
allocate space in the flags for mmaped iprs. Put in in the first 8 bits so that by default its fast. Move the other flags up 8 bits
--HG--
extra : convert_revision : 31255b0494588c4d06a727fe35241121d741b115
Right now this introduces a minor memory leak as old physPorts and virtPorts are not deleted when new ones are created. A flyspray task has been created for this issue. It can not be resolved until we determine how the bus will handle giving out ID's to functional ports that may be deleted.
src/cpu/o3/cpu.cc:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
Change the setup of the physPort and virtPort to instead happen every time the CPU has a context activated. This is a little high overhead, but keeps it working correctly when the CPU does not have a physical memory attached to it until it switches in (like the case of switch CPUs).
src/cpu/o3/thread_context.hh:
Change function from being called at init() to just being called whenever the memory ports need to be connected.
src/cpu/o3/thread_context_impl.hh:
Update this to not delete the port if it's the same as the virtPort.
src/cpu/thread_context.hh:
Change function from being called at init() to whenever the memory ports need to be connected.
src/cpu/thread_state.cc:
Instead of initializing the ports, simply connect them, deleting any old ports that might exist. This allows these functions to be called multiple times.
src/cpu/thread_state.hh:
Ports are no longer initialized, but rather connected at context activation time.
--HG--
extra : convert_revision : e399ce5dfbd6ad658c953a7c9c7b69b89a70219e
src/arch/sparc/process.cc:
MachineBytes doesn't exist any more.
src/arch/sparc/regfile.cc:
Add in the miscRegFile for good measure.
src/cpu/o3/isa_specific.hh:
Add in a section for SPARC
src/cpu/o3/sparc/cpu.cc:
src/cpu/o3/sparc/cpu.hh:
src/cpu/o3/sparc/cpu_builder.cc:
src/cpu/o3/sparc/cpu_impl.hh:
src/cpu/o3/sparc/dyn_inst.cc:
src/cpu/o3/sparc/dyn_inst.hh:
src/cpu/o3/sparc/dyn_inst_impl.hh:
src/cpu/o3/sparc/impl.hh:
src/cpu/o3/sparc/params.hh:
src/cpu/o3/sparc/thread_context.cc:
src/cpu/o3/sparc/thread_context.hh:
Sparc version of this file.
--HG--
extra : convert_revision : 34bb5218f802d0a1328132a518cdd769fb59b6a4