Commit graph

73 commits

Author SHA1 Message Date
Andreas Sandberg 326662b01b arch, cpu: Factor out the ExecContext into a proper base class
We currently generate and compile one version of the ISA code per CPU
model. This is obviously wasting a lot of resources at compile
time. This changeset factors out the interface into a separate
ExecContext class, which also serves as documentation for the
interface between CPUs and the ISA code. While doing so, this
changeset also fixes up interface inconsistencies between the
different CPU models.

The main argument for using one set of ISA code per CPU model has
always been performance as this avoid indirect branches in the
generated code. However, this argument does not hold water. Booting
Linux on a simulated ARM system running in atomic mode
(opt/10.linux-boot/realview-simple-atomic) is actually 2% faster
(compiled using clang 3.4) after applying this patch. Additionally,
compilation time is decreased by 35%.
2014-09-03 07:42:22 -04:00
Curtis Dunham fe27f937aa arch: teach ISA parser how to split code across files
This patch encompasses several interrelated and interdependent changes
to the ISA generation step.  The end goal is to reduce the size of the
generated compilation units for instruction execution and decoding so
that batch compilation can proceed with all CPUs active without
exhausting physical memory.

The ISA parser (src/arch/isa_parser.py) has been improved so that it can
accept 'split [output_type];' directives at the top level of the grammar
and 'split(output_type)' python calls within 'exec {{ ... }}' blocks.
This has the effect of "splitting" the files into smaller compilation
units.  I use air-quotes around "splitting" because the files themselves
are not split, but preprocessing directives are inserted to have the same
effect.

Architecturally, the ISA parser has had some changes in how it works.
In general, it emits code sooner.  It doesn't generate per-CPU files,
and instead defers to the C preprocessor to create the duplicate copies
for each CPU type.  Likewise there are more files emitted and the C
preprocessor does more substitution that used to be done by the ISA parser.

Finally, the build system (SCons) needs to be able to cope with a
dynamic list of source files coming out of the ISA parser. The changes
to the SCons{cript,truct} files support this. In broad strokes, the
targets requested on the command line are hidden from SCons until all
the build dependencies are determined, otherwise it would try, realize
it can't reach the goal, and terminate in failure. Since build steps
(i.e. running the ISA parser) must be taken to determine the file list,
several new build stages have been inserted at the very start of the
build. First, the build dependencies from the ISA parser will be emitted
to arch/$ISA/generated/inc.d, which is then read by a new SCons builder
to finalize the dependencies. (Once inc.d exists, the ISA parser will not
need to be run to complete this step.) Once the dependencies are known,
the 'Environments' are made by the makeEnv() function. This function used
to be called before the build began but now happens during the build.
It is easy to see that this step is quite slow; this is a known issue
and it's important to realize that it was already slow, but there was
no obvious cause to attribute it to since nothing was displayed to the
terminal. Since new steps that used to be performed serially are now in a
potentially-parallel build phase, the pathname handling in the SCons scripts
has been tightened up to deal with chdir() race conditions. In general,
pathnames are computed earlier and more likely to be stored, passed around,
and processed as absolute paths rather than relative paths.  In the end,
some of these issues had to be fixed by inserting serializing dependencies
in the build.

Minor note:
For the null ISA, we just provide a dummy inc.d so SCons is never
compelled to try to generate it. While it seems slightly wrong to have
anything in src/arch/*/generated (i.e. a non-generated 'generated' file),
it's by far the simplest solution.
2014-05-09 18:58:47 -04:00
Curtis Dunham e651188f75 arch: remove 'null update' check in isa-parser
SCons already does this for all build steps.
2014-04-23 05:17:57 -04:00
Yasuko Eckert 2c293823aa cpu: add a condition-code register class
Add a third register class for condition codes,
in parallel with the integer and FP classes.
No ISAs use the CC class at this point though.
2013-10-15 14:22:44 -04:00
Steve Reinhardt 219c423f1f cpu: rename *_DepTag constants to *_Reg_Base
Make these names more meaningful.

Specifically, made these substitutions:

s/FP_Base_DepTag/FP_Reg_Base/g;
s/Ctrl_Base_DepTag/Misc_Reg_Base/g;
s/Max_DepTag/Max_Reg_Index/g;
2013-10-15 14:22:43 -04:00
Nilay Vaish 3700e5448a ISA Parser: Allow predication of source and destination registers
This patch is meant for allowing predicated reads and writes. Note that this
predication is different from the ISA provided predication. They way we
currently provide the ISA description for X86, we read/write registers that
do not need to be actually read/written. This is likely to be true for other
ISAs as well. This patch allows for read and write predicates to be associated
with operands. It allows for the register indices for source and destination
registers to be decided at the time when the microop is constructed. The
run time indicies come in to play only when the at least one of the
predicates has been provided. This patch will not affect any of the ISAs that
do not provide these predicates. Also the patch assumes that the order in
which operands appear in any function of the microop is same across all the
functions of the microops. A subsequent patch will enable predication for the
x86 ISA.
2012-06-03 10:59:04 -05:00
Ali Saidi 6df196b71e O3: Clean up the O3 structures and try to pack them a bit better.
DynInst is extremely large the hope is that this re-organization will put the
most used members close to each other.
2012-06-05 01:23:09 -04:00
Gabe Black eae1e97fb0 ISA: Make the decode function part of the ISA's decoder. 2012-05-25 00:55:24 -07:00
Gabe Black 997cbe1c09 ISA parser: Use '_' instead of '.' to delimit type modifiers on operands.
By using an underscore, the "." is still available and can unambiguously be
used to refer to members of a structure if an operand is a structure, class,
etc. This change mostly just replaces the appropriate "."s with "_"s, but
there were also a few places where the ISA descriptions where handling the
extensions themselves and had their own regular expressions to update. The
regular expressions in the isa parser were updated as well. It also now
looks for one of the defined type extensions specifically after connecting "_"
where before it would look for any sequence of characters after a "."
following an operand name and try to use it as the extension. This helps to
disambiguate cases where a "_" may legitimately be part of an operand name but
not separate the name from the type suffix.

Because leaving the "_" and suffix on the variable name still leaves a valid
C++ identifier and all extensions need to be consistent in a given context, I
considered leaving them on as a breadcrumb that would show what the intended
type was for that operand. Unfortunately the operands can be referred to in
code templates, the Mem operand in particular, and since the exact type of Mem
can be different for different uses of the same template, that broke things.
2011-09-26 23:48:54 -07:00
Gabe Black f370ac5c18 ISA parser: Don't look for operands in strings. 2011-09-08 03:21:14 -07:00
Gabe Black f4dc64655f ISA parser: Match /* */ and // style comments.
Comments should not be scanned for operands, and we should look for both /* */
style and // style.
2011-09-08 03:20:05 -07:00
Gabe Black a7dcd19fa0 ISA: Get rid of the unused mem_acc_type template parameter. 2011-07-11 04:47:06 -07:00
Nathan Binkert 3d252f8e5f grammar: better encapsulation of a grammar and parsing
This makes it possible to use the grammar multiple times and use the multiple
instances concurrently.  This makes implementing an include statement as part
of a grammar possible.
2011-07-05 18:30:04 -07:00
Gabe Black 63a934d152 ISA parser: Define operand types with a ctype directly. 2011-07-05 16:52:15 -07:00
Gabe Black f16179eb21 ISA parser: Simplify operand type handling.
This change simplifies the code surrounding operand type handling and makes it
depend only on the ctype that goes with each operand type. Future changes will
allow defining operand types by their ctypes directly, convert the ISAs over
to that style of definition, and then remove support for the old style. These
changes are to make it easier to use non-builtin types like classes or
structures as the type for operands.
2011-07-05 16:48:18 -07:00
Gabe Black ab3704170e ISA parser: Loosen the regular expressions matching filenames.
The regular expressions matching filenames in the ##include directives and the
internally generated ##newfile directives where only looking for filenames
composed of alpha numeric characters, periods, and dashes. In Unix/Linux, the
rules for what characters can be in a filename are much looser than that. This
change replaces those expressions with ones that look for anything other than
a quote character. Technically quote characters are allowed as well so we
should allow escaping them somehow, but the additional complexity probably
isn't worth it.
2011-06-07 00:46:54 -07:00
Gabe Black 57ed5e77fe ISA parser: Set up op_src_decl and op_dest_decl for pc operands. 2011-03-24 13:55:16 -04:00
Steve Reinhardt d650f4138e scons: show sources and targets when building, and colorize output.
I like the brevity of Ali's recent change, but the ambiguity of
sometimes showing the source and sometimes the target is a little
confusing.  This patch makes scons typically list all sources and
all targets for each action, with the common path prefix factored
out for brevity.  It's a little more verbose now but also more
informative.

Somehow Ali talked me into adding colors too, which is a whole
'nother story.
2011-01-07 21:50:13 -08:00
Gabe Black 4c9b023a7a ISA: Get the parser to support pc state components more elegantly. 2010-12-07 23:08:05 -08:00
Ali Saidi d4767f440a SCons: Cleanup SCons output during compile 2010-11-15 14:04:04 -06:00
Gabe Black 6f4bd2c1da ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors.
This change is a low level and pervasive reorganization of how PCs are managed
in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about,
the PC and the NPC, and the lsb of the PC signaled whether or not you were in
PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next
micropc, x86 and ARM introduced variable length instruction sets, and ARM
started to keep track of mode bits in the PC. Each CPU model handled PCs in
its own custom way that needed to be updated individually to handle the new
dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack,
the complexity could be hidden in the ISA at the ISA implementation's expense.
Areas like the branch predictor hadn't been updated to handle branch delay
slots or micropcs, and it turns out that had introduced a significant (10s of
percent) performance bug in SPARC and to a lesser extend MIPS. Rather than
perpetuate the problem by reworking O3 again to handle the PC features needed
by x86, this change was introduced to rework PC handling in a more modular,
transparent, and hopefully efficient way.


PC type:

Rather than having the superset of all possible elements of PC state declared
in each of the CPU models, each ISA defines its own PCState type which has
exactly the elements it needs. A cross product of canned PCState classes are
defined in the new "generic" ISA directory for ISAs with/without delay slots
and microcode. These are either typedef-ed or subclassed by each ISA. To read
or write this structure through a *Context, you use the new pcState() accessor
which reads or writes depending on whether it has an argument. If you just
want the address of the current or next instruction or the current micro PC,
you can get those through read-only accessors on either the PCState type or
the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the
move away from readPC. That name is ambiguous since it's not clear whether or
not it should be the actual address to fetch from, or if it should have extra
bits in it like the PAL mode bit. Each class is free to define its own
functions to get at whatever values it needs however it needs to to be used in
ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the
PC and into a separate field like ARM.

These types can be reset to a particular pc (where npc = pc +
sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as
appropriate), printed, serialized, and compared. There is a branching()
function which encapsulates code in the CPU models that checked if an
instruction branched or not. Exactly what that means in the context of branch
delay slots which can skip an instruction when not taken is ambiguous, and
ideally this function and its uses can be eliminated. PCStates also generally
know how to advance themselves in various ways depending on if they point at
an instruction, a microop, or the last microop of a macroop. More on that
later.

Ideally, accessing all the PCs at once when setting them will improve
performance of M5 even though more data needs to be moved around. This is
because often all the PCs need to be manipulated together, and by getting them
all at once you avoid multiple function calls. Also, the PCs of a particular
thread will have spatial locality in the cache. Previously they were grouped
by element in arrays which spread out accesses.


Advancing the PC:

The PCs were previously managed entirely by the CPU which had to know about PC
semantics, try to figure out which dimension to increment the PC in, what to
set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction
with the PC type itself. Because most of the information about how to
increment the PC (mainly what type of instruction it refers to) is contained
in the instruction object, a new advancePC virtual function was added to the
StaticInst class. Subclasses provide an implementation that moves around the
right element of the PC with a minimal amount of decision making. In ISAs like
Alpha, the instructions always simply assign NPC to PC without having to worry
about micropcs, nnpcs, etc. The added cost of a virtual function call should
be outweighed by not having to figure out as much about what to do with the
PCs and mucking around with the extra elements.

One drawback of making the StaticInsts advance the PC is that you have to
actually have one to advance the PC. This would, superficially, seem to
require decoding an instruction before fetch could advance. This is, as far as
I can tell, realistic. fetch would advance through memory addresses, not PCs,
perhaps predicting new memory addresses using existing ones. More
sophisticated decisions about control flow would be made later on, after the
instruction was decoded, and handed back to fetch. If branching needs to
happen, some amount of decoding needs to happen to see that it's a branch,
what the target is, etc. This could get a little more complicated if that gets
done by the predecoder, but I'm choosing to ignore that for now.


Variable length instructions:

To handle variable length instructions in x86 and ARM, the predecoder now
takes in the current PC by reference to the getExtMachInst function. It can
modify the PC however it needs to (by setting NPC to be the PC + instruction
length, for instance). This could be improved since the CPU doesn't know if
the PC was modified and always has to write it back.


ISA parser:

To support the new API, all PC related operand types were removed from the
parser and replaced with a PCState type. There are two warts on this
implementation. First, as with all the other operand types, the PCState still
has to have a valid operand type even though it doesn't use it. Second, using
syntax like PCS.npc(target) doesn't work for two reasons, this looks like the
syntax for operand type overriding, and the parser can't figure out if you're
reading or writing. Instructions that use the PCS operand (which I've
consistently called it) need to first read it into a local variable,
manipulate it, and then write it back out.


Return address stack:

The return address stack needed a little extra help because, in the presence
of branch delay slots, it has to merge together elements of the return PC and
the call PC. To handle that, a buildRetPC utility function was added. There
are basically only two versions in all the ISAs, but it didn't seem short
enough to put into the generic ISA directory. Also, the branch predictor code
in O3 and InOrder were adjusted so that they always store the PC of the actual
call instruction in the RAS, not the next PC. If the call instruction is a
microop, the next PC refers to the next microop in the same macroop which is
probably not desirable. The buildRetPC function advances the PC intelligently
to the next macroop (in an ISA specific way) so that that case works.


Change in stats:

There were no change in stats except in MIPS and SPARC in the O3 model. MIPS
runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could
likely be improved further by setting call/return instruction flags and taking
advantage of the RAS.


TODO:

Add != operators to the PCState classes, defined trivially to be !(a==b).
Smooth out places where PCs are split apart, passed around, and put back
together later. I think this might happen in SPARC's fault code. Add ISA
specific constructors that allow setting PC elements without calling a bunch
of accessors. Try to eliminate the need for the branching() function. Factor
out Alpha's PAL mode pc bit into a separate flag field, and eliminate places
where it's blindly masked out or tested in the PC.
2010-10-31 00:07:20 -07:00
Gabe Black 1c0d9806e5 ARM: Fix custom writer/reader code for non indexed operands. 2010-06-02 12:57:59 -05:00
Nathan Binkert 629e8df196 isa_parser: move the operand map stuff into the ISAParser class. 2010-02-26 18:14:48 -08:00
Nathan Binkert 4db57edade isa_parser: move more support functions into the ISAParser class 2010-02-26 18:14:48 -08:00
Nathan Binkert 5ad139375e isa_parser: move more stuff into the ISAParser class 2010-02-26 18:14:48 -08:00
Nathan Binkert 4ef6e129d6 isa_parser: move the formatMap and exportContext into the ISAParser class 2010-02-26 18:14:48 -08:00
Nathan Binkert 4e105f6fe1 isa_parser: Make stack objects class members instead of globals 2010-02-26 18:14:48 -08:00
Nathan Binkert b4178b1ae7 isa_parser: add a debug variable that changes how errors are reported.
This allows us to get tracebacks in certain cases where they're more
useful than our error message.
2010-02-26 18:14:48 -08:00
Nathan Binkert 40a05f04fb isa_parser: Use an exception to flag error
This allows the error to propagate more easily
2010-02-26 18:14:48 -08:00
Nathan Binkert f82a92925c isa_parser: Move more stuff into the ISAParser class 2010-02-26 18:14:48 -08:00
Nathan Binkert f7a627338c isa_parser: move code around to prepare for putting more stuff in the class 2010-02-26 18:14:48 -08:00
Nathan Binkert eb4ce01056 isa_parser: simple fixes, formatting and style 2010-02-26 18:14:48 -08:00
Nathan Binkert 52ccfde2cd isa_parser: allow negative integer literals 2009-11-05 17:21:25 -08:00
Nathan Binkert 708faa7677 compile: wrap 64bit numbers with ULL() so 32bit compiles work
In the isa_parser, we need to check case statements.
2009-11-08 13:31:59 -08:00
Timothy M. Jones 835a55e7f3 POWER: Add support for the Power ISA
This adds support for the 32-bit, big endian Power ISA. This supports both
integer and floating point instructions based on the Power ISA Book I v2.06.
2009-10-27 09:24:39 -07:00
Nathan Binkert baca1f0566 isa_parser: Turn the ISA Parser into a subclass of Grammar.
This is to prepare for future cleanup where we allow SCons to create a
separate grammar class for each ISA
2009-09-23 18:28:29 -07:00
Nathan Binkert e9288b2cd3 scons: add slicc and ply to sys.path and PYTHONPATH so everyone has access 2009-09-22 15:24:16 -07:00
Gabe Black dc0a017ed0 isa_parser: Get rid of the now unused ControlBitfieldOperand. 2009-07-20 20:20:17 -07:00
Gabe Black 60577eb4ca ISAs: Get rid of the IControl operand type.
A separate operand type is not necessary to use two bitfields to generate the
index.
2009-07-10 01:21:04 -07:00
Gabe Black 25884a8773 Registers: Get rid of the float register width parameter. 2009-07-08 23:02:20 -07:00
Gabe Black b8b7c7314a ISA parser: Allow alternative read/write code for operands. 2009-07-08 23:02:19 -07:00
Korey Sewell 63db33c4b1 isa-parser: made a few changes, but not author-worthy 2009-05-12 15:01:13 -04:00
Ali Saidi 3a3e356f4e style: Remove non-leading tabs everywhere they shouldn't be. Developers should configure their editors to not insert tabs 2008-09-10 14:26:15 -04:00
Korey Sewell 789153dff6 Get MIPS simple regression working. Take out unecessary functions "setShadowSet", "CacheOp"
--HG--
extra : convert_revision : a9ae8a7e62c27c2db16fd3cfa7a7f0bf5f0bf8ea
2007-11-15 03:10:41 -05:00
Korey Sewell 2692590049 Add in files from merge-bare-iron, get them compiling in FS and SE mode
--HG--
extra : convert_revision : d4e19afda897bc3797868b40469ce2ec7ec7d251
2007-11-13 16:58:16 -05:00
Gabe Black fb6cdf09cb X86: Make a microcode branch microop.
Also some touch up for ruflag.

--HG--
extra : convert_revision : 829947169af25ca6573f53b9430707101c75cc23
2007-08-07 15:19:26 -07:00
Gabe Black 009df5ff1e Fixed line number accounting
--HG--
extra : convert_revision : 4c046493b98ce4a766c2121710d70650cb6a801e
2007-07-20 23:12:26 -07:00
Korey Sewell ac19e0c505 FINISH off merge of mips mt/dsp isa extensions by adding the ControlBitfieldOPerand to ISA Parser. Now, while things do build, we have to fix broken functionality...
src/arch/isa_parser.py:
    add back deleted writeback in Control Operand

--HG--
extra : convert_revision : dba11af220a1281fa53f79d87e4f8752bdfc56db
2007-06-22 21:09:35 -04:00
Korey Sewell c6d137f565 add Control Bitfield class
--HG--
extra : convert_revision : 31e7243c8820cb9f6744c53c417460dee9adaf44
2007-06-22 20:09:46 -04:00
Gabe Black a3ae9486d5 Merge zizzer.eecs.umich.edu:/bk/newmem
into  doughnut.mwconnections.com:/home/gblack/m5/newmem-x86

--HG--
extra : convert_revision : 276d00a73b1834d5262129c3f7e0f7fae18e23bc
2007-05-25 19:29:32 -07:00