Commit graph

51 commits

Author SHA1 Message Date
Steve Reinhardt 6677b9122a mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMW
Makes x86-style locked operations even more distinct from
LLSC operations.  Using "locked" by itself should be
obviously ambiguous now.
2015-03-23 16:14:20 -07:00
Curtis Dunham fe27f937aa arch: teach ISA parser how to split code across files
This patch encompasses several interrelated and interdependent changes
to the ISA generation step.  The end goal is to reduce the size of the
generated compilation units for instruction execution and decoding so
that batch compilation can proceed with all CPUs active without
exhausting physical memory.

The ISA parser (src/arch/isa_parser.py) has been improved so that it can
accept 'split [output_type];' directives at the top level of the grammar
and 'split(output_type)' python calls within 'exec {{ ... }}' blocks.
This has the effect of "splitting" the files into smaller compilation
units.  I use air-quotes around "splitting" because the files themselves
are not split, but preprocessing directives are inserted to have the same
effect.

Architecturally, the ISA parser has had some changes in how it works.
In general, it emits code sooner.  It doesn't generate per-CPU files,
and instead defers to the C preprocessor to create the duplicate copies
for each CPU type.  Likewise there are more files emitted and the C
preprocessor does more substitution that used to be done by the ISA parser.

Finally, the build system (SCons) needs to be able to cope with a
dynamic list of source files coming out of the ISA parser. The changes
to the SCons{cript,truct} files support this. In broad strokes, the
targets requested on the command line are hidden from SCons until all
the build dependencies are determined, otherwise it would try, realize
it can't reach the goal, and terminate in failure. Since build steps
(i.e. running the ISA parser) must be taken to determine the file list,
several new build stages have been inserted at the very start of the
build. First, the build dependencies from the ISA parser will be emitted
to arch/$ISA/generated/inc.d, which is then read by a new SCons builder
to finalize the dependencies. (Once inc.d exists, the ISA parser will not
need to be run to complete this step.) Once the dependencies are known,
the 'Environments' are made by the makeEnv() function. This function used
to be called before the build began but now happens during the build.
It is easy to see that this step is quite slow; this is a known issue
and it's important to realize that it was already slow, but there was
no obvious cause to attribute it to since nothing was displayed to the
terminal. Since new steps that used to be performed serially are now in a
potentially-parallel build phase, the pathname handling in the SCons scripts
has been tightened up to deal with chdir() race conditions. In general,
pathnames are computed earlier and more likely to be stored, passed around,
and processed as absolute paths rather than relative paths.  In the end,
some of these issues had to be fixed by inserting serializing dependencies
in the build.

Minor note:
For the null ISA, we just provide a dummy inc.d so SCons is never
compelled to try to generate it. While it seems slightly wrong to have
anything in src/arch/*/generated (i.e. a non-generated 'generated' file),
it's by far the simplest solution.
2014-05-09 18:58:47 -04:00
Curtis Dunham 7f1603d207 arch: remove inline specifiers on all inst constrs, all ISAs
With (upcoming) separate compilation, they are useless.  Only
link-time optimization could re-inline them, but ideally
feedback-directed optimization would choose to do so only for
profitable (i.e. common) instructions.
2014-05-09 18:58:46 -04:00
Andreas Sandberg 654d1e675a x86: Add support for loading 32-bit and 80-bit floats in the x87
The x87 FPU supports three floating point formats: 32-bit, 64-bit, and
80-bit floats. The current gem5 implementation supports 32-bit and
64-bit floats, but only works correctly for 64-bit floats. This
changeset fixes the 32-bit float handling by correctly loading and
rounding (using truncation) 32-bit floats instead of simply truncating
the bit pattern.

80-bit floats are loaded by first loading the 80-bits of the float to
two temporary integer registers. A micro-op (cvtint_fp80) then
converts the contents of the two integer registers to the internal FP
representation (double). Similarly, when storing an 80-bit float,
there are two conversion routines (ctvfp80h_int and cvtfp80l_int) that
convert an internal FP register to 80-bit and stores the upper 64-bits
or lower 32-bits to an integer register, which is the written to
memory using normal integer stores.
2013-09-30 12:00:20 +02:00
Gabe Black a7859f7e45 X86: Fix address size handling so real mode works properly.
Virtual (pre-segmentation) addresses are truncated based on address size, and
any non-64 bit linear address is truncated to 32 bits. This means that real
mode addresses aren't truncated down to 16 bits after their segment bases are
added in.
2012-03-31 12:27:33 -07:00
Gabe Black 997cbe1c09 ISA parser: Use '_' instead of '.' to delimit type modifiers on operands.
By using an underscore, the "." is still available and can unambiguously be
used to refer to members of a structure if an operand is a structure, class,
etc. This change mostly just replaces the appropriate "."s with "_"s, but
there were also a few places where the ISA descriptions where handling the
extensions themselves and had their own regular expressions to update. The
regular expressions in the isa parser were updated as well. It also now
looks for one of the defined type extensions specifically after connecting "_"
where before it would look for any sequence of characters after a "."
following an operand name and try to use it as the extension. This helps to
disambiguate cases where a "_" may legitimately be part of an operand name but
not separate the name from the type suffix.

Because leaving the "_" and suffix on the variable name still leaves a valid
C++ identifier and all extensions need to be consistent in a given context, I
considered leaving them on as a breadcrumb that would show what the intended
type was for that operand. Unfortunately the operands can be referred to in
code templates, the Mem operand in particular, and since the exact type of Mem
can be different for different uses of the same template, that broke things.
2011-09-26 23:48:54 -07:00
Gabe Black aade13769f ISA: Use readBytes/writeBytes for all instruction level memory operations. 2011-07-02 22:34:29 -07:00
Gabe Black 2f72d6a1f4 X86: Fix store microops so they don't drop faults in timing mode.
If a fault was returned by the CPU when a store initiated it's write, the
store instruction would ignore the fault. This change fixes that.
2011-07-02 22:31:22 -07:00
Gabe Black efb9f7c2ae X86: Eliminate an unused argument for building store microops. 2011-06-21 19:28:14 -07:00
Gabe Black 2e4fb3f139 X86: Mark IO reads and writes as non-speculative. 2011-03-01 22:42:59 -08:00
Gabe Black 72d35701e9 X86: Mark prefetches as such in their instruction and request flags. 2011-03-01 22:42:18 -08:00
Gabe Black 4e1adf85f7 X86: Don't read in dest regs if all bits are replaced.
In x86, 32 and 64 bit writes to registers in which registers appear to be 32 or
64 bits wide overwrite all bits of the destination register. This change
removes false dependencies in these cases where the previous value of a
register doesn't need to be read to write a new value. New versions of most
microops are created that have a "Big" suffix which simply overwrite their
destination, and the right version to use is selected during microop
allocation based on the selected data size.

This does not change the performance of the O3 CPU model significantly, I
assume because there are other false dependencies from the condition code bits
in the flags register.
2011-02-13 17:44:24 -08:00
Gabe Black cb22bead7d X86: Get rid of the stupd microop. 2011-02-02 19:57:12 -08:00
Gabe Black 9581562e65 X86: Get rid of the flagless microop constructor.
This will reduce clutter in the source and hopefully speed up compilation.
2010-08-23 09:44:19 -07:00
Gabe Black 5a1dbe4d99 X86: Consolidate extra microop flags into one parameter.
This single parameter replaces the collection of bools that set up various
flavors of microops. A flag parameter also allows other flags to be set like
the serialize before/after flags, etc., without having to change the
constructor.
2010-08-23 09:44:19 -07:00
Nathan Binkert 13d64906c2 copyright: Change HP copyright on x86 code to be more friendly 2010-05-23 22:44:15 -07:00
Gabe Black 53086dfefe X86: Make x86 use PREFETCH instead of PF_EXCLUSIVE. 2009-11-08 22:49:57 -08:00
Gabe Black d0d597004f X86: Preserve the NO_ACCESS flag when giving CDA a specialized interface. 2009-08-23 14:16:58 -07:00
Gabe Black ba6b8389ee X86: Take limitted advantage of the compilers type checking for microop operands. 2009-07-16 09:29:29 -07:00
Gabe Black ee7055c289 X86: Put the StoreCheck flag with the others, and don't collide with other flags. 2009-04-23 01:43:00 -07:00
Gabe Black d90456a486 X86: Implement the stul microop.
This microop does a store and unlocks the requested address. The RISC86
microop ISA doesn't seem to have an equivalent to this, so I'm guessing that
the store following an ldstl is automatically unlocking. We don't do it this
way for performance reasons since the behavior is the same.
2009-04-19 04:55:58 -07:00
Gabe Black d2554ff030 X86: Implement the ldstl microop.
This microop does a load, checks that a store would succeed, and locks the
requested address.
2009-04-19 04:55:43 -07:00
Gabe Black 35eea4191b X86: LEA calculates an address before segmentation. 2009-04-19 03:24:51 -07:00
Gabe Black 8a1eb7e8be X86: Take address size into account when computing an effective address. 2009-02-27 09:25:16 -08:00
Gabe Black 9dfa3f7f73 X86: Fix segment limit checks. 2009-02-27 09:23:50 -08:00
Gabe Black 06ff83e1b9 X86: Implement a basic prefetch instruction. 2009-02-25 10:19:22 -08:00
Gabe Black 5f0428ef9f X86: Use the right portion of a register for stores. 2009-02-25 10:19:14 -08:00
Gabe Black dc53ca89f6 X86: Add a flag to force memory accesses to happen at CPL 0. 2009-02-25 10:18:22 -08:00
Gabe Black 1b336a8fe7 X86: Make the stupd microop not update registers in initiateAcc. 2009-02-25 10:15:56 -08:00
Gabe Black a1aba01a02 CPU: Get rid of translate... functions from various interface classes. 2009-02-25 10:15:34 -08:00
Gabe Black 115b1a7ed3 X86: Autogenerate macroop generateDisassemble function. 2009-01-06 22:55:27 -08:00
Gabe Black 8c15518f30 X86: Fix completeAcc get call. 2008-11-09 21:55:43 -08:00
Gabe Black 98d2ca403e X86: Implement the INVLPG instruction and the TIA microop.
--HG--
extra : convert_revision : 31db1ee082f6c3ca5443cba1eb335e408661ead2
2008-02-26 23:39:22 -05:00
Gabe Black aaa30714b3 X86: Various fixes to indexing segmentation related registers
--HG--
extra : convert_revision : 3d45da3a3fb38327582cfdfb72cfc4ce1b1d31af
2007-11-12 14:37:54 -08:00
Gabe Black 421aea980f X86: Implement the cda microop which checks if an address is legal to write to.
--HG--
extra : convert_revision : afe20649180dd59ad0702b98f7293be6c9226359
2007-10-22 14:30:56 -07:00
Gabe Black 4d15e4cf7b X86: Implement the stupd microop ("store with update", not "stupid") and use it in ENTER.
--HG--
extra : convert_revision : 9151f701162d31ef26298497467c42b7b0ed85d5
2007-10-21 18:44:50 -07:00
Gabe Black 9498e536c0 X86: Implement MSR reads and writes and the wrsmr and rdmsr instructions.
There are no priviledge checks, so these instructions will all work in all
modes.

--HG--
extra : convert_revision : ff893eb569313d8aecbfffb47bcbd1c2d65cd393
2007-10-12 16:37:55 -07:00
Gabe Black 7c521db9de X86: Implement the ldst microop and put it in existing microcode where appropriate.
--HG--
extra : convert_revision : f08bd725d07a501bb7a0ce91590b5d37db99c6f3
2007-10-02 22:08:09 -07:00
Gabe Black 22830c0747 X86: Add load and store microops that use the fp registers.
--HG--
extra : convert_revision : 153a055e888d8c47d59758a599dbd38f63008137
2007-08-29 20:36:12 -07:00
Gabe Black fcd04f953c X86: Remove x86 code that attempted to fix misaligned accesses.
--HG--
extra : convert_revision : 42f68010e6498aceb7ed25da278093e99150e4df
2007-08-26 20:30:36 -07:00
Gabe Black 802f13e6bd X86: Make 64 bit unaligned accesses work as well as the other sizes.
There is a fundemental flaw in how unaligned accesses are supported, but this
is still an improvement.

--HG--
extra : convert_revision : 1c20b524ac24cd4a812c876b067495ee6a7ae29f
2007-08-04 20:22:20 -07:00
Gabe Black e410a925df X86: Start implementing segmentation support.
Make instructions observe segment prefixes, default segment rules, segment
base addresses.
Also fix some microcode and add sib and riprel "keywords" to the x86
specialization of the microassembler.

--HG--
extra : convert_revision : be5a3b33d33f243ed6e1ad63faea8495e46d0ac9
2007-08-04 20:12:54 -07:00
Gabe Black e5e5b0119d X86: Fix for compilation bug with new cache code.
--HG--
extra : convert_revision : 073c6db0796cd2c11b8293b382b438a2a959b821
2007-08-01 12:49:58 -07:00
Gabe Black c0670187c5 X86: Add functions to read and write to an exec context.
These functions take care of calling the thread contexts read and write functions with the right sized data type, and handle unaligned accesses.

--HG--
extra : convert_revision : b4b59ab2b22559333035185946bae3eab316c879
2007-07-26 22:08:35 -07:00
Gabe Black f09847c7a6 Make load and store ops use the appropriate sized data access.
--HG--
extra : convert_revision : 6b808586fab10ca433ef04b062bf701b906634b9
2007-07-20 15:02:09 -07:00
Gabe Black cfadef74d1 x86 fixes
Make the emulation environment consider the rex prefix.
Implement and hook in forms of j, jmp, cmp, syscall, movzx
Added a format for an instruction to carry a call to the SE mode syscalls system
Made memory instructions which refer to the rip do so directly
Made the operand size overridable in the microassembly
Made the "ext" field of register operations 16 bits to hold a sparse encoding of flags to set or conditions to predicate on
Added an explicit "rax" operand for the syscall format
Implemented syscall returns.

--HG--
extra : convert_revision : ae84bd8c6a1d400906e17e8b8c4185f2ebd4c5f2
2007-07-19 15:15:47 -07:00
Gabe Black 05a33a443f Make store microops actually store instead of load.
--HG--
extra : convert_revision : fe90f8adc96dd0e680cfa45e4c510a906046ae3d
2007-07-18 17:45:06 -07:00
Gabe Black 4f7809d5e6 Pull some hard coded base classes out of the isa description.
--HG--
rename : src/arch/x86/isa/base.isa => src/arch/x86/isa/outputblock.isa
extra : convert_revision : 7954e7d5eea3b5966c9e273a08bcd169a39f380c
2007-07-14 17:14:19 -07:00
Gabe Black a68ddf685c Make memory instructions work better, add more macroop implementations, add an lea microop, move EmulEnv into it's own .cc and .hh.
--HG--
extra : convert_revision : 1212b8463eab1c1dcba7182c487d1e9184cf9bea
2007-06-20 15:02:50 +00:00
Gabe Black 6e286cddfa Get rid of the immediate and displacement components of the EmulEnv struct and use them directly out of the instruction. The extra copies are conceptually realistic but are just innefficient as implemented. Also don't use the zeroeth microcode register for general storage since it's now the zero register, and implement a load and a store microops.
--HG--
extra : convert_revision : 0686296ca8b72940d961ecc6051063bfda1e932d
2007-06-19 14:18:25 +00:00