sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Nilay Vaish	4ccdf8fb81	x86: set op class of two fp instructions This patch sets op class of two fp instructions: movfp and pop x87 stack as IntAluOp since these instructions do not make use of the fp alu.	2014-09-01 16:55:49 -05:00
Curtis Dunham	fe27f937aa	arch: teach ISA parser how to split code across files This patch encompasses several interrelated and interdependent changes to the ISA generation step. The end goal is to reduce the size of the generated compilation units for instruction execution and decoding so that batch compilation can proceed with all CPUs active without exhausting physical memory. The ISA parser (src/arch/isa_parser.py) has been improved so that it can accept 'split [output_type];' directives at the top level of the grammar and 'split(output_type)' python calls within 'exec {{ ... }}' blocks. This has the effect of "splitting" the files into smaller compilation units. I use air-quotes around "splitting" because the files themselves are not split, but preprocessing directives are inserted to have the same effect. Architecturally, the ISA parser has had some changes in how it works. In general, it emits code sooner. It doesn't generate per-CPU files, and instead defers to the C preprocessor to create the duplicate copies for each CPU type. Likewise there are more files emitted and the C preprocessor does more substitution that used to be done by the ISA parser. Finally, the build system (SCons) needs to be able to cope with a dynamic list of source files coming out of the ISA parser. The changes to the SCons{cript,truct} files support this. In broad strokes, the targets requested on the command line are hidden from SCons until all the build dependencies are determined, otherwise it would try, realize it can't reach the goal, and terminate in failure. Since build steps (i.e. running the ISA parser) must be taken to determine the file list, several new build stages have been inserted at the very start of the build. First, the build dependencies from the ISA parser will be emitted to arch/$ISA/generated/inc.d, which is then read by a new SCons builder to finalize the dependencies. (Once inc.d exists, the ISA parser will not need to be run to complete this step.) Once the dependencies are known, the 'Environments' are made by the makeEnv() function. This function used to be called before the build began but now happens during the build. It is easy to see that this step is quite slow; this is a known issue and it's important to realize that it was already slow, but there was no obvious cause to attribute it to since nothing was displayed to the terminal. Since new steps that used to be performed serially are now in a potentially-parallel build phase, the pathname handling in the SCons scripts has been tightened up to deal with chdir() race conditions. In general, pathnames are computed earlier and more likely to be stored, passed around, and processed as absolute paths rather than relative paths. In the end, some of these issues had to be fixed by inserting serializing dependencies in the build. Minor note: For the null ISA, we just provide a dummy inc.d so SCons is never compelled to try to generate it. While it seems slightly wrong to have anything in src/arch/*/generated (i.e. a non-generated 'generated' file), it's by far the simplest solution.	2014-05-09 18:58:47 -04:00
Curtis Dunham	7f1603d207	arch: remove inline specifiers on all inst constrs, all ISAs With (upcoming) separate compilation, they are useless. Only link-time optimization could re-inline them, but ideally feedback-directed optimization would choose to do so only for profitable (i.e. common) instructions.	2014-05-09 18:58:46 -04:00
Andreas Sandberg	654d1e675a	x86: Add support for loading 32-bit and 80-bit floats in the x87 The x87 FPU supports three floating point formats: 32-bit, 64-bit, and 80-bit floats. The current gem5 implementation supports 32-bit and 64-bit floats, but only works correctly for 64-bit floats. This changeset fixes the 32-bit float handling by correctly loading and rounding (using truncation) 32-bit floats instead of simply truncating the bit pattern. 80-bit floats are loaded by first loading the 80-bits of the float to two temporary integer registers. A micro-op (cvtint_fp80) then converts the contents of the two integer registers to the internal FP representation (double). Similarly, when storing an 80-bit float, there are two conversion routines (ctvfp80h_int and cvtfp80l_int) that convert an internal FP register to 80-bit and stores the upper 64-bits or lower 32-bits to an integer register, which is the written to memory using normal integer stores.	2013-09-30 12:00:20 +02:00
Andreas Sandberg	c299dcedc6	x86: Fix re-entrancy problems in x87 store instructions X87 store instructions typically loads and pops the top value of the stack and stores it in memory. The current implementation pops the stack at the same time as the floating point value is loaded to a temporary register. This will corrupt the state of the x87 stack if the store fails. This changeset introduces a pop87 micro-instruction that pops the stack and uses this instruction in the affected macro-instructions to pop the stack after storing the value to memory.	2013-09-30 11:51:25 +02:00
Andreas Sandberg	d06064c386	x86: Add support for maintaining the x87 tag word The current implementation of the x87 never updates the x87 tag word. This is currently not a big issue since the simulated x87 never checks for stack overflows, however this becomes an issue when switching between a virtualized CPU and a simulated CPU. This changeset adds support, which is enabled by default, for updating the tag register to every floating point microop that updates the stack top using the spm mechanism. The new tag words is generated by the helper function X86ISA::genX87Tags(). This function is currently limited to flagging a stack position as valid or invalid and does not try to distinguish between the valid, zero, and special states.	2013-06-18 16:36:08 +02:00
Andreas Sandberg	5d584934ad	x86: Make fprem like the fprem on a real x87 The current implementation of fprem simply does an fmod and doesn't simulate any of the iterative behavior in a real fprem. This isn't normally a problem, however, it can lead to problems when switching between CPU models. If switching from a real CPU in the middle of an fprem loop to a simulated CPU, the output of the fprem loop becomes correupted. This changeset changes the fprem implementation to work like the one on real hardware.	2013-06-18 16:10:42 +02:00
Andreas Sandberg	de89e133d8	x86: Fix the flag handling code in FABS and FCHS This changeset fixes two problems in the FABS and FCHS implementation. First, the ISA parser expects the assignment in flag_code to be a pure assignment and not an and-assignment, which leads to the isa_parser omitting the misc reg update. Second, the FCHS and FABS macro-ops don't set the SetStatus flag, which means that the default micro-op version, which doesn't update FSW, is executed.	2013-06-18 16:10:21 +02:00
Nilay Vaish	fba40864aa	x86: add op class for int and fp microops in isa description Currently all the integer microops are marked as IntAluOp and the floating point microops are marked as FloatAddOp. This patch adds support for marking different microops differently. Now IntMultOp, IntDivOp, FloatDivOp, FloatMultOp, FloatCvtOp, FloatSqrtOp classes will be used as well. This will help in providing different latencies for different op class.	2013-05-21 11:33:57 -05:00
Nilay Vaish	5c940fec0a	x86: implement some of the x87 instructions This patch implements ftan, fprem, fyl2x, fld* floating-point instructions.	2013-03-11 13:15:46 -05:00
Nilay Vaish	91b00d98a5	x86: implement fabs, fchs instructions	2013-01-15 07:43:19 -06:00
Nilay Vaish	23ba6fc5fb	x86: implement x87 fp instruction fsincos This patch implements the fsincos instruction. The code was originally written by Vince Weaver. Gabe had made some comments about the code, but those were never addressed. This patch addresses those comments.	2012-12-30 12:45:45 -06:00
Nilay Vaish	6369df59c8	x86: Add a separate register for D flag bit The D flag bit is part of the cc flag bit register currently. But since it is not being used any where in the implementation, it creates an unnecessary dependency. Hence, it is being moved to a separate register.	2012-09-11 09:25:43 -05:00
Nilay Vaish	4d4d212ae9	X86: Split Condition Code register This patch moves the ECF and EZF bits to individual registers (ecfBit and ezfBit) and the CF and OF bits to cfofFlag registers. This is being done so as to lower the read after write dependencies on the the condition code register. Ultimately we will have the following registers [ZAPS], [OF], [CF], [ECF], [EZF] and [DF]. Note that this is only one part of the solution for lowering the dependencies. The other part will check whether or not the condition code register needs to be actually read. This would be done through a separate patch.	2012-05-22 11:29:53 -05:00
Andreas Hansson	b6aa6d55eb	clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6 This patch addresses a number of minor issues that cause problems when compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it avoids using the deprecated ext/hash_map and instead uses unordered_map (and similarly so for the hash_set). To make use of the new STL containers, g++ and clang has to be invoked with "-std=c++0x", and this is now added for all gcc versions >= 4.6, and for clang >= 3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1 unordered_map to avoid the deprecation warning. The addition of c++0x in turn causes a few problems, as the compiler is more stringent and adds a number of new warnings. Below, the most important issues are enumerated: 1) the use of namespaces is more strict, e.g. for isnan, and all headers opening the entire namespace std are now fixed. 2) another other issue caused by the more stringent compiler is the narrowing of the embedded python, which used to be a char array, and is now unsigned char since there were values larger than 128. 3) a particularly odd issue that arose with the new c++0x behaviour is found in range.hh, where the operator< causes gcc to complain about the template type parsing (the "<" is interpreted as the beginning of a template argument), and the problem seems to be related to the begin/end members introduced for the range-type iteration, which is a new feature in c++11. As a minor update, this patch also fixes the build flags for the clang debug target that used to be shared with gcc and incorrectly use "-ggdb".	2012-04-14 05:43:31 -04:00
Gabe Black	997cbe1c09	ISA parser: Use '_' instead of '.' to delimit type modifiers on operands. By using an underscore, the "." is still available and can unambiguously be used to refer to members of a structure if an operand is a structure, class, etc. This change mostly just replaces the appropriate "."s with "_"s, but there were also a few places where the ISA descriptions where handling the extensions themselves and had their own regular expressions to update. The regular expressions in the isa parser were updated as well. It also now looks for one of the defined type extensions specifically after connecting "_" where before it would look for any sequence of characters after a "." following an operand name and try to use it as the extension. This helps to disambiguate cases where a "_" may legitimately be part of an operand name but not separate the name from the type suffix. Because leaving the "_" and suffix on the variable name still leaves a valid C++ identifier and all extensions need to be consistent in a given context, I considered leaving them on as a breadcrumb that would show what the intended type was for that operand. Unfortunately the operands can be referred to in code templates, the Mem operand in particular, and since the exact type of Mem can be different for different uses of the same template, that broke things.	2011-09-26 23:48:54 -07:00
Gabe Black	9581562e65	X86: Get rid of the flagless microop constructor. This will reduce clutter in the source and hopefully speed up compilation.	2010-08-23 09:44:19 -07:00
Gabe Black	5a1dbe4d99	X86: Consolidate extra microop flags into one parameter. This single parameter replaces the collection of bools that set up various flavors of microops. A flag parameter also allows other flags to be set like the serialize before/after flags, etc., without having to change the constructor.	2010-08-23 09:44:19 -07:00
Nathan Binkert	13d64906c2	copyright: Change HP copyright on x86 code to be more friendly	2010-05-23 22:44:15 -07:00
Gabe Black	ba6b8389ee	X86: Take limitted advantage of the compilers type checking for microop operands.	2009-07-16 09:29:29 -07:00
Gabe Black	115b1a7ed3	X86: Autogenerate macroop generateDisassemble function.	2009-01-06 22:55:27 -08:00
Gabe Black	06d2d54b57	X86: Fix the movfp microop. --HG-- extra : convert_revision : 23829782a2802a97a05e4dfdb5dd38fbe4165a90	2007-10-02 22:58:04 -07:00
Gabe Black	a75b6f5106	X86: Move the fp microops to their own file with their own base classes in C++ and python. --HG-- extra : convert_revision : 9cd223f2005adb36fea2bb56fa39793a58ec958c	2007-09-19 18:27:55 -07:00

23 commits