sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Andreas Sandberg	8d2c3735d9	arch: Include generated decoder header after normal headers The generated decoder header defines macros that represent bit fields within instructions. These fields typically have short names that conflict with names in other header files. Include the generated header after all normal header to avoid this issue. Change-Id: I53d149b75432c20abdbf651e32c3c785d897973b Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>	2017-02-27 12:06:00 +00:00
Fernando Endo	6c72c35519	cpu, arm: Distinguish Float* and SimdFloat, create FloatMem opClass Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which distinguishes writes to the INT and FP register banks. Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72, where the "latency" of FMADD is 3 if the next instruction is a FMADD and has only the augend to destination dependency, otherwise it's 7 cycles. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-10-15 14:58:45 -05:00
Steve Reinhardt	5200e04e92	arch, x86: add support for arrays as memory operands Although the cache models support wider accesses, the ISA descriptions assume that (for the most part) memory operands are integer types, which makes it difficult to define instructions that do memory accesses larger than 64 bits. This patch adds some generic support for memory operands that are arrays of uint64_t, and specifically a 'u2qw' operand type for x86 that is an array of 2 uint64_ts (128 bits). This support is unused at this point, but will be needed shortly for cmpxchg16b. Ideally the 128-bit SSE memory accesses will also be rewritten to use this support. Support for 128-bit accesses could also have been added using the gcc __int128_t extension, which would have been less disruptive. However, although clang also supports __int128_t, it's still non-standard. Also, more importantly, this approach creates a path to defining 256- and 512-byte operands as well, which will be useful for eventual AVX support.	2016-02-06 17:21:20 -08:00
Steve Reinhardt	f5343df1e1	arch: get rid of dummy var init MemOperand variables were being initialized to 0 "to avoid 'uninitialized variable' errors" but these no longer seem to be a problem (with the exception of one use case in POWER that is arguably broken and easily fixed here). Getting rid of the initialization is necessary to set up a subsequent patch which extends memory operands to possibly not be scalars, making the '= 0' initialization no longer feasible.	2016-02-06 17:21:20 -08:00
Steve Reinhardt	90c279e4b1	arch: clean up isa_parser error handling Although some decent error messages were getting generated inside isa_parser.py, they weren't always getting printed because of the screwy way we were handling exceptions. (Basically an inner exception would get hidden by an outer exception, and the more informative inner error message would not get printed.) Also line numbers were messed up, since they were taken from the lexer, which is typically a token (or more) ahead of the grammar rule that's being matched. Using the 'lineno' attribute that PLY associates with the grammar production is more accurate. The new LineTracker class extends lineno to track filenames as well as line numbers.	2015-10-06 17:26:50 -07:00
Nilay Vaish	aafa5c3f86	revert 5af8f40d8f2c	2015-07-28 01:58:04 -05:00
Nilay Vaish	608641e23c	cpu: implements vector registers This adds a vector register type. The type is defined as a std::array of a fixed number of uint64_ts. The isa_parser.py has been modified to parse vector register operands and generate the required code. Different cpus have vector register files now.	2015-07-26 10:21:20 -05:00
Gabe Black	3069c28a02	arch: Allow named constants as decode case values. The values in a "bitfield" or in an ExtMachInst structure member may not be a literal value, it might select from an arbitrary collection of options. Instead of using the raw value of those constants in the decoder, it's easier to tell what's going on if they can be referred to as a symbolic constant/enum. To support that, the ISA description language is extended slightly so that in addition to integer literals, the case value for decode blobs can also be a string literal. It's up to the ISA author to ensure that the string evaluates to a legal constant value when interpretted as C++.	2014-12-04 15:52:48 -08:00
Andreas Hansson	deb2200671	scons: Address issues related to gcc 4.9.1 Fix a number few minor issues to please gcc 4.9.1. Removing the '-fuse-linker-plugin' flag means no libraries are part of the LTO process, but hopefully this is an acceptable loss, as the flag causes issues on a lot of systems (only certain combinations of gcc, ld and ar work).	2014-09-27 09:08:34 -04:00
Mitch Hayenga	fd722946dd	arch: Properly guess OpClass from optional StaticInst flags isa_parser.py guesses the OpClass if none were given based upon the StaticInst flags. The existing code does not take into account optionally set flags. This code hoists the setting of optional flags so OpClass is properly assigned.	2014-09-03 07:42:32 -04:00
Andreas Sandberg	326662b01b	arch, cpu: Factor out the ExecContext into a proper base class We currently generate and compile one version of the ISA code per CPU model. This is obviously wasting a lot of resources at compile time. This changeset factors out the interface into a separate ExecContext class, which also serves as documentation for the interface between CPUs and the ISA code. While doing so, this changeset also fixes up interface inconsistencies between the different CPU models. The main argument for using one set of ISA code per CPU model has always been performance as this avoid indirect branches in the generated code. However, this argument does not hold water. Booting Linux on a simulated ARM system running in atomic mode (opt/10.linux-boot/realview-simple-atomic) is actually 2% faster (compiled using clang 3.4) after applying this patch. Additionally, compilation time is decreased by 35%.	2014-09-03 07:42:22 -04:00
Curtis Dunham	fe27f937aa	arch: teach ISA parser how to split code across files This patch encompasses several interrelated and interdependent changes to the ISA generation step. The end goal is to reduce the size of the generated compilation units for instruction execution and decoding so that batch compilation can proceed with all CPUs active without exhausting physical memory. The ISA parser (src/arch/isa_parser.py) has been improved so that it can accept 'split [output_type];' directives at the top level of the grammar and 'split(output_type)' python calls within 'exec {{ ... }}' blocks. This has the effect of "splitting" the files into smaller compilation units. I use air-quotes around "splitting" because the files themselves are not split, but preprocessing directives are inserted to have the same effect. Architecturally, the ISA parser has had some changes in how it works. In general, it emits code sooner. It doesn't generate per-CPU files, and instead defers to the C preprocessor to create the duplicate copies for each CPU type. Likewise there are more files emitted and the C preprocessor does more substitution that used to be done by the ISA parser. Finally, the build system (SCons) needs to be able to cope with a dynamic list of source files coming out of the ISA parser. The changes to the SCons{cript,truct} files support this. In broad strokes, the targets requested on the command line are hidden from SCons until all the build dependencies are determined, otherwise it would try, realize it can't reach the goal, and terminate in failure. Since build steps (i.e. running the ISA parser) must be taken to determine the file list, several new build stages have been inserted at the very start of the build. First, the build dependencies from the ISA parser will be emitted to arch/$ISA/generated/inc.d, which is then read by a new SCons builder to finalize the dependencies. (Once inc.d exists, the ISA parser will not need to be run to complete this step.) Once the dependencies are known, the 'Environments' are made by the makeEnv() function. This function used to be called before the build began but now happens during the build. It is easy to see that this step is quite slow; this is a known issue and it's important to realize that it was already slow, but there was no obvious cause to attribute it to since nothing was displayed to the terminal. Since new steps that used to be performed serially are now in a potentially-parallel build phase, the pathname handling in the SCons scripts has been tightened up to deal with chdir() race conditions. In general, pathnames are computed earlier and more likely to be stored, passed around, and processed as absolute paths rather than relative paths. In the end, some of these issues had to be fixed by inserting serializing dependencies in the build. Minor note: For the null ISA, we just provide a dummy inc.d so SCons is never compelled to try to generate it. While it seems slightly wrong to have anything in src/arch/*/generated (i.e. a non-generated 'generated' file), it's by far the simplest solution.	2014-05-09 18:58:47 -04:00
Curtis Dunham	e651188f75	arch: remove 'null update' check in isa-parser SCons already does this for all build steps.	2014-04-23 05:17:57 -04:00
Yasuko Eckert	2c293823aa	cpu: add a condition-code register class Add a third register class for condition codes, in parallel with the integer and FP classes. No ISAs use the CC class at this point though.	2013-10-15 14:22:44 -04:00
Steve Reinhardt	219c423f1f	cpu: rename _DepTag constants to _Reg_Base Make these names more meaningful. Specifically, made these substitutions: s/FP_Base_DepTag/FP_Reg_Base/g; s/Ctrl_Base_DepTag/Misc_Reg_Base/g; s/Max_DepTag/Max_Reg_Index/g;	2013-10-15 14:22:43 -04:00
Nilay Vaish	3700e5448a	ISA Parser: Allow predication of source and destination registers This patch is meant for allowing predicated reads and writes. Note that this predication is different from the ISA provided predication. They way we currently provide the ISA description for X86, we read/write registers that do not need to be actually read/written. This is likely to be true for other ISAs as well. This patch allows for read and write predicates to be associated with operands. It allows for the register indices for source and destination registers to be decided at the time when the microop is constructed. The run time indicies come in to play only when the at least one of the predicates has been provided. This patch will not affect any of the ISAs that do not provide these predicates. Also the patch assumes that the order in which operands appear in any function of the microop is same across all the functions of the microops. A subsequent patch will enable predication for the x86 ISA.	2012-06-03 10:59:04 -05:00
Ali Saidi	6df196b71e	O3: Clean up the O3 structures and try to pack them a bit better. DynInst is extremely large the hope is that this re-organization will put the most used members close to each other.	2012-06-05 01:23:09 -04:00
Gabe Black	eae1e97fb0	ISA: Make the decode function part of the ISA's decoder.	2012-05-25 00:55:24 -07:00
Gabe Black	997cbe1c09	ISA parser: Use '_' instead of '.' to delimit type modifiers on operands. By using an underscore, the "." is still available and can unambiguously be used to refer to members of a structure if an operand is a structure, class, etc. This change mostly just replaces the appropriate "."s with "_"s, but there were also a few places where the ISA descriptions where handling the extensions themselves and had their own regular expressions to update. The regular expressions in the isa parser were updated as well. It also now looks for one of the defined type extensions specifically after connecting "_" where before it would look for any sequence of characters after a "." following an operand name and try to use it as the extension. This helps to disambiguate cases where a "_" may legitimately be part of an operand name but not separate the name from the type suffix. Because leaving the "_" and suffix on the variable name still leaves a valid C++ identifier and all extensions need to be consistent in a given context, I considered leaving them on as a breadcrumb that would show what the intended type was for that operand. Unfortunately the operands can be referred to in code templates, the Mem operand in particular, and since the exact type of Mem can be different for different uses of the same template, that broke things.	2011-09-26 23:48:54 -07:00
Gabe Black	f370ac5c18	ISA parser: Don't look for operands in strings.	2011-09-08 03:21:14 -07:00
Gabe Black	f4dc64655f	ISA parser: Match /* / and // style comments. Comments should not be scanned for operands, and we should look for both / */ style and // style.	2011-09-08 03:20:05 -07:00
Gabe Black	a7dcd19fa0	ISA: Get rid of the unused mem_acc_type template parameter.	2011-07-11 04:47:06 -07:00
Nathan Binkert	3d252f8e5f	grammar: better encapsulation of a grammar and parsing This makes it possible to use the grammar multiple times and use the multiple instances concurrently. This makes implementing an include statement as part of a grammar possible.	2011-07-05 18:30:04 -07:00
Gabe Black	63a934d152	ISA parser: Define operand types with a ctype directly.	2011-07-05 16:52:15 -07:00
Gabe Black	f16179eb21	ISA parser: Simplify operand type handling. This change simplifies the code surrounding operand type handling and makes it depend only on the ctype that goes with each operand type. Future changes will allow defining operand types by their ctypes directly, convert the ISAs over to that style of definition, and then remove support for the old style. These changes are to make it easier to use non-builtin types like classes or structures as the type for operands.	2011-07-05 16:48:18 -07:00
Gabe Black	ab3704170e	ISA parser: Loosen the regular expressions matching filenames. The regular expressions matching filenames in the ##include directives and the internally generated ##newfile directives where only looking for filenames composed of alpha numeric characters, periods, and dashes. In Unix/Linux, the rules for what characters can be in a filename are much looser than that. This change replaces those expressions with ones that look for anything other than a quote character. Technically quote characters are allowed as well so we should allow escaping them somehow, but the additional complexity probably isn't worth it.	2011-06-07 00:46:54 -07:00
Gabe Black	57ed5e77fe	ISA parser: Set up op_src_decl and op_dest_decl for pc operands.	2011-03-24 13:55:16 -04:00
Steve Reinhardt	d650f4138e	scons: show sources and targets when building, and colorize output. I like the brevity of Ali's recent change, but the ambiguity of sometimes showing the source and sometimes the target is a little confusing. This patch makes scons typically list all sources and all targets for each action, with the common path prefix factored out for brevity. It's a little more verbose now but also more informative. Somehow Ali talked me into adding colors too, which is a whole 'nother story.	2011-01-07 21:50:13 -08:00
Gabe Black	4c9b023a7a	ISA: Get the parser to support pc state components more elegantly.	2010-12-07 23:08:05 -08:00
Ali Saidi	d4767f440a	SCons: Cleanup SCons output during compile	2010-11-15 14:04:04 -06:00
Gabe Black	6f4bd2c1da	ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors. This change is a low level and pervasive reorganization of how PCs are managed in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about, the PC and the NPC, and the lsb of the PC signaled whether or not you were in PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next micropc, x86 and ARM introduced variable length instruction sets, and ARM started to keep track of mode bits in the PC. Each CPU model handled PCs in its own custom way that needed to be updated individually to handle the new dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack, the complexity could be hidden in the ISA at the ISA implementation's expense. Areas like the branch predictor hadn't been updated to handle branch delay slots or micropcs, and it turns out that had introduced a significant (10s of percent) performance bug in SPARC and to a lesser extend MIPS. Rather than perpetuate the problem by reworking O3 again to handle the PC features needed by x86, this change was introduced to rework PC handling in a more modular, transparent, and hopefully efficient way. PC type: Rather than having the superset of all possible elements of PC state declared in each of the CPU models, each ISA defines its own PCState type which has exactly the elements it needs. A cross product of canned PCState classes are defined in the new "generic" ISA directory for ISAs with/without delay slots and microcode. These are either typedef-ed or subclassed by each ISA. To read or write this structure through a Context, you use the new pcState() accessor which reads or writes depending on whether it has an argument. If you just want the address of the current or next instruction or the current micro PC, you can get those through read-only accessors on either the PCState type or the Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the move away from readPC. That name is ambiguous since it's not clear whether or not it should be the actual address to fetch from, or if it should have extra bits in it like the PAL mode bit. Each class is free to define its own functions to get at whatever values it needs however it needs to to be used in ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the PC and into a separate field like ARM. These types can be reset to a particular pc (where npc = pc + sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as appropriate), printed, serialized, and compared. There is a branching() function which encapsulates code in the CPU models that checked if an instruction branched or not. Exactly what that means in the context of branch delay slots which can skip an instruction when not taken is ambiguous, and ideally this function and its uses can be eliminated. PCStates also generally know how to advance themselves in various ways depending on if they point at an instruction, a microop, or the last microop of a macroop. More on that later. Ideally, accessing all the PCs at once when setting them will improve performance of M5 even though more data needs to be moved around. This is because often all the PCs need to be manipulated together, and by getting them all at once you avoid multiple function calls. Also, the PCs of a particular thread will have spatial locality in the cache. Previously they were grouped by element in arrays which spread out accesses. Advancing the PC: The PCs were previously managed entirely by the CPU which had to know about PC semantics, try to figure out which dimension to increment the PC in, what to set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction with the PC type itself. Because most of the information about how to increment the PC (mainly what type of instruction it refers to) is contained in the instruction object, a new advancePC virtual function was added to the StaticInst class. Subclasses provide an implementation that moves around the right element of the PC with a minimal amount of decision making. In ISAs like Alpha, the instructions always simply assign NPC to PC without having to worry about micropcs, nnpcs, etc. The added cost of a virtual function call should be outweighed by not having to figure out as much about what to do with the PCs and mucking around with the extra elements. One drawback of making the StaticInsts advance the PC is that you have to actually have one to advance the PC. This would, superficially, seem to require decoding an instruction before fetch could advance. This is, as far as I can tell, realistic. fetch would advance through memory addresses, not PCs, perhaps predicting new memory addresses using existing ones. More sophisticated decisions about control flow would be made later on, after the instruction was decoded, and handed back to fetch. If branching needs to happen, some amount of decoding needs to happen to see that it's a branch, what the target is, etc. This could get a little more complicated if that gets done by the predecoder, but I'm choosing to ignore that for now. Variable length instructions: To handle variable length instructions in x86 and ARM, the predecoder now takes in the current PC by reference to the getExtMachInst function. It can modify the PC however it needs to (by setting NPC to be the PC + instruction length, for instance). This could be improved since the CPU doesn't know if the PC was modified and always has to write it back. ISA parser: To support the new API, all PC related operand types were removed from the parser and replaced with a PCState type. There are two warts on this implementation. First, as with all the other operand types, the PCState still has to have a valid operand type even though it doesn't use it. Second, using syntax like PCS.npc(target) doesn't work for two reasons, this looks like the syntax for operand type overriding, and the parser can't figure out if you're reading or writing. Instructions that use the PCS operand (which I've consistently called it) need to first read it into a local variable, manipulate it, and then write it back out. Return address stack: The return address stack needed a little extra help because, in the presence of branch delay slots, it has to merge together elements of the return PC and the call PC. To handle that, a buildRetPC utility function was added. There are basically only two versions in all the ISAs, but it didn't seem short enough to put into the generic ISA directory. Also, the branch predictor code in O3 and InOrder were adjusted so that they always store the PC of the actual call instruction in the RAS, not the next PC. If the call instruction is a microop, the next PC refers to the next microop in the same macroop which is probably not desirable. The buildRetPC function advances the PC intelligently to the next macroop (in an ISA specific way) so that that case works. Change in stats: There were no change in stats except in MIPS and SPARC in the O3 model. MIPS runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could likely be improved further by setting call/return instruction flags and taking advantage of the RAS. TODO: Add != operators to the PCState classes, defined trivially to be !(a==b). Smooth out places where PCs are split apart, passed around, and put back together later. I think this might happen in SPARC's fault code. Add ISA specific constructors that allow setting PC elements without calling a bunch of accessors. Try to eliminate the need for the branching() function. Factor out Alpha's PAL mode pc bit into a separate flag field, and eliminate places where it's blindly masked out or tested in the PC.	2010-10-31 00:07:20 -07:00
Gabe Black	1c0d9806e5	ARM: Fix custom writer/reader code for non indexed operands.	2010-06-02 12:57:59 -05:00
Nathan Binkert	629e8df196	isa_parser: move the operand map stuff into the ISAParser class.	2010-02-26 18:14:48 -08:00
Nathan Binkert	4db57edade	isa_parser: move more support functions into the ISAParser class	2010-02-26 18:14:48 -08:00
Nathan Binkert	5ad139375e	isa_parser: move more stuff into the ISAParser class	2010-02-26 18:14:48 -08:00
Nathan Binkert	4ef6e129d6	isa_parser: move the formatMap and exportContext into the ISAParser class	2010-02-26 18:14:48 -08:00
Nathan Binkert	4e105f6fe1	isa_parser: Make stack objects class members instead of globals	2010-02-26 18:14:48 -08:00
Nathan Binkert	b4178b1ae7	isa_parser: add a debug variable that changes how errors are reported. This allows us to get tracebacks in certain cases where they're more useful than our error message.	2010-02-26 18:14:48 -08:00
Nathan Binkert	40a05f04fb	isa_parser: Use an exception to flag error This allows the error to propagate more easily	2010-02-26 18:14:48 -08:00
Nathan Binkert	f82a92925c	isa_parser: Move more stuff into the ISAParser class	2010-02-26 18:14:48 -08:00
Nathan Binkert	f7a627338c	isa_parser: move code around to prepare for putting more stuff in the class	2010-02-26 18:14:48 -08:00
Nathan Binkert	eb4ce01056	isa_parser: simple fixes, formatting and style	2010-02-26 18:14:48 -08:00
Nathan Binkert	52ccfde2cd	isa_parser: allow negative integer literals	2009-11-05 17:21:25 -08:00
Nathan Binkert	708faa7677	compile: wrap 64bit numbers with ULL() so 32bit compiles work In the isa_parser, we need to check case statements.	2009-11-08 13:31:59 -08:00
Timothy M. Jones	835a55e7f3	POWER: Add support for the Power ISA This adds support for the 32-bit, big endian Power ISA. This supports both integer and floating point instructions based on the Power ISA Book I v2.06.	2009-10-27 09:24:39 -07:00
Nathan Binkert	baca1f0566	isa_parser: Turn the ISA Parser into a subclass of Grammar. This is to prepare for future cleanup where we allow SCons to create a separate grammar class for each ISA	2009-09-23 18:28:29 -07:00
Nathan Binkert	e9288b2cd3	scons: add slicc and ply to sys.path and PYTHONPATH so everyone has access	2009-09-22 15:24:16 -07:00
Gabe Black	dc0a017ed0	isa_parser: Get rid of the now unused ControlBitfieldOperand.	2009-07-20 20:20:17 -07:00
Gabe Black	60577eb4ca	ISAs: Get rid of the IControl operand type. A separate operand type is not necessary to use two bitfields to generate the index.	2009-07-10 01:21:04 -07:00
Gabe Black	25884a8773	Registers: Get rid of the float register width parameter.	2009-07-08 23:02:20 -07:00

1 2

83 commits