sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Dam Sunwoo	2c1e344313	cpu: generate SimPoint basic block vector profiles This patch is based on http://reviews.m5sim.org/r/1474/ originally written by Mitch Hayenga. Basic block vectors are generated (simpoint.bb.gz in simout folder) based on start and end addresses of basic blocks. Some comments to the original patch are addressed and hooks are added to create and resume from checkpoints based on instruction counts dictated by external SimPoint analysis tools. SimPoint creation/resuming options will be implemented as a separate patch.	2013-04-22 13:20:31 -04:00
Ali Saidi	c9e4678c16	cpu: fix a switching issue with the o3 cpu. This change fixes the switcheroo test that broke earlier this month. The code that was checking for the pipeline being blocked wasn't checking for a pending translation, only for a icache access.	2013-04-22 13:20:31 -04:00
Nilay Vaish	ac778b1d02	o3cpu: commit: changes interrupt handling Currently the commit stage keeps a local copy of the interrupt object. Since the interrupt is usually handled several cycles after the commit stage becomes aware of it, it is possible that the local copy of the interrupt object may not be the interrupt that is actually handled. It is possible that another interrupt occurred in the interval between interrupt detection and interrupt handling. This patch creates a copy of the interrupt just before the interrupt is handled. The local copy is ignored.	2013-03-29 14:05:26 -05:00
Andreas Hansson	08c1835bef	cpu: Remove CpuPort and use MasterPort in the CPU classes This patch changes the port in the CPU classes to use MasterPort instead of the derived CpuPort. The functions of the CpuPort are now distributed across the relevant subclasses. The port accessor functions (getInstPort and getDataPort) now return a MasterPort instead of a CpuPort. This simplifies creating derivative CPUs that do not use the CpuPort.	2013-03-26 14:46:42 -04:00
Andreas Hansson	2ca42cd626	cpu: Avoid including inorder TLBUnit to avoid gcc LTO bug This patch comments out the inclusion of the inorder TLBUnit which is only used in the 9-stage pipeline. With the TLBUnit present, gcc >= 4.6 in combination with LTO ends up throwing away the definition of the TLBUnit destructor, and consequently fail to link. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53808 for more details about the bug, and http://gcc.gnu.org/ml/gcc/2012-06/msg00397.html for the discussion thread that also touches on similar issues seen with clang.	2013-03-20 06:41:23 -04:00
Andreas Sandberg	fc6f569d94	cpu: Fix state transition bug in the traffic generator The traffic generator used to incorrectly determine the next state in when state 0 had a non-zero probability. Due to the way the next transition was determined, state 0 could never be entered other than as an initial state. This changeset updates the transitition() method to correctly handle such cases and cases where the transition matrix is a 1x1 matrix.	2013-03-12 18:41:29 +01:00
Ali Saidi	f205d83359	cpu: fix a switching issue with the o3 cpu. This change fixes the switcheroo test that broke earlier this month. The code that was checking for the pipeline being blocked wasn't checking for a pending translation, only for a icache access.	2013-03-04 23:33:47 -05:00
Andreas Hansson	a62afd094b	scons: Fix warnings issued by clang 3.2svn (XCode 4.6) This patch fixes the warnings that clang3.2svn emit due to the "-Wall" flag. There is one case of an uninitialised value in the ARM neon ISA description, and then a whole range of unused private fields that are pruned.	2013-02-19 05:56:08 -05:00
Andreas Hansson	319443d42d	scons: Add warning for missing declarations This patch enables warnings for missing declarations. To avoid issues with SWIG-generated code, the warning is only applied to non-SWIG code.	2013-02-19 05:56:07 -05:00
Andreas Hansson	c10098f28b	scons: Fix up numerous warnings about name shadowing This patch address the most important name shadowing warnings (as produced when using gcc/clang with -Wshadow). There are many locations where constructor parameters and function parameters shadow local variables, but these are left unchanged.	2013-02-19 05:56:06 -05:00
Andreas Hansson	5c7ebee434	x86: Move APIC clock divider to Python This patch moves the 16x APIC clock divider to the Python code to avoid the post-instantiation modifications to the clock. The x86 APIC was the only object setting the clock after creation time and this required some custom functionality and configuration. With this patch, the clock multiplier is moved to the Python code and the objects are instantiated with the appropriate clock.	2013-02-19 05:56:06 -05:00
Andreas Hansson	0622f30961	mem: Add predecessor to SenderState base class This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses. There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest.	2013-02-19 05:56:05 -05:00
Andreas Sandberg	3af59ab386	cpu: Document exec trace flags	2013-02-15 17:40:10 -05:00
Geoffrey Blake	ca96e7bff1	cpu: Avoid duplicate entries in tracking structures for writes to misc regs setMiscReg currently makes a new entry for each write to a misc reg without checking for duplicates, this can cause a triggering of the assert if an instruction get replayed and writes to the same misc regs multiple times. This fix prevents duplicate entries and instead updates the value.	2013-02-15 17:40:10 -05:00
Geoffrey Blake	8e79c68936	cpu: Fix rename mis-handling serializing instructions when resource constrained The rename can mis-handle serializing instructions (i.e. strex) if it gets into a resource constrained situation and the serializing instruction has to be placed on the skid buffer to handle blocking. In this situation the instruction informs the pipeline it is serializing and logs that the next instruction must be serialized, but since we are blocking the pipeline defers this action to place the serializing instruction and incoming instructions into the skid buffer. When resuming from blocking, rename will pull the serializing instruction from the skid buffer and the current logic will see this as the "next" instruction that has to be serialized and because of flags set on the serializing instruction, it passes through the pipeline stage as normal and resets rename to non-serializing. This causes instructions to follow the serializing inst incorrectly and eventually leads to an error in the pipeline. To fix this rename should check first if it has to block before checking for serializing instructions.	2013-02-15 17:40:10 -05:00
Matt Horsnell	e88e7d88b9	o3: fix tick used for renaming and issue with range selection Fixes the tick used from rename: - previously this gathered the tick on leaving rename which was always 1 less than the dispatch. This conflated the decode ticks when back pressure built in the pipeline. - now picks up tick on entry. Added --store_completions flag: - will additionally display the store completion tail in the viewer. - this highlights periods when large numbers of stores are outstanding (>16 LSQ blocking) Allows selection by tick range (previously this caused an infinite loop)	2013-02-15 17:40:09 -05:00
Andreas Sandberg	b904bd5437	sim: Add a system-global option to bypass caches Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches. To make memory mode tests cleaner, the following methods are added to the System class: * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed. The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.	2013-02-15 17:40:09 -05:00
Andreas Sandberg	1eec115c31	cpu: Refactor memory system checks CPUs need to test that the memory system is in the right mode in two places, when the CPU is initialized (unless it's switched out) and on a drainResume(). This led to some code duplication in the CPU models. This changeset introduces the verifyMemoryMode() method which is called by BaseCPU::init() if the CPU isn't switched out. The individual CPU models are responsible for calling this method when resuming from a drain as this code is CPU model specific.	2013-02-15 17:40:08 -05:00
Andreas Sandberg	7f1263f144	cpu: Make checker CPUs inherit from CheckerCPU in the Python hierarchy Checker CPUs currently don't inherit from the CheckerCPU in the Python object hierarchy. This has two consequences: * It makes CPU model discovery from the Python world somewhat complicated as there is no way of testing if a CPU is a checker. * Parameters are duplicated in the checker configuration specification. This changeset makes all checker CPUs inherit from the base checker CPU class.	2013-02-15 17:40:08 -05:00
Andreas Sandberg	7cd1fd4324	cpu: Add CPU metadata om the Python classes The configuration scripts currently hard-code the requirements of each CPU. This is clearly not optimal as it makes writing new configuration scripts painful and adding new CPU models requires existing scripts to be updated. This patch adds the following class methods to the base CPU and all relevant CPUs: * memory_mode -- Return a string describing the current memory mode (invalid/atomic/timing). * require_caches -- Does the CPU model require caches? * support_take_over -- Does the CPU support CPU handover?	2013-02-15 17:40:08 -05:00
Ali Saidi	4412046041	cpu: include set in o3/commit_impl. While the majority of compilers seemed to pickup set from else where, one version of gcc 4.7 complains, so explictly add it.	2013-02-15 17:40:08 -05:00
Ali Saidi	7ae06a3b3b	cpu: fix case with o3 cpu blocking and unblocking decode in cycle Fix a case in the O3 CPU where the decode stage blocks and unblocks in a single cycle sending both signals to fetch which causes an assert or worse. The previous check could never work before since the status was set to Blocked before a test for the status being Unblocking was executed.	2013-02-15 17:40:08 -05:00
Ali Saidi	b84bd3028c	cpu: Fix a livelock in the o3 cpu. Check if an instruction just enabled interrupts and we've previously had an interrupt pending that was not handled because interrupts were subsequently disabled before the pipeline reached a place to handle the interrupt. In that case squash now to make sure the interrupt is handled.	2013-02-15 17:40:07 -05:00
Nilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)	dbeabedaf0	branch predictor: move out of o3 and inorder cpus This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh	2013-01-24 12:28:51 -06:00
Andrea Pellegrini	11d5ffa108	o3 cpu: fix zero reg problem There was an issue w/ the rename logic, which would assign a previous physical register to the ZeroReg architectural register in x86. This issue was giving problems for instructions squashed in threads w/ ID different from 0, sometimes allowing non-mispredicted instructions to obtain a value different from zero when reading the zeroReg.	2013-01-22 00:13:28 -06:00
Nilay Vaish	fc57ae6401	x86, cpu: corrects 270c9a75e91f, take over decoder on cpu switch The changes made by the changeset 270c9a75e91f do not work well with switching of cpus. The problem is that decoder for the old thread context holds state that is not taken over by the new decoder. This patch adds a takeOverFrom() function to Decoder class in each ISA. Except for x86, functions in other ISAs are blank. For x86, the function copies state from the old decoder to the new decoder.	2013-01-22 00:10:10 -06:00
Joel Hestness	1429d21244	O3 IEW: Make incrWb and decrWb clearer Move the increment/decrement of wbOutstanding outside of the comparison in incrWb and decrWb in the IEW. This also fixes a compiler bug with gcc 4.4.7, which incorrectly optimizes "-- ==" as "-=".	2013-01-19 15:14:54 -06:00
Nilay Vaish	5b6f972750	ruby: remove calls to g_system_ptr->getTime() This patch further removes calls to g_system_ptr->getTime() where ever other clocked objects are available for providing current time.	2013-01-17 13:10:12 -06:00
Nilay Vaish	f7c0ba406e	base simple cpu: removes commented out code about cache ops	2013-01-12 22:11:16 -06:00
Nilay Vaish	25ec278a0b	x86: Changes to decoder, corrects 9376 The changes made by the changeset 9376 were not quite correct. The patch made changes to the code which resulted in decoder not getting initialized correctly when the state was restored from a checkpoint. This patch adds a startup function to each ISA object. For x86, this function sets the required state in the decoder. For other ISAs, the function is empty right now.	2013-01-12 22:09:48 -06:00
Andreas Sandberg	009970f59b	cpu: Unify the serialization code for all of the CPU models Cleanup the serialization code for the simple CPUs and the O3 CPU. The CPU-specific code has been replaced with a (un)serializeThread that serializes the thread state / context of a specific thread. Assuming that the thread state class uses the CPU-specific thread state uses the base thread state serialization code, this allows us to restore a checkpoint with any of the CPU models.	2013-01-07 13:05:52 -05:00
Andreas Sandberg	e09e9fa279	cpu: Flush TLBs on switchOut() This changeset inserts a TLB flush in BaseCPU::switchOut to prevent stale translations when doing repeated switching. Additionally, the TLB flushing functionality is exported to the Python to make debugging of switching/checkpointing easier. A simulation script will typically use the TLB flushing functionality to generate a reference trace. The following sequence can be used to simulate a handover (this depends on how drain is implemented, but is generally the case) between identically configured CPU models: m5.drain(test_sys) [ cpu.flushTLBs() for cpu in test_sys.cpu ] m5.resume(test_sys) The generated trace should normally be identical to a trace generated when switching between identically configured CPU models or checkpointing and resuming.	2013-01-07 13:05:48 -05:00
Andreas Sandberg	1814a85a05	cpu: Rewrite O3 draining to avoid stopping in microcode Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	9e8003148f	cpu: Make sure that a drained atomic CPU isn't executing ucode Currently, the atomic CPU can be in the middle of a microcode sequence when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * Since curMacroStaticInst is populated when executing microcode, repeated switching between CPUs executing microcode leads to incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset fixes a bug where the multiple switches to the same atomic CPU sometimes corrupts the target state because of dangling pointers to the currently executing microinstruction. Note: This changeset moves tick event descheduling from switchOut() to drain(), which makes timing consistent between just draining a system and draining /and/ switching between two atomic CPUs. This makes debugging quite a lot easier (execution traces get the same timing), but the latency of the last instruction before a drain will not be accounted for correctly (it will always be 1 cycle). Note 2: This changeset removes so_state variable, the locked variable, and the tickEvent from checkpoints since none of them contain state that needs to be preserved across checkpoints. The so_state is made redundant because we don't use the drain state variable anymore, the lock variable should never be set when the system is drained, and the tick event isn't scheduled.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	f9bcf46371	cpu: Make sure that a drained timing CPU isn't executing ucode Currently, the timing CPU can be in the middle of a microcode sequence or multicycle (stayAtPC is true) instruction when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * If stayAtPC is true we might execute half of an instruction twice when restoring a checkpoint or switching CPUs, which leads to an incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset also fixes a bug where the timing CPU sometimes switches out with while stayAtPC is true, which corrupts the target state after a CPU switch or checkpoint. Note: This changeset removes the so_state variable from checkpoints since the drain state isn't used anymore.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	52ff37caa3	cpu: Fix broken thread context handover The thread context handover code used to break when multiple handovers were performed during the same quiesce period. Previously, the thread contexts would assign the TC pointer in the old quiesce event to the new TC. This obviously broke in cases where multiple switches were performed within the same quiesce period, in which case the TC pointer in the quiesce event would point to an old CPU. The new implementation deschedules pending quiesce events in the old TC and schedules a new quiesce event in the new TC. The code has been refactored to remove most of the code duplication.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	fca4fea769	cpu: Fix O3 LSQ debug dumping constness and formatting	2013-01-07 13:05:46 -05:00
Andreas Sandberg	8db27aa230	cpu: Fix broken squashAfter implementation in O3 CPU Commit can currently both commit and squash in the same cycle. This confuses other stages since the signals coming from the commit stage can only signal either a squash or a commit in a cycle. This changeset changes the behavior of squashAfter so that it commits all instructions, including the instruction that requested the squash, in the first cycle and then starts to squash in the next cycle.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	a2077ccf02	o3 cpu: Remove unused variables	2013-01-07 13:05:45 -05:00
Andreas Sandberg	2cfe62adc4	cpu: Rename defer_registration->switched_out The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	f7da0fddd1	cpu: Remove unused params.hh header file in inorder CPU	2013-01-07 13:05:45 -05:00
Andreas Sandberg	a7e0cbeb36	cpu: Introduce sanity checks when switching between CPUs This patch introduces the following sanity checks when switching between CPUs: * Check that the set of new and old CPUs do not overlap. Having an overlap between the set of new CPUs and the set of old CPUs is currently not supported. Doing such a switch used to result in the following assertion error: BaseCPU::takeOverFrom(BaseCPU): \ Assertion `!new_itb_port->isConnected()' failed. Check that all new CPUs are in the switched out state. * Check that all old CPUs are in the switched in state.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	901258c22b	cpu: Correctly call parent on switchOut() and takeOverFrom() This patch cleans up the CPU switching functionality by making sure that CPU models consistently call the parent on switchOut() and takeOverFrom(). This has the following implications that might alter current functionality: * The call to BaseCPU::switchout() in the O3 CPU is moved from signalDrained() (!) to switchOut(). * A call to BaseSimpleCPU::switchOut() is introduced in the simple CPUs.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	4ae02295d5	cpu: Unify SimpleCPU and O3 CPU serialization code The O3 CPU used to copy its thread context to a SimpleThread in order to do serialization. This was a bit of a hack involving two static SimpleThread instances and a magic constructor that was only used by the O3 CPU. This patch moves the ThreadContext serialization code into two global procedures that, in addition to the normal serialization parameters, take a ThreadContext reference as a parameter. This allows us to reuse the serialization code in all ThreadContext implementations.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	6daada2701	cpu: Initialize the O3 pipeline from startup() The entire O3 pipeline used to be initialized from init(), which is called before initState() or unserialize(). This causes the pipeline to be initialized from an incorrect thread context. This doesn't currently lead to correctness problems as instructions fetched from the incorrect start PC will be squashed a few cycles after initialization. This patch will affect the regressions since the O3 CPU now issues its first instruction fetch to the correct PC instead of 0x0.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	e2dad8236a	cpu: Implement a flat register interface in thread contexts Some architectures map registers differently depending on their mode of operations. There is currently no architecture independent way of accessing all registers. This patch introduces a flat register interface to the ThreadContext class. This interface is useful, for example, when serializing or copying thread contexts.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	17b47d35e1	arch: Move the ISA object to a separate section After making the ISA an independent SimObject, it is serialized automatically by the Python world. Previously, this just resulted in an empty ISA section. This patch moves the contents of the ISA to that section and removes the explicit ISA serialization from the thread contexts, which makes it behave like a normal SimObject during serialization. Note: This patch breaks checkpoint backwards compatibility! Use the cpt_upgrader.py utility to upgrade old checkpoints to the new format.	2013-01-07 13:05:42 -05:00
Andreas Sandberg	7eb0fb8b6e	cpu: Check that the memory system is in the correct mode This patch adds checks to all CPU models to make sure that the memory system is in the correct mode at startup and when resuming after a drain. Previously, we only checked that the memory system was in the right mode when resuming. This is inadequate since this is a configuration error that should be detected at startup as well as when resuming. Additionally, since the check was done using an assert, it wasn't performed when NDEBUG was set (e.g., the fast target).	2013-01-07 13:05:41 -05:00
Andreas Hansson	ccb6c64047	cpu: Share the send functionality between traffic generators This patch moves the packet creating and sending to a member function in the shared base class to avoid code duplication.	2013-01-07 13:05:37 -05:00
Andreas Hansson	1da209140c	cpu: Add support for protobuf input for the trace generator This patch adds support for reading input traces encoded using protobuf according to what is done in the CommMonitor. A follow-up patch adds a Python script that can be used to convert the previously used ASCII traces to protobuf equivalents. The appropriate regression input is updated as part of this patch.	2013-01-07 13:05:37 -05:00

1 2 3 4 5 ...

1343 commits