sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Marco Elver	9649395f85	cpu, o3: Ignored invalidate causing same-address load reordering In case the memory subsystem sends a combined response with invalidate (e.g. ReadRespWithInvalidate), we cannot ignore the invalidate part of the response. If we were to ignore the invalidate part, under certain circumstances this effectively leads to reordering of loads to the same address which is not permitted under any memory consistency model implemented in gem5. Consider the case where a later load's address is computed before an earlier load in program order, and is therefore sent to the memory subsystem first. At some point the earlier load's address is computed and in doing so correctly marks the later load as a possibleLoadViolation. In the meantime some other node writes and sends invalidations to all other nodes. The invalidation races with the later load's ReadResp, and arrives before ReadResp and is deferred. Upon receipt of the ReadResp, the response is changed to ReadRespWithInvalidate, and sent to the CPU. If we ignore the invalidate part of the packet, we let the later load read the old value of the address. Eventually the earlier load's ReadResp arrives, but with new data. As there was no invalidate snoop (sunk into the ReadRespWithInvalidate), and if we did not process the invalidate of the ReadRespWithInvalidate, we obtain a load reordering. A similar scenario can be constructed where the earlier load's address is computed after ReadRespWithInvalidate arrives for the younger load. In this case hitExternalSnoop needs to be set to true on the ReadRespWithInvalidate, so that upon knowing the address of the earlier load, checkViolations will cause the later load to be squashed. Finally we must account for the case where both loads are sent to the memory subsystem (reordered), a snoop invalidate arrives and correctly sets the later loads fault to ReExec. However, before the CPU processes the fault, the later load's ReadResp arrives and the writeback discards the outstanding fault. We must add a check to ensure that we do not skip any unprocessed faults.	2014-12-02 06:08:03 -05:00
Andreas Hansson	74bbe20141	cpu: Always mask the snoop address when performing lock check Ensure the snoop address check is always using a cache-block aligned address. This patch updates Alpha and Mips to match the other ISAs.	2014-12-02 06:08:00 -05:00
Stephan Diestelhorst	810349a8a7	cpu: Move packet deallocation to recvTimingResp in the O3 CPU Move the packet deallocations in the O3 CPU so that the completeDataAccess deals only with the LSQ specific parts and the generic recvTimingResp frees the packet in all other cases.	2014-12-02 06:07:58 -05:00
Andreas Hansson	5c84157c29	mem: Relax packet src/dest check and shift onus to crossbar This patch allows objects to get the src/dest of a packet even if it is not set to a valid port id. This simplifies (ab)using the bridge as a buffer and latency adapter in situations where the neighbouring MemObjects are not crossbars. The checks that were done in the packet are now shifted to the crossbar where the fields are used to index into the port arrays. Thus, the carrier of the information is not burdened with checking, and the crossbar can check not only that the destination is set, but also that the port index is within limits.	2014-12-02 06:07:56 -05:00
Andreas Hansson	ea5ccc7041	mem: Clean up packet data allocation This patch attempts to make the rules for data allocation in the packet explicit, understandable, and easy to verify. The constructor that copies a packet is extended with an additional flag "alloc_data" to enable the call site to explicitly say whether the newly created packet is short-lived (a zero-time snoop), or has an unknown life-time and therefore should allocate its own data (or copy a static pointer in the case of static data). The tricky case is the static data. In essence this is a copy-avoidance scheme where the original source of the request (DMA, CPU etc) does not ask the memory system to return data as part of the packet, but instead provides a pointer, and then the memory system carries this pointer around, and copies the appropriate data to the location itself. Thus any derived packet actually never copies any data. As the original source does not copy any data from the response packet when arriving back at the source, we must maintain the copy of the original pointer to not break the system. We might want to revisit this one day and pay the price for a few extra memcpy invocations. All in all this patch should make it easier to grok what is going on in the memory system and how data is actually copied (or not).	2014-12-02 06:07:54 -05:00
Andreas Hansson	f012166bb6	mem: Cleanup Packet::checkFunctional and hasData usage This patch cleans up the use of hasData and checkFunctional in the packet. The hasData function is unfortunately suggesting that it checks if the packet has a valid data pointer, when it does in fact only check if the specific packet type is specified to have a data payload. The confusion led to a bug in checkFunctional. The latter function is also tidied up to avoid name overloading.	2014-12-02 06:07:52 -05:00
Andreas Hansson	a2ee51f631	mem: Make the requests carried by packets const This adds a basic level of sanity checking to the packet by ensuring that a request is not modified once the packet is created. The only issue that had to be worked around is the relaying of software-prefetches in the cache. The specific situation is now solved by first copying the request, and then creating a new packet accordingly.	2014-12-02 06:07:50 -05:00
Andreas Hansson	fa60d5cf27	mem: Make Request getters const This patch tidies up the Request class, making all getters const. The odd one out is incAccessDepth which is called by the memory system as packets carry the request around. This is also const to enable the packet to hold on to a const Request.	2014-12-02 06:07:48 -05:00
Andreas Hansson	3d6ec81e66	mem: Add checks and explanation for assertMemInhibit usage	2014-12-02 06:07:46 -05:00
Andreas Hansson	41846cb61b	mem: Assume all dynamic packet data is array allocated This patch simplifies how we deal with dynamically allocated data in the packet, always assuming that it is array allocated, and hence should be array deallocated (delete[] as opposed to delete). The only uses of dataDynamic was in the Ruby testers. The ARRAY_DATA flag in the packet is removed accordingly. No defragmentation of the flags is done at this point, leaving a gap in the bit masks. As the last part the patch, it renames dataDynamicArray to dataDynamic.	2014-12-02 06:07:43 -05:00
Andreas Hansson	5df96cb690	mem: Remove redundant Packet::allocate calls This patch cleans up the packet memory allocation confusion. The data is always allocated at the requesting side, when a packet is created (or copied), and there is never a need for any device to allocate any space if it is merely responding to a paket. This behaviour is in line with how SystemC and TLM works as well, thus increasing interoperability, and matching established conventions. The redundant calls to Packet::allocate are removed, and the checks in the function are tightened up to make sure data is only ever allocated once. There are still some oddities in the packet copy constructor where we copy the data pointer if it is static (without ownership), and allocate new space if the data is dynamic (with ownership). The latter is being worked on further in a follow-on patch.	2014-12-02 06:07:41 -05:00
Andreas Hansson	0706a25203	mem: Use const pointers for port proxy write functions This patch changes the various write functions in the port proxies to use const pointers for all sources (similar to how memcpy works). The one unfortunate aspect is the need for a const_cast in the packet, to avoid having to juggle a const and a non-const data pointer. This design decision can always be re-evaluated at a later stage.	2014-12-02 06:07:38 -05:00
Andreas Hansson	9779ba2e37	mem: Add const getters for write packet data This patch takes a first step in tightening up how we use the data pointer in write packets. A const getter is added for the pointer itself (getConstPtr), and a number of member functions are also made const accordingly. In a range of places throughout the memory system the new member is used. The patch also removes the unused isReadWrite function.	2014-12-02 06:07:36 -05:00
Andreas Hansson	25bfc24999	mem: Remove null-check bypassing in Packet::getPtr This patch removes the parameter that enables bypassing the null check in the Packet::getPtr method. A number of call sites assume the value to be non-null. The one odd case is the RubyTester, which issues zero-sized prefetches(!), and despite being reads they had no valid data pointer. This is now fixed, but the size oddity remains (unless anyone object or has any good suggestions). Finally, in the Ruby Sequencer, appropriate checks are made for flush packets as they have no valid data pointer.	2014-12-02 06:07:34 -05:00
Omar Naji	0e63d2cd62	mem: Add a GDDR5 DRAM config This patch adds a first cut GDDR5 config to accommodate the users combining gem5 and GPUSim. The config is based on a SK Hynix datasheet, and the Nvidia GTX580 specification. Someone from the GPUSim user-camp should tweak the default page-policy and static frontend and backend latencies.	2014-12-02 06:07:32 -05:00
Andreas Hansson	b0aa5a326d	stats: Bump stats after static analysis fixes Fixing up the uninitialised values changes two of the x86 Linux boot regressions slightly.	2014-11-24 09:03:39 -05:00
Andreas Hansson	d66b14ca61	misc: Another round of static analysis fixups Mostly addressing uninitialised members.	2014-11-24 09:03:38 -05:00
Alexandru Dutu	1f539f13c3	mem: Page Table map api modification This patch adds uncacheable/cacheable and read-only/read-write attributes to the map method of PageTableBase. It also modifies the constructor of TlbEntry structs for all architectures to consider the new attributes.	2014-11-23 18:01:09 -08:00
Alexandru Dutu	c11bcb8119	mem: Multi Level Page Table bug fix The multi level page table was giving false positives for already mapped translations. This patch fixes the bogus behavior.	2014-11-23 18:01:09 -08:00
Alexandru Dutu	e4859fae5b	mem: Page Table long lines Trimmed down all the lines greater than 78 characters.	2014-11-23 18:01:09 -08:00
Alexandru Dutu	a19cf6943b	config, kvm: Enabling KvmCPU in SE mode This patch modifies se.py such that it can now use kvm cpu model.	2014-11-23 18:01:08 -08:00
Alexandru Dutu	f743bdcb69	x86: Segment initialization to support KvmCPU in SE This patch sets up low and high privilege code and data segments and places them in the following order: cs low, ds low, ds, cs, in the GDT. Additionally, a syscall and page fault handler for KvmCPU in SE mode are defined. The order of the segment selectors in GDT is required in this manner for interrupt handling to work properly. Segment initialization is done for all the thread contexts.	2014-11-23 18:01:08 -08:00
Alexandru Dutu	adbaa4dfde	kvm, x86: Adding support for SE mode execution This patch adds methods in KvmCPU model to handle KVM exits caused by syscall instructions and page faults. These types of exits will be encountered if KvmCPU is run in SE mode.	2014-11-23 18:01:08 -08:00
Alexandru Dutu	335514dfdc	cpuid, x86: Enabling more features in CPUid Adding more features in the CPUid with the purpose of supporting running the KvmCPU in SE mode.	2014-11-23 18:01:08 -08:00
Steve Reinhardt	252a463b6b	Backed out prior changeset f9fb64a72259 Back out use of importlib to avoid implicitly creating dependency on Python 2.7.	2014-11-23 18:00:47 -08:00
Gabe Black	12243a3835	config: ruby: Get rid of an "eval" and an "exec" operating on generated code. We can get the same result using importlib.	2014-11-23 05:55:26 -08:00
Gabe Black	2d2a5aa410	x86: Update stats for the new Linux delay port.	2014-11-21 17:22:19 -08:00
Gabe Black	8bbfb1b39d	x86: pc: Put a stub IO device at port 0xed which the kernel can use for delays. There was already a stub device at 0x80, the port traditionally used for an IO delay. 0x80 is also the port used for POST codes sent by firmware, and that may have prompted adding this port as a second option.	2014-11-21 17:22:02 -08:00
Nilay Vaish	708e80d9bb	configs: small fix to ruby portion of fs.py and se.py In fs.py the io port controller was being attached to the iobus multiple times. This should be done only once. In se.py, the the option use_map was being set which no longer exists.	2014-11-18 19:17:29 -06:00
Gabe Black	b5fd6050a2	dev: Use fixed size member variables to describe fixed size PL111 registers.	2014-11-18 02:38:23 -08:00
Gabe Black	a08cfd797b	vnc: Add a conversion function for bgr888.	2014-11-17 01:45:42 -08:00
Gabe Black	aceeecb192	x86: Fix setting segment bases in real mode. The data size used for actually writing the base value for the segment was the default size, but really it should set the entire value without any possible truncation.	2014-11-17 01:00:53 -08:00
Gabe Black	f8603fa120	x86: Fix some bugs in the real mode far jmp instruction. The far pointer should be shifted right to get the selector value, not left. Also, when calculating the width of the offset, the wrong register was used in one spot.	2014-11-17 00:20:01 -08:00
Gabe Black	7739c24fbe	x86: APIC: Only set deliveryStatus if our IPI is going somewhere. Otherwise the IPI which isn't sent will never arrive, and the deliveryStatus bit will never be cleared.	2014-11-17 00:19:07 -08:00
Gabe Black	79e7ca307e	x86: APIC: Fix the getRegArrayBit function. The getRegArrayBit function extracts a bit from a series of registers which are treated as a single large bit array. A previous change had modified the logic which figured out which bit to extract from ">> 5" to "% 5" which seems wrong, especially when other, similar functions were changed to use "% 32".	2014-11-17 00:17:06 -08:00
Gabe Black	994c44035d	x86: Update the stats for the x86 FS o3 boot test.	2014-11-17 00:16:36 -08:00
Gabe Black	d228db1143	x86: Fix the CPUID Long Mode Address Size function. The value in EAX has an 8 bit field for the linear address size and one for the physical address size when calling that function. A recent change implemented it but returned 0xff for both of those fields. That implies that linear and physical addresses are 255 bits wide which is wrong. When using the KVM CPU model this causes an error, presumably because some of those bits are actually reserved, or the CPU or kernel realizes 255 bits is a bad value. This change makes those values 48.	2014-11-16 23:12:42 -08:00
Andrew Bardsley	27b7b9e561	config: Fix checkpoint restore in C++ config example This patch fixes the checkpoint restore option in the example of C++ configuration (util/cxx_config). The fix introduces a call to config_manager->startup() (which calls startup on all SimObjects managed by that manager) to replicate the loop of SimObject::startup calls in src/python/m5/simulate.py::simulate guarded by need_startup. As util/cxx_config/main.cc is a C++ analogue of src/python/mt/simulate.py, it should make a similar set of calls.	2014-11-14 03:54:02 -05:00
Andreas Hansson	481eb6ae80	arm: Fixes based on UBSan and static analysis Another churn to clean up undefined behaviour, mostly ARM, but some parts also touching the generic part of the code base. Most of the fixes are simply ensuring that proper intialisation. One of the more subtle changes is the return type of the sign-extension, which is changed to uint64_t. This is to avoid shifting negative values (undefined behaviour) in the ISA code.	2014-11-14 03:53:51 -05:00
Andreas Hansson	9ffe0e7ba6	mem: Clarify unit of DRAM controller buffer size	2014-11-14 03:53:48 -05:00
Andreas Hansson	4583a5114a	stats: Bump regressions to match latest changes Updates after timezone hick-up and sorting of dictionary items in the SimObject.	2014-11-12 09:05:25 -05:00
Mitch Hayenga	9d6d8e02aa	mem: Delete unused variable in Garnet NetworkLink With recent changes OSX clang compilation fails due to an unused variable.	2014-11-12 09:05:23 -05:00
Ali Saidi	b6f32253dd	arm: Fix timing wakeup with LLSC	2014-11-12 09:05:22 -05:00
Andreas Hansson	7d05895120	sim: Sort SimObject descendants and ports This patch fixes a number of occurences where the sorting order of the objects was implementation defined.	2014-11-12 09:05:21 -05:00
Andreas Hansson	cc336ecb5e	base: Revert 9277177eccff and use getenv/setenv for UTC time This patch reverts changeset 9277177eccff which does not do what it was intended to do. In essence, we go back to implementing mkutctime much like the non-standard timegm extension.	2014-11-12 09:05:20 -05:00
Nilay Vaish	02b4605da0	stats: changes to x86 o3 fs and sparc fs regression tests.	2014-11-11 14:17:10 -06:00
Marc Orr	bf80734b2c	x86 isa: This patch attempts an implementation at mwait. Mwait works as follows: 1. A cpu monitors an address of interest (monitor instruction) 2. A cpu calls mwait - this loads the cache line into that cpu's cache. 3. The cpu goes to sleep. 4. When another processor requests write permission for the line, it is evicted from the sleeping cpu's cache. This eviction is forwarded to the sleeping cpu, which then wakes up. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-11-06 05:42:22 -06:00
Marc Orr	3947f88d0f	tests: A test program for the new mwait implementation. This is a simple test program for the new mwait implemenation. It is uses m5threads to create to threads of execution in syscall emulation mode that interact using the mwait instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-11-06 05:42:21 -06:00
Andrew Lukefahr	bd32d55a2c	cpu: Minor Draining Bug Fixes a bug where Minor drains in the midst of committing a conditional store. While committing a conditional store, lastCommitWasEndOfMacroop is true (from the previous instruction) as we still haven't finished the conditional store. If a drain occurs before the cache response, Minor would check just lastCommitWasEndOfMacroop, which was true, and set drainState=DrainHaltFetch, which increases the streamSeqNum. This caused the conditional store to be squashed when the memory responded and it completed. However, to the memory the store succeeded, while to the instruction sequence it never occurred. In the case of an LLSC, the instruction sequence will replay the squashed STREX, which will fail as the cache is no longer in LLSC. Then the instruction sequence will loop back to a LDREX, which receives the updated (incorrect) value. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-11-06 05:42:21 -06:00
Nilay Vaish	a75e27b4a6	stats: updates due to changes to ruby	2014-11-06 05:42:21 -06:00

1 2 3 4 5 ...

10576 commits