Commit graph

23 commits

Author SHA1 Message Date
Alexandru Dutu c8cf71f1a0 gpu-compute: Added method to compute the actual workgroup size
This patch adds a method to the Wavefront class to compute the actual workgroup
size. This can be different from the maximum workgroup size specified when
launching the kernel through the NDRange object. Current solution is still not
optimal, as we are computing these for each wavefront and the dispatcher also
needs to have this information and can't actually call
Wavefront::computeActuallWgSz before the wavefronts are being created. A long
term solution would be to have a Workgroup class that deals with all these
details.
2016-10-04 13:03:52 -04:00
Tony Gutierrez 84f9747688 gpu-compute: fix typo in GPUDispatcher 2016-09-16 14:47:19 -04:00
Alexandru Dutu bd65ec0744 gpu-compute: Adding context serialization methods to Wavefront
This patch adds methods to serialize the context of a particular wavefront
to the simulated system memory. Context serialization is used when a wavefront
is preempeted (i.e. context switch).
2016-09-16 12:32:36 -04:00
Alexandru Dutu e9b14d5111 gpu-compute: Refactoring Wavefront::dynWaveId 2016-09-16 12:31:46 -04:00
Alexandru Dutu 498d0e63e5 gpu-compute: Adding vector register file debug messages
This patch introduces DPRINTFs for reading and writing to and from the vector
register file.
2016-09-16 12:30:05 -04:00
Alexandru Dutu 7918376450 gpu-compute: Changing reconvergenceStack type
std::stack has no iterators, therefore the reconvergence stack can't be
iterated without poping elements off. We will be using std::list instead to be
able to iterate for saving and restoring purposes.
2016-09-16 12:29:01 -04:00
Alexandru Dutu d5c8c5d3db gpu-compute: Adding ioctl for HW context size
Adding runtime support for determining the memory required by a SIMD engine
when executing a particular wavefront.
2016-09-16 12:27:56 -04:00
Alexandru Dutu 589e13a23b gpu-compute: Wavefront refactoring
Renaming members of the Wavefront class in accordance with the style guide.
2016-09-16 12:26:52 -04:00
Alexandru Dutu e9fe1b838b gpu-compute: Remove WFContext
WFContext struct is currently unused and it has been rendered not useful in
saving and restoring the context of a Wavefront. Wavefront class should be
sufficient for that purpose and the runtime can figure out the memory size
it will need to allocate for a Wavefront through an IOCTL.
2016-09-16 12:26:03 -04:00
Michael LeBeane 6a668d0c0c gpu-compute: Fix bug with return in cfg
Connecting basic blocks would stop too early in kernels where ret was not the
last instruction.  This patch allows basic blocks after the ret instruction
to be properly connected.
2016-09-13 23:11:20 -04:00
jkalamat 3724fb15fa gpu-compute: parametrize Wavefront size
Eliminate the VSZ constant that defined the Wavefront size (in numbers of work
items); replaced it with a parameter in the GPU.py configuration script.
Changed all data structures dependent on the Wavefront size to be dynamically
sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at
initialization time.
2016-06-09 11:24:55 -04:00
David Guillen Fandos 70798b1ba0 stats: Fixing regStats function for some SimObjects
Fixing an issue with regStats not calling the parent class method
for most SimObjects in Gem5. This causes issues if one adds new
stats in the base class (since they are never initialized properly!).

Change-Id: Iebc5aa66f58816ef4295dc8e48a357558d76a77c
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-06-06 17:16:43 +01:00
Tuan Ta b6d20c25c3 gpu-compute: Fixed a bug in global memory pipeline
Added a condition when inflightStores is incremented to prevent a deadlock
caused by many memory fence requests generated by a CU
2016-06-03 16:20:08 -04:00
Tony Gutierrez 7dad4377ec gpu-compute: fix bug in GPUDynInst::isScalarRegister() 2016-05-16 15:36:24 -04:00
Tony Gutierrez bb83fa2051 gpu-compute: fix spacing in GPUDynInst ctor 2016-05-06 17:00:54 -04:00
Tony Gutierrez 4f3139e696 gpu-compute: fix uninitialized member bug in GPUDynInst
the n_reg field in the GPUDynInst is not currently set in the constructor.
if it is not set externally, there are assertion failures that may occur
if the random value it gets is just right. here we set it to 0 by default.
2016-05-06 16:44:38 -04:00
Mitch Hayenga c75ff71139 mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system
as the ContextID is what is used for the purposes of locks/wakeups.
Since we allocate sequential ContextIDs for each thread on MT-enabled
CPUs, ThreadID is unnecessary as the CPUs can identify the requesting
thread through sideband info (SenderState / LSQ entries) or ContextID
offset from the base ContextID for a cpu.

This is a re-spin of 20264eb after the revert (bd1c6789) and includes
some fixes of that commit.
2016-04-07 09:30:20 -05:00
jkalamat 1ab75c3ee2 gpu-compute: remove unused variable from scoreboard check stage
appease clang by removing the unused private member variable,
'numGlbMemPipes', from the scoreboard check stage
2016-03-21 11:26:23 -04:00
Steve Reinhardt f6cd7a4bb7 syscall_emul: move mmapGrowsDown() to LiveProcess
The mmapGrowsDown() method was a static method on the OperatingSystem
class (and derived classes), which worked OK for the templated syscall
emulation methods, but made it hard to access elsewhere.  This patch
moves the method to be a virtual function on the LiveProcess method,
where it can be overridden for specific platforms (for now, Alpha).

This patch also changes the value of mmapGrowsDown() from being false
by default and true only on X86Linux32 to being true by default and
false only on Alpha, which seems closer to reality (though in reality
most people use ASLR and this doesn't really matter anymore).

In the process, also got rid of the unused mmap_start field on
LiveProcess and OperatingSystem mmapGrowsUp variable.
2016-03-17 10:29:32 -07:00
Andreas Hansson 8faeec44a6 base: Fix gpu-compute output stream creation
Match changes in output stream.
2016-03-04 20:14:10 -05:00
John Kalamatianos a28a234069 gpu: fix bugs with MemFence, Flat Instrs and Resource utilization
Both Memory Fence is now flagged as Global Memory only to avoid resource
oversubscribing.
Flat instructions now check for Shared Memory resource busy to avoid
oversubscribing resources.
All WaitClass resources now use cycles (not ticks) to register the number
of pipe stages between Scoreboard and Execute to be consistent with
instruction scheduling logic which always used clock cycles.
2016-02-18 10:42:03 -05:00
Tony Gutierrez 9a0f1be21f gpu-compute: remove brig_object.hh from hsa_object.cc
brig_object.hh is specific to the HSAIL ISA, and hence should not be
included in ISA-agnostic code.
2016-02-17 11:46:02 -05:00
Tony Gutierrez 1a7d3f9fcb gpu-compute: AMD's baseline GPU model 2016-01-19 14:28:22 -05:00