CUDA: reduce cn-v8 shared mem footprint
Use only the half AES matrix and compute the other half in place. This PR increases the possible occupancy.
Please register or sign in to comment
Use only the half AES matrix and compute the other half in place. This PR increases the possible occupancy.