- Feb 07, 2019
-
-
psychocrypt authored
@xmrig provided the information that the driver 19.2.1 for vega also create invalid results if pragma unroll is used for the groestl algo.
-
- Feb 02, 2019
-
-
psychocrypt authored
Windows driver creates wrong code if unroll is used.
-
- Feb 01, 2019
-
-
psychocrypt authored
Use the algorithm names from `cryptonight.hpp` instead if number within the OpenCL kernel.
-
- Jan 30, 2019
-
-
psychocrypt authored
- fix broken trutle coin - fix non cn_gpu algorithms
-
fireice-uk authored
Co-authored-by:
psychocrypt <psychocryptHPC@gmail.com> Co-authored-by:
fireice-uk <fireice-uk@users.noreply.github.com>
-
- Jan 25, 2019
-
-
Brandon Lehmann authored
-
- Dec 06, 2018
-
-
psychocrypt authored
Since #2080 bittube2 is broken. - reintroduce special AES function for bittube2
-
- Dec 03, 2018
-
-
psychocrypt authored
NVIDIA is using clang as device compiler so the reciprocal optimizations was disabled with #2104. - re-enable optimized reciprocal calculation
-
- Dec 02, 2018
-
-
psychocrypt authored
- fix broken compile: change used `ULL` to `UL` because `UL` is defined as 64bit - fix memory copy to shared memory via vload8 (somehow it create wrong access)
-
- Nov 30, 2018
-
-
psychocrypt authored
use for non clang (Rocm) OpenCL a optimized reciprocal calculation without lookup table. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
- Nov 29, 2018
-
-
LPHuynh authored
-
- Nov 21, 2018
-
-
psychocrypt authored
Use `mul24` to speedup the scratchpad index calculation. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
Add new striding index where the memory is chunked by the size of the work group (worksize). Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
small optimization for non cryptonight_v8 algorithms
-
- Nov 20, 2018
-
-
SChernykh authored
- optimize division
-
psychocrypt authored
- change a few 64bit variables into 32bit. - provide defines type quallified
-
- Nov 19, 2018
-
-
psychocrypt authored
- remove useless `clFinish` - avoid download num threads for skein&co and start always as much threads as in all other kernel (terminate useless threads)
-
psychocrypt authored
Reduce local memory foot print to increase the occupancy. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
- Nov 16, 2018
-
-
psychocrypt authored
define shared memory in the outer scope
-
SChernykh authored
- optimize kernel cn0 and cn2 - optimize vast int math - use more 32bit variables Co-authored-by:
psychocrypt <psychocryptHPC@gmail.com>
-
- Nov 06, 2018
-
-
SChernykh authored
optimize the devision in cryptonight_heavy and cryptonight_haven import of https://github.com/xmrig/xmrig-amd/pull/185/commits/5d9b9334654df25cea7707f667990fd1577ed290
-
- Oct 10, 2018
-
-
psychocrypt authored
In the current implementation the bit align is using signed integer which results in pulling in ones in the case the sign bit is set. - cast to unsigned integer before using bitshift
-
- Oct 05, 2018
-
-
psychocrypt authored
With rocm we fighted very long with invalid shares. This is now solved with rocm 1.9 and this tiny fix. It is not fully clear where a memory optimization is kicking in and break the kernel `Groestl` if the variables `M` and `H` are not `volatile`. The performance ill not change with this fix. The fix is tested with rocm 1.9 with a VEGA64 and a RX570
-
- Oct 04, 2018
-
-
Tony Butler authored
-
- Sep 30, 2018
-
-
psychocrypt authored
add cpu implementation for the final monero POW
-
- Sep 19, 2018
-
-
psychocrypt authored
- fix code style issues - fix spelling issue - fix asm to support newer clang versions
-
psychocrypt authored
add option `unroll` for OpenCL to allow better tuning the main POW kernel.
-
psychocrypt authored
Create a special pass for NVIDIA GPUs to load memory chunks first into the shared memory. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
- implement cryptonight_v8 - update auto adjust to fit the special requirements of `cryptonight_v8` - add fast math integer implementation for `sqrt`, `reciprocal` and `division` Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
If the first bit of the nonce is `1` (this is very often if we use a nicehash pool) than it could be that some OpenCL implementations handle the 64bit representation of the 32bit nonce on the device side as signed integer. During a right bitshift we pull wrong ones from the wrong higher part of the 64bit nonce representation into the 32bit part of the nonce. The result will be that the computed share is invalid. - explicit cast the nonce on the device to `uint` to avoid any side effects
-
- Jul 14, 2018
-
-
psychocrypt authored
- add cryptonight_heavy derivate cryptonight_bittube2 - add coin bittube - remove coin ipbc because this coin is now called bittube
-
- Jul 08, 2018
-
-
psychocrypt authored
- explicit loop unrolling based on changes in @imperdin fork https://github.com/imperdin/xmr-stak/blob/master/xmrstak/backend/amd/amd_gpu/opencl/cryptonight.cl
-
- Jun 10, 2018
-
-
havenprotocol authored
- update pools.txt - add new algorithm `cryptonight_haven` - update all backends
-
- Jun 07, 2018
-
-
psychocrypt authored
- rename cryptonight_fast to cryptonight_masari - set dev pool to cryptonight_monero
-
- Jun 05, 2018
-
-
gnock authored
-
- May 03, 2018
-
-
Tony Butler authored
-
- May 01, 2018
-
-
psychocrypt authored
solve #1494 - add algorithm `cryptonight_v7_stellite` (internal named: `cryptonight_stellite`)
-
- Apr 22, 2018
-
-
psychocrypt authored
- add algorithm `cryptonight_lite_v7_xor` - update documentation
-
- Apr 08, 2018
-
-
psychocrypt authored
- remove version numbers within the kernel - create seperate program context for each mining algorithm - remove kernel `cn1_monero` is now integrated in `cn1` - remname `cnX` kernel in `cnX + algorithmNumber`
-
- Apr 01, 2018
-
-
psychocrypt authored
fix #1218 - remove inline function with ugly macro :-(
-