- Dec 29, 2018
-
-
psychocrypt authored
In the current implementation the POW algorithm in dev pool section of a currency will not be taken into account during the binary creation. This PR changes the behavior and allow to create binaries for more than two POW algorihms.
-
- Dec 07, 2018
-
-
psychocrypt authored
new bug fix version
-
- Dec 06, 2018
-
-
psychocrypt authored
Since #2080 bittube2 is broken. - reintroduce special AES function for bittube2
-
- Dec 04, 2018
-
-
MarosM authored
-
- Dec 03, 2018
-
-
psychocrypt authored
The default value for interleave was wrongly set to 50. Remove the value and take the devault from the default constructor instead of side channeling it from the json parser.
-
psychocrypt authored
Cleanup missing change from #2101
-
psychocrypt authored
NVIDIA is using clang as device compiler so the reciprocal optimizations was disabled with #2104. - re-enable optimized reciprocal calculation
-
- Dec 02, 2018
-
-
psychocrypt authored
Add an option to brute force intensity settings and lock in at the intensity with the highest hashrate. - update decumentation of the `interleave` option to mention the side effect with `auto-tune` - disable `interleave` auto adjustment if `auto-tune` is enabled - jconf: add `auto-tune` as optional option
-
psychocrypt authored
- fix broken compile: change used `ULL` to `UL` because `UL` is defined as 64bit - fix memory copy to shared memory via vload8 (somehow it create wrong access)
-
psychocrypt authored
The auto config generates for AMD devices now by default two threads per GPU. - remove the savety 128MiB memory now only from the max available GPU memory not from the avaialble memory for one alloc call - extend the memory documentation in amd.txt
-
psychocrypt authored
Due to a wrong implementation clamp was not working.
-
- Dec 01, 2018
-
-
psychocrypt authored
-
- Nov 30, 2018
-
-
psychocrypt authored
use for non clang (Rocm) OpenCL a optimized reciprocal calculation without lookup table. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
Disable compatibility mode if intensity is a multiple of worksize. In that case enabled compaibility mode will only slow down the miner.
-
- Nov 29, 2018
-
-
LPHuynh authored
-
- Nov 27, 2018
-
-
psychocrypt authored
- `monero` - remove fork from cn-v7 to cn-v8 - remove dev pool fork from cn-v7 to cn-8
-
psychocrypt authored
If two threads are using the same GPU device the start time of each hash round is optimized based on the average time needed to calculate a bunch of hashes. This way to optimize the hash rate was first introduced by @SChernykh. This implementation based on the implementation in xmrig but differen in the details. - introduce a new config option `interleave` - implement thread interleaving
-
- Nov 21, 2018
-
-
psychocrypt authored
Use `mul24` to speedup the scratchpad index calculation. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
Add new striding index where the memory is chunked by the size of the work group (worksize). Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
small optimization for non cryptonight_v8 algorithms
-
- Nov 20, 2018
-
-
SChernykh authored
- optimize division
-
SChernykh authored
optimize cryptonight_heavy diff
-
psychocrypt authored
- change a few 64bit variables into 32bit. - provide defines type quallified
-
- Nov 19, 2018
-
-
psychocrypt authored
- remove useless `clFinish` - avoid download num threads for skein&co and start always as much threads as in all other kernel (terminate useless threads)
-
psychocrypt authored
Reduce local memory foot print to increase the occupancy. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
port optimizations from OpenCL. Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
psychocrypt authored
Use only the half AES matrix and compute the other half in place. This PR increases the possible occupancy.
-
psychocrypt authored
port OpenCl optimized division to CUDA Co-authored-by:
SChernykh <sergey.v.chernykh@gmail.com>
-
- Nov 17, 2018
-
-
psychocrypt authored
If CUDA is loaded before AMD but no CUDA is available it can be happen that the embadded OpenCL code is empty. This is only an issue if the binary is builded static on a different system.
-
- Nov 16, 2018
-
-
psychocrypt authored
define shared memory in the outer scope
-
SChernykh authored
x-ref: https://github.com/xmrig/xmrig-amd/pull/192
-
SChernykh authored
- optimize kernel cn0 and cn2 - optimize vast int math - use more 32bit variables Co-authored-by:
psychocrypt <psychocryptHPC@gmail.com>
-
- Nov 14, 2018
-
-
psychocrypt authored
bumo version for next release
-
- Nov 06, 2018
-
-
SChernykh authored
optimize the devision in cryptonight_heavy and cryptonight_haven import of https://github.com/xmrig/xmrig-amd/pull/185/commits/5d9b9334654df25cea7707f667990fd1577ed290
-
- Oct 25, 2018
-
-
psychocrypt authored
-
- Oct 24, 2018
-
-
psychocrypt authored
In the cuda backend for monero we start always twice as much threads as needed. Those threads are than removed after the AES matrix is copied to the shared memory. Never the less it is the result of an copy past bug. - start correct number of threads for `monero`
-
- Oct 23, 2018
-
-
Jason Rhinelander authored
-
- Oct 17, 2018
-
-
fireice-uk authored
Co-authored-by:
Hans Kristian Rosbach <hk-git@circlestorm.org>
-
- Oct 16, 2018
-
-
psychocrypt authored
-
psychocrypt authored
Fix the fix from #1945. The initial fix produces invalid results.
-