Commits · a39e63fab2a17385f8def05ebc3814f0921c5962 · Recolic / azure-cloud-mining-script

Dec 02, 2018
- Merge pull request #2107 from psychocrypt/fix-clamp · a39e63fa
  fireice-uk authored Dec 02, 2018
```
fix clamp implementation
```
  a39e63fa
- fix clamp implementation · b606304b
  psychocrypt authored Dec 02, 2018
```
Due to a wrong implementation clamp was not working.
```
  b606304b
Dec 01, 2018
- Merge pull request #2104 from psychocrypt/topic-optimizev2Reciprocal · a8d09606
  fireice-uk authored Dec 01, 2018
```
OpenCL: opimize reciprocal calculation
```
  a8d09606
- Merge pull request #2097 from LPHuynh/cn_superfast · 19331413
  fireice-uk authored Dec 01, 2018
```
Please add Cryptonight-Superfast
```
  19331413
- Merge pull request #2102 from psychocrypt/topic-compModeOpti · 63359986
  fireice-uk authored Dec 01, 2018
```
OpenCL: comp mode optimization
```
  63359986
Nov 30, 2018

OpenCL: opimize reciprocal calculation · bc91088a

psychocrypt authored Nov 30, 2018



use for non clang (Rocm) OpenCL a optimized reciprocal calculation without lookup table.

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>

bc91088a

OpenCL: comp mode optimization · 307dda83

psychocrypt authored Nov 30, 2018

Disable compatibility mode if intensity is a multiple of worksize. In that case enabled compaibility mode will only slow down the miner.

307dda83

Nov 29, 2018
- Added Cryptonight-Superfast · 053190bb
  LPHuynh authored Nov 29, 2018
  
  053190bb
- Merge pull request #2101 from psychocrypt/topic-updateCurrencyAlgorithms · 1cb4f5e7
  fireice-uk authored Nov 29, 2018
```
update currencies
```
  1cb4f5e7
Nov 28, 2018
- Merge pull request #2100 from psychocrypt/topic-threadInterleaving · 0ca76d96
  fireice-uk authored Nov 28, 2018
```
OpenCL: thread interleaving
```
  0ca76d96
Nov 27, 2018

update currencies · 159e6959

psychocrypt authored Nov 28, 2018

- `monero` - remove fork from cn-v7 to cn-v8
- remove dev pool fork from cn-v7 to cn-8

159e6959

OpenCL: thread interleaving · d8316f7d

psychocrypt authored Nov 27, 2018

If two threads are using the same GPU device the start time of each hash round is optimized based on the average time needed to calculate a bunch of hashes.

This way to optimize the hash rate was first introduced by @SChernykh. This implementation based on the implementation in xmrig but differen in the details.

- introduce a new config option `interleave`
- implement thread interleaving

d8316f7d

Nov 22, 2018
- Merge pull request #2089 from psychocrypt/topic-OpenCLOptimizeStridedIndex1 · 76f0de7f
  fireice-uk authored Nov 22, 2018
```
OpenCl: optimize strided index 1
```
  76f0de7f
Nov 21, 2018
- Merge pull request #2088 from psychocrypt/topic-newStridedIndex · ff204b22
  fireice-uk authored Nov 21, 2018
```
OpenCL: add strided_index 3
```
  ff204b22
- OpenCl: optimize strided index 1 · 39fa7c62
  psychocrypt authored Nov 21, 2018
```
Use `mul24` to speedup the scratchpad index calculation.

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>
```
  39fa7c62
- OpenCL: add strided_index 3 · 3c9442ce
  psychocrypt authored Nov 21, 2018
```
Add new striding index where the memory is chunked by the size of the work group (worksize).

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>
```
  3c9442ce
- Merge pull request #2087 from psychocrypt/topic-cn1Optimization · 11387f7c
  fireice-uk authored Nov 21, 2018
```
OpenCL: cnv8 optimization
```
  11387f7c
- Merge pull request #2086 from psychocrypt/topic-amdOptimizeCNv8Div · c6846a82
  fireice-uk authored Nov 21, 2018
```
OpenCl: optimize cn-v8 div
```
  c6846a82
- Merge pull request #2084 from psychocrypt/topic-amd32bit · b06747f9
  fireice-uk authored Nov 21, 2018
```
AMD: use more 32bit operations
```
  b06747f9
- Merge pull request #2081 from psychocrypt/topic-reduceAPIOverhead · 7b7d4492
  fireice-uk authored Nov 21, 2018
```
OpenCL reduce API overhead
```
  7b7d4492
- OpenCL: cn1 optimization · 33e5825c
  psychocrypt authored Nov 21, 2018
```
small optimization for non cryptonight_v8 algorithms
```
  33e5825c
Nov 20, 2018
- Merge pull request #2085 from psychocrypt/topic-amdOptimizeDiv · 1b2b4d30
  fireice-uk authored Nov 20, 2018
```
OpenCL: optimize cn-heavy div
```
  1b2b4d30
- OpenCl: optimize cn-v8 div · bff5b000
  SChernykh authored Nov 20, 2018
```
- optimize division
```
  bff5b000
- Merge pull request #2078 from psychocrypt/topic-cudaReduceSharedMemFootprint · b7ffd6b9
  fireice-uk authored Nov 20, 2018
```
CUDA: reduce cn-v8 shared mem footprint
```
  b7ffd6b9
- Merge pull request #2080 from psychocrypt/topic-reduceSharedMemUsage · 26830090
  fireice-uk authored Nov 20, 2018
```
OpenCL: reduce local mem footprint
```
  26830090
- Merge pull request #2079 from psychocrypt/topic-cnv8OptimizeDiv · a7e30eb5
  fireice-uk authored Nov 20, 2018
```
CUDA: optimize cn-v8 div
```
  a7e30eb5
- OpenCL: optimize cn-heavy div · 9813e1c0
  SChernykh authored Nov 20, 2018
```
optimize cryptonight_heavy diff
```
  9813e1c0
- AMD: use more 32bit operations · f40c54e3
  psychocrypt authored Nov 20, 2018
```
- change a few 64bit variables into 32bit.
- provide defines type quallified
```
  f40c54e3
- Merge pull request #2077 from psychocrypt/topic-optimizeCUDAHeavyDiv · 6a95f0bb
  fireice-uk authored Nov 20, 2018
```
CUDA: optimize cn-heavy div
```
  6a95f0bb
Nov 19, 2018

OpenCL reduce API overhead · 6c563c9d

psychocrypt authored Nov 19, 2018

- remove useless `clFinish`
- avoid download num threads for skein&co and start always as much threads as in all other kernel (terminate useless threads)

6c563c9d

OpenCL: reduce local mem footprint · 6f283928

psychocrypt authored Nov 19, 2018



Reduce local memory foot print to increase the occupancy.

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>

6f283928

CUDA: optimize cn-v8 div · 4a7fde13

psychocrypt authored Nov 19, 2018



port optimizations from OpenCL.

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>

4a7fde13

CUDA: reduce cn-v8 shared mem footprint · ae8ba7f0

psychocrypt authored Nov 19, 2018

Use only the half AES matrix and compute the other half in place.
This PR increases the possible occupancy.

ae8ba7f0

CUDA: optimize cn-heavy div · 0c1d805a

psychocrypt authored Nov 19, 2018



port OpenCl optimized division to CUDA

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>

0c1d805a

Nov 17, 2018
- Merge pull request #2071 from psychocrypt/topic-changeBackendLoadOrder · 447fef4b
  fireice-uk authored Nov 17, 2018
```
change load order for backends
```
  447fef4b
- Merge pull request #2070 from psychocrypt/topic-amdRefactoring · 1755f5e8
  fireice-uk authored Nov 17, 2018
```
Topic amd refactoring
```
  1755f5e8
- change load order for backends · cf959a1c
  psychocrypt authored Nov 17, 2018
```
If CUDA is loaded before AMD but no CUDA is available it can be happen that the embadded OpenCL code is empty.
This is only an issue if the binary is builded static on a different system.
```
  cf959a1c
Nov 16, 2018

fix ROCm compile · 18dbff68
psychocrypt authored Nov 17, 2018
```
define shared memory in the outer scope
```
18dbff68
optimize cn-heavy div · e6177f1c
SChernykh authored Nov 17, 2018
```
x-ref: https://github.com/xmrig/xmrig-amd/pull/192
```
e6177f1c

Optimize OpenCl · 28ef8e3d

SChernykh authored Nov 16, 2018



- optimize kernel cn0 and cn2
- optimize vast int math
- use more 32bit variables

Co-authored-by: psychocrypt <psychocryptHPC@gmail.com>

28ef8e3d