Commits · a39ee0886cf613b70490164c4f33b066230709bc · Recolic / azure-cloud-mining-script

Dec 29, 2018

OpenCL: allow more than two algorithms · a39ee088

psychocrypt authored 6 years ago

In the current implementation the POW algorithm in dev pool section of a
currency will not be taken into account during the binary creation.
This PR changes the behavior and allow to create binaries for more than two POW algorihms.

a39ee088

Dec 07, 2018
- version increase to 2.7.1 · 44976485
  psychocrypt authored 6 years ago
```
new bug fix version
```
  44976485
Dec 06, 2018

fix bittube2 · e01eebc2

psychocrypt authored 6 years ago

Since #2080 bittube2 is broken.

- reintroduce special AES function for bittube2

e01eebc2

Dec 04, 2018
- Grammar fix · a7bdd603
  MarosM authored 6 years ago
  
  Unverified
  
  a7bdd603
Dec 03, 2018

fix default interleave value · 05b4976d

psychocrypt authored 6 years ago

The default value for interleave was wrongly set to 50.

Remove the value and take the devault from the default constructor instead of side channeling it from the json parser.

05b4976d

remove usage of cn_v7 if cn_v8 is enabled · 54b71cae
psychocrypt authored 6 years ago
```
Cleanup missing change from #2101
```
54b71cae

OpenCL: enable cn_v8 optimization for NVIDIA · ab19d370

psychocrypt authored 6 years ago

NVIDIA is using clang as device compiler so the reciprocal optimizations was disabled with #2104.

- re-enable optimized reciprocal calculation

ab19d370

Dec 02, 2018

OpenCL: auto tuning option · af87b408

psychocrypt authored 6 years ago

Add an option to brute force intensity settings and lock in at the intensity with the highest hashrate.

- update decumentation of the `interleave` option to mention the side effect with `auto-tune`
- disable `interleave` auto adjustment if `auto-tune` is enabled
- jconf: add `auto-tune` as optional option

af87b408

OpenCl: fix NVIDIA · 1b27f0f3

psychocrypt authored 6 years ago

- fix broken compile: change used `ULL` to `UL` because `UL` is defined as 64bit
- fix memory copy to shared memory via vload8 (somehow it create wrong access)

1b27f0f3

OpenCL: auto config two threads per GPU · e46226fa

psychocrypt authored 6 years ago

The auto config generates for AMD devices now by default two threads per GPU.

- remove the savety 128MiB memory now only from the max available GPU memory not from the avaialble memory for one alloc call
- extend the memory documentation in amd.txt

e46226fa

fix clamp implementation · b606304b
psychocrypt authored 6 years ago
```
Due to a wrong implementation clamp was not working.
```
b606304b

Dec 01, 2018
- increase version to 2.7.0 · e69f101a
  psychocrypt authored 6 years ago
  
  e69f101a
Nov 30, 2018

OpenCL: opimize reciprocal calculation · bc91088a

psychocrypt authored 6 years ago


use for non clang (Rocm) OpenCL a optimized reciprocal calculation without lookup table.

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>

bc91088a

OpenCL: comp mode optimization · 307dda83

psychocrypt authored 6 years ago

Disable compatibility mode if intensity is a multiple of worksize. In that case enabled compaibility mode will only slow down the miner.

307dda83

Nov 29, 2018
- Added Cryptonight-Superfast · 053190bb
  LPHuynh authored 6 years ago
  
  053190bb
Nov 27, 2018

update currencies · 159e6959

psychocrypt authored 6 years ago

- `monero` - remove fork from cn-v7 to cn-v8
- remove dev pool fork from cn-v7 to cn-8

159e6959

OpenCL: thread interleaving · d8316f7d

psychocrypt authored 6 years ago

If two threads are using the same GPU device the start time of each hash round is optimized based on the average time needed to calculate a bunch of hashes.

This way to optimize the hash rate was first introduced by @SChernykh. This implementation based on the implementation in xmrig but differen in the details.

- introduce a new config option `interleave`
- implement thread interleaving

d8316f7d

Nov 21, 2018

OpenCl: optimize strided index 1 · 39fa7c62

psychocrypt authored 6 years ago


Use `mul24` to speedup the scratchpad index calculation.

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>

39fa7c62

OpenCL: add strided_index 3 · 3c9442ce

psychocrypt authored 6 years ago


Add new striding index where the memory is chunked by the size of the work group (worksize).

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>

3c9442ce

OpenCL: cn1 optimization · 33e5825c
psychocrypt authored 6 years ago
```
small optimization for non cryptonight_v8 algorithms
```
33e5825c

Nov 20, 2018
- OpenCl: optimize cn-v8 div · bff5b000
  SChernykh authored 6 years ago
```
- optimize division
```
  bff5b000
- OpenCL: optimize cn-heavy div · 9813e1c0
  SChernykh authored 6 years ago
```
optimize cryptonight_heavy diff
```
  9813e1c0
- AMD: use more 32bit operations · f40c54e3
  psychocrypt authored 6 years ago
```
- change a few 64bit variables into 32bit.
- provide defines type quallified
```
  f40c54e3
Nov 19, 2018

OpenCL reduce API overhead · 6c563c9d

psychocrypt authored 6 years ago

- remove useless `clFinish`
- avoid download num threads for skein&co and start always as much threads as in all other kernel (terminate useless threads)

6c563c9d

OpenCL: reduce local mem footprint · 6f283928

psychocrypt authored 6 years ago


Reduce local memory foot print to increase the occupancy.

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>

6f283928

CUDA: optimize cn-v8 div · 4a7fde13

psychocrypt authored 6 years ago


port optimizations from OpenCL.

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>

4a7fde13

CUDA: reduce cn-v8 shared mem footprint · ae8ba7f0

psychocrypt authored 6 years ago

Use only the half AES matrix and compute the other half in place.
This PR increases the possible occupancy.

ae8ba7f0

CUDA: optimize cn-heavy div · 0c1d805a

psychocrypt authored 6 years ago


port OpenCl optimized division to CUDA

Co-authored-by: SChernykh <sergey.v.chernykh@gmail.com>

0c1d805a

Nov 17, 2018

change load order for backends · cf959a1c

psychocrypt authored 6 years ago

If CUDA is loaded before AMD but no CUDA is available it can be happen that the embadded OpenCL code is empty.
This is only an issue if the binary is builded static on a different system.

cf959a1c

Nov 16, 2018

fix ROCm compile · 18dbff68
psychocrypt authored 6 years ago
```
define shared memory in the outer scope
```
18dbff68
optimize cn-heavy div · e6177f1c
SChernykh authored 6 years ago
```
x-ref: https://github.com/xmrig/xmrig-amd/pull/192
```
e6177f1c

Optimize OpenCl · 28ef8e3d

SChernykh authored 6 years ago


- optimize kernel cn0 and cn2
- optimize vast int math
- use more 32bit variables

Co-authored-by: psychocrypt <psychocryptHPC@gmail.com>

28ef8e3d

Nov 14, 2018
- version increase to 2.6.0 · 6ac129fe
  psychocrypt authored 6 years ago
```
bumo version for next release
```
  6ac129fe
Nov 06, 2018

AMD: speedup cryptonight_heavy division · bfb3243c

SChernykh authored 6 years ago

optimize the devision in cryptonight_heavy and cryptonight_haven

import of https://github.com/xmrig/xmrig-amd/pull/185/commits/5d9b9334654df25cea7707f667990fd1577ed290

bfb3243c

Oct 25, 2018
- update version to 2.5.2 · c5b7c80b
  psychocrypt authored 6 years ago
  
  c5b7c80b
Oct 24, 2018

NVIDIA: fix wrong number of threads · 954296ed

psychocrypt authored 6 years ago

In the cuda backend for monero we start always twice as much threads as needed.
Those threads are than removed after the AES matrix is copied to the shared memory.
Never the less it is the result of an copy past bug.

- start correct number of threads for `monero`

954296ed

Oct 23, 2018
- Update for upcoming Graft CNv2 fork · a4a96570
  Jason Rhinelander authored 6 years ago
  
  a4a96570
Oct 17, 2018
- Add console warning on GPU error. · 924abda0
  fireice-uk authored 6 years ago
```
Co-authored-by: Hans Kristian Rosbach <hk-git@circlestorm.org>
```
  924abda0
Oct 16, 2018
- update version to 2.5.1 · dd906d5a
  psychocrypt authored 6 years ago
  
  dd906d5a
- fix AMD driver 14 · 6fc6e3a5
  psychocrypt authored 6 years ago
```
Fix the fix from #1945. The initial fix produces invalid results.
```
  6fc6e3a5