Skip to content
Snippets Groups Projects
  1. Nov 30, 2018
    • psychocrypt's avatar
      OpenCL: comp mode optimization · 307dda83
      psychocrypt authored
      Disable compatibility mode if intensity is a multiple of worksize. In that case enabled compaibility mode will only slow down the miner.
      307dda83
  2. Nov 27, 2018
    • psychocrypt's avatar
      OpenCL: thread interleaving · d8316f7d
      psychocrypt authored
      If two threads are using the same GPU device the start time of each hash round is optimized based on the average time needed to calculate a bunch of hashes.
      
      This way to optimize the hash rate was first introduced by @SChernykh. This implementation based on the implementation in xmrig but differen in the details.
      
      - introduce a new config option `interleave`
      - implement thread interleaving
      d8316f7d
  3. Nov 21, 2018
  4. Nov 20, 2018
  5. Nov 19, 2018
  6. Nov 17, 2018
    • psychocrypt's avatar
      change load order for backends · cf959a1c
      psychocrypt authored
      If CUDA is loaded before AMD but no CUDA is available it can be happen that the embadded OpenCL code is empty.
      This is only an issue if the binary is builded static on a different system.
      cf959a1c
  7. Nov 16, 2018
  8. Nov 06, 2018
  9. Oct 24, 2018
    • psychocrypt's avatar
      NVIDIA: fix wrong number of threads · 954296ed
      psychocrypt authored
      In the cuda backend for monero we start always twice as much threads as needed.
      Those threads are than removed after the AES matrix is copied to the shared memory.
      Never the less it is the result of an copy past bug.
      
      - start correct number of threads for `monero`
      954296ed
  10. Oct 16, 2018
  11. Oct 15, 2018
    • psychocrypt's avatar
      fix broken AMD OpenCL compile · 2a0d565b
      psychocrypt authored
      The AMD compiler for OpenCL shipped with the driver 14XX is broken
      and can not compile xmr-stak since the monero v8 changes are introduced.
      
      - workaround a simple compare.
      - add new device define `OPENCL_DRIVER_MAJOR`
      2a0d565b
  12. Oct 11, 2018
    • psychocrypt's avatar
      NVIDIA: support for multiple CUDA libs · 732b0e41
      psychocrypt authored
      Allow to ship the miner with multiple cuda backends those depends on different driver versions.
      This will allow to support Turing/Volta and old Fermi GPU within one release.
      
      - add support to search for the first working CUDA backend
      - add some more messages to support better debugging (if a user has some issues)
      732b0e41
  13. Oct 10, 2018
    • SChernykh's avatar
      NVIDIA: tweak `get_reciprocal` · b1504b36
      SChernykh authored
      - remove helper array to perform division
      - tweak `get_reciprocal`
      b1504b36
    • psychocrypt's avatar
      NVIDIA: rename config option `comp_mode` · bd4a4c94
      psychocrypt authored
      The name `comp_mode` for a memoy load pattern if a bad choosen name.
      Therefore I changed it to `mem_mode` which also gives use the possibility
      to add new mode later if needed.
      
      - rename `comp_mode` to `mem_mode`
      - fix documentation
      bd4a4c94
    • psychocrypt's avatar
      fix right bitshift in `amd_bitalign` · b4387ac0
      psychocrypt authored
      In the current implementation the bit align is using signed integer which results in pulling in
      ones in the case the sign bit is set.
      
      - cast to unsigned integer before using bitshift
      b4387ac0
    • psychocrypt's avatar
      CUDA: fix invalid results · ed2168b4
      psychocrypt authored
      If `comp_mode` is false the results on a windows platform will be invalid.
      The reason for that is that `ulong4` is in windows 16byte and in linux 32byte.
      
      thx @xmrig for finding and solving the issue
      
      fix #1873
      ed2168b4
  14. Oct 08, 2018
  15. Oct 07, 2018
    • psychocrypt's avatar
      fix crash with monero and strided_index · 1c0ef154
      psychocrypt authored
      Strided index 1 is not allowed for cryptonight_v8 and monero.
      In the case the dev pool is set to monero and the user tuned there settings for
      an other currency the miner will crash if strided index or memChunk is not
      fitting the requirement to mine monero.
      This PR detects wrong configurations and will set strided index and memChunk to a valid
      value but only for cryptonight_v8. The user pool settings will only be changed if monero or
      cryptonight_v8 is selected.
      1c0ef154
    • psychocrypt's avatar
      OpenCL: fix definition range for unroll · 746037d8
      psychocrypt authored
      fix #1870
      
      - remove zero from the valod definition range for the loop unroll option
      746037d8
  16. Oct 06, 2018
  17. Oct 05, 2018
    • psychocrypt's avatar
      fix invalid shares · 8e1e7447
      psychocrypt authored
      With rocm we fighted very long with invalid shares. This is now solved with rocm 1.9 and
      this tiny fix.
      It is not fully clear where a memory optimization is kicking in and break the kernel `Groestl` if the variables `M` and `H` are not `volatile`.
      The performance ill not change with this fix.
      
      The fix is tested with rocm 1.9 with a VEGA64 and a RX570
      8e1e7447
    • psychocrypt's avatar
      CUDA: tine cryptonight_v8 · 99a12cb6
      psychocrypt authored
      Read memory in bigger chunks per thread to increase the used memory bandwith.
      Use for Kepla and Fermi GPUs the old autosuggestion instead of the new settings for cryptonight_v8.
      99a12cb6
    • psychocrypt's avatar
      add cpu family and model detection · 21ce0385
      psychocrypt authored
      
      Helper functions to select the asm version based on the number of used hashes per threads and the family name of the cpu.
      
      - use the noew cpu type functions to fix the wrong AMD family detection in `autoAdjust.hpp`
      - allow to set the asm version to `auto`
      - rename asm option `intel` to `intel_avx`
      - rename asm option `ryzen` to `amd_avx`
      
      Co-authored-by: default avatarfireice-uk <fireice-uk@users.noreply.github.com>
      21ce0385
Loading