SChernykh
50bdaba526
Fixed Debug build in Visual Studio
2020-10-27 14:08:36 +01:00
SChernykh
4bac3e7695
Fix 32-bit compilation
2020-10-07 18:19:35 +02:00
xmrig
59bd6d4187
Merge pull request #1878 from SChernykh/dev
...
Fixed ARM compilation
2020-10-07 23:11:39 +07:00
SChernykh
166c011d37
Fixed ARM compilation
2020-10-07 18:09:42 +02:00
xmrig
1289942567
Merge pull request #1876 from SChernykh/dev
...
RandomX: added `huge-pages-jit` config parameter
2020-10-07 22:48:57 +07:00
SChernykh
44dcded866
RandomX: added huge-pages-jit config parameter
...
Set to false by default, gives 0.2% boost on Ryzen 7 3700X with 16 threads, but hashrate might be unstable on Ryzen between launches. Use with caution.
2020-10-07 17:42:55 +02:00
cohcho
a705ab775b
RandomX: align args
...
tempHash/output must be 16-byte aligned for randomx_calculate_hash{,_first,_next}
2020-10-07 14:47:18 +00:00
xmrig
116fb3d3f9
Merge pull request #1864 from cohcho/soft_aes_optimization2
...
soft_aes: fix previous optimization
2020-10-05 12:20:41 +07:00
cohcho
5f0f2506e8
soft_aes: fix previous optimization
...
Previously removed unrolled variant is faster on some CPUs
Some CPUs are faster with added unrolled variant
The best variant depends on number of threads on some CPUs
2020-10-04 14:47:58 +00:00
SChernykh
ebf259fa7c
RandomX: removed rx/loki
...
Loki forks to PoS on October 9th.
2020-10-02 17:02:52 +02:00
XMRig
d45bb24a32
Renamed WITH_SSE to WITH_SSE4_1 and make it work on all platforms.
2020-10-01 11:00:08 +07:00
SChernykh
7b4f768114
RandomX: optimized soft AES code
...
Unrolled loop was 5-10% slower depending on CPU.
2020-09-29 21:22:11 +02:00
xmrig
dfab81e9fa
Merge pull request #1858 from SChernykh/dev
...
RandomX: removed duplicate constants in Blake2b
2020-09-27 16:51:03 +07:00
SChernykh
3025c265e8
RandomX: removed duplicate constatns in Blake2b
2020-09-27 11:50:08 +02:00
xmrig
ee603ab9e2
Merge pull request #1857 from SChernykh/dev
...
RandomX: isolate SSE4.1 code to fix crashes on old CPUs
2020-09-27 16:47:56 +07:00
SChernykh
84f8a0dc54
RandomX: isolate SSE4.1 code to fix crashes on old CPUs
2020-09-27 11:46:32 +02:00
cohcho
9be3b69109
soft_aes: fix previous optimization
...
the best order of hash/fill/prefetch depends on hw/soft AES
only hw AES is faster after previous optimization
2020-09-25 15:26:19 +00:00
SChernykh
1e26e58660
Fix for ARM compilation
2020-09-23 11:44:08 +02:00
SChernykh
9768bf65d1
RandomX improved performance of GCC compiled binaries
...
JIT compilator was slower compared to MSVC compiled binary. Up to +0.1% speedup on rx/wow in Linux.
2020-09-22 13:48:11 +02:00
SChernykh
891a46382e
RandomX: AES improvements
...
- A bit faster hardware AES code when compiled with MSVC
- More reliable software AES benchmark
2020-09-21 17:51:08 +02:00
SChernykh
c7476e076b
RandomX refactoring, moved more stuff to compile time
...
Small x86 JIT compiler speedup.
2020-09-18 20:51:25 +02:00
SChernykh
8d1168385a
RandomX: returned old soft AES impl and auto-select between the two
2020-09-15 20:48:27 +02:00
SChernykh
a05393727c
RandomX: added performance profiler (for developers)
...
Also optimized Blake2b SSE4.1 code size to avoid code cache pollution.
2020-09-12 23:07:52 +02:00
SChernykh
4a9db89527
RandomX: added SSE4.1-optimized Blake2b
...
+0.15% on `rx/0`
+0.3% on `rx/wow`
2020-09-10 14:28:40 +02:00
SChernykh
a84b45b1bb
RandomX: added parameter for scratchpad prefetch mode
...
`scratchpad_prefetch_mode` can have 4 values:
0: off
1: use `prefetcht0` instruction (default, same as previous XMRig versions)
2: use `prefetchnta` instruction (faster on Coffee Lake and a few other CPUs)
3: use `mov` instruction
2020-09-04 16:16:07 +02:00
XMRig
72c8404d18
Fix compile warnings.
2020-08-24 10:04:46 +07:00
XMRig
3e4bf8cd6c
Fix compile warning
2020-08-17 06:08:14 +07:00
XMRig
00b4ae9c36
Fixed compile warning and updated build.uv.sh.
2020-08-16 16:03:27 +07:00
SChernykh
5926dee354
RandomX JIT: optimized address mask calculation
2020-08-12 16:45:16 +02:00
SChernykh
5bc89fdc8b
Fixed RandomX initialization for VS debug builds
2020-07-21 10:10:07 +02:00
SChernykh
3d740e81a2
RandomX: tweaked Ryzen code
...
Very small speedup
2020-07-05 16:06:59 +02:00
XMRig
b34e3e1a7b
Remove unused code.
2020-05-04 02:07:38 +07:00
SChernykh
80d944bf82
Optimized RandomX dataset initialization
...
- Use single Argon2 implemenation
- Auto-select the fastest Argon2 implementation for RandomX
2020-05-03 20:44:59 +02:00
XMRig
c18478a6b4
Small cleanups.
2020-05-03 13:38:34 +07:00
SChernykh
bfd017d064
Refactored CFROUND
2020-04-21 15:44:04 +02:00
SChernykh
abb3340cc7
RandomX JIT refactoring
...
- Smaller memory footprint
- A bit faster overall
2020-04-09 14:24:54 +02:00
SChernykh
92810ad761
Fixed VM destruction
2020-04-08 08:31:53 +02:00
SChernykh
39bd3ca1da
Fix off-by-one error
2020-04-07 18:53:08 +02:00
SChernykh
4d0edde66d
Fixed pool lock
2020-04-07 18:48:02 +02:00
SChernykh
69cbfd682a
Use node number instead of affinity
2020-04-07 18:46:22 +02:00
SChernykh
6ae37a9519
Pooled allocation of RandomX VMs
...
+0.5% speedup on Zen2 when the whole L3 cache is used (16 threads on 3700X/3800X, 32 threads on 3950X).
2020-04-07 18:31:35 +02:00
kevacoin
0528ccd01e
Added Keva.
2020-03-04 16:23:33 -08:00
XMRig
616c52f266
#1572 Fix compile warning.
2020-03-01 11:59:53 +07:00
SChernykh
131085be80
Optimized CFROUND
...
Shorter version using BMI2 instructionns
2020-02-21 19:00:58 +01:00
SChernykh
e1b8f52e59
Fixed 32-bit compilation
2020-02-21 16:08:23 +01:00
SChernykh
0caeb41bff
Tuned JIT compiler
...
0.3-0.4% speedup depending on CPU.
2020-02-20 20:59:22 +01:00
SChernykh
ffc9f67751
Crash fix for Bullodzer CPUs
2020-02-02 17:16:59 +01:00
SChernykh
cd763be05b
Fix compile error
2020-01-24 14:09:07 +01:00
SChernykh
42a7194e93
Fix crash on Linux
2020-01-24 13:34:12 +01:00
SChernykh
9f1753cc4f
Optimized CFROUND
2020-01-22 20:11:00 +01:00
SChernykh
d342968211
Added support for BMI2 instructions
2020-01-21 19:44:56 +01:00
SChernykh
f80177cbd3
Optimizations for AMD Bulldozer
...
- Added support for XOP instructions
- Enabled Ryzen code for Bulldozer because it's faster there too
2020-01-15 13:04:26 +01:00
SChernykh
73722ce186
JIT compiler: removed unnecessary memcpy from generateProgram()
2020-01-13 18:00:41 +01:00
SChernykh
eb20dfbc94
JIT compiler tweaks
2020-01-06 13:57:48 +01:00
SChernykh
c9f90e6770
Refactor Ryzen fix to fix compilation issues
2019-12-31 11:55:07 +02:00
XMRig
ac4086b273
Fix build.
2019-12-28 02:00:08 +07:00
SChernykh
3a2941b719
Fix for 1st-gen Ryzen crashes
2019-12-27 12:40:38 +02:00
XMRig
dbb721cb5e
Removed "rx/v" algorithm.
2019-12-26 22:34:19 +07:00
XMRig
22eca8e0d5
Fixed memory allocation checks.
2019-12-25 04:39:21 +07:00
Tony Butler
45412a2ace
Add MoneroV (rx/v) algorithm [based on MoneroOcean/master]
2019-12-18 16:17:22 -07:00
SChernykh
c01c035269
Fixed crash with GCC compiler
2019-12-18 17:32:57 +01:00
SChernykh
f85aba5d21
Fixed AVX detection
2019-12-18 12:20:21 +01:00
SChernykh
f8bf8fddd9
Update jit_compiler_x86_static.S
2019-12-18 09:13:21 +01:00
SChernykh
7459677fd5
Add vzeroupper for processors with AVX
...
To avoid false dependencies on upper 128 bits of YMM registers.
2019-12-18 09:12:25 +01:00
SChernykh
4da37baf8c
RandomSFX (Safex Cash variant) support
2019-12-16 19:36:29 +01:00
SChernykh
ef522f6404
Update jit_compiler_x86_static.S
2019-12-09 20:30:37 +01:00
SChernykh
763691fa4b
More optimizations for Ryzen
2019-12-09 20:29:05 +01:00
XMRig
d32df84ca5
Memory allocation refactoring.
2019-12-08 23:17:39 +07:00
SChernykh
028b335bac
Fix GCC compilation
2019-12-08 16:51:37 +01:00
SChernykh
ffec421408
Fixed indentation
2019-12-08 16:20:46 +01:00
SChernykh
d0df824599
Optimized dataset read for Ryzen CPUs
...
Removed register dependency in dataset read, +0.8% speedup on average.
2019-12-08 16:14:02 +01:00
SChernykh
1fbbae1e4a
Added 1GB hugepages support for Linux
2019-12-05 19:39:47 +01:00
SChernykh
84d7eb05f3
RandomX fixes
...
Intel JCC erratum fix and various other improvements, see more here: https://www.phoronix.com/scan.php?page=article&item=intel-jcc-microcode&num=1
2019-12-01 08:46:35 +01:00
SChernykh
e3f726796b
Use XMRIG_ARMv8 macro
2019-11-15 16:12:26 +01:00
SChernykh
3953568a0e
Fix for 32-bit ARM compilation
2019-11-15 16:00:48 +01:00
SChernykh
472ec1a0e6
Fix function names for clang on Apple
2019-11-12 14:42:21 +01:00
SChernykh
578bebb04d
Prefer sys_icache_invalidate on iOS
...
Also break compilation with error if clear cache is not available
2019-10-18 18:17:57 +02:00
SChernykh
5611249af7
Fixed __builtin___clear_cache detection
2019-10-18 18:04:13 +02:00
SChernykh
0ad992985c
Update jit_compiler_a64.cpp
2019-10-18 16:36:50 +02:00
SChernykh
1a66c3f1a1
Update jit_compiler_a64.cpp
2019-10-18 16:32:01 +02:00
SChernykh
a2ef2fd9d9
Update jit_compiler_a64.cpp
2019-10-18 16:28:49 +02:00
SChernykh
998c55030a
Fixed code cache cleanup on iOS/Darwin
2019-10-18 16:26:15 +02:00
XMRig
5c02cb50da
Fix copy/paste typo.
2019-10-18 21:26:15 +07:00
SChernykh
432addab33
Fix ARM64 code alignemtn
2019-10-18 16:18:45 +02:00
XMRig
10d292092a
#1246 Fixed build on iOS.
2019-10-18 12:02:10 +07:00
SChernykh
c9798ba2e9
Sync with latest RandomX code
...
Fix a possible out-of-bounds access in superscalar generator
2019-10-13 22:13:29 +02:00
SChernykh
2b29a4c20f
RandomX (Arqma variant) support
2019-10-08 19:00:19 +02:00
SChernykh
10f9b29e03
Refactored JIT compiler for x86, small RandomX speedup
2019-10-05 21:40:21 +02:00
SChernykh
1bba25e080
Set scratchpad pointer to null by default
...
To avoid freeing random blocks of memory in some cases.
2019-09-24 08:53:00 +02:00
SChernykh
c6096c3c34
Workaround for a bug in binutils-2.32-1 on ARM
...
ldr/madd instruction sequence makes compiled binary crash, so separate them.
2019-09-23 23:12:40 +02:00
XMRig
cbdf1e6c09
Revert instructions_portable.cpp to avoid warning on gcc compilers.
2019-09-22 00:59:53 +07:00
SChernykh
38f4f4f695
Added JIT compiler for RandomX on ARMv8
2019-09-21 10:10:52 +02:00
XMRig
6f5d175d12
Fix compile warning, mostly struct/class inconsistency.
2019-09-13 18:21:05 +07:00
SChernykh
2322e3bcf7
RandomX: optimized loading from scratchpad
...
Prefetches scratchpad data as soon as possible to calculate data address for the next load.
Up to ~1.4% speedup on Ryzen 7 3700X @ 4.1 GHz, RAM 3200 MHz 14-14-14-28 with optimized sub-timings:
Variant|Before H/S|After H/S
-------|----------|---------
rx/0|8663|8777
rx/wow|9867|10009
rx/loki|8652|8731
2019-09-11 19:10:01 +02:00
SChernykh
dc5843651b
Optimized CFROUND
...
One less micro-op
2019-09-04 20:47:47 +02:00
SChernykh
d3f98ef7bc
RandomX optimizations
...
- Optimized soft AES code, up to +30% hashrate on CPU without AES support
- Added prefetch for the first dataset access, up to +0.1% hashrate
2019-09-04 19:24:12 +02:00
Matt Smith
df973763bb
Fix linker marking entire executable as executable stack
...
See: https://wiki.ubuntu.com/SecurityTeam/Roadmap/ExecutableStacks
See: https://wiki.gentoo.org/wiki/Hardened/GNU_stack_quickstart
2019-08-29 14:12:43 +01:00
SChernykh
0a58781b0c
Reverted intrin_portable.h
2019-08-28 07:20:01 +02:00
SChernykh
8b84d7650b
Optimized RandomX JIT compiler
...
Hashrate improved by 0.5-1.5% depending on RandomX version and CPU.
2019-08-27 20:18:56 +02:00
SChernykh
21a56c9cbf
Updated RandomX
2019-08-27 16:12:13 +02:00