GPU threading optimization for frequent work updates #4

cdetrio · 2017-04-23T19:06:53Z

No description provided.

chfast

This works without issues on Nvidia OpenCL. It can be merged now or can wait for incoming changes.

chfast · 2017-04-25T19:13:45Z

Can you rebase this?

pilsy · 2017-06-29T04:18:10Z

libethcore/EthashGPUMiner.cpp

+		//if (m_abort || m_owner->shouldStop())
+		//	return (m_aborted = true);
+		//return false;
+		// TODO: still need a way to stop the GPU thread


Does this still need to be solved?

Yes, that's still needed. For example, if the miner's internet connection goes down then maybe there should be a timeout so the GPU stops mining when there's no new work packages coming in. Currently it will just continue to mine uncles indefinitely.

The change's that i was doing up doesn't quite fit in perfectly with your changes, i am changing the way the kernels are run from the current way. Instead of running 1 kernel then running 2nd kernel before reading the results of 1st, then swap/swap... changing it over to use one main kernel, and one that halts/aborts the job.

So rather than having the kernel finishing and coming back to CPU so often, when a job comes in that miner is created and starts working, it won't ever stop working until you abort it's execution - the idea being that on the CPU you are notified of newWork and simply fire off the abort kernel to kill off the current running one, then queue up this job and fire it off.

How i'm accomplishing this with the kernels:

Create two command queues in a context.

Create two kernels, one to do the work and another to halt execution. Each kernel has access to a shared global buffer.

Load the first kernel into queue1.

Load the second kernel into queue2 when we want to halt execution.

(Alternatively we could use an out-of-order queue and load the second kernel into the same command queue to halt execution -- But i think that would be more trouble because we have to then be very careful and use clFinish/clFlush as required/necessary. However it may be a more natural way of writing/doing this... what do you think?

@pilsy I don't quite understand what you're describing, but I also have limited knowledge of openCL kernels. I just posted an explanation of the optimization in this PR. Perhaps you could expand a bit on your scheme, in the convo thread?

The only other code I haven't yet pushed is some profiling lines. I'm hoping to finally push the profiling code over the next few days, and double check the improvement.

To simplify: the main kernel is while(true) { if (abort); break; ...}, the second one is just abort = true;.

But I don't get why you need the second kernel to set to abort flag. A buffer shared with CPU is not enough?

Back to the problem, I think the miner thread should just have a flag to stop (maybe it already has). The decision when to stop should be delegated to the network thread. E.g. if the stratum thread is not able to get the work for some time it should stop the miner thread. Does it make sense?

cdetrio · 2017-06-29T21:40:15Z

The idea here is to reduce the "switching cost" when new work packages arrive.

Currently, whenever a new work package arrives the setWork function is called, which triggers a pause()/kickOff() cycle. pause() sets m_abort to true, which causes the kernel loop to break when it calls the searched hook. This means that the GPU thread (i.e. the kernel loop) is stopped and restarted on every new work package.

This PR is an optimization which keeps the GPU thread running continuously, instead of stopping and restarting. Rather than the GPU thread calling a hook function on every loop iteration to check if if the loop should break, it calls the hook function to check if a new work package has arrived, gets the new work if so, writes the new work to the GPU buffer, and then continues the kernel loop.

ghost · 2017-07-05T23:59:34Z

The idea sounds very good to me. I think the idea of a fluent GPU work instead of a start/stop behaviour will improve our miner a little bit. 👍

@cdetrio Is you code somehow ready to get merged and is well tested (in terms of "still working and same or better hashrate then the current master code")?

chfast · 2017-07-06T06:31:40Z

This code requires some internal changes and small improvements. But should work.

chfast · 2017-07-06T15:10:45Z

ethminer/MinerAux.h

@@ -855,19 +854,6 @@ class MinerCLI
 					}
 					//exit(0);
 				}
-				else if (EthashAux::eval(previous.seedHash, previous.headerHash, solution.nonce).value < previous.boundary)


Do you know what was this about?

It was to support the second of two work packages: current.headerHash and previous.headerHash, where previous is the "stale" block. This was to check if a solution was for the previous block, there was an issue about it here: https://github.com/Genoil/cpp-ethereum/issues/36 (claims to be a 1.2% improvement).

When new work updates are coming very frequently (e.g. adding new tx's to the pending block every 1 second), sometimes a solution to previous is found just as a new one is arriving. But when the new work package arrives, the one in current becomes previous, and the one in previous is purged. So keeping around only the last two work packages is not enough to prevent throwing away solutions.

Rather than keep around three work packages to check solutions against, the solutions themselves now contain the work package information (seedHash, headerHash, nonce, boundary). So now it just checks that the solution is valid before submitting it to the client (eth_submitWork), rather than checking if its a solution to the current (or previous) work package.

cdetrio · 2017-07-06T16:35:31Z

Rebased on master.

EDIT: fixed rebase

cdetrio · 2017-07-06T20:54:51Z

Pushed some profiling branches.

Baseline (unoptimized) work switching cost: https://github.com/ethereum-mining/ethminer/commits/switch-profile

Optimized fast switching cost: https://github.com/ethereum-mining/ethminer/compare/fast-switch-profile

Obsolate

ghost · 2017-07-26T20:25:55Z

Hi @cdetrio / @chfast ,
for me this improvements sounds very good - what is the current state? Is it ready to get merged?

chfast · 2017-07-27T10:41:32Z

I went through big refactoring first. Some of the changes are in the master branch already (don't stop after finding solution). Some are pending: #200. Some are missing (don't restart mining on new work, just push new work package to the miner thread).

This PR was left for reference, because rebasing it make no sense now.

chfast · 2017-09-13T12:00:06Z

This has been done in #217.

Enable cuda10 build

chfast previously approved these changes Apr 24, 2017

View reviewed changes

cdetrio force-pushed the no-stale branch 3 times, most recently from 8ffa298 to f1588eb Compare April 24, 2017 15:22

cdetrio changed the title ~~submit solutions for all work packages~~ GPU threading optimization for frequent work updates Apr 24, 2017

chfast force-pushed the master branch from 7ad10e3 to aabbf15 Compare May 24, 2017 08:56

chfast force-pushed the no-stale branch from f1588eb to b24e315 Compare May 30, 2017 15:42

pilsy reviewed Jun 29, 2017

View reviewed changes

chfast force-pushed the no-stale branch 2 times, most recently from 6af7ce4 to bf82318 Compare July 3, 2017 20:55

chfast reviewed Jul 6, 2017

View reviewed changes

cdetrio force-pushed the no-stale branch from bf82318 to 99d19cc Compare July 6, 2017 16:06

cdetrio force-pushed the no-stale branch from 99d19cc to c54d25b Compare July 6, 2017 16:54

chfast force-pushed the no-stale branch from 6f29e8d to 7b9c0a6 Compare July 7, 2017 10:19

chfast force-pushed the no-stale branch 2 times, most recently from 6de5b48 to 2cd97ff Compare July 7, 2017 20:26

cdetrio and others added 3 commits July 18, 2017 11:21

getNewWork hook function for openCL GPU thread

f3d00e5

Simplify code

8886109

Fix uninitialized variable

3e9b5f1

chfast force-pushed the no-stale branch from 2cd97ff to 3e9b5f1 Compare July 18, 2017 09:22

Clean up code around start_nonce

8a816a5

chfast mentioned this pull request Aug 2, 2017

CLMiner: Optimization for frequent work updates #217

Merged

chfast closed this Sep 13, 2017

MariusVanDerWijden mentioned this pull request Oct 27, 2017

CUDA: Optimization for frequent work updates #359

Closed

kkkrackpot mentioned this pull request Dec 7, 2017

GTX960 + Linux: benchmark OK, but cannot mine #423

Closed

aotto1968 mentioned this pull request Jan 8, 2018

bin/ethminer → SEGFAULT #512

Closed

joequant mentioned this pull request Mar 14, 2018

asio crash on socket disconnect #887

Closed

This was referenced Jun 24, 2018

-X Mixed Mode is Not Using all CUDA Cards & appears to be reducing the speed too.. #1301

Closed

ethminer 0.15.0rc2 -X Mixed Mode doesn't use all CUDA Cards & Reduces the speed #1306

Closed

MiningGuru mentioned this pull request Aug 19, 2018

Reconnect to nanopool after ADSL/relogin after 24hours forced separation no new jobs! #1471

Closed

AndreaLanfranchi added a commit that referenced this pull request Nov 26, 2018

Merge pull request #4 from AndreaLanfranchi/enable-cuda10-build

e679983

Enable cuda10 build

mattglt mentioned this pull request May 10, 2021

Crash with SIGSEGV Error #2258

Open

hlfritz mentioned this pull request May 19, 2021

Boost fails to download for configure/build #2309

Open

GPU threading optimization for frequent work updates #4

GPU threading optimization for frequent work updates #4

Uh oh!

Conversation

cdetrio commented Apr 23, 2017

Uh oh!

chfast left a comment

Choose a reason for hiding this comment

Uh oh!

chfast commented Apr 25, 2017

Uh oh!

pilsy Jun 29, 2017

Choose a reason for hiding this comment

Uh oh!

cdetrio Jun 29, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pilsy Jun 29, 2017

Choose a reason for hiding this comment

Uh oh!

cdetrio Jun 29, 2017

Choose a reason for hiding this comment

Uh oh!

chfast Jul 7, 2017

Choose a reason for hiding this comment

Uh oh!

chfast Jul 7, 2017

Choose a reason for hiding this comment

Uh oh!

cdetrio commented Jun 29, 2017

Uh oh!

ghost commented Jul 5, 2017 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chfast commented Jul 6, 2017

Uh oh!

chfast Jul 6, 2017

Choose a reason for hiding this comment

Uh oh!

cdetrio Jul 6, 2017

Choose a reason for hiding this comment

Uh oh!

cdetrio commented Jul 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cdetrio commented Jul 6, 2017

Uh oh!

ghost commented Jul 26, 2017

Uh oh!

chfast commented Jul 27, 2017

Uh oh!

chfast commented Sep 13, 2017

Uh oh!

Uh oh!

cdetrio Jun 29, 2017 •

edited

Loading

ghost commented Jul 5, 2017 •

edited by ghost

Loading

cdetrio commented Jul 6, 2017 •

edited

Loading