Skip to content
This repository was archived by the owner on Apr 24, 2022. It is now read-only.

GPU threading optimization for frequent work updates #4

Closed
wants to merge 4 commits into from

Conversation

cdetrio
Copy link
Contributor

@cdetrio cdetrio commented Apr 23, 2017

No description provided.

chfast
chfast previously approved these changes Apr 24, 2017
Copy link
Contributor

@chfast chfast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works without issues on Nvidia OpenCL. It can be merged now or can wait for incoming changes.

@cdetrio cdetrio force-pushed the no-stale branch 3 times, most recently from 8ffa298 to f1588eb Compare April 24, 2017 15:22
@cdetrio cdetrio changed the title submit solutions for all work packages GPU threading optimization for frequent work updates Apr 24, 2017
@chfast
Copy link
Contributor

chfast commented Apr 25, 2017

Can you rebase this?

//if (m_abort || m_owner->shouldStop())
// return (m_aborted = true);
//return false;
// TODO: still need a way to stop the GPU thread
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still need to be solved?

Copy link
Contributor Author

@cdetrio cdetrio Jun 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's still needed. For example, if the miner's internet connection goes down then maybe there should be a timeout so the GPU stops mining when there's no new work packages coming in. Currently it will just continue to mine uncles indefinitely.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change's that i was doing up doesn't quite fit in perfectly with your changes, i am changing the way the kernels are run from the current way. Instead of running 1 kernel then running 2nd kernel before reading the results of 1st, then swap/swap... changing it over to use one main kernel, and one that halts/aborts the job.

So rather than having the kernel finishing and coming back to CPU so often, when a job comes in that miner is created and starts working, it won't ever stop working until you abort it's execution - the idea being that on the CPU you are notified of newWork and simply fire off the abort kernel to kill off the current running one, then queue up this job and fire it off.

How i'm accomplishing this with the kernels:

  1. Create two command queues in a context.
  2. Create two kernels, one to do the work and another to halt execution. Each kernel has access to a shared global buffer.
  3. Load the first kernel into queue1.
  4. Load the second kernel into queue2 when we want to halt execution.

(Alternatively we could use an out-of-order queue and load the second kernel into the same command queue to halt execution -- But i think that would be more trouble because we have to then be very careful and use clFinish/clFlush as required/necessary. However it may be a more natural way of writing/doing this... what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pilsy I don't quite understand what you're describing, but I also have limited knowledge of openCL kernels. I just posted an explanation of the optimization in this PR. Perhaps you could expand a bit on your scheme, in the convo thread?

The only other code I haven't yet pushed is some profiling lines. I'm hoping to finally push the profiling code over the next few days, and double check the improvement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To simplify: the main kernel is while(true) { if (abort); break; ...}, the second one is just abort = true;.

But I don't get why you need the second kernel to set to abort flag. A buffer shared with CPU is not enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back to the problem, I think the miner thread should just have a flag to stop (maybe it already has). The decision when to stop should be delegated to the network thread. E.g. if the stratum thread is not able to get the work for some time it should stop the miner thread. Does it make sense?

@cdetrio
Copy link
Contributor Author

cdetrio commented Jun 29, 2017

The idea here is to reduce the "switching cost" when new work packages arrive.

Currently, whenever a new work package arrives the setWork function is called, which triggers a pause()/kickOff() cycle. pause() sets m_abort to true, which causes the kernel loop to break when it calls the searched hook. This means that the GPU thread (i.e. the kernel loop) is stopped and restarted on every new work package.

This PR is an optimization which keeps the GPU thread running continuously, instead of stopping and restarting. Rather than the GPU thread calling a hook function on every loop iteration to check if if the loop should break, it calls the hook function to check if a new work package has arrived, gets the new work if so, writes the new work to the GPU buffer, and then continues the kernel loop.

@chfast chfast force-pushed the no-stale branch 2 times, most recently from 6af7ce4 to bf82318 Compare July 3, 2017 20:55
@ghost
Copy link

ghost commented Jul 5, 2017

The idea sounds very good to me. I think the idea of a fluent GPU work instead of a start/stop behaviour will improve our miner a little bit. 👍

@cdetrio Is you code somehow ready to get merged and is well tested (in terms of "still working and same or better hashrate then the current master code")?

@chfast
Copy link
Contributor

chfast commented Jul 6, 2017

This code requires some internal changes and small improvements. But should work.

@@ -855,19 +854,6 @@ class MinerCLI
}
//exit(0);
}
else if (EthashAux::eval(previous.seedHash, previous.headerHash, solution.nonce).value < previous.boundary)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what was this about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to support the second of two work packages: current.headerHash and previous.headerHash, where previous is the "stale" block. This was to check if a solution was for the previous block, there was an issue about it here: https://github.com/Genoil/cpp-ethereum/issues/36 (claims to be a 1.2% improvement).

When new work updates are coming very frequently (e.g. adding new tx's to the pending block every 1 second), sometimes a solution to previous is found just as a new one is arriving. But when the new work package arrives, the one in current becomes previous, and the one in previous is purged. So keeping around only the last two work packages is not enough to prevent throwing away solutions.

Rather than keep around three work packages to check solutions against, the solutions themselves now contain the work package information (seedHash, headerHash, nonce, boundary). So now it just checks that the solution is valid before submitting it to the client (eth_submitWork), rather than checking if its a solution to the current (or previous) work package.

@cdetrio
Copy link
Contributor Author

cdetrio commented Jul 6, 2017

Rebased on master.

EDIT: fixed rebase

@cdetrio
Copy link
Contributor Author

cdetrio commented Jul 6, 2017

Pushed some profiling branches.

Baseline (unoptimized) work switching cost: https://github.com/ethereum-mining/ethminer/commits/switch-profile

Optimized fast switching cost: https://github.com/ethereum-mining/ethminer/compare/fast-switch-profile

@chfast chfast dismissed their stale review July 7, 2017 10:20

Obsolate

@chfast chfast force-pushed the no-stale branch 2 times, most recently from 6de5b48 to 2cd97ff Compare July 7, 2017 20:26
@ghost
Copy link

ghost commented Jul 26, 2017

Hi @cdetrio / @chfast ,
for me this improvements sounds very good - what is the current state? Is it ready to get merged?

@chfast
Copy link
Contributor

chfast commented Jul 27, 2017

I went through big refactoring first. Some of the changes are in the master branch already (don't stop after finding solution). Some are pending: #200. Some are missing (don't restart mining on new work, just push new work package to the miner thread).

This PR was left for reference, because rebasing it make no sense now.

@chfast
Copy link
Contributor

chfast commented Sep 13, 2017

This has been done in #217.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants