Skip to content

Pycaffe improvements from OpenCL branch #6209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

Noiredd
Copy link
Member

@Noiredd Noiredd commented Feb 2, 2018

I'm bringing @naibaf7's pycaffe improvements from OpenCL branch to Master. Some pretty cool features have been implemented, most importantly exposure of SolverParameter, possibility to reinstatiate Solver, which would address #2766, and even update its parameters on-the-fly (#2686).
Somewhat related PRs: #4342, #5549 and #6238.

Unfortunately many of the changes are interleaved with the evolution of that branch itself (modifications of MemoryDataLayer, adoption of int_tp type, some multi-device stuff, etc.) so it's not possible to just pull them over to master - plenty of conflicts arise due to interdependencies.
This is a WIP with a request for feedback or ideas, especially from you @naibaf7 as the original author of those changes. I have tracked the history of OpenCL's pycaffe, dissecting every commit since July 2015, cherry-picking the relevant ones and cleaning them from OpenCL-specific code. Here is my plan of attack for this (ticked the already copied commits):

  • 61a05f3 Python update, Lint fixes. and d82979b Softmax fix for >4 shape dims.
    A heavy modification of the MemoryDataLayer and other stuff which I do not want to incorporate here; however, it exposes Solver::max_iter and allows passing a layer index in Net_SetInputArrays, so I just took the relevant pieces as a new commit.
  • faf003a Pycaffe check update. and 5ebebfb Pycaffe fixed.
    ignored, follow-ups to the MemoryDataLayer and int_tp changes
  • 01139b0 Fix for Malis, Softmax. and 165d231 Removed duplicate function.
    ignored, irrelevant to Master but introduce some conflicts around .def("step")
  • 984745d Expose SolverParameter to Python.
    This introduces an issue because we already have a binding to SolverParameter (added in e21b420 Python Multi-GPU). Since this commit registers a new binding, every time we import caffe Python interpreter will now output the following warning:
    /usr/local/caffe/python/caffe/pycaffe.py:13: RuntimeWarning: to-Python converter for caffe::SolverParameter already registered; second conversion method ignored.
    from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
    I am tempted to remove the old binding (perhaps moving the layer_wise_reduce accessor to the new one since unlike max_iter and display it isn't doubled) but I'm not sure what consequences will that have. What do you think @naibaf7?
  • e2f8e70 Improved pycaffe interface.
    Changes meaning of get_solver - will cause make pytest throw errors until 188aeae is merged.
  • e2f8e70 Extended python interface.
    ignored, requires some other commit for Caffe::EnumerateDevices
  • 52732be Merge, stronger PyCaffe interface. (renames SolverParam to SolverParameter)
  • 97f9648 Multi-device advancements.
    ignored, the change only relevant to OpenC's multi-device support
  • 03276d4 updated pycaffe init (adds get_solver_from_file to __init__.py)
  • 96b84b6 Expose snapshot format to python (I fixed a minor syntax error here)
  • a35a577 Python expose layers like blobs. with its lint fixing commits, c65e85b and 6b88d6f, and a bugfix 8719549
    Adds Net_SetLayerInputArrays which is specific to OpenCL's MemoryDataLayer modifications. For now I rolled back the function to use legacy accessors, but let me know if you have a better idea.
  • a4c269e Python interface extensions.
    Exposes NetParameter and NetState. I made sure the changes to Net_Init_Load and Net_SetLayerInputArrays it introduces stay in line with the rollback.
  • 188aeae PyCaffe fixes for solver interface, ensure GIL lock in python layer.
    Solves test problems introduced in e2f8e70, also changes behavior of the Python layer. Tests pass however so I'm leaving this for now.
  • 2daf478 Python NetSpec extension for layer bundles and its bugfix, 1d82ba2
    ignored, it looks like an OpenCL specific thing. Am I right, @naibaf7?
  • e61d542 LINT fixes.
    ignored, a large collection of fixes which will instead be put in respective commits
  • 59113c0 Removed duplicate layer parameter in PyCaffe.
    ignored, it fixes duplicated LayerParameter which does not appear here (for some reason)
  • 70858cb PyCaffe fixes.
    ignored, I fixed lint stuff on the fly, and changes to draw.py already are in master
  • 77c3428 Fix MSVC error for PyCaffe. -- is this needed on master too?
  • removed the old SolverParameter binding to avoid the "already registered" issue.

I'd like to hear comments on:

  • 984745d - how to deal with the doubly registered SolverParameter issue? (for now I removed the old binding, moving the layer_wise_reduce accessor to the new one)
  • a35a577 - is rolling back to legacy accessor the right thing to do (and have I done it right?)
  • 188aeae - should the GIL be brought to master? (currently I pulled it as is)
  • 2daf478 - can we safely ignore it? (currently it is ignored)
  • 77c3428 - what does it do and do we want it on master? (currently I pulled it)

EDIT: rebased this to most recent master, added the exposed ApplyUpdate method.

@naibaf7
Copy link
Member

naibaf7 commented Feb 2, 2018

Looks good to me so far. I'll look over it in more detail after one of the Master branch maintainers had a peek.

@Noiredd
Copy link
Member Author

Noiredd commented Feb 2, 2018

FYI: I pulled the remaining commits, sometimes changing them as I saw fit. Updated the description to reflect the changes, added a short summary below to ask about the issues I see right now. Tests should also pass now.

@Noiredd Noiredd changed the title [not ready] Pycaffe improvements from OpenCL branch Pycaffe improvements from OpenCL branch Feb 2, 2018
@Noiredd Noiredd force-pushed the pycaffe-improvements branch from 28a109e to abe8689 Compare February 12, 2018 11:34
@Noiredd Noiredd force-pushed the pycaffe-improvements branch from abe8689 to 0754447 Compare February 14, 2018 17:07
@shelhamer
Copy link
Member

I will have a look this week! Thanks @Noiredd and @naibaf7

@naibaf7
Copy link
Member

naibaf7 commented Feb 15, 2018

@shelhamer @Noiredd
Actually I'm on my way to take this a lot further.
With my next update to OpenCL Caffe, there'll be:

  • INT8, INT16, INT32, INT64 quantization (on CUDA and OpenCL)
  • HALF, FLOAT and DOUBLE precision (on CUDA and OpenCL)
  • Device abstractions
  • Writing GPU layers for Caffe in Python [WIP for a future release]

And it will all be exposed to Python.
In the works here, if you want to see a preview of the new codebase:
https://github.com/naibaf7/caffe

@naibaf7
Copy link
Member

naibaf7 commented Feb 17, 2018

Check the current progress here. It is now in a working state:
https://github.com/naibaf7/caffe/blob/master/python/caffe/_caffe.cpp

The really great thing about this tweak is that all new data types are accessible, but the old python code written for caffe still works in it's full extent, without any changes needed. So it's even a non-breaking improvement.

@Noiredd
Copy link
Member Author

Noiredd commented Feb 20, 2018

@naibaf7 That is pretty spectacular! Thanks for giving a preview. Will the new numeric types be somehow documented? I did take a look but I'm not sure how to use them.

Also, does that render this PR pointless? I won't mind closing if your change will be incorporated in its place.

@naibaf7
Copy link
Member

naibaf7 commented Feb 20, 2018

@Noiredd
I think you should go ahead for the time being. My new branch will need 2-3 months longer to fully mature into "production ready". It also requires the device abstracted backend (see here: https://github.com/naibaf7/caffe/tree/master/src/caffe/backend) and cannot be used with the current Master branch of Caffe.

Yes the numeric types will be documented. Float16 can already be used equally to float32 and float64, including unit tests (thanks to Intel, @gongzg, who started the float16 implementation; subsequently perfected by my efforts).
The integer (8, 16, 32, 64) quantization need some more testing and unit tests, I will check layer for layer on compatibility there.

Noiredd and others added 11 commits February 21, 2018 11:52
cherry-picked from 984745d

Conflicts:
	python/caffe/_caffe.cpp
	src/caffe/solver.cpp
cherry-picked from e2f8e70
this introduces pytest errors (will get fixed in 188aeae)

Conflicts:
	python/caffe/_caffe.cpp
cherry-picked from 52732be

Conflicts:
	python/caffe/__init__.py
	python/caffe/pycaffe.py
cherry-picked from 03276d4

Conflicts:
	python/caffe/__init__.py
cherry-picked from 96b84b6
fixed syntax error around bp::enum_
cherry-picked from a35a577
includes lint fixes c65e85b and 6b88d6f
also removed MemoryDataLayer change dependency

Conflicts:
	python/caffe/_caffe.cpp
cherry-picked from a4c269e
removed dependency on MemoryDataLayer modifications

Conflicts:
	python/caffe/__init__.py
	python/caffe/_caffe.cpp
	python/caffe/pycaffe.py
cherry-picked from 77c3428

Conflicts:
	python/caffe/_caffe.cpp
@Noiredd Noiredd force-pushed the pycaffe-improvements branch from 0754447 to cd100c0 Compare February 26, 2018 10:56
@Noiredd
Copy link
Member Author

Noiredd commented Feb 26, 2018

Update: I pulled all of the commits (assuming for now that 77c3428 is needed) and removed the old binding for SolverParameter*. This PR is complete, albeit based on some assumptions.

* - this change breaks #6238 as it added base_lr accessor to the now removed binding.

@Noiredd
Copy link
Member Author

Noiredd commented Aug 17, 2018

@shelhamer What's our status on this? Since we merged #6238, this causes some conflicts. Should I fix those and marge or is there an alternative plan?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants