Skip to content

Add Python generator #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 7, 2015
Merged

Add Python generator #24

merged 2 commits into from
Sep 7, 2015

Conversation

mitya57
Copy link
Contributor

@mitya57 mitya57 commented Jul 29, 2015

I am not the original author of these generators; @shibukawa is. However I needed to use the Python generator, so I have fixed some compilation errors in the code, rebased it against the current snowball code, and submitted this pull request.

The original code by Yoshiki can be found at shibukawa/snowball (in snowball directory).

Python needs no introduction. JSX is a statically typed JavaScript-like language that compiles to JavaScript, the compiler can be found at jsx/JSX (requires node.js).

Update: I excluded the JSX generator from this pull request

@mitya57
Copy link
Contributor Author

mitya57 commented Aug 5, 2015

@stefanor I would like you to also review this as you maintain snowball in Debian.

With make dist_libstemmer_python you get a tarball, after unpacking which you can use setup.py to build/install the Python module. If you remove the last line in dist_libstemmer_python code in GNUmakefile, then you'll get the unpacked directory without the need to deal with tarball.

I would like to see python-snowballstemmer and python3-snowballstemmer Debian packages built this way. I need these packages because they are dependencies for Sphinx 1.3.

Note that the Python part has a different version number, I have not changed this to not break reverse dependencies which may expect it to be as it is.

@mitya57
Copy link
Contributor Author

mitya57 commented Aug 25, 2015

@ojwb Ping?

ojwb added a commit that referenced this pull request Sep 1, 2015
Having two different build systems is a maintenance headache, so we
really need to pick one.  Richard says he lacks the time, and I don't
understand cmake, plus we have PR #24 which updates the makefiles but
not the cmake build system.
This was referenced Sep 1, 2015
@mitya57 mitya57 changed the title Add Python and JSX generators Add Python generator Sep 3, 2015
@mitya57 mitya57 force-pushed the master branch 3 times, most recently from acaddf7 to c3455d6 Compare September 3, 2015 16:55
@mitya57
Copy link
Contributor Author

mitya57 commented Sep 3, 2015

@ojwb To simplify things a bit, I removed JSX stuff from this pull request. For the Python part, I addressed all your comments.

If someone (me?) needs the JSX part in the future, I will be happy to propose another pull request to add it.

@@ -1,4 +1,9 @@
language: c
compiler: gcc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about travis, but don't we still want to specify the compiler for C code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately Travis doesn't seem to support different languages. So you can either specify multiple compliers or multiple Python versions.

I wanted to test with both Python 2.7 and Python ≥ 3.3, but now as you committed 30e10d9 to specify multiple compilers, I reverted my change and it tests only Python 2.7 now.

If in future Travis moves to a newer OS that has Python ≥ 3.3 (Ubuntu 12.04 doesn't), we will be able to install it using apt-get and test with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 5 September 2015 at 02:33, Dmitry Shachnev wrote:

@@ -1,4 +1,9 @@
-language: c
-compiler: gcc

Unfortunately Travis doesn't seem to support different languages. So you can either specify multiple compliers or multiple Python versions.

I wanted to test with both Python 2.7 and Python ≥ 3.3, but now as you committed 30e10d9 to specify multiple compilers, I reverted my change and it tests only Python 2.7 now.

If in future Travis moves to a newer OS that has Python ≥ 3.3 (Ubuntu 12.04 doesn't), we will be able to install it using apt-get and test with it.

You should be able to achieve that using multiple entries for the same
compiler with different env: settings in each.

Something like this (inspired by swig's .travis.yml):

compiler:
  - clang
  - gcc
    env: PYTHON=python2.7
  - gcc
    env: PYTHON=python3.2
  - gcc
    env: PYTHON=python3.3

Cheers,
Olly

@ojwb
Copy link
Member

ojwb commented Sep 3, 2015

To simplify things a bit, I removed JSX stuff from this pull request.

It will probably be less painful to do two separate merges, so that seems a good plan, though it's annoying github seems to have now just thrown away my comments on that code. It really does have a terrible patch reviewing workflow.

ojwb added a commit that referenced this pull request Sep 3, 2015
Having two different build systems is a maintenance headache, so we
really need to pick one.  Richard says he lacks the time, and I don't
understand cmake, plus we have PR #24 which updates the makefiles but
not the cmake build system.

Rebased version of PR #25.
@mitya57
Copy link
Contributor Author

mitya57 commented Sep 4, 2015

I will look at your last comments a bit later, just…

It will probably be less painful to do two separate merges, so that seems a good plan, though it's annoying github seems to have now just thrown away my comments on that code. It really does have a terrible patch reviewing workflow.

They aren't thrown away. Your first batch of comments is still available on mitya57/snowball@fb158c6aef6ae606.

@mitya57 mitya57 force-pushed the master branch 2 times, most recently from 0aeef16 to 67aa026 Compare September 4, 2015 14:59
@mitya57
Copy link
Contributor Author

mitya57 commented Sep 4, 2015

While it is quite easy to convert any kind of letter to lowercase in Python (using those with diacritics), it looks like I should mimic the C behavior and only convert ASCII letters, to make the testsuite pass.

This generator was originally written by Yoshiki Shibukawa,
I have fixed some compilation errors and rebased against the
current snowball code.
This is needed to match C behavior and make the testsuite pass.
@mitya57
Copy link
Contributor Author

mitya57 commented Sep 4, 2015

Updated the pull request, and made the Travis build finally pass.

@ojwb ojwb merged commit e7e180d into snowballstem:master Sep 7, 2015
@ojwb
Copy link
Member

ojwb commented Sep 7, 2015

I've set up builds with Python 2.7 and 3 in 8189e60, but the I couldn't seem to access your original commit adding the different builds to see which python interpreters you specified.

The default travis image doesn't see to have python3 so that build currently fails - I guess we need to install it with apt if we aren't using the python-specific travis builder (that is what SWIG seems to do).

Could you let me know which Python versions you specified before (or send a PR to update .travis.yml suitably)?

@ojwb
Copy link
Member

ojwb commented Sep 7, 2015

They aren't thrown away. Your first batch of comments is still available on mitya57/snowball@fb158c6.

Curiously they are now shown as "ojwb commented on an outdated diff" here, which I'm sure they weren't before. Anyway, good that they are still accessible.

@mitya57
Copy link
Contributor Author

mitya57 commented Sep 7, 2015

The default travis image doesn't see to have python3 so that build currently fails - I guess we need to install it with apt if we aren't using the python-specific travis builder (that is what SWIG seems to do).

Could you let me know which Python versions you specified before (or send a PR to update .travis.yml suitably)?

Travis uses Ubuntu 12.04 where 2.7 is available out-of-the-box and 3.2 is in the archive. This code doesn't work with 3.2 and works only with ≥ 3.3, so we can't use apt-get for now.

When one specifies language: python and python: [version-list], then the needed Python versions are installed not via apt-get, but via some hidden internal methods, which are not available to us when language is c.

ojwb added a commit that referenced this pull request Sep 8, 2015
As discussed in #24, there doesn't seem a good way to do it currently.
@ojwb
Copy link
Member

ojwb commented Sep 8, 2015

Hmm, what a pain.

They really should have just put the extra python versions in a package repo and then they'd be trivial to mix and match. Having to pick a language-specific VM base image only really works for single-language projects.

This will only get worse as more backends are added to Snowball - there's Javascript and C# on the cards. Not sure I see a good way to handle this, short of having branches with a different .travis.yml.

@mitya57
Copy link
Contributor Author

mitya57 commented Sep 8, 2015

This will only get worse as more backends are added to Snowball - there's Javascript and C# on the cards. Not sure I see a good way to handle this, short of having branches with a different .travis.yml.

If it will support versions of node.js and mono in Ubuntu 12.04, then it'll be possible to test it after installing them via apt-get.

By javascript, do you mean the JSX generator, or something else? That said, do you want me to propose the pull request for JSX (I don't need it much myself)?

@ojwb
Copy link
Member

ojwb commented Sep 8, 2015

If it will support versions of node.js and mono in Ubuntu 12.04, then it'll be possible to test it after installing them via apt-get.

The Python bindings work with versions of Python in Ubuntu 12.04, but we can't test with all the versions we'd like to (and which we could in a travis python VM image).

I gather node.js is a pretty fast moving target, so it seems likely the node.js travis VM image has custom installed newer versions of node and/or related software which we can't simply install onto a different image with apt-get. We might be able to test with the packaged versions, but that isn't ideal.

To handle Python and C, maybe starting from the Python image and installing the relevant packages for C compilation would work, but as soon as we want the custom-installed versions from more than one VM image that trick stops working.

I'm not sure it's worth putting energy into working around these deficiencies in travis - hopefully they'll move on to a newer Linux version soon and this won't be such an issue.

I think it would be useful to get the JSX generator merged too. The main issue seemed to be that the patch contains a hash function apparently taken from a random forum posting which lacked any licensing information, but simply replacing that is clearly a feasible fix if it comes to it.

@ojwb
Copy link
Member

ojwb commented Jan 20, 2016

travis-ci now have trusty available in beta, and there they've scrapped the array of different images - there's just a minimal one, and one with everything. So I've updated to use that and now we're testing with python2, python3, and pypy (since that was documented as working).

@mitya57
Copy link
Contributor Author

mitya57 commented Jan 21, 2016

@ojwb: Thanks, and nice, it even works with Python 2.6 and PyPy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants