Skip to content

Add github actions build file for MinGW-W64 #102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

tobiasherzke
Copy link
Contributor

FYI, this is how we build liblsl on windows for the MinGW compiler. As liblsl is a C++ library, users who write applications that use liblsl need to have their liblsl compiled with the same compiler as their application.

The build file is derived from your cross-platform build file. Compare against that file to find the differences. I was not sure how to integrate the different default shell settings required for this build (msys2) and all other platforms (bash) in github actions.

I have disabled examples and unit-tests because I get a linker error in one of them. I do not remember which, I remember that it is possible to fix the linker command manually quite easily, but I do not know enough cmake to fix it in the build system.

This pull request is mainly FYI. I for one would very much like if you accept it or merge into your cross-platform build file, but your priorities may lie elsewhere.

@tstenner
Copy link
Collaborator

Thanks for the PR! Compatibility with MinGW is a good thing and once I've figured out how to mark CI targets as optional (i.e., only run them when explicitly requested) I'll add it.

As it is, it shouldn't be needed. Even though liblsl is a C++ library, the entire API/ABI has only plain C types / functions and the C++ include header calls the C functions behind the scenes so liblsl compiled with MSVC should be loadable from a program compiled with MinGW and vice versa.

I have disabled examples and unit-tests because I get a linker error in one of them.

Fixed in 7fcc1bf.

I was not sure how to integrate the different default shell settings required for this build (msys2) and all other platforms (bash) in github actions.

For entire steps, you can add conditions, e.g. if: startsWith(matrix.config.os, 'ubuntu-') (see
https://github.com/labstreaminglayer/AppTemplate_cpp_qt/blob/3b5eed70d10f3c45af3ce5a9645685e6ecfd81d4/.github/workflows/cppcmake.yml#L38 ), for short command blocks something like this:

if [[ "${{ matrix.config.os }}" == ubuntu-* ]]; then
cmake -DCPACK_DEBIAN_PACKAGE_SHLIBDEPS=ON .
sudo dpkg -i package/*.deb
cmake --build build --config Release -j --target package
fi

@tstenner
Copy link
Collaborator

I tried integrating your changes into the main CI config, but the compiler / shell has an issue with a string definition passed via the CLI:

/C/ProgramData/chocolatey/bin/g++.exe -DBOOST_ALL_NO_LIB -DBOOST_ASIO_STANDALONE -DBOOST_THREAD_DONT_PROVIDE_INTERRUPTIONS -DLIBLSL_EXPORTS -DLSLNOAUTOLINK -D_WIN32_WINNT=0x0601 -Dlsl_EXPORTS -DLSL_LIBRARY_INFO_STR=\"git:c744dc95c47d60a9d8356ad4fc31f0126f6f44e6/branch:refs/heads/githubmingw/build:Release/compiler:GNU-8.1.0/link:SHARED\" @CMakeFiles/lsl.dir/includes_CXX.rsp -O3 -DNDEBUG -fvisibility=hidden -fno-keep-inline-dllexport -MD -MT CMakeFiles/lsl.dir/src/buildinfo.cpp.obj -MF CMakeFiles/lsl.dir/src/buildinfo.cpp.obj.d -o CMakeFiles/lsl.dir/src/buildinfo.cpp.obj -c /D/a/liblsl/liblsl/src/buildinfo.cpp
<command-line>: error: too many decimal points in number
D:/a/liblsl/liblsl/src/buildinfo.cpp:6:9: note: in expansion of macro 'LSL_LIBRARY_INFO_STR'
  return LSL_LIBRARY_INFO_STR;
         ^~~~~~~~~~~~~~~~~~~~
D:/a/liblsl/liblsl/src/buildinfo.cpp: In function 'const char* lsl_library_info()':
<command-line>: error: found ':' in nested-name-specifier, expected '::'
D:/a/liblsl/liblsl/src/buildinfo.cpp:6:9: note: in expansion of macro 'LSL_LIBRARY_INFO_STR'
  return LSL_LIBRARY_INFO_STR;
         ^~~~~~~~~~~~~~~~~~~~

Neither MSVC on Windows nor g++ on Linux have a problem with -DLSL_LIBRARY_INFO_STR=\"git:c744dc95c47d60a9d8356ad4fc31f0126f6f44e6/branch:refs/heads/githubmingw/build:Release/compiler:GNU-8.1.0/link:SHARED\" . CMake 3.21 might have a fix for it, so I'll wait for that and try again.

@tobiasherzke
Copy link
Contributor Author

Interesting! Can you point me to the build file and an github actions log, I would be interested in finding the cause.

@tstenner
Copy link
Collaborator

Sure, the log is here and the commit here. Just in case you can't access the build log, I've also uploaded it: buildlog.txt.gz

I think the main issue is that the build steps are running in the bash shell, not the msys shell, but I'd rather avoid changing the shell for every step.

@tobiasherzke
Copy link
Contributor Author

Thanks for the pointers. I still do not understand how this double-decimal-point error occurs.

But I saw that your build was using an older compiler installed by chocolatey and the bash shell from the git installation instead of from msys2.

I agree that it is undesirable to extend every step with a shell setting. Alternative suggestion here: https://github.com/tobiasherzke/liblsl/commit/13dac557fc4dcc77b75ebbbfc670714b84a3ec47 the shell is again specified only once in the file, but in a different place to allow a different shell for MinGW.

This change builds the liblsl DLL. The unit test stage still fails, but it is one step further.

@tstenner
Copy link
Collaborator

I still do not understand how this double-decimal-point error occurs.

Without the quotes, the compiler tries to make sense of git:c744dc95c47d60a9d8356ad4fc31f0126f6f44e6/branch:refs/heads/githubmingw/build:Release/compiler:GNU-8.1.0 and for some reason the 8.1.0 is the first thing it trips over.

I agree that it is undesirable to extend every step with a shell setting. Alternative suggestion here: tobiasherzke@13dac55 the shell is again specified only once in the file, but in a different place to allow a different shell for MinGW.

Nice, I didn't know you could put the defaults: in the build: section. The shell: parameter could be even shorter: shell: ${{ matrix.config.name == 'windows-mingw' && 'msys2 {0}' || 'bash' }}

This change builds the liblsl DLL. The unit test stage still fails, but it is one step further.

A segfault shouldn't happen, especially when testing the exported methods. The segfault seems to happen in the multithreaded lsl_last_error test, so maybe the warning about the ignored thread attribute should've been an error instead. I don't have a MinGW setup at hand right now, so it'd be great if you could take a look at this.

@tobiasherzke
Copy link
Contributor Author

Reproducing on a local windows installation:

In order to reproduce the steps from the build file, I need

  • Msys2 with packages git, make, mingw64/gcc installed
  • Cmake installed from the regular cmake windows installer, because the msys cmake package does not understand -G 'Msys Makefiles'
  • I also installed the mingw64-gdb for debugging the segmentation fault.

When executing the steps from the Makefile locally, I can reproduce the segmentation fault.

When reproducing the segmentation fault in the debugger, it does not give me any information in the backtrace.

In order to recompile with debug info, I have changed "Release" to "Debug" in the two cmake invocations that had "Release".

After doing this, there is no directory "install", but I can find the two test executables in build/testing.

The internal tests succeed as before.

The exported tests fail, but without producing the segmentation fault. The output is:

user@DESKTOP MINGW64 ~/liblsl/build
$ ./testing/lsl_test_exported.exe --order rand --wait-for-keypress never --durations yes
2021-06-27 16:53:59.845 (   0.003s) [        A9E48EB7]         api_config.cpp:231   INFO| Loaded default config
2021-06-27 16:53:59.846 (   0.003s) [        A9E48EB7]             common.cpp:64    INFO| git:lslgitrevision/branch:lslgitbranch/build:Debug/compiler:GNU-10.3.0/link:SHARED
1.064 s: datatransfer - int32_t
1.625 s: resolve multiple streams
1.055 s: datatransfer - char
2021-06-27 16:54:04.106 (   4.263s) [IO_movetest     ]   inlet_connection.cpp:45    WARN| The stream named 'movetest' can't be recovered automatically if its provider crashes because it doesn't have a unique source ID
3.074 s: move C++ API types
0.543 s: fullinfo
0.001 s: multithreaded lsl_last_error

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lsl_test_exported.exe is a Catch v2.13.6 host application.
Run with -? for options

-------------------------------------------------------------------------------
lsl_last_error size
-------------------------------------------------------------------------------
C:/msys64/home/user/liblsl/testing/test_ext_streaminfo.cpp:22
...............................................................................

C:/msys64/home/user/liblsl/testing/test_ext_streaminfo.cpp:25: FAILED:
  REQUIRE( lsl_last_error()[512] == 0 )
with expansion:
  6 == 0

0.000 s: lsl_last_error size
1.054 s: TypeConversion
1.063 s: datatransfer - float
1.614 s: timeouts
2021-06-27 16:54:11.453 (  11.611s) [IO_timesync     ]   inlet_connection.cpp:45    WARN| The stream named 'timesync' can't be recovered automatically if its provider crashes because it doesn't have a unique source ID
1.704 s: simple timesync
3.243 s: Flush
0.002 s: Invalid queries are caught before sending the query
0.037 s: resolve from streaminfo
1.079 s: datatransfer - int64_t
1.070 s: datatransfer - double
1.069 s: datatransfer - int16_t
1.107 s: data datatransfer
0.000 s: streaminfo
===============================================================================
test cases:  19 |  18 passed | 1 failed
assertions: 722 | 721 passed | 1 failed

but the diagnostic output is not stable: in the next run, it is

C:/msys64/home/user/liblsl/testing/test_ext_streaminfo.cpp:25: FAILED:
  REQUIRE( lsl_last_error()[512] == 0 )
with expansion:
  'o' == 0

and in the next

C:/msys64/home/user/liblsl/testing/test_ext_streaminfo.cpp:25: FAILED:
  REQUIRE( lsl_last_error()[512] == 0 )
with expansion:
  '▒' == 0

Hope this helps and you can make sense of it. My first guess when seeing this would be usage of uninitialized memory. (In that case, running the same test on Linux with valgrind would reliably diagnose it as such. It could also explain a segfault.)

@tobiasherzke
Copy link
Contributor Author

Valgrind does not report uninitialized memory on Linux (see https://github.com/tobiasherzke/liblsl/commit/af99b67582149c430ca696fe452e9194be80aabf#diff-71cc08be7d433f3916799b252b7ae38be718a330946ec1399e8361276cd6bb40) which makes this explanation unlikely. Something else indeterministic must be going on.

@tstenner
Copy link
Collaborator

Hope this helps and you can make sense of it.

As always I appreciate your thoroughness, and while typing out why this shouldn't happen and MinGW is to blame, I found the error but it still doesn't fix the issue.

The last_error variable is thread_local (see the PR #75), so each thread's last_error has to be zero-initialized at start of each thread (common.cpp:19: thread_local char last_error[512] = {0};). It's only written in two places: lsl_c_api_helpers.hpp:10: strncpy(last_error, e.what(), sizeof(last_error) - 1);, and in both places with strncpy so the last element should always be zero.

That being said, it's time to quote Phil Karlton:

There are two hard things in computer science: cache invalidation, naming things, and off-by-one errors.

The last element of char last_error[512] is, of course last_error[511].

After fixing it, it still segfaults. Maybe MinGW really is to blame?

tstenner and others added 2 commits June 28, 2021 21:10
Use github actions to also build for MinGW-W64. Uses msys2
environment.

For the current gcc-10.3 compiler on msys,
* link-time optimization needs to be disabled to avoid
  linker errors,
* -O3 needs to be disabled to avoid errors with thread-
  local storage across DLL borders.
@tobiasherzke
Copy link
Contributor Author

Maybe MinGW really is to blame?

Initially, I found this hard to believe, because it's usually not the compiler's fault. But now, after extensive recompilations, printfs and retryings, I think you are right. The current MinGW-W64 compiler 10.3 (as packaged by msys2) somehow messes up the thread-local storage when that TLS is defined in a DLL, and optimization is -O3.

I'll try to update this pull request with a working modification of cppcmake.yml. Because I do not know cmake very well, I'll simply use --target=Debug for MinGW to disable -O3 in the proposed update. The additional -g does no harm IMO.

What my proposed change does not fix is the thread-local-storage in the loguru library that you use by source inclusion. I believe you will be able to fix that by replacing declspec(thread) with thread_local, but I have not tested this on all platforms.

@tobiasherzke tobiasherzke force-pushed the tobiasherzke/pullrequest102 branch from c60ae9e to 05fcd77 Compare June 28, 2021 19:51
@tobiasherzke
Copy link
Contributor Author

tobiasherzke commented Jun 29, 2021

What my proposed change does not fix is the thread-local-storage in the loguru library

0bd89ad demonstrates this problem by adding a test which exposes it.

@tobiasherzke
Copy link
Contributor Author

tobiasherzke commented Jun 29, 2021

e0fce48 suggests a fix.

@tstenner
Copy link
Collaborator

Maybe MinGW really is to blame?

Initially, I found this hard to believe, because it's usually not the compiler's fault.

Normally I'd agree, but when the compiler has an internal error for code MSVC, GCC and Clang accept without any warnings it's reason to get suspicious.

The current MinGW-W64 compiler 10.3 (as packaged by msys2) somehow messes up the thread-local storage when that TLS is defined in a DLL, and optimization is -O3.
I'll try to update this pull request with a working modification of cppcmake.yml. Because I do not know cmake very well, I'll simply use --target=Debug for MinGW to disable -O3 in the proposed update. The additional -g does no harm IMO.

Debug builds will be far slower, so before disabling optimizations completely we should test if -Og, -O1 or -O2 will also work. This is best done by overriding CMAKE_CXX_FLAGS_RELEASE (as done here:
tstenner@1041509).

Building against all optimization levels -O2, -O1, and -Og also crash, so the faulty optimization pass is one of

-fauto-inc-dec -fcombine-stack-adjustments -fcompare-elim -fcprop-registers -fdce -fdefer-pop -fforward-propagate -fguess-branch-probability
           -fipa-profile -fipa-pure-const -fipa-reference -fipa-reference-addressable -fmerge-constants  -fomit-frame-pointer -freorder-blocks -fshrink-wrap -fshrink-wrap-separate -fsplit-wide-types -fssa-backprop 
           -ftree-ccp -ftree-ch -ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts -ftree-forwprop -ftree-fre -ftree-phiprop -ftree-scev-cprop -ftree-sink -ftree-slsr -ftree-ter -funit-at-a-time

e0fce48 suggests a fix.

Looks good to me. We (and for authorship credits, I mean you 😄) should submit a PR to the upstream repository (https://github.com/emilk/loguru). thread_local is plain C++11 so even on Windows it should be supported by all major compilers.

@tstenner
Copy link
Collaborator

And it's fixed: tstenner@a1f8679

@tstenner
Copy link
Collaborator

I have pushed the changes that allow MinGW to build liblsl even with optimizations. There's one issue: as liblsl is compiler-agnostic, the packages are named after the platform and architecture and the windows package names are already taken by the MSVC packages. I can think of two options:

  1. build with MinGW, but skip uploading packages so they have to be downloaded manually from the Github Actions page
  2. merge all changes except the entry in matrix.config. That way you can just carry a single (smaller) commit that removes some targets and adds the MinGW target.

@tobiasherzke
Copy link
Contributor Author

I have pushed the changes that allow MinGW to build liblsl even with optimizations.

Thank you!

There's one issue: as liblsl is compiler-agnostic, the packages are named after the platform and architecture and the windows package names are already taken by the MSVC packages. I can think of two options:

1. build with MinGW, but skip uploading packages so they have to be downloaded manually from the Github Actions page

2. merge all changes except the entry in `matrix.config`. That way you can just carry a single (smaller) commit that removes some targets and adds the MinGW target.

You need to decide this, obviously this is not my decision. I would prefer 1) because then github actions will inform you if any changes that you make to liblsl causes problems with MinGW. This can be a liability because, as we have seen here, the MinGW compiler has some peculiarities. But it is also a chance for you to detect more possible problems in the code because one more compiler will mean more/different code analysis during compilation. The number of places in the code that were causing problems were very few, after all. And if in future the liability aspect causes too much work, you can always disable the MinGW build then. Especially if you do not publish it with your releases, then you will also not grow any significant user base that would depend on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants