Skip to content

Fix building under windows targets. #210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 20, 2018

Conversation

crlf0710
Copy link
Contributor

@crlf0710 crlf0710 commented Aug 14, 2018

This commit adds support for building under windows-gnu target. (and windows-msvc target, see below)

This works if:

  • You've got same version mingw gcc with the one shipped with rust.
  • You've got pkg-config and c dependency libraries built with the correct gcc version above.

I can compile this branch locally and run the tests. There's one test still failing, which is the sha256 comparison in formats.rs. Still don't know why yet. I will look into it later.

@pkgw
Copy link
Collaborator

pkgw commented Aug 15, 2018

Thanks very much for submitting, on a first glance this looks great!

It will be important to identify why the SHA256 test is failing. But if the reason turns out to be something that is basically superficial, we should be able to modify the tests to vary their expectations depending on the platform.

Two additional items that would be great to bring along with this if you can find the time:

  • Some kind of docs on the Windows build process; perhaps best as a PR to the website repo
  • Setup of a Windows CI framework so that we don't break Windows in the future. I know that AppVeyor can do this.

@pkgw pkgw self-assigned this Aug 15, 2018
@crlf0710
Copy link
Contributor Author

Yes, i think it's important too. However i'm not so familiar with TeX... So i'll try to understand what's in the content of the "plain.fmt" file, and compare it to what the linux version generates... It will take several days, and it will be useful if you can point me to a few web pages about them.

And i'm not satisfied my current building process, the "c dependencies" part is still a bit hacky. I'll make use of AppVeyor to find out a clean way to do it.

In the meantime i'll also try to make the windows-msvc version build too.

I plan to write some docs on the Windows build process when all of these are ready.

@pkgw
Copy link
Collaborator

pkgw commented Aug 15, 2018

The first thing I'd suggest is doing a binary diff on your format and the Linux version ... the difference might just be an extra zero byte at the end of the file, or something simple like that. In which case it will probably not be so hard to locate the relevant code. Another possibility is that there is code to byte-swap the format files and maybe everything is getting byte-swapped relative to the reference files.

The relevant C code starts here and I think is all in that file. It refers to "dump" files rather than "format" files. The format is a little complex but I have trouble seeing how anything platform-dependent would sneak into the higher-level code, so it probably has to do with the low-level I/O. As ever, TeX: The Program is the canonical reference for the original design ... as written in a language no one else uses, for mathematicians, for a different version of the code :-/.

I'm afraid that know very little about how to build software on Windows so I don't feel that I can contribute much to the bigger infrastructure, but thank you for taking the effort to polish the build process!

@codecov
Copy link

codecov bot commented Aug 15, 2018

Codecov Report

Merging #210 into master will decrease coverage by <.01%.
The diff coverage is 53.73%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #210      +/-   ##
==========================================
- Coverage   37.89%   37.88%   -0.01%     
==========================================
  Files         131      131              
  Lines       60339    61500    +1161     
==========================================
+ Hits        22864    23300     +436     
- Misses      37475    38200     +725
Impacted Files Coverage Δ
tectonic/bibtex.c 0% <ø> (ø) ⬆️
src/lib.rs 100% <ø> (ø) ⬆️
src/driver.rs 54.27% <0%> (+1.74%) ⬆️
src/io/format_cache.rs 63.26% <0%> (+4.93%) ⬆️
src/io/local_cache.rs 0% <0%> (ø) ⬆️
tectonic/dpx-pdfdoc.c 33.9% <100%> (+5.88%) ⬆️
tectonic/xetexini.c 86.55% <100%> (ø) ⬆️
src/engines/mod.rs 74.31% <100%> (+9.85%) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 532e428...0e42f80. Read the comment docs.

@crlf0710
Copy link
Contributor Author

@pkgw I traced the program a little, and now i think i understand where did the difference come from.
The difference happens within a variable called "font_check".
Somehow the original program's design is using 1-based array. and leaving the 0 indexed item unused. Since the memory comes from malloc, it is even not initialized once.

But during dumping (the place you pointed out), the whole memory is dumped out, including the 0 position. So in theory the 0 position contains garbage. (Accessing it is UB, i think). On linux, the data dumped out over there is 0. On Windows, it's some random bytes, thus the difference.

So, what do you think we should do?

@pkgw
Copy link
Collaborator

pkgw commented Aug 15, 2018

@crlf0710 Nice sleuthing! If that's the case, we should either write and read the array starting at the 1 index, or explicitly zero out the 0th element. I am curious to take a look at this but am busy so might not get a chance to soon — if you are able to implement a fix, feel free to do so. You will have to run the Linux version and calculate the new SHA256 sum yourself if the format file changes.

Once again, nice work!

@crlf0710
Copy link
Contributor Author

I don't want to change much of the logic happening inside TeX, so i just changed xmalloc to ensure all memory is zero-initialized when allocated. And the test now passes. Is that ok?

@crlf0710
Copy link
Contributor Author

crlf0710 commented Aug 16, 2018

Also i've previously added some hacks to make the code buildable in MSVC too. I've got several files from postgresql's windows port code in tectonic/msvc directory. The licenses are permissive. Please review them when you have time.

Now the MSVC version can build. It can also correctly generate the md5_of_hello.pdf. But it errors when generating plain.fmt. I'll investigate into it before weekend. However the MSVC version is affected by a few issues.

Blocking issues:

Nice to have:

@crlf0710 crlf0710 changed the title Fix building under windows-gnu target. Fix building under windows targets. Aug 16, 2018
@pkgw
Copy link
Collaborator

pkgw commented Aug 16, 2018

That's very rapid progress!

  1. Some large buffers are allocated with xmalloc, so I think it would be good for performance not to zero-initialize everything that comes out of that function. Can you please instead just zero out the one item in the font_check array?

  2. That's exciting news about MSVC! However, to keep the complexity of this PR down, can we keep it focused on the windows-gnu target? It will be easier for me to review the details this way. Once it's working and merged, MSVC can come in another PR. If a better MSVC approach will be coming it's OK to skip the documentation and CI setup in this PR.

@crlf0710
Copy link
Contributor Author

Progress update: I've written appveyor configuration.
See https://ci.appveyor.com/project/crlf0710/tectonic/build/1.0.14

There're test failures related with failuring to lock the manifest files in cache.
Very suspicious.

Sadly i can't directly debug this on machine. From my network here i can't access the online bundle, and i'm always using (dogfooded) offline mode (#211).

I'm not if this behavior is something wrong with interaction with fs2, or is it just an appveyor problem.

@pkgw
Copy link
Collaborator

pkgw commented Aug 20, 2018

Thanks for the update. I would really like to get this merged, so I will see if I can investigate on my end.

@crlf0710
Copy link
Contributor Author

crlf0710 commented Aug 20, 2018

Hi, @pkgw. I've known the cause of the failure of locking problem. It's because windows doesn't support locking a file in append mode.

But at the same time, i read the code a little and it seems to me that "formats" in cache is not locked at all? It seems to me that when a few instances is running simultaneously nothing will stop them from trying to add the plain format into formats and collide. However i'm not able to run this part of code locally and can't be sure of it...

@pkgw
Copy link
Collaborator

pkgw commented Aug 20, 2018

@crlf0710 Dashing off a quick response here — several of the operations in the caching code rely on Unix filesystem semantics, where file renames are atomic and can be used to set up cache files without any explicit locking. If Windows doesn't provide the same semantics, we'll need some new code to lock appropriately.

@crlf0710
Copy link
Contributor Author

@pkgw Ok, i think that's where the problem is. I don't think MoveFileEx() in winapi provides the same semantics. Anyway i've rebased, moved MSVC stuff off the shelf and got everything i'm certain of here.

The appveyor setting is included, you can set it up in ci.appveyor.com, but because of the cache locking problem the tests won't pass now.

@pkgw
Copy link
Collaborator

pkgw commented Aug 20, 2018

OK, let's get this merged! Note that we're still not properly supporting Windows because of the locking issues, but at least now we'll compile, and the CI infrastructure will be in place to see if we can get things working.

@pkgw pkgw merged commit 091fdac into tectonic-typesetting:master Aug 20, 2018
@pkgw pkgw mentioned this pull request Aug 20, 2018
@crlf0710 crlf0710 deleted the win-gnu branch September 9, 2019 16:46
Mrmaxmeier pushed a commit to Mrmaxmeier/tectonic that referenced this pull request Nov 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants