Skip to content

Use utf-8 to decode filename #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 26, 2019
Merged

Use utf-8 to decode filename #18

merged 1 commit into from
Jan 26, 2019

Conversation

ken2812221
Copy link

@ken2812221 ken2812221 commented Jul 28, 2018

See bitcoin/bitcoin#13869

Enable unicode support for leveldb on Windows

CI result for applying this change is available in bitcoin/bitcoin#13787

@maflcko
Copy link

maflcko commented Jul 29, 2018

This won't cause any issues when down/upgrading from 0.16 to 0.17 while keeping the datadir?

@ken2812221
Copy link
Author

ken2812221 commented Jul 29, 2018

@MarcoFalke The user must change their config file to be utf-8 encoded if the file contains any non-ascii characters in it. Before 0.17, they are read as ansi-encoded characters on Windows. Except this, I can't see it could any issue here.

@ken2812221
Copy link
Author

ken2812221 commented Jul 29, 2018

I'll add the caution in the description of bitcoin/bitcoin#13426

@ken2812221
Copy link
Author

ken2812221 commented Jul 31, 2018

@MarcoFalke Do you mean applying this change to 0.16? Is so, it will not cause any problem because we don't define UNICODE in 0.16

@ken2812221 ken2812221 changed the title Use utf-8 if UNICODE is defined Use utf-8 string if UNICODE is defined Aug 6, 2018
@ken2812221 ken2812221 changed the title Use utf-8 string if UNICODE is defined Use utf-8 string if UNICODE is defined and add compatibility with MSVC Oct 2, 2018
@ken2812221 ken2812221 changed the title Use utf-8 string if UNICODE is defined and add compatibility with MSVC Use utf-8 string if UNICODE is defined Oct 3, 2018
@luke-jr
Copy link
Member

luke-jr commented Oct 18, 2018

This no longer checks for UNICODE, so @MarcoFalke 's question should be re-answered (and the PR description updated)

@ken2812221
Copy link
Author

ken2812221 commented Oct 18, 2018

@luke-jr If we don't backport this change into older version of Bitcoin Core. Then there would have no problem because it's using utf-8 facet to imbue path on current master. I checked if UNICODE defined to make sure this change can be backported.

@ken2812221 ken2812221 changed the title Use utf-8 string if UNICODE is defined Use utf-8 to decode filename Oct 18, 2018
@luke-jr
Copy link
Member

luke-jr commented Oct 18, 2018

Why is it a problem to backport now?

@ken2812221
Copy link
Author

Because it use MBCS encoding on boost::filesystem::path on Bitcoin Core <= 0.17. It changed to utf-8 by bitcoin/bitcoin#13877

@laanwj
Copy link
Member

laanwj commented Jan 16, 2019

Needs rebase after #14

@ken2812221
Copy link
Author

@sipsorcery @NicolasDorier @xaya Are you willing to review this and give me an ACK or a NACK.

@xaya
Copy link

xaya commented Jan 23, 2019

@ken2812221 I don't know if it causes or solves any other issues but it definitely solved this issue here:
bitcoin/bitcoin#15092

so ack for our issue

@sipsorcery
Copy link

utACK f8e797a

@NicolasDorier
Copy link

NicolasDorier commented Jan 24, 2019

Sorry I am kind of lost about understanding those encoding problem.
This code seems correct to me after reviewing it, but I lack good understanding of the problem.

I don't understand why there is methods for both std::wstring and std::string and why Bitcoin Core is using std::string. And I feel that trying to investigate will bring me down a very deep in the rabbit hole about the weirdness of encoding and legacy windows code, to which I am not too eager to jump in.

I see nothing wrong with this code, and wish it to be merged as it fix bitcoin/bitcoin#15092 though.

@ken2812221
Copy link
Author

@laanwj Could this be merged?

@laanwj laanwj merged commit f8e797a into bitcoin-core:bitcoin-fork Jan 26, 2019
laanwj added a commit that referenced this pull request Jan 26, 2019
f8e797a Use utf-8 to decode filename (Chun Kuan Lee)

Pull request description:

  See bitcoin/bitcoin#13869

  Enable unicode support for leveldb on Windows

  CI result for applying this change is available in bitcoin/bitcoin#13787

Tree-SHA512: 860261f973ec7aec8d3051632be8154d87854df8a604ef10b9171701f132c4ba9855ca97fc6e2d529ba322a8100e1e160d5d0f2afe558158bde89979815b5246
@ken2812221 ken2812221 deleted the windows-env branch January 26, 2019 20:51
maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Jan 31, 2019
82dcacb msvc: build leveldb locally (Chun Kuan Lee)
5209106 msvc: build secp256k1 locally (Chun Kuan Lee)

Pull request description:

  In current MSVC build setup, the code depends on leveldb and secp256k1 that are installed from vcpkg which is not controlled by us. If we update our code, we have to wait for vcpkg port being merged.

  This PR move them from vcpkg to local branch to make it as same as autoconf.

  The leveldb changes is based on bitcoin-core/leveldb-subtree#14 and bitcoin-core/leveldb-subtree#18

Tree-SHA512: aa2cc1c3191e8d9cab23d555da4be296314c46d944f452c2ec6202b1779e4cc223b603e589b38196cd2c793a03a8bb0ba128cc66256b35a58c5e7bb358475206
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants