Old 12th February 2019, 11:39   #1
Pawel
Moderator
 
Pawel's Avatar
 
Join Date: Aug 2004
Location: Poland
Posts: 528
Send a message via ICQ to Pawel
NSIS UTF-8 to ANSI conversion problem

I was so stupid and decided to add support for Chinese language in my project
I thought it will be nice and easy - I was wrong! It seems that there are problems I can not resolve myself.

So, I will try do describe the problem and I hope some of you guys can help me.

My NSIS Installer (Unicode) uses UTF-8 (No BOM) encoded language files. The file contains all strings that I use to
display in installer pages and to generate files on user PC during installation (configuration files).
My NSIS installer with Chinese language with UTF-8 encoding works fine - everything is displayed correctly (my Chinese translator says it is OK).

But, the problem is with files that I generate on runtime during installation process on user PC (the files need to be generated dynamically as its content depends on user selection during installation).
So, I was trying to use what NSIS offer - as far as I know NSIS support "FileWrite" and "FileWriteUTF16LE" commands. Write below strings to file.

LangString UFM_COMPILE_DATE ${LANG_ID} "构建日期: " ; Built Date:
LangString UFM_INSTALL_DATE ${LANG_ID} "安装日期:" ; Installation date:
LangString UFM_INSTALL_PATH ${LANG_ID} "安装目录:" ; Installation Directory:

1. FileWrite creates ANSI file:

PHP Code:
    ClearErrors
    FileOpen $R0 
"$INSTDIR\FILE.MNU" w
    
${If} ${Errors}
        ; 
Error! Do something
    
${Else}
        
FileWrite $R0 '; Example TC Menu File$\r$\n'
        
FileWrite $R0 '; Chinese - Simplified (China) [zh-CN]$\r$\n'
        
FileWrite $R0 '$\r$\n'
        
FileWrite $R0 'POPUP "$(UFM_COMPILE_DATE)"$\r$\n'
        
FileWrite $R0 '  MENUITEM "$(UFM_INSTALL_DATE)", cm_SetAttrib$\r$\n'
        
FileWrite $R0 '  MENUITEM "$(UFM_INSTALL_PATH)", cm_VersionInfo$\r$\n'
        
FileWrite $R0 'END_POPUP$\r$\n'
        
FileClose $R0
    
${EndIf} 
But, above example generates file with wrong encoding - Chinese strings are all wrong. Encoding is bad.

2. FileWriteUTF16LE creates Unicode (UTF-16LE) file:

PHP Code:
    ClearErrors
    FileOpen $R0 
"$INSTDIR\FILE.MNU" w
    
${If} ${Errors}
        ; 
Error! Do something
    
${Else}
        
FileWriteUTF16LE $R0 '; Example TC Menu File$\r$\n'
        
FileWriteUTF16LE $R0 '; Chinese - Simplified (China) [zh-CN]$\r$\n'
        
FileWriteUTF16LE $R0 '$\r$\n'
        
FileWriteUTF16LE $R0 'POPUP "$(UFM_COMPILE_DATE)"$\r$\n'
        
FileWriteUTF16LE $R0 '  MENUITEM "$(UFM_INSTALL_DATE)", cm_SetAttrib$\r$\n'
        
FileWriteUTF16LE $R0 '  MENUITEM "$(UFM_INSTALL_PATH)", cm_VersionInfo$\r$\n'
        
FileWriteUTF16LE $R0 'END_POPUP$\r$\n'
        
FileClose $R0
    
${EndIf} 
Generated file looks nice (Chinese chars looks fine) - but my Chinese translator says it is displayed wrong (this might be problem of the app I use to display the file - Total Commander (it also do not accept BOM)).
But, regardless of whether it's a TC program fault or encoding conversion I can not use this.

So, Chinese translator said only ANSI GB2312 (Simplified) works fine.


Finally, the question is how to write (generate) file with ANSI GB2312 (Simplified) encoding using NSIS with UTF-8 (NO BOM) language file.

- Example NSIS Language file (with input strings): http://www.meggamusic.co.uk/shup/154...IMPCHINESE.nsh
- Example Output Chinese file: http://www.meggamusic.co.uk/shup/154...IMPCHINESE.MNU

Is this possible to create with NSIS a file encoded as ANSI GB2312 (Simplified) using UTF-8 (No BOM) encoded language file?
How can I do this?

Any help is much appreciated!
I am lost with this Chinese magic :P
-Pawel
Pawel is offline   Reply With Quote
Old 12th February 2019, 15:28   #2
Anders
Moderator
 
Anders's Avatar
 
Join Date: Jun 2002
Location: ${NSISDIR}
Posts: 5,012
0)

Why are your .nsh files without BOM?

1)

FileWrite will never work on anything except "Chinese" machines, it is documented to convert the string to ACP ANSI.

2)

That file does not have a BOM, use FileWriteUTF16LE /BOM $R0 "" before writing the strings. Most editors that can read UTF-16 will support the BOM on these files. (Technically, all editors that can read UTF-16 can also read the BOM because it is a valid Unicode character but they might not recognize it as a BOM and switch to UTF-16 automatically)

?)

To write in a custom codepage you must do the conversion and writing yourself:

PHP Code:
!macro WriteConvertedUnicodeToFile handle codepage string
Push 
$0
Push 
$1
Push 
${handle}
!
define /ReDef /Math REGMEMNSCAP ${NSIS_MAX_STRLEN} - 24
!define /ReDef /Math REGMEMNSCAP ${REGMEMNSCAP} * ${NSIS_CHAR_SIZE}
System::Call 'KERNEL32::WideCharToMultiByte(i${codepage},i0,ws,i-1,@r0,i${REGMEMNSCAP},p0,p0)i.s' `${string}`
Pop $1
TodoError checkFailed if $1 is 0 or $> ${REGMEMNSCAP}
IntOp $$# Don't write \0
System::Call 'KERNEL32::WriteFile(ps,pr0,ir1,*i,p0)i.r1'
TodoError checkFailed if $1 is 0
Pop 
$1
Pop 
$0
!macroend

Unicode True
LangString UFM_INSTALL_DATE 0 
"${U+5b89}${U+88c5}${U+65e5}${U+671f}:" # "安装日期:" without Unicode .nsi
Section
FileOpen $R0 
"$%temp%\utf8.txt" w
!insertmacro WriteConvertedUnicodeToFile $R0 65001 'Hello "$(UFM_INSTALL_DATE)" World$\r$\n'
FileClose $R0
FileOpen $R0 
"$%temp%\20936.txt" w
!insertmacro WriteConvertedUnicodeToFile $R0 20936 'Hello "$(UFM_INSTALL_DATE)" World$\r$\n'
FileClose $R0
SectionEnd 

IntOp $PostCount $PostCount + 1
Anders is offline   Reply With Quote
Old 12th February 2019, 16:15   #3
Pawel
Moderator
 
Pawel's Avatar
 
Join Date: Aug 2004
Location: Poland
Posts: 528
Send a message via ICQ to Pawel
Anders,
First of all - thank you for your answer and time you spent to help me.

Quote:
Originally Posted by Anders View Post
Why are your .nsh files without BOM?
No reason. It works fine without BOM so I didn't touch it. I will try to change it and use BOM in future.

Quote:
Originally Posted by Anders View Post
FileWrite will never work on anything except "Chinese" machines, it is documented to convert the string to ACP ANSI.
Yes, now I know. I just tried everything I thought may work. Everything! (the main problem is that I couldn't tell if Chinese text is OK or not - each time I have to ask my translator).

Quote:
Originally Posted by Anders View Post
That file does not have a BOM, use FileWriteUTF16LE /BOM $R0 "" before writing the strings. Most editors that can read UTF-16 will support the BOM on these files. (Technically, all editors that can read UTF-16 can also read the BOM because it is a valid Unicode character but they might not recognize it as a BOM and switch to UTF-16 automatically)
Yes, you are absolutely right. Normally I use UTF-16LE with BOM. In this case (when creating MNU file) BOM is not accepted (it only accept ANSI and UTF-8, and this is not gonna change in nearest future - compatibility reasons). So I was creating UTF16LE (no BOM) file and then converted it to UTF-8 (no BOM) - but it was not working for Chinese anyway). I hope you understand now why I hate this encoding stuff

Quote:
Originally Posted by Anders View Post
To write in a custom codepage you must do the conversion and writing yourself:

PHP Code:
!macro WriteConvertedUnicodeToFile handle codepage string
Push 
$0
Push 
$1
Push 
${handle}
!
define /ReDef /Math REGMEMNSCAP ${NSIS_MAX_STRLEN} - 24
!define /ReDef /Math REGMEMNSCAP ${REGMEMNSCAP} * ${NSIS_CHAR_SIZE}
System::Call 'KERNEL32::WideCharToMultiByte(i${codepage},i0,ws,i-1,@r0,i${REGMEMNSCAP},p0,p0)i.s' `${string}`
Pop $1
TodoError checkFailed if $1 is 0 or $> ${REGMEMNSCAP}
IntOp $$# Don't write \0
System::Call 'KERNEL32::WriteFile(ps,pr0,ir1,*i,p0)i.r1'
TodoError checkFailed if $1 is 0
Pop 
$1
Pop 
$0
!macroend

Unicode True
LangString UFM_INSTALL_DATE 0 
"${U+5b89}${U+88c5}${U+65e5}${U+671f}:" # "安装日期:" without Unicode .nsi
Section
FileOpen $R0 
"$%temp%\utf8.txt" w
!insertmacro WriteConvertedUnicodeToFile $R0 65001 'Hello "$(UFM_INSTALL_DATE)" World$\r$\n'
FileClose $R0
FileOpen $R0 
"$%temp%\20936.txt" w
!insertmacro WriteConvertedUnicodeToFile $R0 20936 'Hello "$(UFM_INSTALL_DATE)" World$\r$\n'
FileClose $R0
SectionEnd 
As I understand this little macro will make me happy again, right?
I will test it and I hope this will resolve all my problems.

-Pawel
Pawel is offline   Reply With Quote
Old 12th February 2019, 17:28   #4
Pawel
Moderator
 
Pawel's Avatar
 
Join Date: Aug 2004
Location: Poland
Posts: 528
Send a message via ICQ to Pawel
Anders,
I made few tests and it seems to work fine. Thanks!
This will make all things much easier. Now, I will only have to implement it.

-Pawel
Pawel is offline   Reply With Quote
Reply
Go Back   Winamp & Shoutcast Forums > Developer Center > NSIS Discussion

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump