AGB  ·  Datenschutz  ·  Impressum  







Anmelden
Nützliche Links
Registrieren
Zurück Delphi-PRAXiS Delphi-PRAXiS - Lounge Delphi-News aus aller Welt The Delphi Compiler and UTF-8 Encoded Source Code Files With no BOM
Thema durchsuchen
Ansicht
Themen-Optionen

The Delphi Compiler and UTF-8 Encoded Source Code Files With no BOM

Ein Thema von DP News-Robot · begonnen am 4. Okt 2018
Antwort Antwort
Benutzerbild von DP News-Robot
DP News-Robot

Registriert seit: 4. Jun 2010
14.979 Beiträge
 
#1

The Delphi Compiler and UTF-8 Encoded Source Code Files With no BOM

  Alt 4. Okt 2018, 12:10
Last week a (large) customer sent me an email indicating he was experiencing issues when compiling the same project on different machines. Turned out the difference was in the source code files format and the root cause was a unit saved as UTF-8*but without a BOM. The reason? One of the developers is using Visual Studio Code... and the solution is a chancing that or using compiler flag. But before I get to the solution, let me show you the problem with a very simple test case.

Delphi and Source Files Encoding

To test the scenario, we (it was one of the architects who came up with the simple scenario) created a simple VCL application with code like the following:

var strEuro : String = 'Euro=€'; procedure TForm16.Button1Click(Sender: TObject); begin Button1.Caption := 'Hello ' + strEuro; end; By default, the editor in the Delphi IDE uses ANSI encoding, and everything works fine. You can use the editor context menu, pick the File Format submenu, select UTF8*(I know, missing -), and everything keeps working as expected. Notice, though the length in bytes of the string changes, as you need multiple bytes to represent the Euro symbol in UTF-8.

Enter Visual Studio Code

While many modern editors use UTF-8 as their standard file format, a nice option we are considering to default to also in RAD Studio, Visual Studio Code (or VSC) is one of the few that prefers using UTF-8*with no BOM. The BOM (Begin Of file Marker) is a sequence of bytes (3 bytes for for UTF-8) that marks the file to make it simple for an editor or a tool processing the file (like a compiler) to figure out the internal format. Given the similarities in many cases, you might have to read an entire file to see if it is UTF-8*or ANSI encoded.

When you open a Delphi unit in VSC, it keeps the formatting. If it is UTF-8 with BOM, it remains as such. But for a new file the default is UTF-8*with no BOM. And any file can be saved with that format, as you can see in the status bar options:



Now if you do this, compile the application again, you'll get a nice caption for your button, rather than the Euro symbol:



Use a BOM or a Compiler Flag

Once you realize the issue, the solution is not that complex to achieve.

On one side, you can make sure your editors save the UTF-8 files with a BOM. The compiler sees it, and all works fine.

On the other side, you can tell the compiler to consider the source code file as UTF-8, regardless of the BOM. You can do this using the compiler flag:

--codepage:65001 or setting the codepage*in the compiler options to that value:



Good Work Around, Future Solution

I hope this will help you avoid a similar issue. We are researching making the codepage*detection automatic, so that the compiler could automatically pick up the right option. But not something we plan to addressing shortly... as the R&D team is busy with the coming release.

*

*

*



Weiterlesen...
  Mit Zitat antworten Zitat
Antwort Antwort


Forumregeln

Es ist dir nicht erlaubt, neue Themen zu verfassen.
Es ist dir nicht erlaubt, auf Beiträge zu antworten.
Es ist dir nicht erlaubt, Anhänge hochzuladen.
Es ist dir nicht erlaubt, deine Beiträge zu bearbeiten.

BB-Code ist an.
Smileys sind an.
[IMG] Code ist an.
HTML-Code ist aus.
Trackbacks are an
Pingbacks are an
Refbacks are aus

Gehe zu:

Impressum · AGB · Datenschutz · Nach oben
Alle Zeitangaben in WEZ +1. Es ist jetzt 15:04 Uhr.
Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO © 2011, Crawlability, Inc.
Delphi-PRAXiS (c) 2002 - 2023 by Daniel R. Wolf, 2024 by Thomas Breitkreuz