![]() |
AW: Suche nach String mit 'decomposed' Character
Zitat:
Ich sehe allerdings für eine solche Zweiteilung keinen Grund bei der Zeichenkodierung. |
AW: Suche nach String mit 'decomposed' Character
Zitat:
![]() |
AW: Suche nach String mit 'decomposed' Character
Zitat:
Irgendwer hat da heimlich 3 Bit geändert. :shock: Zitat:
Zitat:
So einen ähnlichen Mist gibt es inzwischen auch für Smileys ... Geschlechtsteil und Rasse (Hautfarbe) Wobei man es hier unvollständig lassen und nicht alles x-fach nochmal aufnehmen wollte. ![]() |
AW: Suche nach String mit 'decomposed' Character
Liste der Anhänge anzeigen (Anzahl: 2)
Hi,
Zitat:
Which is completely different APIs like FindNLSString and FindNLSStringEx ![]() ![]() in fact all the functions the following link capable to perform what you want, with more or less tweaking parameters ![]() Windows has National Language Support (NLS) and International Components for Unicode (ICU), but there is differences, ICU introduced to Windows 10 Creator, while NLS was there since Windows Vista (at least i think) Anyway here is a simple test for you search problem and its solution, i will paste the code and attach the project because i can't trust the browsers to keep encoding right.
Delphi-Quellcode:
The data file in question, which is minimized and edited a little is MacOS_ItunesContent_Small.txt
unit Unit11;
interface uses Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics, Vcl.Controls, Vcl.Forms, Vcl.Dialogs, Vcl.StdCtrls; type TForm11 = class(TForm) Button1: TButton; Memo1: TMemo; Memo2: TMemo; procedure FormCreate(Sender: TObject); procedure Button1Click(Sender: TObject); private { Private declarations } public procedure SearchStringList(AList: TStringList); procedure SearchStringFindNLSString(AList: TStringList); procedure SearchStringFindNLSStringEx(const LOCALE_NAME: string; AList: TStringList); end; var Form11: TForm11; implementation {$R *.dfm} const SUB_STR_1 = 'Götterdämmerung'; SUB_STR_2 = 'Götterdämmerung'; procedure TForm11.FormCreate(Sender: TObject); begin Button1.Click; end; procedure TForm11.SearchStringFindNLSString(AList: TStringList); procedure DoFindWithFindNLSString(const SubStr: string); var i, Res, Found: Integer; begin for i := 0 to AList.Count - 1 do begin // LOCALE_USER_DEFAULT = $400 Res := FindNLSString(LOCALE_USER_DEFAULT, FIND_FROMSTART, PChar(AList.Strings[i]), -1, PChar(SubStr), -1, @Found); if (Res <> -1) and (Found > 0) then Memo1.Lines.Add(IntToStr(i)); end; end; begin Memo1.Lines.Add('finding lines with SUB_STR_1 = ' + SUB_STR_1); DoFindWithFindNLSString(SUB_STR_1); Memo1.Lines.Add('finding lines with SUB_STR_2 = ' + SUB_STR_2); DoFindWithFindNLSString(SUB_STR_2); end; procedure TForm11.SearchStringFindNLSStringEx(const LOCALE_NAME: string; AList: TStringList); procedure DoFindWithFindNLSStringEx(const SubStr: string); var i, Res, Found: Integer; begin for i := 0 to AList.Count - 1 do begin Res := FindNLSStringEx(PChar(LOCALE_NAME), FIND_FROMSTART, PChar(AList.Strings[i]), -1, PChar(SubStr), -1, @Found, nil, nil, 0); if (Res <> -1) and (Found > 0) then Memo1.Lines.Add(IntToStr(i)); end; end; var i, Res, Found: Integer; begin Memo1.Lines.Add('finding lines with SUB_STR_1 = ' + SUB_STR_1); DoFindWithFindNLSStringEx(SUB_STR_1); Memo1.Lines.Add('finding lines with SUB_STR_2 = ' + SUB_STR_2); DoFindWithFindNLSStringEx(SUB_STR_2); end; procedure TForm11.SearchStringList(AList: TStringList); procedure DoFindWithPos(const SubStr: string); var i: Integer; begin for i := 0 to AList.Count - 1 do if Pos(SubStr, AList.Strings[i]) > 0 then Memo1.Lines.Add(IntToStr(i)); end; begin Memo1.Lines.Add('finding lines with SUB_STR_1 = ' + SUB_STR_1); DoFindWithPos(SUB_STR_1); Memo1.Lines.Add('finding lines with SUB_STR_2 = ' + SUB_STR_2); DoFindWithPos(SUB_STR_2); end; procedure TForm11.Button1Click(Sender: TObject); var sList: TStringList; i: Integer; begin sList := TStringList.Create; try sList.LoadFromFile('MacOS_ItunesContent_Small.txt'); if sList.Count = 0 then Exit; for i := 0 to sList.Count - 1 do Memo2.Lines.Add(IntToStr(i) + #9 + sList.Strings[i]); Memo1.Lines.Add('Searching using Pos'); SearchStringList(sList); Memo1.Lines.Add(#13#10'Searching with FindNLSString'); SearchStringFindNLSString(sList); Memo1.Lines.Add(#13#10'Searching with FindNLSStringEx and LOCALNAME='''''); SearchStringFindNLSStringEx('', sList); finally sList.Free; end; end; end. Zitat:
Zitat:
Anhang 57176 The project with the data file Anhang 57178 Notes on FindNLSString and FindNLSStringEx: 1) Although the documentation of FindNLSString advice to move to FindNLSStringEx, yet Notpad.exe is using FindNLSString ! 2) FindNLSStringEx use LOCALE_NAME (plain string) instead of the structured LOCALE_NAME value, yet it is go complicated very fast when you need to chain many languages, so the sticking to default (USER or SYSTEM) is easier, in that case it is better and easier to use FindNLSString. 3) implementing similar algorithm in pure Pascal is huge job unless you will depend on either NLS or the ICU library, well .. such dependency will render any implementation useless and waste of time. Hope that help |
AW: Suche nach String mit 'decomposed' Character
After smoking a cigarette here few more notes
4) there is a nice yet a little complicate documentation for LOCALE_NAME on Windows OS ![]() if you want to use multiple LOCALE then you only can do with sort so value can be Zitat:
![]() 5) Notice that : I tried to stay away from the lengths of the parameters i passed to FindNLSStringX, because it can be tricky and easily can go wrong and cause overflow, generating wide range of problems from AV in the OS API to simple corrupted data, the reason is these functions require the length in chars, and here is the problem with this terminology, what is char ? Is it render-able one or code unit which 2 bytes for Delphi by default and Windows also by default has 2 bytes WideChar, the troubles comes from using or mixing different API such as WideCharToMultiByte this one in particular can be very dangerous due its ability to handle almost everything, and by MultiByte you should never assume it is 2 bytes output or even input for that API. The moral of this is to do as i do, prefer to stay away as far as you can from calculating the length or passing it, and try to stay on the safe side by leaning on using the null-terminating strings while passing (-1) as length, the API is safer that way. Again hope that helps somebody ! |
AW: Suche nach String mit 'decomposed' Character
Hi,
Wenn man weiß wie's geht, ist es tatsächlich trivial! FindNLSString ist in der Tat die wirklich einfache Lösung. Vielen Dank an alle Tippgeber, nicht zuletzt an Himitsu, aber vor allem Kas Ob., der mir mit seinem ausführlichen Beitrag entscheidend auf die Sprünge geholfen hat! Once you know how to do it, it's actually trivial! FindNLSString is indeed the really simple solution. Many thanks to everyone who contributed good tips, not least to Himitsu, but especially to Kas Ob., who helped me significantly with his detailed contribution! Gruß LP |
Alle Zeitangaben in WEZ +1. Es ist jetzt 13:04 Uhr. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO © 2011, Crawlability, Inc.
Delphi-PRAXiS (c) 2002 - 2023 by Daniel R. Wolf, 2024-2025 by Thomas Breitkreuz