AGB  ·  Datenschutz  ·  Impressum  







Anmelden
Nützliche Links
Registrieren
Thema durchsuchen
Ansicht
Themen-Optionen

Update form Ansii to Unicode

Ein Thema von WojTec · begonnen am 1. Dez 2013 · letzter Beitrag vom 2. Dez 2013
Antwort Antwort
WojTec

Registriert seit: 17. Mai 2007
480 Beiträge
 
Delphi XE6 Professional
 
#1

Update form Ansii to Unicode

  Alt 1. Dez 2013, 11:45
Delphi-Version: 2010
Original function in C++:

Code:
unsigned int MurmurHash2 ( const void * key, int len, unsigned int seed )
{
   // 'm' and 'r' are mixing constants generated offline.
   // They're not really 'magic', they just happen to work well.

   const unsigned int m = 0x5bd1e995;
   const int r = 24;

   // Initialize the hash to a 'random' value

   unsigned int h = seed ^ len;

   // Mix 4 bytes at a time into the hash

   const unsigned char * data = (const unsigned char *)key;

   while(len >= 4)
   {
      unsigned int k = *(unsigned int *)data;

      k *= m;
      k ^= k >> r;
      k *= m;
      
      h *= m;
      h ^= k;

      data += 4;
      len -= 4;
   }
   
   // Handle the last few bytes of the input array

   switch(len)
   {
   case 3: h ^= data[2] << 16;
   case 2: h ^= data[1] << 8;
   case 1: h ^= data[0];
           h *= m;
   };

   // Do a few final mixes of the hash to ensure the last few
   // bytes are well-incorporated.

   h ^= h >> 13;
   h *= m;
   h ^= h >> 15;

   return h;
}
Translation for Delphi Ansii version:

Delphi-Quellcode:
function Murmur2(const S: AnsiString; const Seed: Cardinal = $9747b28c): Cardinal;
const
  // 'm' and 'r' are mixing constants generated offline.
  // They're not really 'magic', they just happen to work well.
  m = $5bd1e995;
  r = 24;
var
  hash: LongWord;
  len: LongWord;
  k: LongWord;
  data: Integer;
begin
  len := Length(S);

  //The default seed, $9747b28c, is from the original C library

  // Initialize the hash to a 'random' value
  hash := seed xor len;

  // Mix 4 bytes at a time into the hash
  data := 1;

  while(len >= 4) do
  begin
      k := PLongWord(@S[data])^;

      k := k*m;
      k := k xor (k shr r);
      k := k*m;

      hash := hash*m;
      hash := hash xor k;

      data := data+4;
      len := len-4;
  end;

  Handle the last few bytes of the input
          S: ... $69 $18 $2f
  }

  Assert(len <= 3);
  if len = 3 then
      hash := hash xor (LongWord(s[data+2]) shl 16);
  if len >= 2 then
      hash := hash xor (LongWord(s[data+1]) shl 8);
  if len >= 1 then
  begin
      hash := hash xor (LongWord(s[data]));
      hash := hash * m;
  end;

  // Do a few final mixes of the hash to ensure the last few
  // bytes are well-incorporated.
  hash := hash xor (hash shr 13);
  hash := hash * m;
  hash := hash xor (hash shr 15);

  Result := hash;
end;


I don't like AnsiString, so I'm trying to change to string:

function Murmur2(const AValue: string; const Seed: Cardinal = $9747b28c): Cardinal;

Result is different than in Ansii version. I think problem is here:

k := PLongWord(@AValue[data])^;

How to fix it?

Also line:

data := 1;

is valid?
  Mit Zitat antworten Zitat
mjustin

Registriert seit: 14. Apr 2008
3.004 Beiträge
 
Delphi 2009 Professional
 
#2

AW: Update form Ansii to Unicode

  Alt 1. Dez 2013, 12:10
The input data seems not to be a string but a byte array. I would use an array of byte (TBytes type) to avoid the danger of string encoding conversion related bugs.
Michael Justin
  Mit Zitat antworten Zitat
WojTec

Registriert seit: 17. Mai 2007
480 Beiträge
 
Delphi XE6 Professional
 
#3

Re: Update form Ansii to Unicode

  Alt 1. Dez 2013, 12:42
If I'll use bytes as input, how to use it for strings and other data?
  Mit Zitat antworten Zitat
Benutzerbild von Sir Rufo
Sir Rufo

Registriert seit: 5. Jan 2005
Ort: Stadthagen
9.454 Beiträge
 
Delphi 10 Seattle Enterprise
 
#4

AW: Update form Ansii to Unicode

  Alt 1. Dez 2013, 13:03
Simple convert the strings into a byte array.

Just keep in mind that AnsiString has 1 Byte/Char and UnicodeString has 2 Byte/Char
Kaum macht man's richtig - schon funktioniert's
Zertifikat: Sir Rufo (Fingerprint: ‎ea 0a 4c 14 0d b6 3a a4 c1 c5 b9 dc 90 9d f0 e9 de 13 da 60)
  Mit Zitat antworten Zitat
WojTec

Registriert seit: 17. Mai 2007
480 Beiträge
 
Delphi XE6 Professional
 
#5

Re: Update form Ansii to Unicode

  Alt 1. Dez 2013, 13:41
Ok, maybe it's good idea, but lets back to problem: ansii --> unicode?
  Mit Zitat antworten Zitat
mjustin

Registriert seit: 14. Apr 2008
3.004 Beiträge
 
Delphi 2009 Professional
 
#6

AW: Re: Update form Ansii to Unicode

  Alt 1. Dez 2013, 14:02
Ok, maybe it's good idea, but lets back to problem: ansii --> unicode?
The Delphi Unicode string has a code page information stored in its metadata. If York input data is meant to be just raw binary data without caring about encoding and code pages, you will not want this string type.


The RawByteString is a string type which does not carry encoding information, which can be used for binary data. But watch out and take care of compiler warnings about implicit string type conversions.


TBytes would be the appropriate data type, RawByteString is only easier to use as AnsiString replacement.
Michael Justin
  Mit Zitat antworten Zitat
WojTec

Registriert seit: 17. Mai 2007
480 Beiträge
 
Delphi XE6 Professional
 
#7

Re: Update form Ansii to Unicode

  Alt 1. Dez 2013, 15:47
RawByteString is good, because it's easy. TBytes will be better as you told, but I don't know to much how to use it (perform convertion) with any data, strings, streams, etc. How to convert some string to bytes? Or stream?
  Mit Zitat antworten Zitat
Benutzerbild von Sir Rufo
Sir Rufo

Registriert seit: 5. Jan 2005
Ort: Stadthagen
9.454 Beiträge
 
Delphi 10 Seattle Enterprise
 
#8

AW: Update form Ansii to Unicode

  Alt 2. Dez 2013, 08:20
Have a look at Delphi-Referenz durchsuchenSysUtils.TEncoding and a closer look at Delphi-Referenz durchsuchenSysUtils.TEncoding.GetBytes
Kaum macht man's richtig - schon funktioniert's
Zertifikat: Sir Rufo (Fingerprint: ‎ea 0a 4c 14 0d b6 3a a4 c1 c5 b9 dc 90 9d f0 e9 de 13 da 60)
  Mit Zitat antworten Zitat
WojTec

Registriert seit: 17. Mai 2007
480 Beiträge
 
Delphi XE6 Professional
 
#9

Re: Update form Ansii to Unicode

  Alt 2. Dez 2013, 12:00
Ok, thanks
  Mit Zitat antworten Zitat
Antwort Antwort


Forumregeln

Es ist dir nicht erlaubt, neue Themen zu verfassen.
Es ist dir nicht erlaubt, auf Beiträge zu antworten.
Es ist dir nicht erlaubt, Anhänge hochzuladen.
Es ist dir nicht erlaubt, deine Beiträge zu bearbeiten.

BB-Code ist an.
Smileys sind an.
[IMG] Code ist an.
HTML-Code ist aus.
Trackbacks are an
Pingbacks are an
Refbacks are aus

Gehe zu:

Impressum · AGB · Datenschutz · Nach oben
Alle Zeitangaben in WEZ +1. Es ist jetzt 08:24 Uhr.
Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO © 2011, Crawlability, Inc.
Delphi-PRAXiS (c) 2002 - 2023 by Daniel R. Wolf, 2024 by Thomas Breitkreuz