bildwelt
E2E FORUM
E2E Bridge E2E Commerce

Smart replace of special characters by normal characters

E2E Forum Modeling & Development Smart replace of special characters by normal characters

This topic contains 6 replies, has 3 voices, and was last updated by  Andi 4 years, 4 months ago.

Viewing 7 posts - 1 through 7 (of 7 total)
  • Author
    Posts
  • #597

    Andi
    Moderator

    I’ve got the requirement to replace replace of special characters by normal characters. For instance, ö becomes o; á becomes a, č becomes c, etc.

    The background is, that we have customers from all over Europe and they enter their addresses. We have to forward these addresses to a service provider which is unfortunately not able to handle special characters.

    The obvious Idea is to have a mapping list where all replacements are defined and then iterate over this list in order to replace all the special characters in the address.

    This will work of course, but I don’t like it. First i have to manage all possible values (probably 100 or even more) in a list, seconds I have to iterate over this list with the different address parts like name, street, city, etc.

    Is there not a smarter approach ? Maybe by transforming and using an encoding that does not support such special characters, i.e. ASCII as encoding, could this work ? Or are there any other suggestions ?

     

     

    #598

    Marcel R.
    Blocked

    Use a hash-table. I don’t know any other approach. See also the discussions and links in http://stackoverflow.com/questions/17215431/unicode-to-ascii-standardized-transcription

    #601

    Andi
    Moderator

    In deed, approach with encoding to ASCII does not work, the diacritic characters are going lost.

    I have also made some research in WWW. Best approach I could find currently is http://stackoverflow.com/questions/863800/replacing-diacritics-in-javascript. The Java Script approach is maybe a little more efficient as iterating over a list of value mappings and apply the replace() function.

    I’ll keep you updated in case i have really to implement the requirement 😉

    #602

    Alfred
    Moderator

    ICU can do a transformation from Latin to ASCII http://unicode.org/repos/cldr/trunk/common/transforms/Latin-ASCII.xml

    If this ICU transformation is the correct solution for you Andi and you really have to implement it then we can enhance the Bridge with a string method to do a transformation implemented with ICU.

    Here http://userguide.icu-project.org/transforms/general you find more information about ICU transformations.

    #603

    Andi
    Moderator

    Hi Alfred,

    I’m not sure. I understand from the xml you refer that it defines rules how to transform Latin to ASCII. But i could not find a such a rule for simple diacritics like ä, é, etc.

    But when you can implement functionality as in http://demo.icu-project.org/icu-bin/translit, this would solve my problem. When inserting sample ‘Accents’, use Latin in source1 and ASCII in target1, then i get the result in need to have:

    Input:

    “La mort d’Olivier Bécaille” — Émile Zola;
    “Das Vermächtnis des alten Pilgers” von Rainer M. Schröder (Österreich, ÖVP);
    “Smyčcový koncert As dur” — Antonín Dvořák.
    objektiv på 32 mm, og 8x forstørrelse.

    Result:
    “La mort d’Olivier Becaille” — Emile Zola;
    “Das Vermachtnis des alten Pilgers” von Rainer M. Schroder (Osterreich, OVP);
    “Smyccovy koncert As dur” — Antonin Dvorak.
    objektiv pa 32 mm, og 8x forstorrelse.

     

    Can you implement this functionality ?

    #604

    Alfred
    Moderator

    Hi Andi,

    yes I can implement the same functionality. In the document http://unicode.org/repos/cldr/trunk/common/transforms/Latin-ASCII.xml I found this comment: “Here we remove accents from Latin characters.”

    #605

    Andi
    Moderator

    Hi Alfred,

    i have checked it again with my customer, and yes, this is exactly what would solve the problem. So, I’m looking forward for this new feature in E2E Bridge ! 🙂

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic.