Ingres Community Forums Login Register Ingres.com  

Ingres Community Forum



Reply
 
LinkBack Thread Tools Display Modes
Old 2009-07-09   #1 (permalink)
Junior Member
 
Join Date: Jul 2009
Posts: 6
Default Charset in OR

Hi

Currently in our ingres installation II_CHARSETII is set to ISO88591 but this char set don't have some welsh character which i want to use. I have found character set ISO885914 which encodes the characters used in Celtic languages including the Welsh characters.

ISO 885914 is not a character set that exists in charsets folder of ingres installation so presumably is one that is not issued with ingres 2.6

My questions are as follows:
1. Can someone suggest what impact changing the II_CHARSETII setting may have on the database/application?

2. If ingres isn’t shipped with ISO 885914 is it possible to find out if it is supported?

3. Is unload/reload required after changing the II_CHARSETII setting?

4. Can changing of II_CHARSETII cause issues to interface scripts that process text using the operating system?

Thanks
MohitG is offline   Reply With Quote
Old 2009-07-09   #2 (permalink)
Ingres Corp
 
Join Date: Mar 2007
Location: On the OpenROAD
Posts: 666
Default

Why can't you stay with ISO88591?
I know that it doesn't contain the welsh characters, but as long as all your clients use the same interpretation of the according code points that shouldn't matter.
E.g. the Euro symbol € is not part of the ISO8859-1 character set either (ISO8859-15 contains it),
but I can use it as long as all my clients interprete the code point 0x80 as a € symbol.
Bodo is offline   Reply With Quote
Old 2009-07-09   #3 (permalink)
Ingres Community
 
Join Date: Mar 2007
Posts: 47
Default

Quote:
Originally Posted by MohitG View Post
My questions are as follows:
1. Can someone suggest what impact changing the II_CHARSETII setting may have on the database/application?
Non-ASCII (i.e. characters that do not fit into 7 bits) object names and user data may appear corrupted to remote Ingres Net users that are using a different character set to the one (originally) defined on the server.

Collation sequences will not work as expected.

In short changing the charset on the fly is NOT a good idea (and is not supported).

Quote:
Originally Posted by MohitG View Post
2. If ingres isn’t shipped with ISO 885914 is it possible to find out if it is supported?
If it is not shiped with the product it is not supported. If you need a specific character set supported I encourage you to log an enhancement request (at servicedesk.ingres.com), you can then talk to product management about this.

Quote:
Originally Posted by MohitG View Post
3. Is unload/reload required after changing the II_CHARSETII setting?
Yes along with transcoding, however I would not change the charset, I would migrate to another database installation.

E.g. assume you have installation II with charset = iso-8859-1, create a new installation, say I2 with the required character set, e.g win1252

However you would need to transcode the data before making the transition to the new database/installation. The easiest way to do this is to unload/reload over Ingres Net but I would advise making a call to support and opening a ServiceDesk.ingres.com issue first to understand the implications.

Quote:
Originally Posted by MohitG View Post
4. Can changing of II_CHARSETII cause issues to interface scripts that process text using the operating system?
Only if the data changes too, e.g. if you use a different charset name/setting but continue to use the same character set data you used before (i.e. you lie about the charset) you will be OK. This is what Bodo was describing and if you can ensure clients all consistently use the same encoding for data and all CLAIM the same Ingres charset everything will be fine.

One final idea is to use Unicode, Unicode supports (almost) every character/language we have on the planet. The disadvantage is that you may need to make application changes to handle this. Either by using NVARCHAR types (e.g. UCS2 encoding) or by using UTF8 which has some byte/character length semantic issues to deal with.

Chris
clach04 is offline   Reply With Quote
Old 2009-07-13   #4 (permalink)
Junior Member
 
Join Date: Jul 2009
Posts: 6
Default

Thanks Bodo, Can you please explain this in a bit more detail as how you display € symbol on Openroad frame while its not part of charset and can it be printed also from OR?
MohitG is offline   Reply With Quote
Old 2009-07-13   #5 (permalink)
Ingres Corp
 
Join Date: Mar 2007
Location: On the OpenROAD
Posts: 666
Default

The windows fonts I am using all that contain the € symbol, and I can create it with a key combination on my German keyboard (<AltGr>+<E>).
So, my fileds can contain the € symbol - and as most of the "Windows: Western" fonts have the € symbol included at position 0x80 (also those mapped to the OpenROAD System Fonts).
Therefore I have no problem using it, even with my II_CHARSET set to ISO88591(or even not set at all).
What you need to display welch characters in OpenROAD is just a celtic/welsh font for Windows.
Then you either have to change your font mapping in your font file (II_FONT_FILE), or explicitely select the native font for your fields.
The II_CHARSET setting actually doesn't matter for OpenROAD in this respect. It is only important when Ingres/Net comes into play: If II_CHARSET on client and server are different then there is a transliteration between the character sets done.
But when retrieved back the translieration is done backwards as well.
Therefore it's important that all clients are using the same II_CHARSET.
Bodo is offline   Reply With Quote
Old 2009-08-24   #6 (permalink)
Ingres Community
 
Join Date: Mar 2007
Posts: 10
Default

The display of a character has nothing to do with the II_CHARSET setting for a non-Unicode application. The Windows Codepage is used by the OS to display the character.

Typically on Windows the WIN1252 the Euro Symbol has a singlebyte codepoint of 0x80. The Ingres client II_CHARSET and Ingres Server II_CHARSET setting control how the codepoint gets transliterated from the client to the server. How the specific codepoint gets transliterated to the server is dependent on the II_CHARSET setting on the client and server.

When the data is retrieved by the client the characer that the codepoint represents is determined by the value of the codepoint and the Codepage value.

An example is what happens when the II_CHARSET of the server is ISO888591. The ISO-8859-1 character set does not contain a codepoint for the Euro symbol. If the client CODEPAGE is set to WIN1252 then a codepoint of 0x80 will be sent by the client to the server. As long as the Ingres Server does not attempt to interpret this codepoint as a Euro Sign then everything will be okay. All displays and interpretation of the 0x80 codepoint if done in the client will show the correct Euro Sign.
wridu01 is offline   Reply With Quote

Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


© 2009 Ingres Corporation. All Rights Reserved