OME: User Defined Functions
From Ingres Community Wiki
OME: Example Functions as an introduction to Opensource
Written By: Martin Bowes
Contents |
Introduction.
In this discussion I will present an example, which is deliberately incomplete. If you are interested in working in the Opensource then you should try to complete the program and then extend it as per the suggestions I'll make later.
Here goes...
What is OME?
The Object Management Extension (OME) is used to create customised datatypes &/or functions which can then be used by the Ingres Installation exactly as if they were part of the standard Ingres datatype and functions sets.
The OME has existed in Ingres for a considerable period, but is probably one of the least utilised features of the product. This is not surprising - how many of us really need to make a User Defined Type? I certainly have never needed to do so - although I have been tempted on a few occassions!
More usefully, the OME allows the creation of User Defined Functions (UDFs) which can manipulate most of the standard Ingres datatypes as well as any datatypes you create yourself. It is on the UDFs that I wish to concentrate as these will be a very useful introduction to some of the concepts you will need to be familiar with when working on the Opensource.
Example. checksum(digest, string)
The following discussion focuses on a specific example I was asked to make at Oxford University. The checksum function simply outputs a varchar representing the chosen digests summation of an input string. ie we can ask the function to return the 'md5' sum of a string, or the 'md2' sum or the 'sha1' sum.
The example given will restrict itself to using a varchar datatype for the string.
Documentation
The Ingres OME reference guide is available in all the documentation sets. There are examples provided in $II_SYSTEM/ingres/demo/udadts.
Personally speaking, I find that trawling through the 'demo' looking for specific examples of what is indicated in the documentation set a fairly good start point. Get used to searching for things! When you start on Opensource you will be spending a lot of time just looking for stuff.
What do you need to get started?
Can you write in C? You don't need to be a C-Guru or Druid. Just competent will do. Hell, I'm barely competent and I'm still having a go! You need to know about structures and pointers at the very least.
It isn't necessary, but it won't hurt to have a code repository such as CVS or Subversion available.
Program Overview
The OME 'stuff' is compiled into a dynamic library and linked with the ingres server (iimerge) to form a new server binary.
So the first step is to write a C program with the function and all the necessary support stuff for the Ingres system to find and manage the function, its parameters and its return type.
An OME function
Fine, but what does a function look like? If you read the OME manual you'll find that OME provides four basic function types. However three of these are used only on User Defined Types and we need not concern ourselves with those. So I'll restrict this discussion to the so called 'NORMAL' functions.
Conceptually our checksum function could be represented as:
varchar = checksum(varchar, varchar).
Eg. select checksum('md5', 'Hi There' + X'0A')
We should receive the answer: dedc1f20145899a3f253a03ffc752d6b
But C doesn't know what a varchar is. But thats OK because the Ingres system will abstract it's datatypes into a structure that C can understand. That structure is called II_DATA_VALUE, more on that later.
If you think about it for a while, you start to realise that the Ingres system also needs to mandate a format for a function. Afterall, its going to have to call this function using only the stuff it knows about.
As a quick precis on the limitations of functions:
- The function may have zero, one or two input parameters.
- The function must have a result, which is passed by reference to/from the function.
- The parameters and result are all of type II_DATA_VALUE
- The function returns a status value indicating success or failure.
When all of this is rolled togethor, the function syntax is:
II_STATUS
function_executor_code (
/* Inputs */
II_SCB *scb
/* zero, one or two input parameters */
[, II_DATA_VALUE *p1 [, II_DATA_VALUE *p2 ] ]
/* Output */
, II_DATA_VALUE *rdv)
{
....
}
II_DATA_VALUE
The key to handling the function is to pass the parameter data to the function, decode those parameters within the function and then build a structure that represents the answer.
Simple!
Function Definitions: IIADD_FO_DFN
You have to tell Ingres what the new functions are called and tie them to a unique function identifier number. This is achieved by an array of the IIADD_FO_DFN structures. This is actually pretty trivial, as you'll see in the following code example.
Function Instance Definitions: IIADD_FI_DFN
A function may have one or more Function Instance definitions. Essentially these allow different parameter datatypes to the function. They may also specify different result types, different result lengths and different function executor code to perform the function for this instance.
This information is stored in an array of IIADD_FI_DFN structures. The only kicker is it has a specific sort order. But with a little care its also fairly easy to make. See the following code example.
Code example
Hands up all those who guessed I have no idea about encryption, digests or ssl 'do-hickeys'? Your dead set right! But I can read a man page. The following code is really an OME function wrapped around what the man page said to do!
#include <openssl/evp.h> /* required for checksum */
/* max string length returned by checksum */
#define MAX_CHECKSUM_LENGTH 48 + sizeof(short)
/*Easy way to start counting the fod_id's*/
#define UDF_CHECKSUM II_OPSTART
/*Easy way to start counting the fid_id's */
#define UDF_FI_CHECKSUM_VARCHAR II_FISTART
II_STATUS
checksum (
II_SCB *scb,
II_DATA_VALUE *p1, /* digest_name */
II_DATA_VALUE *p2, /* text to hash */
II_DATA_VALUE *rdv) /* return_text */
{
EVP_MD_CTX mdctx;
const EVP_MD *md;
unsigned char md_value[EVP_MAX_MD_SIZE];
char digest[33], xx[3], msg[128];
unsigned short true_length;
int md_len, i;
rdv->db_prec = 0;
OpenSSL_add_all_digests();
true_length=*(short* )p1->db_data;
if (true_length > 32) {
strncpy(digest, (char *)p1->db_data + sizeof(short), 32);
digest[32]='\0';
sprintf(msg,
"checksum(): digest name commencing '%s' is too long to be valid!", digest);
us_error(scb, 0x200010, msg);
return(II_ERROR);
};
/*
** Confirm p1 is a known digest
*/
strncpy(digest, (char *)p1->db_data + sizeof(short), true_length);
digest[true_length]='\0';
md = EVP_get_digestbyname(digest);
if(!md) {
sprintf(msg, "checksum(): Unknown digest '%s'", digest);
us_error(scb, 0x200011, msg);
return(II_ERROR);
};
/*
** The 2nd parameter is passed in a 'varchar' like format regardless of
** its true data type. Hence allow for a two character length offset at the
** start of the data
*/
true_length=*(short *)p2->db_data;
EVP_DigestInit(&mdctx, md);
EVP_DigestUpdate(&mdctx, (char *)p2->db_data + sizeof(short), true_length);
EVP_DigestFinal(&mdctx, md_value, &md_len);
/* Check return length is sane */
if (md_len * 2 > MAX_CHECKSUM_LENGTH) {
sprintf(msg,
"checksum(): Return length of %d exceeds maximumm permitted (%d)",
md_len * 2, MAX_CHECKSUM_LENGTH
);
us_error(scb, 0x200011, msg);
return(II_ERROR);
};
/* And fill in the data...
** Note that this is a direct rip-off of the man page...
*/
*((short *) rdv->db_data)=md_len * 2;
strcpy((char *)rdv->db_data + 2, "");
strcat(msg,"Result: ");
for(i = 0; i < md_len; i++) {
sprintf(xx, "%02x", md_value[i]);
xx[2]='\0';
strcat((char *)rdv->db_data + 2, xx);
strcat(msg,xx);
};
return(II_OK);
}; /* checksum */
static IIADD_FO_DFN Function_Definitions[]={
{/* Define new function "checksum"*/
II_O_OPERATION, /*fod_object_type*/
{"checksum"}, /*fod_name*/
UDF_CHECKSUM, /*fod_id*/
II_NORMAL /*fod_type*/
}
};
/*Need an array of 2-byte integers to hold the function argument datatypes*/
static II_DT_ID UD_2_VC[] = {II_VARCHAR, II_VARCHAR};
static IIADD_FI_DFN Function_Instances[] = {
{
II_O_FUNCTION_INSTANCE, /* fid_object_type */
UDF_FI_CHECKSUM_VARCHAR, /* fid_id*/
II_NO_FI, /* fid_cmplmnt*/
UDF_CHECKSUM, /* fid_opid=fod_id from function definition */
II_NORMAL, /* fid_optype */
II_FID_F0_NOFLAGS, /* fid_attributes*/
0, /* fid_wslength*/
2, /* fid_numargs*/
UD_2_VC, /* fid_args, a pointer to an array of datatypes*/
II_VARCHAR, /* fid_result, result is an integer*/
II_RES_FIXED, /* fid_rltype*/
MAX_CHECKSUM_LENGTH , /* fid_rlength */
0, /* fid_rprec */
checksum, /* fid_routine */
0 /* lenspec_routine */
} /*checksum(varchar, varchar)*/
};
Compilation
Your first exercise is to make that code example compile! What's there is very close. But you will need to add a few more things to complete it.
Hint: Read the Manual! Look in II_SYSTEM/ingres/demo/udadts for examples.
Ultimatly you need to make a shared object library, here's a Makefile that may help...
# Make OME Stuff: libOME.so
CCFLAGS=-c -fPIC -O0 -o$@
LDFLAGS=-shared
CC=cc
INC=-I${II_SYSTEM}/ingres/files
# In all Ingres...
LIBPATH=-L${II_SYSTEM}/ingres/lib
LIBS=-lcrypto -lssl
#
all: OME_functions.o
$(CC) $(LDFLAGS) -olibOME.so OME_functions.o
OME_functions.o:OME_functions.c
$(CC) OME_functions.c $(CCFLAGS) $(LIBPATH) $(INC)
Linking
Once the code compiles it needs to be linked to the server (iimerge) process. This is achieved via the 'iilink' utility. This is an interactive utility. When asked, you will need to provide it with the full pathname of the shared object library you have created.
Furthermore, you will be asked if you wish to create a server with a file extension. My suggestion is that you do this! I use the extension 'OME'.
You'll quickly find that the iilink can't find the libssl! To fix this you'll need to edit the file $II_SYSTEM/ingres/utility/iisysdep and alter the setting of the variable LDLIBMACH to include a '-lssl'.
Have another try with iilink. Having made the new iimerge.OME, my suggestion is that you:
cd $II_SYSTEM/ingres/bin mv iimerge iimerge.patch_number ln -s ./iimerge.OME iimerge
Once done you may restart the installation. Check the errlog after the startup and see if its all cool!
Testing
Just a simple SQL monitor session can be used to put this new function through its paces... Hints.
- 1. Can Ingres recognise the checksum() as a function it knows?
- 2. Ensure the error messages in the function are displayed correctly. Try using an invalid digest name.
- 3. Compare with the md5sum of a small file containing the same text as your test case within SQL.
Hints on Management
Congratulations you now have an installation with OME!
There are a few issues in regards to ongoing management of the installation you should be aware of. Namely:
- 1. Every subsequent patch install is going to blow away your OME functions. If you want to keep them you'll have to remember to relink them after every patch install. It helps if you have reminder in place to do this. In fact the symlink of iimerge --> iimerge.OME I suggested you put in place earlier is useful in this area. The patch installer notices the symlink, complains bitterly and stops! You will need to manually remove the symlink for the patch install to work. Thats a powerful hint that after the patch install, you may want to relink your OME!
- 2. Ditto you may need to edit iisysdep after each patch install for the iilink to work.
Exercises for the Reader
Now its your turn! After you get the checksum() code to compile and work it's time to look at what else can be done with it. Here are some suggestions.
- 1. Extend the function to also work on binary datatypes.
- Hint: You only need to add an FID that allows for binary datatypes. The same function executor code can be used.
- 2. Extend the function to work on long datatypes.
- Hint: Make some new function executor code just for the long datatypes.
- Question: How are you going to test this?
- 3. Add an external lenspec routine.
- Question: Why should the external lenspec routine have a default length?
Categories: DBMS | Contributors | Examples | Articles
