Random Ingres Coding Thoughts

From Ingres Community Wiki

Jump to: navigation, search

Random Ingres Coding Thoughts

(Karl Schendel)

This is a more or less random collection of thoughts, style points, things to watch out for, and code explanations. Most of the focus is on the Ingres back-end (DBMS server), although I have a very few things to say about the front-end tools as well.

Note: This is an updated version of the article originally posted as a Sponsored Advertisement on DevX.com

Coding Style

Some thoughts that supplement or extend the practices listed in the Ingres Coding Style document.

NULL vs zero. NULL is a pointer. Zero is a number. Don't mix them; this is the source of many of the compiler warnings in the current source. [Winter '07] I don't feel the need to cast NULL, because NULL is a conceptually a generic pointer. (Even though it's usually not implemented as (void *)0.) Casting NULL to a specific pointer type is acceptable but not necessary. Zero should ideally not be used as a pointer, even if you cast it. This practice (using eg (i4 *)0 as NULL) is common in older code but should be deprecated. Never say "if (number == NULL)". Never say "if (pointer == 0)". Fix code that mixes up NULL and 0 when you get a chance.

Declaration indentation. Some authors line up variable names in a declaration list, so: i4 foo; u_i4 bar; while other authors just use a space: i4 foo; u_i4 bar; Either way is acceptable. If names are lined up, they should be lined up on a reasonable tab-stop (in the sense of the Style manual, ie 4spaces, tab, tab+4, etc). What is not acceptable is a mess with variables names indented to random looking places, usually with a mix of spaces and tabbing.

Generic pointers. There are two issues here: the type of a generic pointer, and when to use one. Traditionally in Ingres, the PTR type has been used for both a byte pointer (char *) and a generic pointer. All modern compilers implement void *, which is a much better generic pointer. New code that needs a generic pointer in a struct member or routine parameter should use void *.

Many existing structures use PTR for pointer members because the specific pointer type is not always available at that point. (Ingres coding style typedefs all struct types, and so the typedef must be available at the point of use.) For structure members only, it is acceptable to use the "struct foo *" form; this compiles even if the definition of "struct foo" is not available to the struct definition. This method is preferred in new code over the old use of PTR with casting, as it produces compiler warnings if the wrong pointer is used with the struct member. If a struct must hold a truly anonymous pointer, or if the pointer type is hidden in another facility and should not be seen even as "struct foo *", then use a void * rather than using PTR.

Condition-less if's. By which I mean "if (var) ...". This is a construct that is easy to abuse; take a few extra keystrokes for readability. An if-test should omit the explicit test only in two cases: if the variable is a bool, or if it's a pointer and you're testing for not NULL. These forms are acceptable: if (bool) ... if (! bool) ... if (ptr) ... meaning if (ptr != NULL) Other forms are simply confusing and it's better to explicitly write the test. The code compiles the same, and the additional 2 or 3 characters of typing are not going to hurt you. Avoid: if (!ptr) ... meaning if (ptr == NULL) please avoid if (int_var) avoid unless int_var is a zero/nonzero status variable if (STcompare(a,b)) avoid this one especially, it's anti-English and confusing

Assignment expressions. C originally made assignment an expression, rather than a statement, mainly so that compilers wouldn't have to do common subexpression analysis and decent register allocation. Modern C compilers aren't so dumb anymore, so the choice of whether to write: foo = expr; if (foo != 0) ... versus if ( (foo = expr) != 0) should be made based on which is more readable and less error prone – and not based on "efficiency" concerns. Probably 9 times out of 10, the first form is easier to read, and is certainly less error prone.

Commenting. The Coding Style guide mentions this, but it bears repeating: comment your code. If you are the world's most elegant programmer, your code might be "self-documenting" (probably a lot less than you think); but your intent is most certainly not self-evident. Routine header comments and code block comments are for "why am I doing this", not so much for "what am I doing".

Bit maps/bit vectors. Never hardcode a primitive C type (i4 or i8) to hold a bit map. Bit maps have a habit of expanding over time. Either define a macro-parameterized array type (PST_J_MASK is one of many examples), or dynamically allocate the bit map. Use the BTxxx functions available in the CL. Avoid by-hand bit manipulation unless you use a parameterized loop.

Control flow. Be aware that Ingres, and the server in particular, likes to use dummy for (;;) loops as a device to break forward from for error handling. In most cases, the loop never loops. It might be better to use a do...while (0) construct if no actual looping ever occurs, just to highlight that fact. Limited use of goto for error exit and error cleanup is OK.


General Server Observations

QUEL. You don't need to know QUEL to work on the server, but it will help; especially if you plan on doing parser or optimizer work. In particular, try to gain at least some understanding of QUEL aggregates, which are very different from SQL aggregates. SQL aggregates (and SQL queries in general) are inherently single-pass. In QUEL, one query can ask for multiple unrelated aggregates, nested aggregates, aggregates with different WHERE clauses. The QUEL aggregate BY-list is more complex than the simple SQL group-by.

ICE server directories. The ascf, awsf, and urs facilities under back are for the ICE server: a tool for feeding data into HTML pages, now de-emphasized if not positively deprecated. For most normal DBMS server work, you can ignore them, but don't break them. Don't forget these directories when you grep for where-used.

Threads. Ingres is a multi-threaded server. It can use an internal threads model (user level Ingres threads), or an OS-threads model. Internal threading is slowly dying off, and indeed is disallowed on the newer platforms; but it's not dead yet. OS-threading is used where the OS supports a posix or posix-like threading model. Where OS threads are available, the threading model is switchable to internal threads at startup time. This means that threading-model tests in the code must be runtime tests, not compile time #if's.

If OS threads are used, in-thread blocking system calls are used for I/O. If internal threads are used, the DBMS server is effectively single-threaded as far as the OS is concerned. So, all I/O is done by some sort of slave to avoid blocking the server (allowing it to run another internal thread). Traditionally, an I/O slave is a separate process, which received I/O requests through shared memory, and does I/O either thru a shared memory buffer or directly into a shared memory buffer cache. Async I/O is also available; however the current implementation runs what is essentially an in-process asynchronous I/O slave thread, and the overhead is significant on most platforms.

An Ingres thread can be roughly classified into one of four types: A regular user session thread; a monitor thread; an internal DMF thread; or a factotum thread. User sessions normally run regular queries, but they might be running Star queries (in a Star server), or RAAT queries (row-at-a-time, deprecated). Monitor threads are for monitoring and shutting down the server. Internal DMF threads do DMF things like write-behind, and live almost entirely within DMF. Factotum threads are temporary worker threads for doing things like parallel sorting or parallel query execution. A factotum is spawned from a user session thread, and will often share some resources with the parent thread.

Stack space. A thread executes using its own thread stack. Ingres thread stacks are unusually large compared to non-DBMS threaded programs; the current default is over 128K per thread, or more on 64-bit platforms. The large stack is necessary to accommodate extensive recursion, particularly in parsing and optimizing. You need to do your part in saving stack space by avoiding large auto allocations in routines you write. Avoid putting large structures or large data values on the stack; use heap memory if necessary. (Of course common sense must intervene as well; a heap allocation in some low level routine called millions of times per query would not be a good idea.)

Multiple DBMS servers. It's easy to fall into the habit of thinking that everything you need is there in your server address space. This is false, three different ways. First, even with OS-threading, one address space only scales so far. (Common wisdom is about 400-500 user sessions per DBMS server; this is probably right to within a factor of 2 or so.) When running internal-threads, with Ingres doing its best to schedule threads in user-land, the agreeable level is more like 50-60 sessions per server. After that, if the installation is to support more users, additional DBMS server(s) are started. These servers may share DBMS cache, but do not share anything else (not RDF cache, nor QSF memory, nor DMF table control blocks).

Second, even if one single DBMS server is running, one still has to deal with the recovery server. Note that this is really only a concern for DMF. The recovery server is sort of a hot-spare for DMF, and serves as a master controller for locking and logging (among other things). DMF and its buffer cache must be ready to deal with the recovery server performing some operation on behalf of the regular server.

Third, Ingres supports shared-nothing clustering if the underlying platform supports a suitable distributed lock manager (DLM). This is an extreme variant of the first case, where not even the locking/logging system is shared directly. The locking system is shared indirectly, via the DLM, and acts as if it were cluster-wide. The logging system runs individual logs per cluster node.

Ingres does not always do the greatest job of detecting and invalidating obsolete objects installation-wide. It works, but not always efficiently. Make things better, not worse.

STAR (distributed) server. Ingres/Star is a distributed database facility. A distributed database is constructed from registrations (views, essentially) from one or more local databases. Queries against the distributed database are decomposed into queries against the local databases. Partial results from the local database servers are combined in the Star server.

Star actually issues textual queries to the local servers. This makes a Star server more of a text processing machine than a normal database machine. On the other hand, a Star server needs most of the query grammar and much of the basic query optimization that a DBMS server needs. (And some of query execution, too, but essentially none of DMF.) The original Star used separate code, and that didn't work out too well. So a Star server is built from the same code as a regular DBMS server.

Many of the comments will say something like "is this a distributed thread?", the implication being that distributed (Star) threads can coexist with regular non-Star threads. This is in fact not the case, although a combined Star/non-Star server is something that's been talked about for years. A server is either Star, or not Star, and all threads in a server are either Star, or not Star.

It's easy to forget about Star, but don't. If you are extending syntax you will certainly have to at least consider Star, if for no other reason than to issue error messages. Star has the biggest impact on the parser, the optimizer, and RDF (caching of various Star catalogs and remote fetching of table information).

Cross-facility calls. Are absolutely prohibited, no matter what. Every server facility (except ULF) has some kind of xxx_call interface; use it. ULF is the closest thing the server has to a global utility library; something that is really and truly global in nature can go into ULF somewhere (creating a new subdirectory if needed).

(Having said the above, there are a few direct cross-facility calls; for example, OPF calls pst_resolve in the parser directly. This is evil, wrong, and broken; feel free to fix this sort of thing any time you find it.)

ADF does not count as a server facility in the sense of having an adf_call (it doesn't). Calls to specific functions in ADF are allowed, especially calls to the equivalent of user level functions. (Eg a call to adu_dategmt is OK.) A cross-call to a lower level ADF utility (for instance, adu_3day_of_date) would be strongly discouraged. ADF does have a way to call back into server facilities in a controlled manner; see adufexi.c.

Where to put new code. Usually it's obvious. Sometimes, you are adding some kind of global definition, and the proper place to put it may not be quite as obvious. Some rules of thumb:

  • System-wide low-level things can go into CL if they are clearly "compatibility library" related, or into GL if it's a low level general front-end-or-back-end utility type of thing.
  • If your new definition or routine is closely related to ADF, it should go into common/adf – even if it's only used by the DBMS server. The adf/adc and adf/adt subdirectories have a lot of backend-only routines.
  • The only place for definitions that are truly global, and not specific to some other subsystem like ADF, is common/hdr/hdr/iicommon.h. There is no "becommon.h" for definitions global to back-end and common, nor is there an "fecommon.h" for definitions global to front-end and common. Use iicommon.h for either.
  • Definitions that are backend-wide, but don't need to be known outside of back (ie not common), can go into one of the back/hdr/hdr files. dbdbms.h is the place to put generic backend-only definitions.
  • As mentioned above, code that belongs to the backend only, is not ADF-related, and is not specific to any facility, can go into a ULF subdirectory.
  • If the proper facility is obvious, but the proper subdirectory is not, choose whatever seems clearest. New facility subdirectories can be created but this should be done sparingly and for good reason.
  • Locking and logging definitions are a bit odd, for historical reasons. The "public" definitions are in the GL header directory, with additional private declarations in dmf/hdr.

Query "main loop". For most purposes, the sequencer (scsqncr.c) can be considered the "main loop". The real "main loop" for a thread is in the CL, subdirectory cs or csmt, CS_setup or CSMT_setup. Unless you are chasing a problem with client program interaction (interrupts, say), you probably don't need to deal with the CS_setup loop.

Facility Control Blocks. Most if not all facilities have some sort of Server Control Block containing facility parameters (from startup), and information needed globally for the facility. It's better to keep facility global data in the appropriate control block, and not in a bunch of random global variables. Most facilities will also have some sort of Session Control Block, one instance per user session. Calls to a facility are directed through an xxx_call interface, usually passing a facility Request Control Block containing the parameters needed for the request, and supplying a place to put the answer. In some cases (eg DMF) there are a variety of Request Control Blocks, with the specific one to use dependent on the function. There is usually one (or some small number) of Request Control Block types, as opposed to many different variants specific to different functions. Finally, there will be a variety of facility specific control blocks to handle the facility's specific functionality.

To take a specific example, QEF has one QEF_S_CB (QEF server control block); one QEF_CB per session; and a QEF_RCB request control block for encoding most QEF function requests. The QEF_RCB usually sticks around for the life of the QEF request, as it's used to return result status; this is typical. QEF also uses many subsidiary control blocks during a request: QEE_DSH data segment headers, QEN_NODE query plan nodes, QEF_QP_CB query plan headers, and many more.

The session control blocks for a thread's facilities are allocated from the same memory block as the main SCF session control block, at thread startup time. This makes it easy to locate the session control block for any facility from a session's ID (which is the same as the session's SCF session control block address), without requiring auxiliary arrays of pointers to session control blocks.

Thread safety. Global data must be accessed and read in a thread-safe manner. For historical reasons, Ingres is rather simplistic in its handling of thread safety, and usually insists on a mutex being held to read or write global data. There is a CSadjust_counter routine which uses compare-and-swap hardware (if available) to do un-mutexed counter arithmetic. Dirty reads of global data are often done, and this is OK as long as the reading code is aware that the read can return garbage. (Typically, the dirty read is done to test for the possibility of some condition; if the condition may exist, a mutex is taken and the read repeated safely for verification.)

Deciding the scope of a mutex is something of an art. A mutex that locks too much data will become a bottleneck and prevent scaling. Mutexes that lock too little data will lead to a profusion of mutexes, and time wasted taking and releasing them. Nested mutexes must be taken in the same order at all times, or deadlock is possible. This ordering should be documented (!).

CL Routine names and debugging. The CL is meant for both front-end and back-end of Ingres. When the CL appears in a front-end client, it may be tucked away in an Ingres shared library; but it may also be linked in with the client from libingres.a, in the traditional manner. Therefore some effort was made to make all the global names in CL start with II. For example, MEreqmem is defined in gl/hdr/me.h as IIMEreqmem. So, if you are looking to put a breakpoint on MEreqmem in a debugger, you'll get a "no such symbol"; when this happens, try putting II on the front. The II prefixing was not done consistently; roughly half the CL uses the II prefix, the rest doesn't.

Personal tools
Developing With