Monday, August 29, 2016

Creating a MariaDB MaxScale router module

I wanted to do some tests with MariaDB MaxScale and realized that the two existing routers (beyond the binlog router that is, which is a bit special) didn't do what I wanted them to do. What I was looking for was a simple round-robin feature and none of readconnroute nor readwritesplit could be configured to do this, they are just too smart for my simple experiment.

Why would you want a round-robin router then? Well, one use case is when you are INSERTing a lot of data and you just want to persist it. You don't have the use case where you have to SELECT data from all servers, but in the case you need it, you just select from all servers until you find what you need. Let's think about log data that you don't care much about but that  you for some reason need to retain, maybe for corporate policy reasons or legal reasons. Using round-robin could, in theory, give you better performance, but that would require something way smarter than what I am proposing here. Rather, you get INSERT availability, i.e. you will always have some server to insert into and secondly, you get INSERT sharding, which is basic but useful, you only store so much data on each server.

So, let's get to work. To begin with you need the MaxScale source tree, place yourself in some directory where you want this and do this:
$ git clone https://github.com/mariadb-corporation/MaxScale.git
Now you should have a directory called MaxScale, so pop in there are, create a build directory and then run cmake to configure MaxScale itself:
$ cd MaxScale
$ mkdir build
$ cd build
$ cmake ..
$ make
These are the quick instructions and you will probably find that you lack some dependencies. The full instructions for how to do this is available as part of the sample code as presented later in this document, and that is available from Sourceforge. Browse to https://sourceforge.net/projects/mxsroundrobin/files and then click on roundrobin 1.0 where you find a pdf with detailed instructions. Also there is a tgz there will all the sourcecode presented late in this blog.

So, now we have something to work with and the plan is to introduce a new router module in this tree. To begin with pop over to where routers module code is and create a directory for our code there:
$ cd ../server/modules/routing
$ mkdir roundrobin
$ cd roundrobin
Before we can start building some code, let's look at the basics of what kind of code gets into a module.

A plugin is a shared object that is loaded by MaxScale core when it starts. Early on when MaxSCale starts it reads the configuration file, /etc/maxscale.cnf by default, and in there each service defines a router. Note that several services can use the same router so our code we write later has to take this into account. Look at this extract of a service section for example:
[Read-Write Service]
type=service
router=readwritesplit
The router here tells MaxScale to look for a readwritesplit module, or in technical terms, it will load the shared library: libreadwritesplit.so. After loading this library successfully, MaxScale has to figure out a few things about this module, like it's name and version and above all, the entry points for the functions that MaxScale will call when processing a connection. In addition we need to define a few structs that are passed around these different calls to give the different router functions some context. Lets start with a header file rooundrobin.h in the roundrobin directory:
#ifndef ROUNDROBIN_H
#define ROUNDROBIN_H
#include <server.h>

typedef struct tagROUNDROBININST *PROUNDROBININST;

typedef struct tagROUNDROBIN_CLIENT_SES {
  SPINLOCK lock;
  bool bClosed;
  SERVER **pBackends;
  SESSION *pSession;
  DCB **pdcbClients;
  unsigned int nBackends;
  unsigned int nCurrBackend;
  PROUNDROBININST pRouter;
  struct tagROUNDROBIN_CLIENT_SES *pNext;
} ROUNDROBIN_CLIENT_SES, *PROUNDROBIN_CLIENT_SES;

typedef struct tagROUNDROBININST {
  SERVICE *pService;
  PROUNDROBIN_CLIENT_SES pConnections;
  SPINLOCK lock;
  SERVER **pBackends;
  unsigned int nBackends;
  struct tagROUNDROBININST *pNext;
} ROUNDROBININST;
#endif
As you can see, the main thing here is that I define and typedef two structs. As I said, I have mostly been looking at other existing routers and grabbed the stuff in there, so I can't explain all aspects of these structs, but let's look at a few members:
  • These structs are in a linked list and the pNext member is a pointer to the next element in this list.
  • The lock members is a reference to a spinlock associated with the struct.
  • The pBackends member is a pointer to an array of pointers to the database SERVERS that this service is attached to.
  • The pbcdClients member is an array of pointers to DCDs. A DCB is the Descriptor Control Block which is a generic descriptor of a connection inside MaxScale, be it a server or a client. In this case this is the DCBs to the SERVERs in pBackends.
  • The nBackends is the number of elements in the pBackends and pdcbClients arrays.
  • The pRouter member is a pointer to the ROUNDROBININST for the connection.
That is the most of that, the next step now is to start with the more exiting stuff of the actual code take make up this module. The main source file we work with here is roundrobin.c and we need a few basics in this. Let's have a look the beginning of roundrobin.c:
#include <my_config.h>
#include <router.h>
#include <query_classifier.h>
#include <mysql_client_server_protocol.h=>
#include "roundrobin.h"

/* Macros. */
#define ROUNDROBIN_VERSION "1.0.0"

/* Globals. */
MODULE_INFO info = {
  MODULE_API_ROUTER,
  MODULE_GA,
  ROUTER_VERSION,
  "A simple roundrobin router"
};
static PROUNDROBININST pInstances;

/* Function prototypes for API. */
static ROUTER *CreateInstance(SERVICE *service, char **options);
static void *CreateSession(ROUTER *pInstance, SESSION *session);
static void CloseSession(ROUTER *pInstance, void *session);
static void FreeSession(ROUTER *pInstance, void *session);
static int RouteQuery(ROUTER *pInstance, void *session, GWBUF *queue);
static void Diagnostic(ROUTER *pInstance, DCB *dcb);
static void ClientReply(ROUTER *pInstance, void *router_session,
  GWBUF *queue, DCB *backend_dcb);
static void HandleError(ROUTER *pInstance, void *router_session,
  GWBUF *errmsgbuf, DCB *backend_dcb, error_action_t action,
  bool *succp);
static int GetCapabilities();

static ROUTER_OBJECT RoundRobinRouter = {
  CreateInstance,
  CreateSession,
  CloseSession,
  FreeSession,
  RouteQuery,
  Diagnostic,
  ClientReply,
  HandleError,
  GetCapabilities
};
Let's now look at what is going on here. To begin with I include a few necessary files, including roundrobin.h that we created above and then a macro is defined. Then the MODULE_INFO struct follows. The information in this is used by MaxScale to get information on the router, but if you leave this out, currently MaxScale will start anyway. The command show modules in maxadmin will return the information in this struct for the module.


Then follows a number of function prototypes, and these are needed here before the ROUTER_OBJECT struct, and this is the key to the router as it provides the entry points for MariaDB itself. Again, I will not specify exactly what all of these do, I have mostly just grabbed code from other routers.

Following this we need some basic functions that all routers implement, to initialize the module, get the version and a function to return the ROUTER OBJECT defined above:
/*
 * Function: ModuleInit()
 * Initialize the Round Robin router module.
 */
void ModuleInit()
   {
   MXS_NOTICE("Initialise roundrobin router module version " ROUNDROBIN_VERSION ".");
   pInstances = NULL;
   } /* End of ModuleInit(). */


/*
 * Function: version()
 * Get the version of the roundrobin router
 */
char *version()
   {
   return ROUNDROBIN_VERSION;
   } /* End if version(). */


/*
 * Function: GetModuleObject()
 * Get the object that describes this module.
 */
ROUTER_OBJECT *GetModuleObject()
   {
   return &RoundRobinRouter;
   } /* End of GetModuleObject(). */
With that we have completed the housekeeping code and are ready to look at the functions that implement the actual functionality. We'll look at CreateInstance first which, as the name implies, creates an snstance of RoundRobin. Note that within a running MaxScale there might well be more than one instance, one for each RoundRobin service.
/*
 * Function: CreateInstance()
 * Create an instance of RoundRobing router.
 */
ROUTER *CreateInstance(SERVICE *pService, char **pOpts)
   {
   PROUNDROBININST pRet;
   PROUNDROBININST pTmp;
   SERVER_REF *pSvcRef;
   unsigned int i;

   MXS_NOTICE("Creating roundrobin router instance.");
/* Allocate the RoundRobin instance struct. */
   if((pRet = malloc(sizeof(ROUNDROBININST))) == NULL)
      return NULL;
   pRet->pService = pService;
   pRet->pConnections = NULL;
   pRet->pNext = NULL;
   pRet->nBackends = 0;

/* Count the number of backend servers we manage. */
   for(pSvcRef = pService->dbref; pSvcRef != NULL; pSvcRef = pSvcRef->next)
      pRet->nBackends++;

/* Allocate space for the backend servers and add to the instance struct. */
   if((pRet->pBackends = calloc(pRet->nBackends, sizeof(SERVER *))) == NULL)
      {
      free(pRet);
      return NULL;
      }

   spinlock_init(&pRet->lock);

/* Set up list of servers. */
   for(i = 0, pSvcRef = pService->dbref; pSvcRef != NULL; i++, pSvcRef = pSvcRef->next)
      pRet->pBackends[i] = pSvcRef->server;

/* Set up instance in list. */
   if(pInstances == NULL)
      pInstances = pRet;
   else
      {
      for(pTmp = pInstances; pTmp->pNext != NULL; pTmp = pTmp->pNext)
         ;
      pTmp->pNext = pRet;
      }

   MXS_NOTICE("Created roundrobin router instance.");
   return (ROUTER *) pRet;
   } /* End of CreateInstance(). */
Again, nothing really exiting is happening, I create a struct that defines the instance, initialize it and add it to the linked list of instances that I maintain. Also I get references to the backend servers that this instance use and set up the array for the and I also initialize the spinlock. With that, we are done. Then there is the issue of creating a session, and this function gets called when a client connects to MaxScale through the port that is linked to RoundRobin.
/*
 * Function: CreateSession()
 * Create a session in the RoundRobin router.
 */
void *CreateSession(ROUTER *pInstance, SESSION *session)
   {
   PROUNDROBIN_CLIENT_SES pRet;
   PROUNDROBIN_CLIENT_SES pTmp;
   PROUNDROBININST pRoundRobinInst = (PROUNDROBININST) pInstance;
   unsigned int i;

/* Allocating session struct. */
   if((pRet = malloc(sizeof(ROUNDROBIN_CLIENT_SES))) == NULL)
      return NULL;
   spinlock_init(&pRet->lock);
   pRet->pNext = NULL;
   pRet->nCurrBackend = 0;
   pRet->pSession = session;
   pRet->pRouter = pRoundRobinInst;
   pRet->nBackends = pRoundRobinInst->nBackends;

/* Allocating backends and DCBs. */
   if((pRet->pBackends = calloc(pRet->nBackends, sizeof(SERVER *))) == NULL)
      {
      free(pRet);
      return NULL;
      }
   if((pRet->pdcbClients = calloc(pRet->nBackends, sizeof(DCB *))) == NULL)
      {
      free(pRet->pBackends);
      free(pRet);
      return NULL;
      }

/* Set servers and DCBs. */
   for(i = 0; i < pRet->nBackends; i++)
      {
      pRet->pBackends[i] = pRoundRobinInst->pBackends[i];
      pRet->pdcbClients[i] = NULL;
      }

/* Place connecting last in list of connections in instance. */
   spinlock_acquire(&pRoundRobinInst->lock);
   if(pRoundRobinInst->pConnections == NULL)
      pRoundRobinInst->pConnections = pRet;
   else
      {
      for(pTmp = pRoundRobinInst->pConnections; pTmp->pNext != NULL; pTmp = pTmp->pNext)
         ;
      pTmp->pNext = pRet;
      }
   spinlock_release(&pRoundRobinInst->lock);

   return (void *) pRet;
   } /* End of CreateSession(). */
This is also pretty basic stuff, the server pointers are copied from the instance (do I need to do this you ask? Answer is, I don't know but I do know that what I do here works). I also clear the DCB pointers, these are created on an as-needed base later in the code.

Following this are a couple of basic housekeeping functions that I am not showing here, actually I'm just going to show one more function, which is RouteQuery. This is, as the name implies, the function that gets called to do what we are actually writing this code for, routing queries. Before I show that code, I have to explain that this is very simplistic code. To being with, it doesn't implement "session commands", these are commands that really should be run on all backends, like setting the current database, handling transactions and such things. As I said, I do not implement this and this is one of the major shortcomings on this code that makes it much less generally applicable. But it still has use cases. Secondly, I have tried to make sure that the code works, more than optimizing it to death, so maybe I grab the spinlock too often and maybe I am too picky with allocating/deallocating the DCBs, I let others answer that.

The role of the function at hand is to handle an incoming query and pass it along to one of the servers defined for the service in question. In the general case, the most complicated part of this is selection of which server to route the query to and handling of session commands. I have simplified this by only having a very simple routing algorithm where I store the index of the last used backed for a connection in the nCurrBackend member, and for each query this is incremented until nBackends is reached where it is reset to 0. And for the complexity of session commands, I just don't implement them.

So, lets have a look at what the RouteQuery function looks like:
/*
 * Function: RouteQuery()
 * Route a query in the RoundRobin router.
 */
int RouteQuery(ROUTER *instance, void *session, GWBUF *queue)
   {
   PROUNDROBIN_CLIENT_SES pSession = (PROUNDROBIN_CLIENT_SES) session;
   DCB *pDcb;
   int nRet;
   unsigned int nBackend;

   MXS_NOTICE("Enter RoundRobin RouteQuery.");
   queue = gwbuf_make_contiguous(queue);

   spinlock_acquire(&pSession->lock);
/* Check for the next running backend. Set non-running backend DCBs to NULL. */
   for(nBackend = pSession->nCurrBackend; nBackend < pSession->nBackends; nBackend++)
      {
/* If this server is up, then exit this loop now. */
      if(!SERVER_IS_DOWN(pSession->pBackends[nBackend]))
         break;

/* If the server is down and the DCB is non-null, then free the DCB and NULL it now. */
      if(pSession->pdcbClients[nBackend] != NULL)
         {
         dcb_close(pSession->pdcbClients[nBackend]);
         pSession->pdcbClients[nBackend] = NULL;
         }
      }
/* If I couldn't find a backend after the current, then look through the ones before. */
   if(nBackend >= pSession->nBackends)
      {
      for(nBackend = 0; nBackend <= pSession->nCurrBackend; nBackend++)
         {
         if(!SERVER_IS_DOWN(pSession->pBackends[nBackend]))
            break;
         if(pSession->pdcbClients[nBackend] != NULL)
            {
            dcb_close(pSession->pdcbClients[nBackend]);
            pSession->pdcbClients[nBackend] = NULL;
            }
         }

/* Check that I really found a suitable backend. */
      if(nBackend > pSession->nCurrBackend)
         {
         spinlock_release(&pSession->lock);
         MXS_NOTICE("No suitable RoundRobin running server found in RouteQuery.");
         return 0;
         }
      }

   pDcb = pSession->pdcbClients[nBackend];
/* If backend DCB wasn't set, then do that now. */
   if(pDcb == NULL)
      pDcb = pSession->pdcbClients[nBackend] = dcb_connect(pSession->pBackends[nBackend],
        pSession->pSession,
        pSession->pBackends[nBackend]->protocol);
   spinlock_release(&pSession->lock);

/* Route the query. */
   nRet = pDcb->func.write(pDcb, queue);

/* Move to next dcb. */
   pSession->nCurrBackend = nBackend;
   if(++pSession->nCurrBackend >= pSession->nBackends)
      pSession->nCurrBackend = 0;

   MXS_NOTICE("Exit RoundRobin RouteQuery.");
   return 1;
   } /* End of RouteQuery(). */
So, what is going on here? First I check for a backend, first the ones starting with the current one (which is badly named, this is actually the one after the current) and then until I find a server that is running. If I find a non-Running server I skip that one, after having closed the associated DCB. If I can't find a server after the current one, I start again from the first, processing servers in the same way.

Following this I should have a server, then I check if the DCB is open, and if not I open it now. After that I do the actual routing of the query, move not the next backend and then return. Simple as that. As I have stated, this is a very simple router, but it does work, within the given limitations, and it should be good enough as a crude example.

Before I can test my code, I have to set it up for inclusion in the build process and do a few other mundane tasks, but that is all documented in the pdf that comes with the code, download the package from Sourceforge.

Happy SQLing
/Karlsson

No comments: