New macros system for C/C++

Now probably is a good time to start mentioning my own macros system for C/C++. I’ve already written a bit about it in my first post of the cycle. The codename for the project is RMPP, which stands for Real Macro Preprocessor (for C/C++). Why did I call it that? Because the current preprocessor is way too limited to be considered a real one compared to others I’ve discussed. I’ve identified the biggest drawback of current approach to be the lack of scriptability. By this I mean that one can only define simple substitutions using #define macros and that’s pretty much it. You can’t include any logic, flow control, loops, etc. etc., just basic substitutions. I’ve noticed that just by adding the ability to write some code which would generate the macro output instead of using fixed text the macros ecosystem in C/C++ becomes much richer. Take this macro for example:

#include <stdio.h>

int main(int argc, char *argv[])
        if (argc != 3)
                fprintf(stderr, "USAGE: @bits 010101010101 @endn");
                return -1;

        unsigned int result = 0;

        for (char *bits = argv[2]; *bits; bits++)
                case ' ':
                case 't':
                case 'n':
                case 'r':
                case '0':
                        result <<= 1;
                case '1':
                        result <<= 1;
                        result |= 1;
                        fprintf(stderr, "ERROR: Syntax error - only 0s and 1s are allowed in @bits invocation.n");
                        return -2;

        printf("0x%08X", result);
        return 0;

It takes as input a binary representation of a number (unsupported by C/C++) and outputs its hexadecimal representation. It is good to put such conversion inside a macro instead of running conversion code at run-time. Even such a simple macro couldn’t be written using current C/C++ approach.

As you can see the macro itself is written in C++. It also has to be compiled to work. My idea was – why limit macros to some scripting language or why use single language at all? I decided to use UNIX-like approach, i.e. anything that takes in text and outputs text can be used as a macro. Obviously it’s not the same but you can think of it as for example writing macros for Nemerle using C# or IronPython or Python or anything 😉 Now the big difference is that my preprocessor is still operating at source code level. It doesn’t know first thing about ASTs. This is both its strength and its weakness. It is a good thing because this is what made it possible to use any program as a macro. It is a bad thing because it’s far less ambitious than writing my own C++ parser ;]

In the current implementation macros have to be called like this:

int x = @bits 01001000101 @end;

I though that what MetaLua’s author wrote about macros sticking out of the regular code being a good thing is correct and worth taking into consideration. Among many other characters @ seemed to be good enough for the role of a warning sign. It should convey a clear message – THIS IS MAC-RO (you’ve watched 300, right? 😉 ). So you start a macro invocation with an @ followed by macros name and then a list of parameters separated by : like this:

@someweirdmacro param1 : param2 : param3 @end

You end it with an @end. I’m not sure if this is optimal. It certainly is distinguishable from regular code but perhaps even too much so? Anyway I can still change it to anything before it becomes an industry standard 😉 Perhaps:

@someweirdmacro(param1, param2, param3)

would be better… I would definitely like it to be different from regular macros invocations, so writing just:

someweirdmacro(param1, param2, param3)

wouldn’t be good enough.

I’ve also introduced code quotation blocks, similar to Nemerle’s. If you’d like to pass a block of code as parameter to macro invocation you can do it like this:

@loopmacro 1 : 10 : <[
        printf("Loop iteration: %d", _loopmacro_iter);
]> @end

As you can see this way : is not treated as parameters separator. You can also observe the use of a variable generated by macro. For now I decided that the convention over configuration approach to macros hygiene is the best. So I say that the convention for macro variables is an underscore (_) followed by macro name, another underscore (_) and then variable name and macros are not supposed to generate any other variables.

Yeah, well… so this was an initial discussion of my solution for new macros system for C/C++. I’m hoping for some feedback although it’s quite unlikely given my blog’s popularity 😉 Anyway, stay tuned for more!

Leave a Reply

Your email address will not be published. Required fields are marked *