Tuesday, January 19, 2021

Sometimes creating your own compiler (a.k.a sort of parser) helps you solve nasty bug which compiler does not catch.

Today I developed a software in which tons of data fields are going to be added.  Usually, compiler does catch your fatal logical error (such as calling the method not defined in specific class object).

However, Groovy language is often tricky since it is implemented semi-type-based (and semi-script).


This means, it compiles even though it is not legitimate in  coding level (and even in python or any other normal language, the compilation error must be raised). For instance,


enum SomeGroovy {

ELEM1(ClassA.class, "method1"),

ELEM2(ClassA.class, "method2"),

        ...

ELEMk(ClassA.class, "methodk"),

ELEMk+1(ClassA.class, "methodk+1"),

        ...

}


and this class is almost totally 400-500 lines long, filled with this convoluted enum element fields.

The other program calls this class with (I believe it is) reflection of something, and set the SomeGroovy's each enum elements with each ClassA.methodX is filled in as instances.


But...


One day I found Groovy raises the following error:


tons of java stacks....


the class SomeGroovy cannot be initialized.


What the hell? I tested `grails compile` and there is no compilation error, but somehow this happens.

Since my feature branch changes dozens of files (since those are dependent each other), I thought the postmortem on this should be appropriately managed with Git operation as:


$ git diff --name-only develop | tee revert.sh


then, edited revert.sh such that


git checkout develop file1

git checkout develop file2

git checkout develop file3

git checkout develop file4

...


to make it purely as same as that of develop branch, at the same time you can find out which files you've reverted with git status command, then recover each file one by one, running test (which failed) per each.


then, I found out that SomeGroovy enum class is the culprit of this entire mess, yet still it was so hard to detect the cause because the stack trace, error message, does not show nothing.


Okay, then we can conclude that:


Groovy compiler SUCKS. 


and this is the only complaint, and devs ends with complaining is not tech savvy, nothing but incompetent issue consumer.


So I thought:


Okay, then let's make a semi-compiler for this SomeGroovy enum class.


I create the compiler script as follows: (it doesn't implement no algorithm, no Graph algorithm is included, just a miniature crap, so do not compare it with any other compiler as gcc or sth).


Let me obfuscate the code bit for I don't want to disclose no company specific ones:

-----------------------------------------------------------------------------------------------------------


#!/usr/bin/env python3


import sys 

import re


file = "/something/ClassA.groovy"

f = open(file, 'r')

lines = f.read()

xs = []

for x in lines.split('\n'):

    x = x.replace("/", "").strip()

    if(re.compile("^(String|EnumFieldElement|Integer|float|SelectOption|Date)\s").match(x)):

        xs.append([x.strip() for x in x.split(' ') if x != ''][1])


file = 'something/SomeGroovy.groovy'

f = open(file, 'r')

lines = f.read()

ys = []


yes = []

no  = []


for x in lines.split('\n'):

    if(re.compile(".*\(.*Data.*").match(x)):

        x = x.replace("/", "").strip()

        ys = [[y.replace('"', "").strip().replace(")", "").replace('.class', '') for y in x.strip().split(',') if y != ''] for x in x.split('(') if x != ''][1]

        if(ys[0] == 'FixedData'):

            if(ys[1] in xs):

                yes.append(ys)

            else:

                no.append(ys)


for y in yes:

    print(f"YES: {y}")


for y in no:

    print(f"NONE: {y}")


f.close()


------------------------------------------------------------------------------------------------------------


this code has sort of duplication and the verbosity is nasty but I don't care. This is an improviso and nothing but an auxiliary script.


But impact of this script is huge!


Once I run this script, I saw that:


YES: method1

YES: method2

YES: method3

YES: method4

.....

NO: method(k)

NO: method(k+1)


then you found out the culprit. Once I commented out the NO's part in the enum file, regression fixed!!!


-------------------------------------------------------------------------------------------------------------

Conclusion:


Coding in IDE is fancy, but it at the same time it makes people bit difficult to summon various

languages and make tiny scripts to make your life easier. Especially for Perl, Python or even Ruby,

 these script can help you write an easy script to supplement the crappy compiler which never catches the fatal language-logic level error. Since in middle-sized or mega IT corporation there is already politics there, and almost it is impossible for you to change nothing. But, there are always the way to get along with it. (Oh, this is why startups are always comfortable for the high performer!!!)



No comments:

Post a Comment