This describes, how the remake works inside.
The whole thing works like this:
Initializes everything, including parsing parameters.
Loads cache, if it is present in current directory and it is not forbidden by a command line parameter.
Builds hierarchy of remake scripts and modules, and creates contexts for them. Uses data from cache, if the cache is loaded and it is not disabled by command line parameter.
Computes list of parameters.
Computes dependencies of targets, their dependencies, etc. It then builds plan from them and runs it.
At the end, it saves cache for next time.
The hierarchy builder’s task is to find out what script files and modules should be loaded, plug them together and get contexts and syntax trees for them.
The hierarchy builder starts with a REM script in current working directory, unless something else is specified on command line.
With each file it works with, it first looks into the cache, if it changed since last time. If it didn’t, it loads syntax tree and context for this file from cache. Otherwise, it parses the input file and creates a context for it.
Then it looks if it contains a parent statement. If it does, it evaluates it. Then it restarts whole remake with directory and file specified by it. This allows remake to walk the directories up, until it gets to the top-level project directory, where the cache is stored from last time and all file names are relative to it.
After it, it looks for modules. For all import statements, the corresponding file is found and processed (as any normal file), recursively. Its context is plugged into the current one as child.
Similar thing is done with descend statements, with the only difference the value of it is computed in the current context.
Lazy evaluation is used for variables — a variable is evaluated only when it is needed. It means the order of evaluation is not defined by the language and some variables do not have to be evaluated at all. Once it is evaluated, it is remembered, so it does not have to be computed again. Variables are even stored into cache for future use.
Expressions are evaluated recursively — when value of one is being computed, evaluation of subexpressions is called first and the final result is computed from the results from subexpressions.
The planner has several sets of objects:
Yet unscheduled objects.
Scheduled with unsatisfied dependencies.
Scheduled with satisfied dependencies.
Working set (currently being run).
Already completed objects.
Attached object values.
First, all objects are put into the unscheduled set. Each one is taken and dependencies are computed. If some dependency is still in unscheduled set, it is scheduled recursively before this one. After computing dependencies, the object is put either into scheduled with satisfied or unsatisfied dependencies.
When everything is scheduled, objects are taken one by one from the set with satisfied dependencies. It is put into the working set. If any of the dependencies of this object changed from last remake run, it is executed. After that, it is taken from the working set to completed one. All objects that depend on this one are updated and, if it was the last dependency, they are moved from the unsatisfied to satisfied set.
The attached set is somehow out of the rest. It exists only because each object can be split into multiple values, if it creates more than one file. So one of the values goes trough the whole process described above and the other values are put into the attached set until the one value is processed and put into the done set.
The planner can fail for multiple reasons. Let aside the obvious ones, like if someone puts a function into dependencies, the main ones are circular dependencies. If a circular dependency exists, there will be no objects in the set with satisfied dependencies and some objects in the one with unsatisfied it some point.
The cache’s task is to hold state between runs. It stores three important things:
Size and modification time of relevant files (to know if they changed from last time).
Contexts with variables. This is needed to hold stored dependencies, variables provided on command line and outputs from external commands (detections, etc.).
Parsed syntax trees. This is because the variables reference them.
Note
|
The cache is probably the most problematic, buggy and broken part of remake, if you think you found a bug or met an odd behaviour, try disabling the cache to see if it disappears. Remake even contains two bug bypasses because of bugs that are probably unfixable in current design. |
The cache is binary and therefore not portable. But that is not a problem, since if you copy the code to another machine, you need to compile it differently anyway.
In cache, there are two tricks. The first one is how structures full of pointers are stored. They are put verbatim into the cache and then an translation table is added, saying which address the structure resided. After loading, the old, invalid, pointers are found in the newly loaded data and changed.
The second one is invalidating. Most things can change. Therefore, cache objects have dependencies and are invalidated when some of the dependency changes. If it happens, cache pretends it does not contain such object.
Parsing an input script is 4 phase process. First, the file is tokenized into basic elements, like operators, identifiers, strings and terminations of expressions. Comments are removed in this phase.
Then a syntax tree is built using algorithm very similar to shunting-yard one.
After this, the syntax tree is optimised and slightly modified, single value lists are replaced by their values, inheritance is put under the class definition.
Final phase scans the tree and notes down where are variable assignments, which subfiles should be loaded and such.