Tag: a text file tag expansion
When programming, I noticed that my sources have many recurrent information, such as the project name, current project version, project license and so on.
This kind of information is tedious to maintain, and would better be manages from a single location and then dispatched to the various source files. This information could be replaced by tags that could be expanded, pretty much like what CVS and other version control system do. For instance, a file with the following line:
Project: $ProjectName v.$ProjectVersion
could be expanded to:
Project: Tag v.1.0.0
In this respect I would simply have to maintain a tag file, like myproject.tags, which could be the following:
ProjectName?: myproject ProjectVersion?: 1.0.0 ...
and then my source files would simply reference the variables as illustrated before. This is the basic idea behind the tag project.
Use cases
Here is a quick list of use cases for tag:
- For source code headers, allow global expansion of project name, version, license and such
- Allow automatic replication of text portions. It happens sometimes (if not often) that documentation string are redundant in diffent parts of the same or different source files (like this).
From tags to templates
Thinking about tags is rather simple: a value is bound to an identifier (the tag), and each recognized reference to this identifier in a given set of text files are replaced with the tag value.
This is sufficient for most cases, but sometimes the expansion (the replacement of the tag by its value) may require specific treatments, like the following :
- Consider formatting constraints, like preventing the expanded tag from overflowing a line limit (say 79 chars)
- Justify the tag value in a given space, which may be a delimited range, including lines or line subsets
- Process the tag expansion, like making it lower or uppercase
- Tag values may be multiline and would then require a combination of processing and formatting
We now have to care about the tag expansion context, and how the tag is expanded. So basically a tag would not represent a value anymore, but rather a process that produces this value.
It is then clear that in this regard tags can be considered as way to make text file templates.
Preventing bloat
The problem is now that tag is susceptible and getting bloated with rarely used or even unnecessary features.
Features
- Tag values can be defined in a specific tag file, or can be extracted from a given text file, referenced in a tag file.
- Tag value can be either simple values (strings), or be portions of code that expand the tag according to the parsing context.
References
Here is a list of software that create new files from expanding templates:
- new is a program to create files from templates. It uses environment variables to expand the template placeholders. It is useful for creating new files, but cannot be used to fill placeholders in existing files unless they are templates.
- newfile is pretty much the same as the previously presented
newutility. - Desift is a very simple placeholder expansion utility.
Here is a list of preprocessors (most resemble to the C preprocessor):
- Filepp a generic Perl-written preprocessor that has the same syntax as the C preprocessor, but aims to be way more generic.
- lwpp is also a simple (if not simpler) file processor that is designed to conditionally output parts of an input file according to value of variables defined in the environment.
- GPP is yet another C preprocessor alike, but it is very small and has various syntaxes to adapt to C, LaTeX? and HTML files. It is worth a look for its simplicity.
Here is a list of complete templating systems:
- Cheetah Template is a very flexible, simple to use, Python templating system. Functions available in Cheetah can be written in Python, which offers most flexibility.
- Chakotay preprocessor is a very complete preprocessor that sports a lot of primitive functions. The problem is that it has a really complex and unnatural syntax that tends to make it unreadable.
- FMPP uses Freemarker templates and fills them with values taken from different data sources. It seems quite versatile, but seems also rather big and slow.
Here are software that I cannot really put into the last three groups:
- temgen is a "universal language generator", which appears to be more a domain-specific language than a preprocessor, but with a really clean and simple syntax.
