What’s old is new again
We developers have a habit of reinventing the wheel over and over again. We often look at a piece of stable legacy code and think to ourselves that we could do it better. Many times there is nothing technically wrong with the code. It suffers from being written at a time when the current set of tools, frameworks, and platforms didn’t exist.
In the continued interest of inventing wheels, I have a personal project to build a tool that will help me create databases as I work on the associated applications. Here are some of my requirements, although some are for personal or political reasons and not strictly for getting the job done.
- it runs from the command line (necessary for things like automated build tools)
- written in python - to avoid java/oracle and because python is cool
- uses a structured changelog file format that can be applied sequentially
Let’s consider that last requirement in more detail. The basic concept is that the developers will have a place where they store all of their changes in text files that can be read by the tool in order and applied to the database. After making a change to the database, that change can’t be made again, but the text file remains as a record of the change and can be used by the tool if, for example, we need to rebuild the database.
There are many benefits to having the changes or refactorings stored in text files including the ability to easily share them with other members of the development team or applying them to a database in a different environment. We can put them in a source control system like git or mercurial.
The text files have to be highly structured and correct for the tool to consume them. How can we control that?
The obvious answer to me would be to write them in XML. XML is highly structured and has the added benefit of XML-Schema which allows us to create enforceable, detailed design rules. Problems solved, right? Well, no. For some reason I can’t quite figure out, many developers detest working in XML. It can be a little difficult to read with all of its angle brackets and the tendency for developers to neglect proper indenting. Add to that the difficulty in finding an affordable editor that can verify an XML file against its schema in real time. Very few people are going to drop $500 - $1,000 on an editor for a hobby project, including me (I’m looking at you, Altova).
OK, so XML is out. What’s left and liked by the cool kids? JSON. JSON was the answer to the data interchange portion of XML that aimed to make the text more human-friendly. It uses javascript-like syntax that can be very familiar to the current crop of developers. Also, while coming a little late and still not fully mature, JSON has a companion specification for JSON-Schema that would allow me to create a set of rules to validate the structure of the changelogs.
So JSON will be my changelog file type. Not so fast, mister. I find JSON to be harder to read than XML with its seemingly endless groups of curly braces and square brackets. Its vertical spacing can be a problem as well since I often wind up with a single object or two taking up the entire vertical space on my screen. Lastly, JSON does not directly support the use of multi-line strings. Granted there are some workarounds for this, but they are either kludgy or require additional dependencies just to read the file. I don’t want to introduce that sort of tool maintenance in what should be a relatively simple setup. Not having multi-line strings is going to be a massive issue for this project since we will regularly have blocks of SQL code embedded in our definitions. I don’t want to even think of how terrible it would be trying to edit a few hundred lines of a complicated stored procedure without being able to include a line break.
Well then, what else do we have? How about the new kid on the block, YAML? YAML is functionally equivalent to JSON and does a great job of eliminating as many unnecessary characters as possible. All those curly braces and brackets are gone. Vertically, it’s more compact than JSON, but not as much as XML. YAML derives its structure from indenting, coincidentally so does python. Finally, YAML has excellent support for multi-line strings. Developers can’t write sloppy code and have it run. YAML is going to complain before the developer gets a mess that is too big of a problem to solve.
Great! What about schema? Well, YAML doesn’t currently support schema. It’s not on the radar for the near future. That’s not good. The developers of the YAML specification knew this was going to be an issue and made a commitment to support JSON schema as a minimum design requirement and that goal seems to have made it into a few of the editor tools. What does that mean? We can define our change logs in YAML files and validate them against a JSON-Schema file.
Setting up VS Code to validate YAML against a JSON-Schema file
VSCode is the hot new, free, cross-platform editor out of Microsoft. Go to the extensions view and install the YAML extension from Red Hat. The editor will now be aware of YAML files and give you code hinting and structure validation as you type. After you have a workspace, you can associate a JSON-Schema file with your YAML files by going to your settings, Workspace settings, extensions and near the bottom click YAML. Scroll down to see Yaml:Schemas. In the window that appears, your going to add:
{
...,
"settings": {
"yaml.schemas": {
"./relativepathtomy/schema.json": "somegloblike *.cl.yaml"
}
}
}
You should begin to see code hints and highlighted errors in the editor at that point. It can be a little fiddly, so you might have to play around with the paths until you get them just right.
Setting up JetBrains editors to validate YAML against a JSON-Schema file
For me, this was a little easier than the VSCode setup, but it’s similar. Go to preferences, Languages & Frameworks, Schemas and DTDs, JSON Schema mappings. Click the + to add a new definition, Give it a name, Choose the Schema file, select a Schema version (the current one for me is JSON Schema version 7) and in the box below set a file selection pattern to associate them with your schema file.
Now comes the hard part, I have to go about building that schema file.
comments powered by Disqus