About Compilation
Talk About Compilation
Building a hello world C program may be the first time for us to hear about Compiling. After we have known deeper about a sort of the conception, compiling became much more unapproachable
. Only nerds understand what does a compiler do during compilation.
When I was senior grade in college, the course fundamentals of compiling
did frightened me a lot.
Recently, I found the project called the-super-tiny-compiler. That's so awesome that it telling you the basic concept and steps of compiler stuff.
Structure of Compiler
Having a look of the picture below, these parts forms a compiler: the front-end such as tokenizer,the middle state AST, the backend which responses for syntax analysis and code generation.
the-super-tiny-compiler
has four functions.
- parse the source code from string format into tokens.
- parse from tokens into AST structure.
- transform AST. (code generation preparing)
- generate the final code.
you can easily know about steps during the compilation by reading its source code, with well explaining comments.
If you have developed front-end apps around year 2016, you may know babel
well. Babel is a famous compiler for JavaScript, which support multiple features to let you parse es6, jsx, flow syntax like code. And it will do the code transform to ensure the final code could be working in environment you're expected.
Let's Talk About Babel
Babel has a large ecosystem providing syntax presets, specific transforming plugins and also the parser for users. babylon is the basic parser working background of babel. You can access the AST of the code by parsing it through babylon. That is really easy, just follow the tutorial in babel-plugin-handbook.
The most mysterious thing is what does AST look like. For babylon syntax tree format. A simple function like
function square(n) { return n * n }
will just be parsed as:
- FunctionDeclaration:
- id:
- Identifier:
- name: square
- params [1]
- Identifier
- name: n
- body:
- BlockStatement
- body [1]
- ReturnStatement
- argument
- BinaryExpression
- operator: *
- left
- Identifier
- name: n
- right
- Identifier
- name: n
See it? It's easy. A huge object that you could traverse it by depth-first-traversal
method. It's just a tree, representing the priority of each node been parsed. Same level nodes have same priority and deeper get smaller priority.
What about the whole compilation steps? I wrote a demo.
// babel tool set
const babylon = require('babylon')
const t = require('babel-types')
const traverse = require('babel-traverse').default
const generate = require('babel-generator').default
// source code
const jsxCode = `
<div>
<img src="a.jpg" />
<span>hello</span>
</div>
`
// parse ast
const ast = babylon.parse(jsxCode, {plugins: ['jsx']})
// traversal transform
traverse(ast, {
enter(path) {
const {node} = path
if (node.type === 'JSXIdentifier') {
if (node.name === 'div') {
node.name = 'View'
} else if (['span', 'p', 'h1', 'h2', 'h3'].includes(node.name)) {
node.name = 'Text'
}
}
}
})
// code generation
const gened = generate(ast, null, jsxCode)
// result
console.log(gened.code)
What I want to do is transforming a simple ReactJS code into ReactNative like code. But without replacing the keywords manually, I would like to use babel toolset. Let's begin.
For above you could see that I gave the source JSX code and I wanna parse it into babylon AST, finally I generate the code with babel itself.
The work including four parts:
- Use
babel-types
to access the specific nodes you would like to do transformation. - Use
babylon
to parse code int AST. - Use
babel-traverse
to traversal whole AST and do some transformation. - Output final code with
babel-generator
.
These libraries are famous in babel ecosystem. They are your armory and weapons the split code structurely.
If you want to learn more, just have a deep look at babel-handbook
repo.
Fow now, you just finished the same work that "the-super-tiny-compiler" had done. Parsing, Travsal, Transforming, Generating. I like it very much 'cause it's been much easier for me to understand the problem since I learned how to compiling a C++ program. No more asm middle state code, just from input to output.
Extensions
After reading this post and the source code of tiny compiler, you might have many ideas to do transforming in your field. Let me give an example, I'm now a front-end developer. I write advanced CSS very day. People may say out some key words like "SASS, PostCSS", the tools to let people write less vanilla CSS. How does it work? From recognize the complex syntax then output the pure css code.
Think above. You'll find out the same routine to get the shit work. Advanced CSS mean another syntax of CSS, like es6 to es5. What you need is a syntax parser with ability to construct CSS like syntax AST, and generate code from the rules given.
Let's see an example.
Given a CSS source code below, written in vanilla CSS syntax. What about translating it into ReactNative stylesheets, Ha?
/* ItemActions */
.action {
margin-top: 10px;
margin-right: -25px;
color: #8e8e8e;
}
.button {
margin-right: 25px;
}
.icon {
margin-left: -3px;
fill: rgb(192, 192, 192);
}
.number {
margin-left: -.3em;
}
We depend on a parser module called css
, you can find it on npm.
const fs = require('fs')
const {parse} = require('css')
const log = console.log
const cssFilename = process.argv[2]
const css = fs.readFileSync(cssFilename, 'utf8')
// parse content to AST, including tokenizer
const ast = parse(css)
const rules = ast.stylesheet.rules.filter(rule => rule.type === 'rule')
function camelCase(str) {
return str.replace(/-([a-z])/g, (match, p1) => p1.toUpperCase())
}
function letterCase(str) {
const camel = camelCase(str)
return camel[0].toUpperCase() + camel.slice(1)
}
function renameSelectorToKey(selector) {
return letterCase(selector.replace(/\.|#/g, ''))
}
function transformValue(value) {
if (/^-?(0?\.)?[0-9]+/.test(value)) {
const matches = /(-?(0?\.)?\d+)(px|em)/g.exec(value)
return parseFloat(matches[0])
}
return value
}
const ruleMap = rules.map(rule => {
return {
name: renameSelectorToKey(rule.selectors[0]),
declarations: rule.declarations.reduce((m, declare) => {
m[camelCase(declare.property)] = transformValue(declare.value)
return m
}, {}),
}
})
// transform AST
const styleSheet = ruleMap.reduce((m, rule) => {
m[rule.name] = Object.assign({}, rule.declarations)
return m
}, {})
// Code Generation
const styleTemplate = (styleSheet) => {
const content = JSON.stringify(styleSheet, null, 2)
.replace(/\"([^(\")"]+)\":/g,"$1:")
.replace(/\"/g, '\'')
return `module.exports = require(\'react-native\').StyleSheet.create(${content})`
}
log(styleTemplate(styleSheet))
finally we got
module.exports = require('react-native').StyleSheet.create({
action: {...},
button: {...},
icon: {...},
})
That is what we expected after a compilation progress.
Conclusion
You may know a little more about compilation after the two examples of compiling JSX and CSS. The concept could just be used everywhere. Human use complied code to talk with machines, write easier code, and so on. Just keep the knowledge, it may helps you a lot. :)