Building a Linux Shell [Part III]

Building a Linux Shell [Part III] hackernoon.com1 year ago in#Dev Love65

@MIMAMohammed Isam GNU/Linux system administrator and programmer This is part III of a tutorial on how to build a Linux shell. You can read the first two parts of this tutorial from these links: part I, part II. NOTE: You can download the complete source code for Part II & III from this GitHub repository. Parsing Simple Commands In the previous part of this tutorial, we implemented our lexical scanner. Now let’s turn our eyes to the parser. Just to recap, the parser is the part of our Command Line Interpreter that calls the lexical scanner to retrieve tokens, then constructs an Abstract Syntax Tree, or AST, out of these tokens. This AST is what we’ll pass to the executor to be, well, executed. Our parser will contain only one function, parse_simple_command() . In the upcoming parts of this tutorial, we’ll add more functions to enable our shell to parse loops and conditional expressions. So let’s start coding our parser. You can begin by creating a file named parser.h in the source directory, to which you’ll add the following code: #ifndef PARSER_H #define PARSER_H #include “scanner.h” /* struct token_s */ #include “source.h” /* struct source_s */ struct node_s *parse_simple_command(struct token_s *tok); #endif Nothing fancy, just declaring our sole parser function. Next, create parser.c and add the following code to it: #include #include “shell.h” #include “parser.h” #include “scanner.h” #include “node.h” #include “source.h” struct node_s *parse_simple_command(struct token_s *tok) { if(!tok) { return NULL; } struct node_s *cmd = new_node(NODE_COMMAND); if(!cmd) { free_token(tok); return NULL; } struct source_s *src = tok-src; do { if(tok-text[0] == ‘n’) { free_token(tok); break; } struct node_s *word = new_node(NODE_VAR); if(!word) { free_node_tree(cmd); free_token(tok); return NULL; } set_node_val_str(word, tok-text); add_child_node(cmd, word); free_token(tok); } while((tok = tokenize(src)) != &eof_token); return cmd; } Pretty simple, right? To parse a simple command, we only need to call tokenize() to retrieve input tokens, one by one, until we get a newline token (which we test for in the line that reads: if(tok-text[0] == ‘n’) ), or we reach the end of our input (we know this happened when we get an eof_token token. See the loop conditional expression near the bottom of the previous listing). We use input tokens to create an AST, which is a tree-like structure that contains information about the components of a command. The details should be enough to enable the executor to properly execute the command. For example, the figure below shows how the AST of a simple command looks like. Each node in the command’s AST must contain information about the input token it represents (such as the original token’s text). The node must also contain pointers to its child nodes (if the node is a root node), as well as its sibling nodes (if the node is a child node). Therefore, we’ll need to define yet another structure, struct node_s , which we’ll use to represent nodes in our AST. Go ahead and create a new file, node.h , and add the following code to it: #ifndef NODE_H #define NODE_H enum node_type_e { NODE_COMMAND, /* simple command */ NODE_VAR, /* variable name (or simply, a word) */ }; enum val_type_e { VAL_SINT = 1, /* signed int */ VAL_UINT,  » Read More

Like to keep reading?

This article first appeared on If you'd like to keep reading, follow the white rabbit.

View Full Article

Leave a Reply