2017-06-26 data

Learn just enough Linux to get things done

Different operating systems have long catered to different audiences: Windows for the business professional, Mac for the creative professional and Linux for the software developer. For OS providers, this sort of market segmentation greatly simplified product vision, technical requirements, user experience and marketing direction. However, it also reinforced workplace norms which bucket individuals into narrow, non-overlapping domains: business people can offer no insight into the creative process, and developers no insight into business problems.

In reality, knowledge and skill are fluid, spanning multiple disciplines and fields. The notion that “you can only be good at one thing” is not a roadmap to mastery but rather a prescription for premature optimization. You can only know what you’re good at once you’ve sampled a lot of things - and you may just find that you’re good at a lot of them.

For modern business analysts, bridging the gap between business and software development is especially important. Business analysts must be “dual platform,” able to leverage command-line tools available only on Linux (or OS X) yet still benefit from the power of Microsoft Office on Windows. Understandably, the world of Linux is intimidating for those with a business degree. Fortunately, as with most things, you only need to learn 20% of the information to accomplish 80% of the work. Here is my 20%.

Why modern business analysts should know Linux

Due to its open source roots, Linux benefited from the contributions of thousands of developers over time. They built programs and utilities not only to make their jobs easier, but also the jobs of programmers who followed them. As a result, open source development created a network effect: the more developers built utilities on the platform, the more other developers could leverage those utilities to write their programs right away.

What resulted was an expansive suite of programs and utilities (collectively, software) that were written in Linux, for Linux - much of which was never ported to Windows. One example of this is the popular version control system (VCS) called git. Developers could have written this software to work on Windows, but they didn’t. They wrote it to work on the command line for Linux because it was the ecosystem which already had all the tools they needed.

Concretely, development on Windows runs into two main problems:

  1. Basic tasks, like file parsing, job scheduling, and text search are more involved than running a command-line utility
  2. Programming languages (eg. Python, C++) and their associated code libraries will throw errors because they are expecting certain Linux parameters or file system locations

Together, this means more time spent rewriting basic tools already available in Linux and troubleshooting OS compatibility errors. This is not a surprise - the Windows ecosystem simply wasn’t designed with software development in mind.

With the case made for Linux development, let’s begin with the basics.

The fundamental unit of Linux: the “shell”

The shell (also known as the terminal, console or command line) is a text-based user interface through which commands are sent to the machine. On Linux, the shell’s default language is called bash. Unlike Windows users who primarily point-and-click inside of windows, Linux developers stick to their keyboard and type commands into the shell. While this transition is at first unnatural for those without a programming background, the benefits of developing in Linux easily outweigh the initial learning investment.

Learning the few important concepts

Compared to a full-fledged programming language, bash only has a few major concepts that need to be learned. Once these are covered, the rest of bash is just memorization. I’ll restate for clarity: being good at bash is simply memorizing about 20-30 commands and their most common arguments.

Linux seems impenetrable to non-developers because of the way that developers seem to effortlessly regurgitate esoteric terminal commands at will. The reality is that they committed only a few dozen commands to memory - for anything more complicated, they too (like all mere mortals) consult Google.

With that out of the way, here are the main concepts in bash.

Command syntax

Commands are case-sensitive and follow the syntax of: {command} {arguments..}

For example, in ‘grep -inr’, grep is the command (to search for a string of text) and -inr are flags/arguments which change what grep does by default. The only way to learn what these mean is to look them up through Google or by typing ‘man grep’. I recommend learning the commands and their most common arguments together; it’s too burdensome otherwise to remember what each and every flag does.

Directory aliases

For example, to change from the current directory to the parent directory, one would type: cd ..

Similarly, to copy a file located at “/path/to/file.txt” into the present directory, one would enter cp /path/to/file.txt . (note the period at the end of the command). Since these are no more than aliases, the actual path name could be used in their place instead.

STDIN / STDOUT

Anything you type into the window and submit (via ENTER) is called standard input (STDIN).

Anything that a program prints back out to the terminal (eg. text from within a file) is called standard output (STDOUT).

Piping

  1. |

    A pipe takes the STDOUT of the command to the left of the pipe and makes it the STDIN to the command on the right of the pipe.
    example: echo ‘test text’ | wc -l

  2. >

    A greater-than sign takes the STDOUT of the command on the left and writes/overwrites to a new file on the right
    example: ls > tmp.txt

  3. >>

    Two greater-than signs takes the STDOUT of the command on the left and appends to a new or existing file on the right.
    example: date » tmp.txt

Wildcards

You can think of this like SQL’s % symbol - for example, you might write “WHERE first_name LIKE ‘John%’” to catch any first name starting with John.

In bash, you would write “John*”. If you want to list all of the files ending with “.json” in a folder, you would write: “ls *.json”

Tab completion

Bash will often finish off commands intelligently for you if you start typing a command and hit your TAB key.

That being said, you should really use something like zsh or fish for autocomplete since it is hard to remember the commands and all their parameters - rather, these tools will autocomplete your commands based on your command history!

Quitting

Sometime’s you’ll get stuck in some program and you can’t get out. This is a very frequent occurrence for beginners in Linux and it is extremely demotivating. Often, quitting has something to do with q. It’s good to memorize the following and try them all when you’re trapped.

My memorized list of bash commands

Here are the commands I use most frequently in Linux (sorted from most to least frequently used). As I mentioned before, knowing just a handful of commands will accomplish the vast majority of programmable tasks you need to perform.

Advanced and infrequently commands

I find it’s good to keep a list of commands that are useful in certain situations (eg. which process is blocking a certain network port), even though those situations don’t happen very often. These are some uncommon commands I keep nearby: