Computers are much faster and more accurate than humans. However, that speed and accuracy is ornamental unless the user is capable of giving the computer instructions to follow.
The earliest computers were made to run only a single program, like adding two large numbers together. Having to physically rewire computers to run different programs meant that computers were only narrowly applicable.
Later computers used central processing units, which are small components capable of executing a series of simple calculations, storing and accessing partial results along the way. To give instructions to these computers, users would feed punch cards into the computer, where metal pins would poke across the card: if a hole was present, the pin would touch a metal barrel beneath the card, completing a circuit and communicating a 1 to the computer. If there was no hole, the paper would block the circuit from being completed, communicating a 0 to the computer. This improvement meant that a single computer could make millions of types of calculations instead of just one. But further improvements were required to increase the speed and robustness of human-to-computer communication.
After several innovations and iterations, these improvements were achieved through the keyboard and terminal. A screen would display a blinking cursor, and a user could type commands for the computer to execute.
You know the rest of the story, where Microsoft and Apple later came out with operating systems that would show pictures and icons that could be navigated with a mouse. These advancements made computers approachable for novices, and made the computer a common household item. No longer did users need to remember the commands to type, they could simply click corresponding buttons on a screen. However, these graphical user interfaces (GUIs), as they were called, came with costs to speed and flexibility.
Counterintuitive to most modern computer users, your options for how to use your computer are limited by the buttons and icons on your screen, and your speed by how quickly you can move your mouse cursor from one side of the screen to the other. For this reason, the terminal still plays a critical role in the workflow for programmers today. In this chapter, you'll learn basic commands for the terminal, how to navigate folders, and how to use git to make the computer do more for you.
I feel that it is important to acknowledge that much of this may feel difficult and unintuitive at first. However, you will be surprised at how quickly this can become second nature. I encourage you to follow along with the instructions below, and to test out your curiosities as you go. Getting acquainted with it now will help you sort things out when you inevitably run into the need for these things later.
You're certainly familiar with how to make a new folder on your desktop. I'm sure you could make a new folder within that one as well. However, you may not have been introduced to the idea that your desktop is itself a folder (aka directory), and it is inside its own folder, which is inside its own folder, etc. This doesn't go on for long, though. In a few steps, you'll hit the folder that contains everything: the root folder or root directory.
Called /
on Mac and C:\\
on Windows, the root folder
holds all folders and files you use on your computer. That means that every
folder and file can be uniquely described by what's called a path name.
For example, on my Mac, a file on my desktop called notes.txt
has
the path name /Users/miles/Desktop/notes.txt
. If I was on a Windows
computer, the path name would be C:\\Users\miles\Desktop\notes.txt
.
(For the rest of this chapter, I'll use the Mac path name, but the same principles
apply to Windows. See the note in the box below for information about Windows path names.)
As you can see, the desktop is stored inside a folder called miles
,
which is the username for my computer. This folder,
/Users/[your-username-here]/
, is referred to as the
home directory, and contains important folders like
Desktop
, Documents
and Downloads
.
The home directory can be abbreviated to the symbol ~
,
so the path for the notes.txt
file could also be expressed as
~/Desktop/notes.txt
.
Path names that start from the root folder are called full path names.
As is discussed in the next section, the terminal is always looking
into a folder as its working. Whatever folder the terminal is looking in
is called the current working directory (CWD) or simply
working directory. Since the terminal will know where it is pointed
at a given time, you can give a path name that is relative to the CWD, i.e.,
a relative path name.
For example, if the terminal is looking into the home directory,
you can refer to the notes.txt
file by Desktop/notes.txt
.
Since the Desktop
folder is within
the home directory, the relative path is sufficient to describe where the file is.
Going even further, if the terminal is looking at the Desktop
folder,
the relative path for notes.txt
is just that: notes.txt
.
In R programming, one of the most common uses of path names is to import data from a file. The following box gives instructions for how to quickly get the full path of a file on your computer.
shift
as you right-click on the file. Select Copy as Path
.
It is now in your clipboard and can be pasted anywhere.
\
) in their path names
by default. This is dumb because backslashes have a special use in programming
that makes it so using backslashes in path names can break your code. To avoid
this, you can simply replace C://
with C:\
in a
full path, and replace all backslashes with forward slashes
(i.e., /
) when you paste the path name into your code.
E.g., C://Users/miles/wow.txt
becomes
C:\Users\miles\wow.txt
, and miles/wow.txt
becomes
miles\wow.txt
. Windows will still be able to handle it.
option
as you right-click on the file. Select
Copy "[filename]" as Pathname
. It is now in your clipboard and can be pasted
anywhere.
The terminal is a program that allows you to give instructions to your computer
by typing commands. The terminal is also called the command line or
command line interface (CLI). The terminal is a powerful tool that
allows you to do things that would be impossible or very difficult to do with
a graphical user interface (GUI). For example, imagine that you have a folder of
thousands of pictures that are named for the city and day they were taken, e.g.,
Chicago_Apr-12-23.png
, Ontario_Jun-08-03.png
, etc., and you
want to copy all the pictures taken in Miami in 2009 to a new folder. With a GUI, you would
have to move each file individually. With the terminal, you can do this in one line of code:
cp Miami*09.png new_folder
This command copies all files whose names start with Miami
and end in
09.png
and have whatever in between (the *
is a wildcard token
to represent any characters, not discussed below but mentioned here if you are curious)
to the folder new_folder
.
Of course, the tradeoff is that you have to learn the commands to type, and
then have to remember them when you need them. But the more you use the terminal,
the more you'll find that it is faster and more flexible than a GUI.
The terminal is a program that runs on your computer. On Mac, the terminal is
called Terminal
. On Windows, the old version is called
Command Prompt
, whereas the new one is called PowerShell
.
Complicatedly, there is a different set of commands for each of these terminals,
including between the two Windows terminals. For this reason, I recommend that
Windows users use Git Bash, which is installed when you installed Git
in the first chapter. Git Bash is a special terminal that runs on Windows
that uses the same commands as the Mac terminal. This means that everyone can learn
the same commands, and you can follow along with the same instructions as Mac users,
which has preferable syntax. See the box below for instructions on how to open the terminal.
Terminal
program.
It can be quickly opened by pressing Cmd + Space
, typing
Terminal
and pressing Enter
.
bash
terminal, whereas newer
macs use the zsh
terminal. The commands are the same, and
effectively the only differences you'll notice is the title of the window saying
bash
or zsh
.
Git Bash
program.
It can be quickly opened by pressing Win
, typing
Git Bash
and pressing Enter
.
Ctrl + C/V
. There are other keyboard
shortcuts, but they're dumb. Instead, you'll probably do best right-clicking
and select Copy
/Paste
.
Follow the instructions below in your open terminal.
When you open the terminal, you'll see a screen that is mostly blank, except for
some text that follows something similar to one of these patterns:
username@computername working_directory %
, or
username@computername:working_directory$
.
For example, mine looks like miles@Miless-MacBook-Pro ~ %
.
This is called the command prompt, and it is where you type commands for the
computer to execute. Regardless of the exact pattern yours follows,
you'll likely see a ~
in the command prompt, which
is a shorthand for the home directory, as discussed above. For me,
that means my terminal is looking into \Users\miles
,
my home directory. You'll see this command prompt every time you hit enter.
Below, I'll introduce the most common commands for the terminal. Each demonstration builds upon the one previous. You can follow along by pulling up this chapter and the terminal at the same time on your screen. I encourage you to try out the commands yourself, and to experiment with them along the way. I'll warn you about anything "dangerous", and you'll learn through practice. The most important commands to learn are the first four, so prioritize understanding and remembering those.
A few tips for ease: if you're part-way through
typing a file name, you can hit tab
to autocomplete the rest of the name,
given that no other files/folders in the CWD start with the same characters.
Also, you can hit the up
arrow on your keyboard to cycle through your previous
commands. The down
arrow does the same in reverse.
pwd
, or print working directory:
To start, check the full path name of the folder you're in by typing
pwd
and hitting enter. This stands for print working directory,
referencing the working directory discussed above. If you're at your
home directory, you should see something like
/Users/[your-username-here]
.
cd
, or change directory:
Next, let's change the working directory to be the desktop. To do this, type
cd Desktop
and hit enter. After doing so, run pwd
again. You should
see that the working directory is now /Users/[your-username-here]/Desktop
.
ls
, or list: Next, let's see what is on your desktop.
To do this, type ls
and hit enter.
This command will list all the files and folders in the
working directory. You should see a list of all the files and folders on your desktop.
Now, try typing ls -a
and hitting enter. The -a
is called a flag.
Flags are used to modify the behavior of commands. In this case, the -a
flag
stands for all, and it modifies the ls
command so that it
lists all files and folders, including hidden files.
Hidden files are usually configuration files that you don't need to
worry about.
.
and ..
. These are special "folders" (quotations here
because they're more like links to folders) that are always present
in every folder. .
is the CWD, and ..
is the folder that contains the CWD (aka the parent directory of the CWD).
You can use these in path names to refer to the CWD
and the parent directory. For example, you
can go up one folder from your CWD by running the
command cd ..
.
mkdir
, or make directory:
Now, let's make a new folder on your desktop. To do this, run the command
mkdir new_folder
. This stands for make directory, and it
creates a new folder in the CWD. You can check that it was created by running
ls
again. You should see new_folder
listed.
touch
, the file creation command:
Now, let's create a new file in the new folder. To do this, first change directory
into that new folder: cd new_folder
. Then, run the command
touch new_file.txt
. The touch
command
creates a new, totally blank file with the name you give it.
You can check that it was created by running ls
again.
mv
, or move:
Now, let's move the file to the desktop. To do this, first change directory
back to the desktop: cd ..
.
Run pwd
again to make sure you were successful.
Then, run the command
mv new_folder/new_file.txt .
. The mv
command
stands for move, and it moves the file from the first path name
to the second path name. In this case, the first path name is
new_folder/new_file.txt
, and the second path name is
.
. As discussed above, .
is the CWD, so this command
moves the file from the new folder to the CWD, which is the desktop.
You can check that it was moved by running ls
again.
mv
again, but used for renaming:
Now, let's rename the file. To do this, run the command
mv new_file.txt renamed.txt
. This command
moves the file from the first path name to the second path name, but since
the second path name is in the same folder as the first, it effectively
renames the file. You can check that it was renamed by running ls
again.
cp
, or copy: Now, let's copy the file. To do this, run the command
cp renamed.txt renamed_copy.txt
. This command
copies the file from the first path name to the second path name. You can check
that it was copied by running ls
again.
rm
, or remove: This is the only command
so far that can be considered "dangerous" in that you can use this command to delete
things permanently. However, if you follow my instructions here, you'll
be fine and sufficiently introduced to do your own research. Now, let's remove a file.
To do this, run the command rm renamed_copy.txt
. This command
removes the file at the path name you give it. You can check that it was removed
by running ls
again.
echo
, a command to put text to the screen:
This one is simple: simply type echo "Hello, world!"
and hit enter,
and you'll see that the terminal prints Hello, world!
to the screen.
This command is most helpful in conjunction with the next command.
>>
, an operator to append content to a file:
If you've followed along so far, you should have an empty file named renamed.txt
on your desktop. Say you want to add some text to it. You can do this by running
echo "Hello, world!" >> renamed.txt
and hitting enter. This command
appends the text Hello, world!
to the file renamed.txt
.
That is, it adds the text to the end of the file. You can check that it was appended
with the next command.
cat
, or concatenate: This command
lets you see the contents of a file. To do this, run the command
cat renamed.txt
. You should see the text Hello, world!
printed to the screen. To play around with this a little more, try running
echo "How are the wife and kids?" >> renamed.txt
, and then
again running cat renamed.txt
(don't forget you can pull up
previous commands using the up
arrow key!). You'll see that using
>>
added the second line in addition to the first.
Although those were a lot of commands, I hope you realize that they're all relatively simple to use. Being even a little familiar with these commands will help you be a lot more comfortable when you see them out in the wild. Below is a brief summary of the commands you learned, and their syntax.
pwd
: print working directory to the screen.
cd [path-name]
: change directory to the given path name.
ls [-a]
: list all files and folders in the CWD.
Adding the -a
flag
lists hidden files as well.
mkdir [path-name]
: make a new folder at the given path name.
touch [path-name]
: make a new file at the given path name.
mv [path-name-1] [path-name-2]
: move the file/folder at the first path name
to the second path name. Additionally, this can be used to rename a file/folder.
cp [path-name-1] [path-name-2]
: copy the file/folder at the first path name
to the second path name.
rm [path-name]
: permanently remove the file at the given path name.
echo [text]
: print the given text to the screen.
[command] >> [path-name]
: append the output of the command to the
file at the given path name.
cat [path-name]
: print the contents of the file at the given path name
to the screen.
The terminal will tell you if you have asked it to do something it cannot do.
For example, if you accidentally type
la
instead of ls
, you will see an error message that says
zsh: command not found: la
or something like it. This of course means that the
command la
is not a valid command. When you see a message in the terminal after
running a command, read it! Often times, we as computer users are conditioned to ignore
error messages, but such messages in the terminal are often helpful, and if they are
confusing, they can be googled to help you understand what went wrong.
To begin, I will give some disambiguation: Git is a version control system (i.e., it helps keep track of past versions of files), and is open-source. Git Bash is a terminal program that gets installed with Git. GitHub is cloud storage for your folders of code, and is run by a private company.
As stated above, Git is a version control system. This means that it keeps track of
the changes you make to your files, and allows you to revert back to previous versions
of your files. For example, say you're working on a project that has a
data file (extension .csv
) and an R script (extension .R
).
As you edit the data file and add lines of code to your R script, you might
make a change that you later regret. Git allows you to revert both files back
to previous versions. In reality, the biggest way that Git is used in our field is to allow
multiple people contribute to the same coding project through GitHub.
We'll start by talking about how Git works, and doing some practice
along the way. Then, we'll talk about GitHub, and how it can be used to store your
code in the cloud.
The first step with using Git is to initialize a Git repository. A Git repository is simply a folder that is being tracked by Git. Let's use some of the commands from the previous section in the terminal to make a practice example.
First, make a new folder on your desktop called practice_repo
or something
of the sort. Then, change directory into that folder: cd practice_repo
.
Then, run the command git status
. This command checks to see if there is
already a git repository in the CWD. Since we just made this folder, there is not, so
you should see a message that says
fatal: not a git repository (or any of the parent directories): .git
.
Next, run the command git init
. This command initializes a new git repository
in the CWD. You should see a message that says Initialized empty Git repository in
/Users/[your-user-name]/Desktop/practice_repo/.git
. From that message,
note that there is now
a new folder in practice_repo
called .git
. This folder is where
Git stores all the information about the repository. You can see this folder inside
practice_repo
by runningls -a
, which lists all files and folders, including hidden ones.
Now, run git status
again. You should see a message that
says On branch main
, and that there are no commits yet. A commit
is a "snapshot" of the files in the repository at a given time. You can think of it as a
save point. You can make as many commits as you want, and you can revert back to any
commit at any time. We'll now work on making our first commit.
First, let's make a new file in the repository. Run the command touch new_file.txt
.
Then, run git status
again. You should see a message that says
Untracked files:
, and that new_file.txt
is untracked.
This means that Git is aware that there is a new file, but it is not tracking it yet.
Note that git status
is just a way for you to check the status of your repository,
and is not required to do anything. It is just a helpful tool.
To start tracking that file, run the command git add new_file.txt
.
Then, run git status
again. You should see a message that says
Changes to be committed:
, and that new_file.txt
is new.
The effect that git add
has is that it tells Git to start tracking the file.
This is sometimes called staging the file.
In reality, you will almost never use git add
on a single file like we
have here, but rather on all files in the repository.
To do this, you'll git add .
, where the
.
represents the CWD as discussed in previous sections.
Now, let's make our first commit. Run the command git commit -m "my first commit"
.
This command makes a commit, and using the required flag -m
(for message),
adds the message my first commit
. You can see all the commits you've made
by running the command git log
. This command lists all the commits you've
made, and the messages you gave them. You should see a commit with the message
my first commit
.
Everything we've done so far has been local to your computer. That is, you've been making commits to your local repository, which is just a folder on your computer. No internet connection is required to do what we've done so far. However, the real power of Git comes from being able to push your local repository to a remote repository, or in other words, being able to save to the cloud. This is where GitHub comes in. GitHub is a website that allows you to store your repositories in the cloud. It is free to use, and is the most popular website for storing code.
In the subsection below, I'll walk you through how to create a GitHub account, and how to push your local repository to GitHub.
To start, go to github.com and create an account.
Once you've done that, you'll be taken to your dashboard. Click on your account
icon in the top right corner, and select Settings
. On the left hand
side, click SSH and GPG keys
. Here, we need to tell your GitHub account
to trust your computer. It will do this by using a public key that is unique
to your computer. Follow the instructions in the box below to get your public key.
cd ~
to get there.
ls -a
. If you see a folder called
.ssh
, you already have an SSH key and should skip
the following step. If you don't see that file, continue to the next step.
ssh-keygen
. This command will create
a new SSH key for you. It will ask you where you want to save the key.
The default location is fine, so just hit enter. It will then ask you
to enter a passphrase. This is optional, so you can just hit enter again.
It will then ask you to confirm the passphrase. Again, just hit enter.
You should see a message that includes some weird symbols in a box
that looks something like this:
The key's randomart image is: +---[RSA 2048]----+ | 00b*| | +oo| | o. | | . .. | | S+ ...| | 0.=o ....| | o+SS=...o00 | | +***=ooo. | | EEo00=+ | +----[SHA256]-----+
cd .ssh
, and then ls
.
You should see two files: id_rsa
and id_rsa.pub
.
These are your private and public keys, respectively. You should not share
your private key, but can share your public key. To do so, run the command
cat id_rsa.pub
. You should see several lines of characters
that start with ssh-rsa
. This is your public key. Select it
and copy it to your clipboard.
Settings > SSH and GPG keys
,
click the green
New SSH key
button. Give it a title (e.g., "MyMacbook"),
and paste your public key into the textbox. Click the green
Add SSH key
button.
Now your GitHub account will trust your computer. This means that you can
push (i.e., save) your local repositories to GitHub. The first time
you do this after adding your SSH key, you will be asked to confirm that you
trust GitHub. When that happens, type yes
and hit enter. You won't
have to do it again.
Now you'll be able to connect your local repositories to your GitHub account.
Create a new repository by clicking your profile icon in the top right corner,
and selecting
Your repositories
. Then, click the green New
button.
You'll be taken to a page
where you can name your repository, and give it a description if you wish.
For this case, you might name it practice_repo
. Click the green
Create repository
button.
Once you've done that, you'll be taken to a page that gives you instructions
for how to connect your local repository to your GitHub repository.
However, we have already completed some of those steps (e.g., initializing the
local repository with git init
, adding with git add
,
and committing with git commit
). So, you'll only need to do the
last three steps:
git branch -M main
git remote add origin git@github.com:[your-username]/[repo-name].git
git push -u origin main
Note that the second line uses the SSH link (i.e., starts with git@github.com
),
not the HTTPS link (i.e., starts with https://
). Although each line here has a purpose, you only have to run this when you
create a new repository, so I'll leave further research to you if you're curious.
The first time you push, Git may ask for your email or username with the prompt below.
This refers to the email or username of your GitHub account. Pick one of the two
options, include your GitHub username or email, and run the command.
*** Please tell me who you are.
Run
git config --global user.email "you@example.com"
git config --global user.name "Your Name"
to set your account's default identity.
If this new repository that you have created were something that you were actually working on, here's what your workflow would look like after this setup:
git add .
.git commit -m "[your message here]"
.git push
.git pull
.
In this chapter, you were exposed to a lot of information about the terminal, Git, and GitHub. As stated before, this information can be overwhelming at first. However, the more you use these tools, the more comfortable you'll become with them, and the more you'll get done while you code. Below, I've given you some practice exercises to help you get more comfortable with these tools.
In this section, you'll practice using the terminal. Use the commands you learned above to follow the sequential instructions below.
test
.test
.test_file.R
."print(1)"
to this file.test_file.txt
."Hello, world!"
to this file.test_file.R
.test_file.txt
to the screen.test_file_2.txt
.Here, you'll use the results of the previous practice exercise to practice using Git in the terminal.
test
.test
.test
on their website with your account.README.md
."#Hello my friends"
to this file.