Top 5 Linux Commands for Beginners

Data Science on the Command Line

Photo by Nathan Dumlao on Unsplash

As data sets are getting larger and more prevalent, researchers are having to do a lot more of the leg work in regards to core programming — thereby spending more time with tools like GIT and Linux (something we rarely had to before!).

For the software engineers reading this post: you probably won’t find the following super useful but as someone who’s been through those early self-taught days as a junior researcher, I feel the pain of budding Data Scientists or ML researchers!

Given all that, I thought about which commands I use daily and which commands I wished I had known earlier. So from that, I now present my top 5 Linux commands that have helped me in my career!


Command 1: grep

grep sounds like the noise frogs make, but actually it stands for Global regular expression print. That long phrase doesn’t make much sense outright, but the essential use case for the grep command is to search for a particular string in a given file.

The function is fairly quick and incredibly helpful when you’re trying to diagnose an issue on your production box, in which for example, you may think a TXT file has some bad data.

As an example, say we’re searching for the string 'this’ in any file which begins with the name 'demo_’:

$ grep "this" demo_*
demo_file:this line is the 1st lower case line in this file.
demo_file:Two lines above this line is empty.
demo_file:And this is the last line.
demo_file1:this line is the 1st lower case line in this file.
demo_file1:Two lines above this line is empty.
demo_file1:And this is the last line.

Not so bad huh? We can see on the left hand side that there are two files that begin with demo (demo_file and demo_file1)

Command 2: wget

Now we move onto something a little bit more sophisticated but still something we use quite a lot. The wget command is a useful utility used to download files from the internet. It runs in the background so can be used in scripts and cron jobs.

To utility is called as follows:

wget <URL> -O <file_name>

Where the following is an example if we wanted to download a file:

wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.17.2.tar.xz
Photo by Aziz Acharki on Unsplash

Command 3: wc

Often you have a file of arbitrary length and something smells fishy: maybe the size of the file seems too small for the number of rows you expect or something you’re just curious how many words are in it. Either way, you want to inspect it a bit more and need a command to do so.

The wc command helps out in that it essentially counts a few different things for the file in reference:

# wc --help

Usage: wc [OPTION]... [FILE]...
-c, --bytes print the byte counts
-m, --chars print the character counts
-l, --lines print the newline counts
-L, --max-line-length print the length of the longest line
-w, --words print the word counts
--help display this help and exit
--version output version information and exi

So, say we want to count the number of lines in a file:

wc -L tecmint.txt

16 tecmint.txt

or maybe the number of characters:

wc -m tecmint.txt

112 tecmint.txt

Awesome!

Command 4: Vi

The vi command is super helpful as it allows you to open and explore a file. The command works as follows:

vi [filepath]

And it takes you into an editor sort of thing. Now in this editor, you can use the following characters to navigate:

k    Up one line  
j Down one line
h Left one character
l Right one character (or use <Spacebar>)
w Right one word
b Left one word

However, in reality, you’ll find navigation pretty naturally. The following commands will be the most useful though:

ZZ     Write (if there were changes), then quit
:wq Write, then quit
:q Quit (will only work if file has not been changed)
:q! Quit without saving changes to file

You’ll learn to love vi, I swear!

Command 5: CTRL+R

So I’ve saved the best for last as I really use this command quite a lot. CTRL+R isn’t really a command but more a shortcut type of thing. It allows you to search your history of used commands by typing in something which resembles the command, and then similar commands that you’ve used before come up!

For example, say you’ve just run a really long command and for whatever reason your terminal session breaks and you have to re-run the command again. With this command, you can quickly search for it again instead of reconstructing the command from scratch!

Let’s say I’m trying to remember a command that begins with hi, but I can’t remember it all. I type in ctrl+r and then I see what it recommends:

$ history
bck-i-search: his_

Perfect! The command history has been recommended and that’s exactly the command we were looking for. If you press tab at this point, the autocomplete fills in the line:

$ history
Photo by Wil Stewart on Unsplash

I’ve actually always struggled to use both Linux and GIT but over time, I’ve managed to remember a few key commands that’ve helped my development as an independent researcher. I can work fairly independently now and it’s thanks to the above command line tools that I’m able to so.

Therefore, I really recommend spending a few hours getting used to linux as the small lessons you take now will really help progress your use of the system going forward. It’s pure upside!


Thanks again! If you have any questions or need any help, please message =]

Keep up to date with my latest work here!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Powered by WordPress.com.

Up ↑

%d bloggers like this: