jamesnewbrain

the online braindump of James Fallisgaard

How to use Pelican, GitHub, and a DigitalOcean VPS to host a cool blog

Why host a blog like this?

The procedure outlined here, and this style of blog as static-site generated from code, is probably for nerds only. It's sooooo much easier to go sign up for a tumblr and be posting in literally 2 minutes. This whole process took me like a week of learning, setting up and documenting. Why put yourself through all this setup, instead of just running with tumblr? Beacuse...it's fun? It's possibly professionally valuable? Maybe. I get the impression that the only people that will notice and appreciate that your blog is built on Pelican are other programmers, in which case its just more signaling that you're "part of the club". Whatever your reason, you end up with a pretty sweet, svelte blog, and get a tingle of satisfaction knowing that you push builds of your blog to your personal virtualized linux server and commit blog posts to GitHub... we all do what makes us happy I guess!

The fundamental ideas behind hosting a blog like this compared to other common styles of blog hosting:


Technology stack covered

This post documents the full process to bring this website/blog up and host it on the internet. The stack of technologies looks like:

LOCAL machine - running Mac OS 10.9:

Task Tools used
Control REMOTE machine (your VPS) via SSH BASH terminal
Python environment, synced between LOCAL and REMOTE virtualenv, pip
Design website layout/theme HTML/CSS editor (TextWrangler)
Write blog posts in Markdown Markdown editor (nvALT, Byword)
Create images or take photos Lightroom, ImageOptim, ImageAlpha
Test locally - Build HTML from Markdown source Pelican
Test locally - Serve website, viewable on LOCAL machine Pelican and Python
Version control source (website layout, blog posts) to GitHub GitHub, BASH terminal

REMOTE machine - VPS w/ Ubuntu 12.04:

Task Tools used
Security, firewall, other basic security on our VPS UFW, Fail2ban, etc
Python environment, synced between LOCAL and REMOTE virtualenv, pip
Get GitHub-versioned website source code GitHub
Build HTML from Markdown source Pelican
Serve website to yourDNSdomain.com nginx
Automated deployment to build/host site when GitHub updates Fabric

Layout of this document

I've read a huge number of blog posts, tutorials, and code documentation to end up with the process documented in this post (see reference links scattered throughout).

I document here the exact process I ended up using to set this website up from start to finish:

The biggest source of confusion for me, while reading through blog posts and documentation to configure all this stuff for the first time, was the lacking clarity of whether code blocks or documentation were referring to the LOCAL or the REMOTE (VPS) machine. I try to be crystal clear about this to eliminate confusion, and I think the way this document flows will make more sense to a newbie.

Table of Contents

♫ Of course this is only a suggested procedure, and just like anything involving computers or programming, there are multiple ways to accomplish any given task, so feel free to adapt and leave suggestions in the comments when you find better means on your own!

Notes on reading this document

References are spread through as we go as direct links.

Code block sections will look like:

1
2
3
# first line will denote "on REMOTE" or "on LOCAL" for clarity.
$ BASH commands will start with the dollar sign
# Code will be syntax highlighted according to language

➩ This symbol represents that this is a specific action that should be followed.

♫ This symbol represents the following is a note / commentary for more detail.

♫ I'll be refferring to your VPS as REMOTE, and your local machine as LOCAL.


PART 1: Build website/blog locally

I. Establish a local directory for your website project

Create a root directory for your website project. This will be what we turn in to a Git repo that gets backed up on GitHub, and also contain the source that Pelican will build from.

My personal system is to have a folder in my users' home directory called dev wherein I put one-word directories that become GitHub repos. So since my website has a DNS domain of jamesnewbrain.com, I created a folder at ~dev/jamesnewbrain.

1
2
# on LOCAL:
$ mkdir -p ~/dev/jamesnewbrain/

♫ From now on, when you see ~dev/jamesnewbrain in code blocks, please substitute with your own root directory created in this step.

II. Setup Python environment on LOCAL (Python, pip, virtualenv, Pelican, Markdown)

Note that I referenced the following tutorials: dabapps.com, duncanlock.net, feross.org, and clemesha.org.

  1. Install Python (Mac OS 10.9 already has 2.7.5 installed)

  2. Install pip, the package manager for Python modules.

    ♫ You're using a sudo to install pip globally on your machine, since you'll typically want to be able to install/update/uninstall Python modules outside of any specific virtual environment we set up.

    ♫ We also install python-dev headers in case we will be compiling any python libraries that need them.

    1
    2
    # on LOCAL:
    $ sudo aptitude install python-pip python-dev
    
  3. Install virtualenv, the Python virtual environment management system.

    virtualenv allows you to compartmentalize sets of Python modules for specific projects from the globally installed modules on your entire system. This way you can have project-specific versions of modules and manage any conflicts between modules on a project-by-project basis. It also allows you to sync your "blessed" set of Python modules from your LOCAL machine with your REMOTE machine, which we will do later in this procedure.

    1
    2
    # on LOCAL:
    $ sudo pip install virtualenv
    
  4. From now on, don't use global pip commands, instead use virtualenv

    ➩ Create a new virtualenv Python environment in your site's project folder:

    1
    2
    3
    4
    5
    6
    7
    # on LOCAL:
    $ cd ~/dev/jamesnewbrain/
    $ virtualenv env    # can be any <environment_name>
    
    # switch to the new environment
    $ cd env
    $ source bin/activate
    

    ♫ Thus the name of the environment is added to your cmd prompt. This lasts as long as terminal window is open (see it prefixed on the left of any command line).

    ♫ You can switch back to default python install with $ deactivate.

    ♫ Now can use pip (without sudo) inside of your virtual environment to install Python modules ONLY IN YOUR CURRENT PROJECT:

    1
    2
    3
    # on LOCAL:
    $ pip search <package_name>
    $ pip install <package_name>
    
  5. Install the Python packages to our virtual environment we will use to generate our site.

    a. Install Pelican

    1
    2
    # on LOCAL:
    $ pip install pelican
    

    b. Install Markdown manually (the Python version).

    1
    2
    # on LOCAL:
    $ pip install markdown
    

    c. Install BeautifulSoup, a HTML parser, which we will use later.

    1
    2
    # on LOCAL:
    $ pip install beautifulsoup4
    

    d. You can check what's installed in your virtualenv now with:

    # on LOCAL:
    $ pip freeze
    
    Jinja2==2.7.2
    Markdown==2.3.1
    MarkupSafe==0.18
    Pygments==1.6
    Unidecode==0.04.14
    beautifulsoup4==4.3.2
    blinker==1.3
    docutils==0.11
    feedgenerator==1.7
    pelican==3.3
    pytz==2013.9
    six==1.5.2
    wsgiref==0.1.2
    
  6. Save a file called requirements.txt, which contains the above list of Python packages installed in your virtual environment.

    1
    2
    # on LOCAL:
    $ pip freeze > requirements.txt
    

    ♫ Now you can move to any new machine and install the same Python environment quickly with:

    1
    2
    # on LOCAL:
    $ pip install -r requirements.txt
    

    ♫ You can also update all the modules quickly with:

    1
    2
    # on LOCAL:
    $ pip install --upgrade -r requirements.txt
    

III. Create a default Pelican blog

I read a lot of blog posts to kind of formulate the following configuration. Check out the following: cbracco.me, duncanlock.net, jamesmurty.com, gtmanfred.com, claudiodangelis.com, martinbrochhaus.com, and xlarrakoetxea.org.

  1. Use the Pelican wizard, pelican-quickstart from your project's root directory to spin up a default Pelican blog.

    1
    2
    3
    # on LOCAL:
    $ cd ~/dev/jamesnewbrain
    $ pelican-quickstart
    

    ➩ Walk through the wizard, answering the questions with your own answers.

    Where do you want to create your new web site? [.] 
    What will be the title of this web site? jamesnewbrain
    Who will be the author of this web site? james fallisgaard
    What will be the default language of this web site? [en] en
    Do you want to specify a URL prefix? e.g., http://example.com   (Y/n) y
    What is your URL prefix? (see above example; no trailing slash) http://jamesnewbrain.com
    Do you want to enable article pagination? (Y/n) n
    Do you want to generate a Fabfile/Makefile to automate generation and publishing? (Y/n) y
    Do you want an auto-reload & simpleHTTP script to assist with theme and site development? (Y/n) y
    Do you want to upload your website using FTP? (y/N) n
    Do you want to upload your website using SSH? (y/N) n
    Do you want to upload your website using Dropbox? (y/N) n
    Do you want to upload your website using S3? (y/N) n
    Do you want to upload your website using Rackspace Cloud Files? (y/N) n
    Done. Your new project is available at /Users/yames/dev/jamesnewbrain
    
  2. How to test your website locally using the Pelican devserver.

    pelican-quickstart creates a shell script, develop_server.sh that you can run locally to start a loop where it will detect changes in your Pelican project (change to config file, change to blog post files), rebuild the HTML automatically, and serve the site locally using the Python HTTP web server.

    This is very useful to use while writing blog posts or working on your site's theme, as as soon as you save changes locally as you can see them reflected on a local version of your site just by refreshing your web browser window.

    1
    2
    3
    # on LOCAL:
    $ cd ~/dev/jamesnewbrain
    $ make devserver
    

    ➩ The default localhost port that Python's webserver will use is 8000. Navigate to http://localhost:8000/ in web browser to see preview.

    ♫ If you don't want to actually run the local webserver and just want to force a rebuild of the site's HTML, use make html.

    ➩ To regain access to your terminal, use ctl-c.

    ➩ This keeps the python webserver running in a background process. To kill this process also, run:

    1
    2
    # on LOCAL:
    $ sh develop_server.sh stop
    
  3. Customize the file/folder hierarchy to meet your own design goals.

    By default pelican-quickstart will make some files you may not end up needing, for example they provide two methods to automate building of your site, a Makefile using MAKE, and Fabfile.py using Fabric. In practice, you'll probably only use one of these methods and delete the unused file. They also do not impose much in terms of folder hierarchy, leaving you with a simple content folder to be a generic container for your blogposts and images. You are free to customize as you want, renaming files and folders and moving things around, as long as you update your pelicanconf.py to account for changes in pathing.

    I ended up choosing the following hierarchy for my site. I will cover the significance of the different directories in this hierarchy throughout the document, for example, you have already seen the creation of the env directory to contain your Python virtual environment. I also comment below the directories we will tell Git not version control, as they are either built dynamically, or contain externals that shouldn't be committed as source in our repository.

    jamesnewbrain/
    |-- .gitignore
    |-- Makefile
    |-- README.md
    `-- content
        `-- extras
        `-- images
        `-- pages
        `-- posts
    `-- env                 # ignored by Git
    `-- output              # ignored by Git
    `-- plugins             # ignored by Git
    `-- themes              # ignored by Git
    |-- develop_server.sh
    |-- pelicanconf.py
    |-- requirements.txt
    

    ➩ If you rename or move either content or pelicanconf.py, modify develop_server.sh, Makefile and pelicanconf.py accordingly so that Pelican will continue to build correctly.

IV. GitHub versioning your website/blog project

  1. Create a Git repo for your project. (This is just a Git repo locally. You will tie it to GitHub as a backup service in a later step).

    1
    2
    3
    # on LOCAL:
    $ cd ~/dev/jamesnewbrain
    $ git init
    
  2. Create a .gitignore file for your project to ensure that only source and config files get synced with GitHub.

    ➩ Start by downloading a copy of GitHub's .gitignore file template for Python and saving it in the root of your blog project like ~/dev/jamesnewbrain.

    ➩ Edit this .gitignore file and add the following lines:

    #Custom
    output/
    plugins/
    themes/
    *.pid
    

    ♫ You don't want to sync the output/ directory because this will contain the HTML that we will generate on the VPS from the GitHub-versioned source files.

    ♫ GitHub's .gitignore template for Python already includes ignoring env/ directories, so our virtualenv won't sync. Instead, the requirements.txt file used by pip will be synced in the root of our project. This is how we will deploy the same Python virtualenv to our REMOTE server.

    plugins/ and themes are externals from other repos, so we shouldn't commit those with our own source.

  3. Create a file README.md in your projects' root.

    This will be GitHub's default readme file.

    1
    2
    3
    # on LOCAL:
    $ cd ~/dev/jamesnewbrain
    $ touch README.md
    

    ➩ I started with the following:

    # jamesnewbrain.com
    
    This is a static site generated by [Pelican](http://docs.getpelican.com/en/3.3.0/).
    

    ➩ Save/exit with ctl-x when you're done.

  4. Sync your local Git repo with GitHub.com.

    ➩ Start by committing the site to a local Git repo.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    # on LOCAL:
    $ git add .
    $ git status
    
    # On branch master
    #
    # Initial commit
    #
    # Changes to be committed:
    #   (use "git rm --cached <file>..." to unstage)
    #
    # new file:   .gitignore
    # new file:   Makefile
    # new file:   README.md
    # new file:   develop_server.sh
    # new file:   pelicanconf.py
    # new file:   requirements.txt
    #
    
    $ git commit -m "Initial commit of jamesnewbrain.com"
    $ git status
    

    ➩ Now let's synchronize our local repo with a remote repo at GitHub.com.

    ➩ First create an empty repo at GitHub.com so that you can get an HTTPS URL to push to from your local machine. GitHub will provide you with a HTTPS URL like: https://github.com/jfallisg/jamesnewbrain.git.

    ➩ Next, add this as the remote repository for your local Git repo.

    1
    2
    3
    # on LOCAL:
    $ git remote add origin https://github.com/jfallisg/jamesnewbrain.git
    $ git push -u origin master
    

    ♫ There's a chance you set this up wrong, or make the mistake that I did of copying in the SSH link instead of HTTPS, when you've previously only established credentials to sync with GitHub over HTTPS. Audit and remediate those issues with the following:

    1
    2
    3
    4
    # on LOCAL:
    $ git remote -v         # to tell you what you have set up as remote
    $ cat .git/config       # alternative to audit your settings
    $ git remote rm origin  # to remove previous remote origin from you repo
    

    ➩ Now that you have established sync with GitHub, from now on, commit changes or new blog posts to your GitHub repo with:

    1
    2
    3
    4
    5
    6
    # on LOCAL:
    $ cd ~/dev/jamesnewbrain
    $ git status                        # optional, check for changes
    $ git add .
    $ git commit -m "describe changes"
    $ git push origin master
    

Congrats, your website is all set up locally! Now let's set up our VPS and actually host this thing on the internet!


PART 2: DigitalOcean VPS setup to host our Pelican blog

Once you've decided on going the VPS route (and not a cloud-based hosting like Amazon EC2), you'll quickly narrow your VPS providers down to either DigitalOcean or Linode. Both have great reps. For my use case, DigitalOcean provided a dirt cheap $6/month hosting deal which includes automated backups of my server, which won me over.

I. Create a DigitalOcean droplet

See official DigitalOcean tutorial for more help.

  1. From the DigitalOcean dashboard, create Droplet, and select a meaningful/memorable hostname.
  2. Select size / price plan (I chose their cheapest droplet which gives you 512MB / 1CPU / 20GB SSD / 1TB x-fer for $5 per month).
  3. Select region / datacenter near you.
  4. Select distribution (I chose Ubuntu 12.04.3 x64).
  5. We will add our SSH key later in this procedure.
  6. Settings (I chose to enable the following: Enable VirtIO, Private Networking, Backups (for an extra $1 per month)).

DigitalOcean will then spin up your VPS and email you your VPS's IP address, and default root user password.

II. Configure VPS for remote access

  1. Connect to VPS

    1
    2
    # on REMOTE:
    $ ssh root@your_vps_ip
    
  2. Set hostname and set Fully Qualified Domain Name (FQDN)

    I referenced these DigitalOcean tutorials, Set Hostname and Set FQDN.

    ♫ By default your DigitalOcean droplet's name is your hostname. I'll refer to these interchangably as your_hostname.

    a. First check default hostname:

    1
    2
    3
    # on REMOTE:
    $ hostname      # your_hostname == droplet name by default
    $ hostname -f   # your currently set FQDN, should be localhost by default
    

    b. Change FQDN to properly reflect our hostname/domain:

    1
    2
    # on REMOTE:
    $ nano /etc/hosts
    

    ➩ Then insert a line at the top of the list like:

    your_vps_ip     your_hostname.yourDNSdomain.com     your_hostname
    

    If it's not at the top of the list, localhost will continue to be returned in FQDN lookup. Verify hostname and FQDN work correctly now with another $ hostname -f command. You should get your_hostname.yourDNSdomain.com back.

  3. Configure DNS with DigitalOcean

    a. On your domain registrar's site, login and point Domain Name Server to DigitalOcean domain servers.

    b. On DigitalOcean's dashboard, "Add Domain"

    c. Input yourDNSdomain.com, your_vps_ip, and your_hostname.

    This will create the A record for your domain.

    You should end up with a line like: A @ your_vps_ip

    d. Add CNAME records for "www" and wildcard "*" to resolve to default domain level CNAME www @ CNAME * @

  4. Update Ubuntu on your droplet

    1
    2
    3
    # on REMOTE:
    $ aptitude update
    $ aptitude upgrade
    

III. Basic security for your VPS

Ubuntu web server security is something I know next to nothing about, so I leaned heavily on the following references: feross.org, digitalocean.com/community, and from cbracco.me.

  1. Change root password from DigitalOcean's default.

    1
    2
    # on REMOTE:
    $ passwd your_new_password
    
  2. Create a new username account to log in to instead of root.

    1
    2
    # on REMOTE:
    $ adduser your_username
    

    I used the same username as my local machine so I have the option to login with $ ssh your_droplet_name.yourDNSdomain.com.

    ➩ Specify password, and you can leave other fields blank.

  3. Give sudo root privileges to this new user

    1
    2
    # on REMOTE:
    $ visudo
    

    ➩ In nano text editor, navigate to # User privilege specification, add a line like the one for "root" user with your new user name like your_username ALL=(ALL:ALL) ALL, save with ctl-x and Y to save.

  4. Configure SSH to your server to disallow root logins, and operate on a different default port.

    1
    2
    # on REMOTE:
    $ nano /etc/ssh/sshd_config
    

    ➩ change your_SSH_port Port 22 to Port ### where ### is less than 1024, and not 22.

    ➩ change the rule for PermitRootLogin to no

    ➩ add the line UseDNS no to the bottom of the file

    ➩ add the line AllowUsers your_username to the bottom of the file.

    ➩ ctl-x to save and exit.

    1
    2
    # on REMOTE:
    $ reload ssh
    

    before logging out of root user, make sure everything is set up okay by opening a new Terminal window:

    1
    2
    3
    # on REMOTE:
    $ ssh -p your_SSH_port your_username@yourDNSdomain.com # if DNS has registered by now
    $ ssh -p your_SSH_port your_SSH_port@your_vps_ip       # alternatively
    

    ➩ If you successfully have logged in to the new user, close the terminal window that's logged in as root. From now on log in as this new user whenever you connect to your VPS.

  5. Configure SSH keys

    See DigitalOcean’s SSH Key tutorial.

    The point of the next section is to provide an easier and more secure way to log in to your REMOTE VPS using SSH keys.

    a. ssh-keygen will create a public/private key pair, saving the:

    • public key to /Users/your_username/.ssh/your_name_specified.pub (which will be copied to the remote server that we want to authenticate with)
    • private key to /Users/your_username/.ssh/your_name_specified (which we will keep on our local machine to do authentication).

      1
      2
      # on LOCAL:
      $ ssh-keygen -t rsa -C "your_email_address"
      

    ➩ You can specify a custom your_name_specified.pub for this.

    ➩ Enter a passphrase (if you want).

    b. Copy public key to the server

    1
    2
    3
    4
    5
    6
    7
    8
    9
    # on LOCAL:
    $ scp -P your_SSH_port ~/.ssh/your_name_specified.pub your_username@yourDNSdomain.com:
    # on REMOTE:
    $ mkdir .ssh
    $ mv your_name_specified.pub .ssh/authorized_keys
    $ chown -R your_username:your_username .ssh
    $ chmod 700 .ssh
    $ chmod 600 .ssh/authorized_keys
    $ exit
    

    ➩ Now you can connect to your REMOTE VPS without providing a password, as long as you're connecting from the machine where you generated the SSH key with one of the following:

    1
    2
    3
    # on LOCAL:
    $ ssh -p your_SSH_port your_username@yourDNSdomain.com
    $ ssh -p your_SSH_port your_SSH_port@your_vps_ip`
    

    c. To avoid typing all the above out, you can modify your local SSH config file.

    ♫ Note, this will also allow you to have multiple saved SSH keys, for example if you had multiple REMOTE servers that you liked to log in to from the same LOCAL machine with different keys associated with each.

    1
    2
    # on LOCAL:
    $ nano ~/.ssh/config
    

    ➩ Now you can set up remote servers you want to connect to, each getting their own block in this file. Use following format:

    Host do                                    # nickname of this server
        HostName yourDNSdomain.com             # this can also be an IP address
        User your_username
        Port your_SSH_port
        IdentityFile "~/.ssh/your_private_key"
    Host awshost1                              # another server you connect to
        HostName some_other_ip_address
        User some_other_username
        IdentityFile "~/.ssh/some_other_combined_key.pem"
    

    ➩ Once you're finished with this file, ctl-x to save and exit.

    From now on, on local machine, you can log in to your remote servers without passwords or remembering the IP addresses, ports, usernames for each one. You just have to remember the alias you set up as Host server_nickname in the .ssh/config file. I use ssh do so that my DigitalOcean VPS is just a couple keystrokes away from my local terminal window. You'll still have to remember your password in order to root sudo access however.

IV. More advanced (optional) network security for your VPS

I ended up following a lot of the configurations suggested by feross.org and cbracco.me.

  1. Install/config Fail2Ban

    Check out this DigitalOcean tutorial on installing fail2ban.

    a. Install it

    1
    2
    # on REMOTE:
    $ sudo aptitude install fail2ban
    

    b. Copy default to local jail file

    1
    2
    3
    # on REMOTE:
    $ sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
    $ sudo nano /etc/fail2ban/jail.local
    

    c. Modify configuration file

    ➩ If you have static IP on local machine, add it to ignoreip line

    ➩ Change bantime from 10 minutes to 1 hour

    ➩ Change destemail = your_email_address

    ➩ Change action default to action = %(action_mwl)s

    ➩ In [ssh] section, make sure enabled = true, and change port number to ours.

    ➩ In [ssh-ddos] section, make sure enabled = true, and change port number to ours.

    ➩ Use ctl-x to save and exit, and then $ sudo service fail2ban restart.

  2. Set up Uncomplicated Firewall or UFW, which is a front-end to iptables.

    See this DigitalOcean tutorial on installing ufw firewall. See this wiki page for what ports default to what protocol, as used in following code block.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    # on REMOTE:
    $ sudo aptitude install ufw         # if not already installed:
    $ sudo status                       # verify UFW is off
    $ sudo ufw default deny incoming
    $ sudo ufw default allow outgoing
    $ sudo ufw logging on
    $ sudo ufw allow http/tcp
    $ sudo ufw allow 443                # this is https
    $ sudo ufw allow your_SSH_port/tcp  # enter your SSH port
    $ sudo ufw allow 21/tcp             # this is ftp
    

    Turn on and verify:

    1
    2
    3
    # on REMOTE:
    $ sudo ufw enable
    $ sudo ufw status verbose
    

    ➩ To get a numbered list of what rules are established: $ sudo ufw status numbered

    ➩ Can then delete current rules with $ sudo ufw delete [number]

  3. Enable auto security updates

    a. Install unattended-upgrades

    1
    2
    3
    # on REMOTE:
    $ sudo aptitude install unattended-upgrades
    $ sudo nano /etc/apt/apt.conf.d/10periodic
    

    b. Overwrite lines to read:

    APT::Periodic::Update-Package-Lists "1";
    APT::Periodic::Download-Upgradeable-Packages "1";
    APT::Periodic::AutocleanInterval "7";
    APT::Periodic::Unattended-Upgrade "1";
    

    c. Open

    1
    2
    # on REMOTE:
    $ sudo nano /etc/apt/apt.conf.d/50unattended-upgrades
    

    d. Overwrite lines to read:

    # on REMOTE:
    Unattended-Upgrade::Allowed-Origins {
        "Ubuntu lucid-security";
    //      "${distro_id}:${distro_codename}-security";
    //      "${distro_id}:${distro_codename}-updates";
    //      "${distro_id}:${distro_codename}-proposed";
    //      "${distro_id}:${distro_codename}-backports";
    };
    
  4. Have system auto-reboot if it runs out of memory

    See article at fanclub.co.za for more detail.

    1
    2
    # on REMOTE:
    $ sudo nano /etc/sysctl.conf
    

    ➩ Add following lines to the bottom of the file, then ctl-x to save/exit:

    vm.panic_on_oom=1
    kernel.panic=10
    
  5. Secure shared memory

    1
    2
    # on REMOTE:
    $ sudo nano /etc/fstab
    

    ➩ Add following lines to the bottom of the file: tmpfs /dev/shm tmpfs defaults,noexec,nosuid 0 0, then ctl-x to save/exit.

    1
    2
    # on REMOTE:
    $ sudo mount -a
    
  6. Harden network with sysctl settings

    1
    2
    # on REMOTE:
    $ sudo nano /etc/sysctl.conf
    

    ➩ Uncomment any of the following lines:

    net.ipv4.conf.default.rp_filter=1
    net.ipv4.conf.all.rp_filter=1
    net.ipv4.tcp_syncookies=1
    net.ipv4.conf.all.accept_redirects = 0
    net.ipv6.conf.all.accept_redirects = 0
    net.ipv4.conf.all.send_redirects = 0
    net.ipv4.conf.all.accept_source_route = 0
    net.ipv6.conf.all.accept_source_route = 0
    net.ipv4.conf.all.log_martians = 1
    

    Apply the new settings with:

    1
    2
    # on REMOTE:
    $ sudo sysctl -p
    
  7. Prevent IP spoofing

    1
    2
    # on REMOTE:
    $ sudo nano /etc/host.conf
    

    ➩ Add following line: nospoof on.

  8. Check for rootkits with RKHunter and CHKRootKit

    1
    2
    3
    4
    5
    6
    # on REMOTE:
    $ sudo aptitude install rkhunter chkrootkit
    $ sudo chkrootkit
    $ sudo rkhunter --update
    $ sudo rkhunter --propupd
    $ sudo rkhunter --check
    
  9. Analyze system log files with LogWatch

    1
    2
    3
    4
    5
    # on REMOTE:
    $ sudo aptitude install sendmail
    $ sudo aptitude install logwatch libdate-manip-perl
    $ sudo logwatch | less
    $ sudo logwatch --mailto {your email address} --output mail --format html --range 'between -7 days and today'
    
  10. Audit system security with Tiger

    1
    2
    3
    4
    # on REMOTE:
    $ sudo aptitude install tiger
    $ sudo tiger
    $ sudo less /var/log/tiger/security.report.*
    

V. Set up nginx on your REMOTE VPS (web server to publish your website to the internet)

nginx is open source webserver software that we will run on our REMOTE server. One of its features is that you can host multiple yourDNSdomain.com's from a single server. We want to configure our REMOTE server to serve our website to the internet at yourDNSdomain.com. Check out this DigitalOcean tutorial for reference.

First: a couple useful nginx / port debug commands

  1. Start, stop, restart nginx on Ubuntu.

    Assuming your /etc/init.d/nginx is populated, which it should be with a default install of nginx on Ubuntu:

    1
    2
    3
    4
    5
    # on REMOTE (Ubuntu):
    $ sudo service nginx stop
    $ sudo service nginx start
    $ sudo service nginx restart
    $ sudo service nginx reload
    
  2. It will be helpful to be able to ask either your LOCAL or REMOTE machine which ports are being listened to using TCP protocol and by which processes. This way you can verify your configurations and also see if nginx or the Python web servers are behaving the ways you believe.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    # on REMOTE:
    $ sudo lsof -i -n -P | grep TCP
    # my LOCAL machine didn't require sudo:
    $ lsof -i -n -P | grep LISTEN
    $ lsof -i -n -P | grep TCP
    # if you know which specific port you want to ask for, say 8080:
    $ lsof -i -n -P | grep 8080
    
    # you're looking for a return like:
    nginx      987     root    9u  IPv4   8940      0t0  TCP *:80 (LISTEN)
    nginx      987     root   10u  IPv6   8941      0t0  TCP *:80 (LISTEN)
    nginx      998 www-data    9u  IPv4   8940      0t0  TCP *:80 (LISTEN)
    nginx      998 www-data   10u  IPv6   8941      0t0  TCP *:80 (LISTEN)
    
  3. Look up which IP address nginx is serving up websites on:

    1
    2
    # on REMOTE:
    $ ifconfig eth0 | grep inet | awk '{ print $2 }'
    
  4. I found it useful also to have a quick way to search the full file system from root on for instance of a string, for example to search my REMOTE server for where all relevant directories and files dealing with nginx were:

    1
    2
    # on REMOTE:
    $ find /. -iname "*nginx*" 2>/dev/null
    

    ♫ The piping of warnings to /dev/null will mute all the inevitable warnings about permission denied on protected files.

    ♫ This isn't really recommended on your LOCAL machine, though you can run it if you wan't to, it will just take a long time.

  5. To kill nginx processes, get the process ID of the master process with:

    1
    2
    # on REMOTE or LOCAL:
    $ ps -ax | grep nginx
    

    ➩ Then kill it:

    1
    2
    # on REMOTE or LOCAL:
    $ kill -s QUIT <process id>
    

Back to nginx configuration

Now here's where things became a bit confusing to me, as someone who hasn't used nginx before. I'll recount how I believe all this to work. Again, if you know better, please leave me a comment and I'll correct the post for everyone!

Basically nginx allows for a global server setup, and then n-number of specially tailored setups for specific "virtual hosts" (in traditional Apache lingo), or "server blocks" (nginx's name for the same concept). Each of these virtual hosts can then be separate web sites with different associated yourDNSdomain.com's, all with the same server computer on the backend. Because some of the pathing gets confusing, use the following as reference as we work through this:

Description Path
Default installation directory (Ubuntu) /etc/nginx/
Default installation directory (Mac OS) /usr/local/etc/nginx/
Global config file /etc/nginx/nginx.conf
Available virtual host config files /etc/nginx/sites-available
Default virtual host config file /etc/nginx/sites-available/default
Backup of as-installed default virtual host config file /etc/nginx/sites-available/default.bak
My blog's virtual host config file /etc/nginx/sites-available/jamesnewbrain.com
Sym-links pointing at enabled virtual host config files /etc/nginx/sites-enabled
My blog's Pelican-built HTML ~/dev/jamesnewbrain/output
Dir with sym-link to my blog's HTML; used by nginx /var/www/jamesnewbrain.com/public_html
Default global logs end up in /var/log/nginx
My blog's virtual host logs end up in /var/www/jamesnewbrain.com/logs
default 50x.html and index.html used globally by nginx /usr/share/nginx/www
  1. Install and start nginx

    1
    2
    3
    # on REMOTE:
    $ sudo aptitude install nginx
    $ sudo service nginx start      # to start nginx
    

    ♫ My default nginx version as installed by aptitude in Ubuntu was nginx/1.1.19.

    ➩ Check nginx is on

    1
    2
    3
    # on REMOTE:
    $ sudo lsof -i -n -P | grep LISTEN                 # make sure nginx is on
    $ ifconfig eth0 | grep inet | awk '{ print $2 }'   # which IP is it serving on?
    

    ➩ Visit the IP returned in browser to see "Welcome to nginx", this is your server!

  2. Confirm nginx is set to start automatically with the server.

    1
    2
    # on REMOTE:
    $ update-rc.d nginx defaults
    

    ♫ If you get back System start/stop links for /etc/init.d/nginx already exist., it's already going to start automatically.

  3. Configure global nginx configuration at /etc/nginx/nginx.conf:

    ➩ Open nginx.conf for editing.

    1
    2
    # on REMOTE:
    $ sudo nano /etc/nginx/nginx.conf
    

    ➩ Uncomment # server_names_hash_bucket_size 64;.

    ➩ make sure include /etc/nginx/sites-enabled/*; is present and uncommented somewhere in nginx.conf.

  4. Customize the default nginx virtual hosts config.

    1
    2
    3
    # on REMOTE:
    $ sudo cp /etc/nginx/sites-available/default /etc/nginx/sites-available/default.bak
    $ sudo nano /etc/nginx/sites-available/default
    

    ➩ modify sites-available/default in nano like this:

    server {
        listen   80; ## listen for ipv4; this line is default and implied
        listen   [::]:80 default ipv6only=on; ## listen for ipv6
    
        root /usr/share/nginx/www;
        index index.html index.htm index.php;
    
        # Make site accessible from http://localhost/
        server_name _;
        location / {
            try_files $uri $uri/ /index.html;
        }
    
        location /doc/ {
            alias /usr/share/doc/;
            autoindex on;
            allow 127.0.0.1;
            deny all;
        }
    
        # Redirect server error pages to the static page /50x.html
        error_page 500 502 503 504 /50x.html;
        location = /50x.html {
            root /usr/share/nginx/www;
        }
    
        # Pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
        location ~ \.php$ {
            fastcgi_split_path_info ^(.+\.php)(/.+)$;
            fastcgi_pass unix:/tmp/php5-fpm.sock;
            fastcgi_index index.php;
            include fastcgi_params;
        }
    
        # Deny access to .htaccess files, if Apache's document root concurs
        # with Nginx’s one
        location ~ /\.ht {
            deny all;
        }
    }
    
  5. Set up nginx Virtual Hosts (server blocks) to host multiple websites on single server

    There's a decent tutorial I referenced from the DigitalOcean community.

    /var/www/ is the conventional root directory for public_html that your web server will host.

    a. Create a directory to hold new website's HTML:

    1
    2
    # on REMOTE:
    $ sudo mkdir -p /var/www/yourDNSdomain.com/public_html
    

    ➩ Also make a folder for the automated logs in the same area.

    1
    2
    # on REMOTE:
    $ sudo mkdir -p /var/www/yourDNSdomain.com/logs
    

    b. Grant ownership and modification permissions.

    1
    2
    # on REMOTE:
    $ sudo chown -R your_username:www-data /var/www/yourDNSdomain.com/public_html
    

    ➩ Give read access to everyone.

    1
    2
    # on REMOTE:
    $ sudo chmod -R 755 /var/www
    

    c. Create a test index.html page

    1
    2
    # on REMOTE:
    $ sudo nano /var/www/yourDNSdomain.com/public_html/index.html
    

    ➩ in nano, copy/paste:

    1
    2
    3
    4
    5
    6
    7
    8
    <html>
        <head>
            <title>yourDNSdomain.com</title>
        </head>
        <body>
            <h1>Good job man, you have set up a Virtual Host</h1>
        </body>
    </html>
    

    d. Create a virtual host config file for yourDNSdomain.com from a copy of the sites-availble/default config document.

    I referenced cbracco.me some more here.

    1
    2
    3
    # on REMOTE:
    $ sudo cp /etc/nginx/sites-available/default /etc/nginx/sites-available/example.com
    $ sudo nano /etc/nginx/sites-available/yourDNSdomain.com
    

    ➩ Begin editing your custom virtual host config file

    ➩ Uncomment listen 80; so that traffic coming through port 80 will be directed to site

    ➩ Change the root extension to match /var/www/yourDNSdomain.com/public_html

    ➩ Change the server_name to yourDNSdomain.com

    ♫ For an example configuration, see below:

    ...
    server {
            server_name www.jamesnewbrain.com;
    
            # rewrite www to non-www
            rewrite ^(.*) http://jamesnewbrain.com$1 permanent;
    }
    
    server {
            # Listening ports
            listen   80;                            ## listen for ipv4; this line is default and implied
            listen   [::]:80 default ipv6only=on;   ## listen for ipv6
    
            # Make site accessible from domain
            server_name jamesnewbrain.com;
    
            # Root directory
            root /var/www/jamesnewbrain.com/public_html;
            index index.html index.htm;
    
            # Logs
            access_log /var/www/jamesnewbrain.com/logs/access.log;
            error_log /var/www/jamesnewbrain.com/logs/error.log;
    
            # Includes
            include global/restrictions.conf;
    ...
    

    e. Create global/restrictions.conf:

    1
    2
    3
    # on REMOTE:
    $ sudo mkdir /etc/nginx/global
    $ sudo nano /etc/nginx/global/restrictions.conf
    

    ➩ Edit file so that it resembles:

    # based on info from: http://cbracco.me/vps/#vhosts
    
    # Global restrictions configuration file.
    # Designed to be included in any server block.
    location = /favicon.ico {
        log_not_found off;
        access_log off;
    }
    
    location = /robots.txt {
        allow all;
        log_not_found off;
        access_log off;
    }
    
    # Deny all attempts to access hidden files such as .htaccess, .htpasswd, .DS_Store (Mac).
    # Keep logging the requests to parse later (or to pass to firewall utilities such as fail2ban)
    location ~ /\. {
        deny all;
    }
    
    # Deny access to any files with a .php extension in the uploads directory
    # Works in sub-directory installs and also in multisite network
    # Keep logging the requests to parse later (or to pass to firewall utilities such as fail2ban)
    location ~* /(?:uploads|files)/.*\.php$ {
        deny all;
    }
    

    f. Activate the host by symbolically linking the specifc config you want to enable in sites-available w/ sites-enabled.

    1
    2
    # on REMOTE:
    $ sudo ln -s /etc/nginx/sites-available/<yourdomain.com> /etc/nginx/sites-enabled/<yourdomain.com>
    

    g. To avoid a "conflicting server name error", delete the default nginx virtual host.

    1
    2
    # on REMOTE:
    $ sudo rm /etc/nginx/sites-enabled/default
    

    h. Restart nginx

    1
    2
    # on REMOTE:
    $ sudo service nginx restart
    

You should now be able to actually visit yourDNSdomain.com and see the test index.html we made in nano a few steps back. If so, you've successfully connected the internet to a DNS lookup against yourDNSdomain.com to the IP address DigitalOcean provides your droplet to a nginx virtual host to an actual *.html file on your VPS! Good job! We'll be back to finish the job once we get your website actually building on your server in Pelican.

VI. Install Git

➩ Install Git on your REMOTE VPS.

1
2
# on REMOTE:
$ sudo aptitude install git-core

VII. Set up global Python environment on REMOTE

This is going to be a pretty similar procedure to our setup on our LOCAL machine, the difference being that we will also use the requirements.txt file generated by pip freeze so that our Python virtualenv on our REMOTE server matches that of our development environment.

  1. Install Python (Ubuntu 12.04 already has 2.7.3 installed)

  2. Install pip, the package manager for Python modules.

    ♫ You're using a sudo to install pip globally on your machine, since you'll typically want to be able to install/update/uninstall Python modules outside of any specific virtual environment we set up.

    ♫ We also install python-dev headers in case we will be compiling any python libraries that need them.

    1
    2
    # on REMOTE:
    $ sudo aptitude install python-pip python-dev
    
  3. Install virtualenv, the Python virtual environment management system.

    ♫ Remember that virtualenv allows you to compartmentalize sets of Python modules for specific projects from the globally installed modules on your entire system. This way you can have project-specific versions of modules and manage any conflicts between modules on a project-by-project basis. It also allows you to sync your "blessed" set of Python modules from your LOCAL machine with your REMOTE machine, which we will do later in this procedure.

    1
    2
    # on REMOTE:
    $ sudo pip install virtualenv
    

VIII. Make a snapshot backup of your VPS now

The timing is good for a backup snapshot of your VPS, because all the software we need is installed and configured, but we haven't polluted the machine with any specific source code checkouts of our own. By saving a snapshot now, if we need to get back to a configured image of our system, we can, but without having to redo all the time-consuming sys-admin activities we did earlier.

➩ To take a snapshot of the droplet, you'll need to stop your droplet form the command line.

1
2
# on REMOTE:
$ sudo shutdown -h now

➩ Now from the DigitalOcean dashboard, select to take a snapshot. DigitalOcean will turn your server back on for you after they are done.

IX. Sync REMOTE with our website's GitHub repo, and finish Python environment setup

  1. We will now clone our GitHub repo to our REMOTE server. Again, I will show cloning this to ~/dev/jamesnewbrain/, so apply your own pathing as you like.

    ➩ Get the HTTPS URL for your website's GitHub repo from the GitHub website. Mine looks like https://github.com/jfallisg/jamesnewbrain.git.

    1
    2
    3
    4
    # on REMOTE:
    $ mkdir ~/dev
    $ cd dev
    $ git clone your_github_HTTPS_URL
    

    ♫ You now have a copy of your current website repo on your remote server! We still have a few steps to get this thing working the way we want to.

    ➩ To update your REMOTE repo, use:

    1
    2
    3
    # on REMOTE:
    $ cd ~/dev/jamesnewbrain
    $ git pull origin master
    
  2. Install Python environment from GitHub-synced requirements.txt

    1
    2
    3
    4
    5
    # on REMOTE:
    $ virtualenv env                   # create a Python virtualenv
    $ source env/bin/activate          # active our Python virtualenv
    $ pip install -r requirements.txt  # install our site's Python dependencies
    $ pip freeze                       # check how it went
    

X. Build HTML and serve the site to the internet

  1. Add a symbolic link between your blog's output/ dir and /var/www/yourDNSdomain.com/public_html.

    The idea here is to make a symbolic link between the contents of ~/dev/jamesnewbrain/output/, the location Pelican drops the HTML that it builds, and the location nginx is looking to serve HTML on its webserver.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    # on REMOTE:
    
    # first delete the test index.html file we made earlier
    $ sudo service nginx stop
    $ sudo rm -rf /var/www/jamesnewbrain.com/public_html
    
    # make the symbolic link
    $ sudo ln -s ~/dev/jamesnewbrain/output/ /var/www/jamesnewbrain.com/public_html
    
    # check it worked
    $ ls -alFGh         # if this output makes sense, proceed
    $ sudo service nginx restart
    
  2. You can have Pelican on the REMOTE server build HTML now, which will populate the output folder in your site's directory.

    1
    2
    3
    4
    # on REMOTE:
    $ cd ~/dev/jamesnewbrain
    $ git pull origin master
    $ make html
    

    ➩ You should now be able to visit yourDNSdomain.com in an actual web browser and see your currently GitHub-committed Pelican-generated website!


PART 3: New blog posting workflow

I. Write a new blog post.

I write blog posts in Markdown, a plain-text format that allows for some simple markup for formatting. Pelican also support .rst, if you prefer that. Personally, I've gotten in to the habit of taking notes in Markdown syntax so that I can easily have formatted text that is treatable also as source code.

➩ Save and edit your blogpost.md file in the blog_project_root/content/posts dir of your website repo, so that it will be visible to Pelican.

II. Use Pelican's simple dev server to preview posts in browser as you edit.

♫ Remember that develop_server.sh will run in a loop, detecting changes in your Pelican project (changes to config file, changes to blog post files), rebuilding the HTML automatically, and serving the site locally using the Python HTTP web server to http://localhost:8000/.

1
2
3
# on LOCAL:
$ cd ~/dev/jamesnewbrain        # use your own projects' repo dir
$ make devserver

➩ To regain access to your terminal, use ctl-c.

➩ This keeps the python webserver running in a background process. To kill this process also, run:

1
2
# on LOCAL:
$ sh develop_server.sh stop

III. Commit completed post source to GitHub

1
2
3
4
5
# on LOCAL:
$ cd ~/dev/jamesnewbrain            # use your own projects' repo dir
$ git add .
$ git commit -m "describe changes"
$ git push origin master

IV. Deploy changes on remote server

1
2
3
4
# on REMOTE:
$ cd ~/dev/jamesnewbrain            # use your own projects' repo dir
$ source env/bin/activate
$ make html

THAT'S IT! Your website is updated on the internet!

Whew, I know that was kind of long winded. And we haven't even gotten to customizing your website with Pelican plugins and custom CSS and all those personalizations that will make your site your own! Well I'm working on a follow up (and much much shorter) post about site customization. So look forward to part 2, coming soon!

-James


comments powered by Disqus