subfocal.net

  • Archive
  • RSS

Using bindfs to work around NFS3 limits

I have a pretty awesome little network storage drive made by Synology that I use for backups, large media files, and anything I may want to access across more than one computer. It has one unfortunate limitation: by default, my uid (user id) is 1026, and I can’t change it to anything below 1024 without causing problems. (discussion here)

Of course, my uid on most of my computers is 1000, the default for Debian-based Linux systems and probably many others. NFS operates on raw user ids (e.g. 1026) rather than user names (e.g. mike), which essentially requires these uids to line up on all your computers. The “correct” solution would be to change my uid on every OS I run to 1026 (what a hassle), possibly by using an LDAP server to host user accounts (total overkill for a small home network). Isn’t there some hacky solution, perhaps involving duct tape and zip ties?

It turns out there is: bindfs, a FUSE-based filesystem that supports a bunch of permissions and ownership tweaks and transformations. The feature I care about is uid mapping, but check out the site for other uses.

Example session:

$ mkdir mike mike_fixed
$ sudo mount -t nfs nas:/volume1/mike mike
$ ls -l mike/hello.txt
-rw-r--r-- 20 1026 mike  4096 May 27  2013 hello.txt
$ touch mike/hello.txt
touch: cannot touch `hello.txt': Permission denied
$ sudo bindfs --map=1026/1000 mike mike_fixed
$ ls -l mike_fixed/hello.txt
-rw-r--r-- 20 mike mike  4096 May 27  2013 hello.txt
$ touch mike_fixed/hello.txt
(touch silently succeeds)

Notice how the file now appears owned by ‘mike’, who has uid 1000 on my local system. From now on, I can treat files on that bound filesystem as if they’re owned by me without any permissions headaches! (Also note that the gid (group id) already matched, but bindfs also supports mapping groups if this is needed.)

    • #nfs
    • #linux
    • #synology
  • 3 weeks ago
  • 1
  • Permalink
  • Share
    Tweet

Book Review: The Signal and the Noise

The Signal and the Noise: Why So Many Predictions Fail - But Some Don'tThe Signal and the Noise: Why So Many Predictions Fail - But Some Don’t by Nate Silver
My rating: 4 of 5 stars

I followed Nate Silver’s blog (FiveThirtyEight) closely during the run-up to election day 2012. His premise was simple: grab every public poll possible, attempt to correct for pollsters’ known biases, and produce a forecast based on the result. Somehow no one had thought to do this before. Silver simply crunched the numbers and nailed the outcomes in every state. Meanwhile, pundits, bloggers, and assorted blowhards made predictions based on nothing but gut feeling and partisan hackery, and they mostly missed the mark (often by a wide margin).

I was looking forward to reading more about his methodology in this book, as well as his take on the principles involved in making predictions from noisy data. In this regard, I wasn’t disappointed. Silver does a good job of laying out the rules of the road:

  • It’s easy to mistake essentially random fluctuations for a meaningful pattern, and in some contexts (say, earthquake predictions), this can have devastating results.
  • Having a well-formed, testable theory is better than just looking for any correlations you can find in your data set.
  • Always make predictions and update your probability estimates like a good Bayesian. Your predictions should approach reality as you continually refine them.
  • Watch out for biases in yourself and in your data set.
  • Often overlooked: make sure incentives are aligned with the results you would like to achieve.

Also, some specific interesting facts:

  • Making a living at poker is really hard. Without any really bad players at the table, it’s nearly impossible for anyone but the top players to turn a profit.
  • The efficient market hypothesis doesn’t hold up to scrutiny; however, even though the stock market has discernible patterns, it may not be possible to exploit the patterns and consistently beat the market.
  • Weather prediction has gotten a lot better in the last couple decades, even though most people think it hasn’t.
  • Both earthquakes and terrorist attacks follow a power law distribution.

If you’re a stock trader, scientist, gambler, or simply someone who wants to form an accurate picture in a noisy environment, there’s something in this book for you. It’s nice to see this kind of clear-headed, rational thinking becoming sexier recently. The book is also well cited, which helps give weight to some of the more counterintuitive claims.

There was a missed opportunity to spend some time on results from the medical research industry. It’s well known that publication bias and other factors result in misleadingly positive results for new treatments, which ultimately go away after independent researchers attempt (unsuccessfully) to reproduce the results. It seems like a pertinent, prototypical case of finding patterns in noise, one which could have been instructive.

A final note: Silver is not the best writer; his prose is uneven and occasionally downright awkward. His casual style works fine for a blog, but here it diminishes the impact the book could otherwise have had. This is his first published book, and it shows. There are also a couple glaring mistakes that make me think he needed a better editor.

View all my reviews

    • #books
  • 1 month ago
  • 1
  • Permalink
  • Share
    Tweet

Stupid Bash tricks: show git branch in your window title

First, the code (add to your ~/.bash_profile):

function git-title {
    local title
    if ! title="branch: `git rev-parse --abbrev-ref HEAD 2>/dev/null`"; then
        # Not a git repository
        title="bash"
    fi
    echo -ne "\033]2;$title\007"
}

export PROMPT_COMMAND="git-title"

How it works: the git-title function uses the terminal escape sequence to set your title to “branch: foo” if your current working directory is inside a git repository.

The PROMPT_COMMAND variable allows you to run a command every time bash is about to render a new prompt. I think the intent is that you can echo additional information to the terminal to make your prompt more robust, but I’m taking advantage of this hook to just make sure the title updates whenever you have potentially entered a repository, changed branch, etc.

    • #git
    • #bash
  • 3 months ago
  • Permalink
  • Share
    Tweet

Quantum mechanics on a planetary scale

Jupiter, the largest planet in our solar system, produces brilliant auroras in its atmosphere. It might not seem too surprising, given that we can see them from our significantly less exotic vantage here on Earth, that such things would happen on other planets as well. However, without a phenomenon predicted by quantum mechanics and first observed in 1996, this would not be possible on Jupiter at all.

Aurora

Auroras occur when charged particles (known as ions), streaming steadily from our sun, encounter Earth’s magnetic field. They are guided along the magnetic field lines toward the north and south magnetic poles. accelerating and crashing into nitrogen and oxygen molecules in the upper atmosphere. These collisions have such high energy that they can cause ions to gain an electron, dropping from a charged state to ground state, emitting a photon in the process (hence the light show!).

But wait, our planet has a huge magnetic field produced by interactions in its core (the liquid outer shell, in particular), primarily composed of the metals iron and nickel. In contrast, Jupiter is primarily (90%) composed of hydrogen and helium, as well as some other compounds such as methane, water, and ammonia. Yet, at cloud level, its magnetic field is ten times stronger than ours. Where is it coming from?

To understand these Jovian auroras, we have to turn to quantum mechanics.

Strange (but true)

It’s easy to think of particles, such as the protons and electrons in a hydrogen gas, as billiard balls bouncing around like well-behaved Newtonian objects. You’re probably vaguely aware that this view is incorrect, but probably only a litte wrong (right?).

In the every day world, the classical rules of physics seem natural — we have mass, we can touch things and push them around. We can make tools and even build rocket ships with these rules. We have to be reminded that these rules only seem natural because our intuitions are shaped by our experiences of the macroscopic world. Quantum mechanics is no different from classical physics in that there are rules that can be used to predict and explain behavior in the observed world.

The laws of quantum physics are really only “weird” because we didn’t grow up (and evolve) interacting with them directly. At the level of particles, there are no natural analogies to guide our thinking, there are only rules that we simply accept (since they have survived nearly a century of experimental examination). Two such rules of the very small are required to understand the behavior of a very large planet.

The first is the Pauli exclusion principle, a rule that says that fermions (“matter” particles, such as electrons) cannot have the same quantum state. This essentially forces electrons to stack up in very specific ways around atomic nuclei, in order to ensure that each electron has a distinct quantum state (spin, energy level, etc.).

A consequence of Pauli is that matter (made of fermions) takes up a substantial volume. If the exclusion principle did not exist, atoms would collapse to a much smaller size, where the nucleic and electromagnetic forces are balanced. Pauli essentially explained why we have volume at all. (This is what’s so mind-bending about quantum physics: it explains things you didn’t think needed explaining; worse, it explains them using rules that are seemingly arbitrary and unnecessary.)

The second rule we’ll need is Heisenberg’s uncertainty principle. This more well-known (in certain circles) rule states that you can never know the precise position and momentum of a particle simultaneously. More specifically, it states that the uncertainties (which can be quantified) of position and momentum have an inverse relationship — the more certainty you can create around one value, the less you’ll have of the other.

Note that Heisenberg wasn’t just describing a limitation of observational methods — for example that our instruments would interfere with the system and prevent measuring both values precisely. No, this is a bona fide Law of Nature, a fundamental property of the universe.

Idealism vs Degenerates

On Jupiter, gravity is so intense that its hydrogen is always under incredible pressure. It’s under so much pressure that its protons and electrons are extremely closely confined, close to the limits imposed by the Pauli exclusion principle. (Any closer, and two electrons would be forced into the same quantum state, which is forbidden.)

In these close quarters, the atoms in the liquid hydrogen have very certain positions — they are confined by immense gravity and therefore bound to a small range of positions. The uncertainty principle requires that, regardless of the liquid’s temperature, its particles have a high average momentum (in order to satisfy the requirement of high uncertainty!).

In an ideal gas, temperature and pressure are correlated — increase one and the other will increase proportionally. Here, however, the pressure of this liquid is not dictated by temperature. Instead it is a result of the high velocities required by Heisenberg, making it “degenerate” matter (rather than ideal).

Metallic hydrogen?

In this dense, high-pressure mix, the electrons of hydrogen atoms and their neighbors are so close together that it’s no longer clear which electron belongs to which proton. As well, electrons, having significantly less mass than protons, have higher average velocities. (Momentum is the product of mass and velocity, so protons with the same momentum as electrons would move much slower.)

The electrons thus are free to move independently of their protons, and this hydrogen soup becomes a metal — a conductor of electricity, and a very good one at that! (This also helps justify its place at the top of the alkali metal column in the periodic table.)

This form of hydrogen was predicted more than 75 years ago by Wigner and Huntington, and more recently, created in the lab. Scientists, working at Lawrence Livermore National Laboratory in 1996, used a high-velocity gun (called a “light gas gun”) to fire a compressing plate toward a small sample of liquid hydrogen. The hydrogen experienced a pressure of over a million atmospheres and briefly exhibited metallic properties — it transitioned from being an electrical insulator to having near zero resistance.

So, given a layer of conductive liquid hydrogen spinning inside Jupiter, we have an account of its impressive magnetic field and resulting auroras. As well, it provides a great illustration of the “reality” of quantum mechanics — its principles, operating at the subatomic level, provide the only explanation of a phenomenon occurring on a planetary scale.

    • #auroras
    • #jupiter
    • #quantum physics
    • #quantum mechanics
    • #metallic hydrogen
  • 6 months ago
  • 6
  • Permalink
  • Share
    Tweet

Generate bit.ly links from the command line

I’m a command line junkie. Whether that’s a good or bad thing I’ll leave up to the reader. But since the reader is here, I’ll assume (s)he finds command line utilities helpful!

Lately, I’ve found myself generating short URLs for things often enough that I thought it would be nice to have a little script that generates them for me. Then I found python-bitly, an elegant little Python module that wraps the bit.ly API.

Over the next couple minutes, I unceremoniously hacked that module into a script that you can kick off whenever you want a bit.ly URL: bitly.py.

Prerequisites: python-bitly requires simplejson (or json), which you can install with: pip install simplejson. After this, register an account at bit.ly, grab your username and API key from the account page, and paste them into the script (API_USERNAME and API_KEY respectively).

Example session:

(crono:~)$ bitly.py 'http://github.com/'

Short URL: http://bit.ly/k7lifz

Get in the habit of quoting the URL, because a URL that contains ampersands (&) or semicolons (;) will cause problems with bash. Grab the script!

    • #python
    • #bit.ly
    • #urls
    • #command line
  • 8 months ago
  • 1
  • Permalink
  • Share
    Tweet

supybot-git: An IRCbot plugin for Git notifications

I was looking for a way to get git commit notifications in IRC, and there are several examples of bots that do this. However, being a Pythonista, if I’m going to run an IRC bot it’s going to be one written in Python. The main idea is that I figured I’d probably want to extend it, and in that case, it ought to be in a language I love.

After some investigation, I found Supybot, a robust, extensible IRC bot written in — you guessed it — Python. It has a pretty slick installation wizard to get you up and running without fiddling with any configuration files. It also ships with a robust collection of plugins, and many have written third-party plugins as well.

Since no git notification plugin existed, I went ahead and wrote one. It’s been running on my IRC server for over a year now, and I recently decided to clean it up and open source it. I present supybot-git!

Screenshot

It’s pretty straightforward to get up and running, and has the following features:

  • Can monitor any number of repositories
  • Repositories are associated with a list of IRC channels:
    • repositories command lists repositories associated with the current channel
    • Notifications will appear on these channels
    • Users can display recent commit log on this channel (with the log command)
    • People on other channels will have no indication the repository exists
  • Asynchronous commit notification
  • Configurable polling frequency (default: every 2 minutes)
  • Configurable notification format (you can use the commit author, branch, message, provide a link to the commit, and more)
  • Reload configuration with rehash command

It’s built with the assumption that you may want to retain some privacy, i.e. monitor a closed-source git repository. This means that people in one channel are allowed to see commit information, but other channels will have no idea that the repository even exists. I currently have my bot monitoring six repositories across three IRC channels and it has been perfectly stable.

Grab supybot-git and try it out! Let me know how it’s working for you.

Update Nov ‘12: Renamed some of the commands for simplicity’s sake, updated this post accordingly.

    • #supybot
    • #git
    • #python
    • #supybot-git
    • #irc
  • 8 months ago
  • Permalink
  • Share
    Tweet

Share your development server with a reverse ssh tunnel

Sometimes you want to allow someone access to your development server (e.g. a Django or Rails dev server) running on port 8000 on your laptop. Unless the other person is on the same subnet as you, it’s very likely there’s a firewall between you. (Whether you’re at home, on a company LAN, or at Starbucks.)

Assuming you have access to a Linux host that is publicly accessible, this is easy to work around. I personally have a tiny virtual host that gives me remote ssh access and runs a few little services for me, so I use this host.

This is a quick and dirty way to open up access to your dev server by using ssh and a publicly accessible remote server as your proxy. Here’s the entirety of my webtunnel.sh script:

#!/bin/bash
REMOTE="myvirtualhost.com"
echo "Opening tunnel to $REMOTE..."
ssh -nNT -o ServerAliveInterval=30 \
    -R $REMOTE:8000:localhost:8000 $REMOTE

Here’s what it looks like in action:

$ webtunnel.sh
Opening tunnel to myvirtualhost.com...

As long as this ssh process is around, the tunnel will continue to exist. It’s essentially a one-liner, but a little complicated, so I’ll explain the options:

  • -n: Redirect stdin from /dev/null, mainly useful if you plan on putting the ssh session in the background (with -f).
  • -N: Do not actually execute a remote shell, just connect and establish the requested port forwards.
  • -T: Don’t allocate a pseudo-TTY, since this is not intended to be an interactive shell.
  • -o ServerAliveInterval=30: Send a keepalive ping every 30 seconds. This will keep the TCP connection from being shut down due to inactivity if it is unused for several minutes.
  • -R $REMOTE:8000:localhost:8000: The key bit of magic, establish a reverse tunnel from the remote host, port 8000, to your local host, port 8000. Any incoming connection to the remote server on port 8000 will be transparently routed to your local development server.

As I hinted above, you can optionally pass -f to tell ssh to fork into the background after successfully connecting. I like to have it in the foreground, occupying a screen terminal, so that I don’t forget it’s open and I can kill it with CTRL-C whenever I want.

    • #ssh
    • #web-development
    • #ssh-tunnel
  • 8 months ago
  • 4
  • Permalink
  • Share
    Tweet

Database file storage for Django

Django provides a good mechanism for handling file attachments as model fields, using the FileField and ImageField classes. These field types store the path to a file in the database, while facilitating the actual file storage on the filesystem through the Storage API. They even come with the interface widgets to handle uploads in the model admin, so it’s a simple feature to activate.

Motivation

File storage adds a new concern that your application did not previously have. Now, in addition to supporting database content (migrations, backups, and all the best practices that come with a database), you have user content living on the filesystem. This means a new area where you need to consider security (permissions that allow uploading without compromising the system), managing the uploaded files across redeployments, distributing them across application instances, etc.

When file storage is central to your application’s purpose, then these concerns are worth spending engineering time on. However, there are many times where file uploads are guaranteed to be small and won’t be heavily used throughout the app. In such cases, storing file content in the database is possible and can make your life a bit easier.

Database Storage

Database storage is an alternative to using the filesystem for uploaded file content: the whole thing can be stored in the database. Django doesn’t ship with this capability — probably because in many cases it is a very bad idea.

Some searching turned up a snippet that implements this basic idea. It unfortunately doesn’t use the Django database layer, and only supports Microsoft SQL Server as written. It would be possible to get working with other databases, given the correct connection string, but I’d really prefer to let Django abstract the database layer for me where possible.

Taking inspiration from the snippet, I’ve implemented a new database storage class that does use the Django database API. It’s extremely easy to get going, and should work with any database supported by Django (I’ve used it with SQLite and SQL Server so far).

Check it out:

  • django-database-storage on PyPI
  • django-database-storage on Github

Example Usage

Here’s a quick how-to on using this library. First, install it (ideally in a virtualenv):

$ pip install django-database-storage

Add a FileField to your model object (in models.py):

from database_storage import DatabaseStorage
...
DBS_OPTIONS = {
    'table': 'blog_attachments',
    'base_url': '/blog/attach/',
}

class BlogEntry(models.Model):
    attachment = models.FileField(
        upload_to='blog_attachments/',
        storage=DatabaseStorage(DBS_OPTIONS),
        null=True, blank=True)

Create a table in the database for storing the attachments. Add a file in your_app/sql/create_blog_attach.sql, which will execute when you run syncdb:

CREATE TABLE blog_attachments (
    filename VARCHAR(256) NOT NULL PRIMARY KEY,
    data TEXT NOT NULL,
    size INTEGER NOT NULL);

(The column names may be different from the above, but these are the defaults. If you use different column names, specify them in DBS_OPTIONS as 'name_column', 'data_column', and 'size_column'.)

Now any uploaded files will be saved into the database instead of the filesystem. Note that the data field is text; this is because Django doesn’t presently support blobs or non-unicode data returned from queries. DatabaseStorage transparently uses base64 encoding to store your data, so you can store arbitrary binary files, but they will use some extra space from the encoding.

The last thing you need is a view to serve these files upon request. Add a url handler to urls.py:

(r'^blog/atttach/(?P<filename>.+)$', 'blog_attach'),

…and the view to handle these requests (in views.py):

def blog_attach(request, filename):
    # Read file from database
    storage = DatabaseStorage(DBS_OPTIONS)
    image_file = storage.open(filename, 'rb')
    if not image_file:
        raise Http404
    file_content = image_file.read()

    # Prepare response
    content_type, content_encoding = mimetypes.guess_type(filename)
    response = HttpResponse(content=file_content, mimetype=content_type)
    response['Content-Disposition'] = 'inline; filename=%s' % filename
    if content_encoding:
        response['Content-Encoding'] = content_encoding
    return response

That’s it! Now you have the ability to save user-uploaded files in the database, without having to worry about file management on the server. Remember, this is intended for simple use cases and small files. An application that relies heavily on file storage, or wishes to store files greater than 1MB or so, should absolutely use a more robust solution.

Finally, to see the full documentation, pull up the help in your Python shell:

$ python
...
>>> from database_storage import DatabaseStorage
>>> help(DatabaseStorage)

Let me know if you have any issues or questions!

    • #django
  • 9 months ago
  • Permalink
  • Share
    Tweet

Django migrations and your new development environment

If you’re coming to Django from another framework where migrations are a first-class concept, the lack thereof is one of the first things you’ll notice about Django. The de facto standard for Django migrations is a third-party app called South. (There are others, see the full list on Django’s wiki.)

South makes it very easy to create migrations to add or remove columns, create tables, and perform other routine database schema updates. It keeps these migrations in sequential order, so your team can all perform the same updates in the same order on their workstations.

The Problem

When you set up a new development environment (or server for that matter), you need to create the database schema your application requires. South expects you to do this by replaying your migration history:

$ ./manage.py syncdb
$ ./manage.py migrate
...
DatabaseException in 0003_some_ancient_migration

On a big, long-lived project, you’ll eventually have dozens or hundreds of migrations. These migrations are not regularly tested or even used by anyone. But if you use the above workflow, you’ll attempt to replay those dozens or hundreds of untested migrations on your system.

Code that is rarely used and never tested will rot; it’s just a matter of time. You’ll switch database versions, or some dependency will change, and your migrations no longer work. Really, is it even important that they work two years after they’re written? The purpose of a migration is to make a small, incremental change to an existing database, bringing it up to date. Forcing migrations to work over a period of years across a multitude of environments would require significant development effort, time which could be better spent on your product itself.

SyncDB

Normally, South hijacks your syncdb command and does not allow Django to create the tables for apps that normally use South’s migrations. Fortunately there’s an easy workaround: South lets you bypass this behavior and do a normal syncdb with the --all parameter. Run this way, syncdb will create tables for all of your apps, even those that contain South migrations.

Simply doing this would leave your database schema and what South thinks about your database schema out of sync. To remedy this, follow it up with a faked migration run:

$ ./manage.py syncdb --all
(Database is now ready to roll)
$ ./manage.py migrate --fake
(South pretends to run migrations and updates its internal state)

This is a much more reliable way to get a new environment up and running, and saves you the hassle of maintaining ancient migration code years after it was actually needed.

Note: If your migrations create data in the database (rather than just updating schema), skipping these migrations may leave your database setup incomplete. Normally, however, this will not be the case.

    • #django
    • #databases
    • #database migrations
    • #migrations
    • #south
    • #django-south
  • 9 months ago
  • 2
  • Permalink
  • Share
    Tweet

The best cover of Dr Wily’s theme (Mega Man 2) you’ll hear all day. It starts slowly, but stick around.

    • #vgm
    • #mega man
    • #video games
  • 9 months ago
  • 1
  • Permalink
  • Share
    Tweet
← Newer • Older →
Page 1 of 2

About

Avatar

I'm a software engineer in San Francisco. This is my less-than-focused blog.

I work at Goodreads. We help people find and share books they love.

I'm also on Github and Linkedin.

Twitter

loading tweets…

  • RSS
  • Random
  • Archive
  • Mobile

Effector Theme by Carlo Franco.

Powered by Tumblr