Skip to main content

Drupal Node Access

Most website permissions work like this: you've got the anonymous public, who can see everything you've published, and a webmaster or two who can see and edit everything.

But once you add the complexity of users publishing their own content, or adding an 'intranet' where only some people are able to see private content, it gets tricky. Unlike Wordpress and Joomla that are very focussed on the simple website permissions model, Drupal was originally built as a platform for collaboration, and it's got some great tools for addressing all kinds of access use cases. In particular, sometime around when I started working with Drupal, the node access system of access control was born.

Unfortunately, the problem of access control is inherently challenging due to its complexity, and every time I try to implement it, I have to reread about the Drupal node access model to remember how it works. I've just had to do this again while implementing a customized Open Atrium installation for the Centre for Social Innovation, and so this posting is my attempt at writing it down to help me remember it for next time, and to share it with others.

When you step beyond the basic web publishing model, then the key question is: who can do what (view, edit, etc.) with what page? The obvious model is then to define those access permissions for each page, but after trying that once, you'll realise that you probably don't want to do that because:

a. it takes too long to set up (n nodes x m users x p permissions = n x m x p checkboxes to add).
b. it's impossibly cumbersome to maintain.

Or, put another way, we usually have a different mental model that includes some intermediate abstractions that helps us answer the question "who can do what". Actually, I'm oversimplifying already - I've been describing the problem of access to "node" pages - but what about those listing pages? You'll also want to restrict what's listed, depending on who's looking, right? Let's forget that problem for a second though.

As an example, in Drupal, users are grouped according to 'role', and nodes are grouped by 'type'. In a common scenario, you might want to restrict nodes of type "secret" to only be visible by users with the role "spy". In fact, that scenario is nicely covered by some standard modules, so you can implement that one without thinking about it too much. Another important example is in the case of "Organic Groups", where access for group nodes depends on a users' membership in and administrative permissions for the group.

Those two examples are not only the most common implementations of node access, they're also an excellent demonstration of how to understand and implement Drupal's node access system. The key to simplifying node access in Drupal is the abstraction of groups of related users and nodes. This is done with the concept of a 'realm' which is best thought of as the context of groups of users accessing a particular group of nodes. If you were just thinking of the first example, then this seems like excessive abstraction, but if you think of the Organic Groups example, you'll see why we need this, where the "group of users" abstraction is contextually dependent on the node you're looking at.

So: here's a recipe for implementing Drupal's node access system.

1. write down in english all the access control restrictions that your system needs. Give each group of users and nodes sensible names.

2. reread the example implementation in the (search for node_grants and the node_access_records hooks).

3. define the node_access_records function per node: for a given node, what are the relevant 'grants', i.e. permissions for each user group, where 'realm' corresponds to the user group within the context of that node.

4. define the node_grants function, which then converts the abstraction of the realm, operation and user into the actual yes or no.

So the machine actual flow will go like this:

1. find the corresponding row(s) in the node_access table for the current node.

2. for each row, run the corresponding node_grants function with the current user and realm to determine that user's permission for that node.

Be sure to test the outcome, particularly if you're protecting important content!

And finally - let's go back to the question of listing pages. If you've been following along so far, you may wonder why we have to use all these abstractions, and in fact, you wouldn't have to. You could (and can) just intervene with the nodeapi hook and hide nodes that the current user shouldn't have access to.

But it turns out that by using the 'realm' concept, and breaking the answer to the question into these two hooks, we can very efficiently generate lists of nodes per user that respect the node access.

If it sort of makes sense now, but you think someone has already solved your particular use case, then check out this overview of node access modules.

Popular posts from this blog

Me and varnish win against a DDOS attack.

This past month one of my servers experienced her first DDOS - a distributed denial of service attack. A denial of service attack (or DOS) just means an attempt to shut down an internet-based service by overwhelming it with requests. A simple DOS attack is usually relatively easy to deal with using the standard linux firewall called iptables.  The way iptables works is by filtering the traffic based on the incoming request source (i.e., the IP of the attacking machine). The attacking machine's IP can be added into your custom ip tables 'blacklist' to block all traffic from it, and it's quite scalable so the only thing that can be overwhelmed is your actual internet connection, which is hard to do.

The reason a distributed DOS is harder is because the attack is distributed from multiple machines. I first noticed an increase in my traffic about a day after it had started - it wasn't slowing down my machine, but it did show up as a spike in traffic. I quickly saw that…

Upgrading to Drupal 8 with Varnish, Time to Upgrade your Mental Model as well

I've been using Varnish with my Drupal sites for quite a long while (as a replacement to the Drupal anonymous page cache). I've just started using Drupal 8 and naturally want to use Varnish for those sites as well. If you've been using Varnish with Drupal in the past, you've already wrapped your head around the complexities of front-end anonymous page caching, presumably, and you know that the varnish module was responsible for translating/passing the Drupal page cache-clear requests to Varnish - explicitly from the performance page, but also as a side effect of editing nodes, etc.

But if you've been paying attention to Drupal 8, you'll know that it's much smarter about cache clearing. Rather than relying on explicit calls to clear specific or all cached pages, it uses 'cache tags' which require another layer of abstraction in your brain to understand.

Specifically, the previous mechanism in Drupal 7 and earlier was by design 'conservative' …

CiviCRM's invoice_id field and why you should love the hash

I've been banging my head against a distracted cabal of developers who seem to think that a particular CiviCRM core design, which I'm invested in via my contributed code, is bad, and that it's okay to break it.

This post is my attempt to explain why it was a good idea in the first place.

The design in question is the use of a hash function to populate a field called 'invoice_id' in CiviCRM's contribution table. The complaint was that this string is illegible to humans, and not necessary. So a few years ago some code was added to core, that ignores the current value of invoice_id and will overwrite it, when a human-readable invoice is generated.

The complaint about human-readability of course is valid, and the label on the field is misleading, but the solution is terrible for several reasons I've already written about.

In this post, I'd like to explain why the use of the hash value in the invoice_id field is actually a brilliant idea and should be embrac…