7c0h

Animations in Gephi

Gephi is a tool for graph visualization that's great for plotting big datasets. If you have large, complex graphs to plot, Gephi has plenty of layouts that you can apply to your data and turn a mess of a graph into something that hopefully makes sense.

But there's something Gephi can't do: animations. The layout algorithms at work look awesome, and yet there's no easy way to create a video out of them. Capturing the screen is an alternative, but the final resolution is not really high.

It has been suggested that you could export images of the individual networks at every step and then put them together. Of course, the people who suggested this in the past have also...

Therefore, this post.

Note: this post is not an introduction to Gephi, and I'll assume that you know the basics. If you don't know them, these slides look quite good.

Requirements

  1. Gephi (obviously)
  2. ImageMagick (to post-process images in bulk)
  3. FFmpeg (to create the final video)

I am also assuming that you are using Linux. This is not a requirement, but it does make my life easier.

Preparing your data

The first step is to have some data to visualize. For the purposes of this exercise, you can download this zip file containing two files named nodes.csv and edges.csv. They form a simple directed graph that I generated from a sample of directories in my computer.

Now, load them into Gephi:

  1. Start a new Project
  2. Load the nodes table: File → Import spreadsheet → nodes.csv. Follow the steps, but remember to select "Append to existing workspace" in the last one.
  3. Repeat the previous step, this time using the edges.csv file.

Next, choose a layout. In my experience it is better to try them all first, see which one gives the best result, and then generate the final graph from scratch. Generating the final graph will take a while, so it's better to do it only once you are sure about which parameters to use.

Exporting all frames

It is now time to run our first script. If you don't have the Scripting Plugin installed, you can do it now via Tools → Plugins. We will use it to write a very simple Python script that does the following:

  1. Run a single step of a layout
  2. Take note of the relative size of the graph (we'll come back to this)
  3. Export it to a PNG file
  4. Return to step 1.

In case you want to copy-paste it, this is the script I used. Don't forget to remove the comments first, because Gephi doesn't like them.

def make_frames(iters, layout, outdir):
    # As many iterations as frames we want to generate
    for i in range(iters):
        # Run a single step of our layout
        runLayout(layout, iters=1)
        # Calculate the bounding box of this specific graph
        min_x = 0
        max_x = 0
        min_y = 0
        max_y = 0
        for node in g.nodes:
            min_x = min(min_x, node.x)
            max_x = max(max_x, node.x)
            min_y = min(min_y, node.y)
            max_y = max(max_y, node.y)
        width = max_x - min_x
        height = max_y - min_y
        # Generate a file and include the graph's bounding box
        # information in its filename
        exportGraph("%s/%05d-%d-%d.png" % (outdir, i, width, height))

Once you have copied this script into the console, you can generate all animation frames with the command make_frames(100, FruchtermanReingold, "/tmp/"). This will run the FruchtermanReingold layout for 100 iterations, and will save the generated frames in the /tmp/ directory. Of course, you can choose other layouts (see the documentation for more info) and you can run the script for a larger number of steps. You can also customize the layout parameters in the regular Layout tab, and the script will pick them up.

The script will block Gephi entirely, so don't go for a really high number of steps from the beginning. Start with 50-100, and only then move on.
For a nicer effect, make a single run of the "Random" algorithm first. This will put all your nodes in a very small space, and the final effect will be like an explosion of nodes.

Generating the animation

A further issue to deal with is the changing size of the image canvas. If we set Gephi to generate a 1024x1024 output image but we only have two nodes close to each other, those two nodes will look huge. If we have thousands of disperse nodes, however, the individual nodes will be barely visible. Therefore, if you made a video with the images we generated in the previous section directly, you would almost certainly get a zoom effect where the nodes would get bigger and smaller as the graph gets denser or sparser respectively.

To avoid this, we need to scale all pictures proportionally. The Bash script below calculates the maximum theoretical size of our graph (based on those variables we added to the filenames before), scales all images down to the proper size (as defined by canvas_w and canvas_h), and places them in the center of a single-color canvas (see the rgb() call).

# Directory where all individual frames are
SOURCE_DIR=/tmp
# Directory where final frames will be stored
TARGET_DIR=/tmp/outdir

# Obtain the theoretical maximum width and height in the PNG frames
max_width=`ls ${SOURCE_DIR}/*png | cut -f 2 -d '-' | sort -n | tail -1`
max_height=`ls ${SOURCE_DIR}/*png | cut -f 3 -d '-' | cut -f 1 -d '.' | sort -n | tail -1`

# Give your desired canvas size
canvas_w=1024
canvas_h=1024

# Scaling factor for the frames, based on the largest theoretical dimension
if (( $max_width > $max_height ))
then
    factor=`bc -l <<< "$canvas_w/$max_width"`
else
    factor=`bc -l <<< "$canvas_h/$max_height"`
fi

# Generate the new frames
for file in ${SOURCE_DIR}/*png
do
    # Obtain the properties of the image
    frame=`echo $file | xargs -n 1 basename | cut -f 1 -d '-'`
    width=`echo $file | cut -f 2 -d '-'`
    height=`echo $file | cut -f 3 -d '-' | cut -f 1 -d '.'`
    # Calculate how the image should be scaled down
    new_width=`bc -l <<< "scale=0; ($width*$factor)/1"`
    new_height=`bc -l <<< "scale=0; ($height*$factor)/1"`
    # Put it all together
    convert $file -scale ${new_width}x${new_height} png:- | \
        convert - -gravity Center -background "rgb(255,255,255)" \
        -auto-orient -extent ${canvas_w}x${canvas_h} \
        ${TARGET_DIR}/${frame}.png
done

Once you have generated this second set of frames, you can generate your final video going to your TARGET_DIR directory and running the command

ffmpeg -framerate 30 -i %05d.png -c:v libx265 -r 24 final_video.mp4

If your video is too slow, a higher framerate value will do the trick (and vice versa). The final result, heavily compressed for the internet, can be seen below:

Final thoughts

I hope you'll find this guide useful. I'm not going to say that it's easy to follow all these steps, but at least you can set them up once and forget about it.

Some final points:

  • For an alternative method of graph generation involving nodes with timestamps, this script looks like the way to go.
  • I'm interested in unifying everything under a single script - I chose Python because it was easier than Java, but maybe developing a proper multi-platform plugin is the way to go. I can't promise I'll do it, but I'll think about it. In that same vein, perhaps the script should center in a specific node?
  • If you plan to publish your video on the internet by yourself, this post gives a nice overview of which standards to use. If you want to tweak the video quality, this SE question provides some magic incantations for FFmpeg.
  • Finally, and speaking of magic incantations, special thanks to this post for providing the right ImageMagick parameters.