How to Visualize a Graph with a Million Nodes
Large-scale graph visualizations are tricky. The more nodes and edges you have in your network, the more difficult it is to compute the layout for it. Graph layout defines where on a canvas the nodes will be placed. No layout, no visualization! Rendering a large number of nodes and edges is also a challenge. For example, if you try to animate more than a few thousand objects with Scalable Vector Graphics (SVG), it’ll fail and you’ll need to find another technique to do that. Can you take advantage of a Graphic Processing Unit (GPU) to help draw so many data points? Of course! Using WebGL to render complex visualizations is becoming more common nowadays. But is it possible to use that GPU power to calculate the layout for your graph as well? The answer is also yes, with Cosmograph! GPU-accelerated Force Layout One of the key techniques used in network graph visualizations is Force Layout. It is a type of physical simulation that defines several forces affecting the nodes of your graph. For example, a spring force between connected nodes will pull them together, a many-body repulsion force will pushe nodes away from each other, and a gravity force brings non-connected parts of the graph together in the simulation space. You can find numerous libraries that implement various kinds of Force Layout simulations. Almost all of them use CPU to do the calculations, and they get slower as the number of nodes increase, usually choking at around 100,000 nodes. GPU-based force layout algorithms are much less common; they are more difficult to write. Using the traditional approach of implementing the Many-Body force (which is the most complex force in the simulation) won’t make the calculations noticeably faster, because random memory access operations (reading or writing data from computer’s memory) on GPUs are slow and you’ll need a lot of them (i.e. when you need to get information about two different nodes to calculate the forces, and their data is stored far away from each other in memory). Usually, when you need to visualize a big network, you have to use a desktop visualization tool, like Gephi, that will calculate the layout first using an optimized CPU algorithm, and then visualize the result. You can also use more sophisticated tools like Graphistry, which will calculate the layout on their server and then render your graph in the browser using WebGL. Or a sophisticated and powerful command line tool called GraphViz. However, we came up with a much more user-friendly solution. We developed a technique that allowed us to fully implement Force Graph simulation on the GPU. It is amazingly fast and it works on the Web! A screenshot from Cosmograph showing a network visualization that has 133K nodes and 321K edges Cosmograph and how it works Meet Cosmograph — the fastest network graph visualization tool that works in the browser. It’s capable of visualizing networks that have a million nodes and edges, and that’s not the limit! It’s free to use, everyone can go to https://cosmograph.app, upload a CSV and get it visualized! Cosmograph’s user interface (UI) is pretty minimal. When you open it you’ll find hints and data examples right away. Let’s briefly go over the basics here so you don’t get lost when you run it for the first time. Cosmograph’s data load interface Your graph data will need to be stored in a CSV file that has at least two columns, » Read More
Like to keep reading?
This article first appeared on nightingaledvs.com. If you'd like to continue this story, follow the white rabbit.