thoughts/data/graphs-scale.md

83 lines
3.3 KiB
Markdown
Raw Normal View History

2024-08-05 20:24:56 +02:00
Following up on my post yesterday, I have also been looking at
graphs the other way - from a scalable database to a manageable
graph involving e.g. just one segment.
There are currently two ways to do this:
1) Export the graph, and 2) streaming the graph from and to the
graph database. The first option is obviously the simple one, but
doesn't always make up for our needs. The latter option is often
the case when you work multiple analysts at the same graph.
## Option 1: Exporting the Graph
To achieve the first you can use the GraphML save function of
Gremlin.
conf = new BaseConfiguration();
conf.setProperty("storage.backend","hbase");
conf.setProperty("storage.hostname","sandbox.hortonworks.com");
conf.setProperty("storage.port","2181");
g = TitanFactory.open(conf);
g.saveGraphML('test.graphml')
This graph can again be opened in tools such as Gephi.
You can also use the Gephi database API plugin for
Rexster. There's a Blueprints repo [1] which extends that. Short
how-to on how to get going with the Gephi development environment,
from the wiki-pages of the plugin [2]:
1. Get plugins from [3], and [4]
2. Open Gephi, go to ``Tools > Plugins > Downloaded > "Add
Plugins..."``
3. Press install and follow the guidance, at the end you should
restart Gephi
4. Go to File > Import Database
5. Add the Rexster configuration to ``/etc/graph/rexster.xml`` (if
when importing the database issues arises, look at [5]
``rexster.xml`` should look like this:
<graph>
<graph-name>RexterGraph</graph-name>
<graph-type>com.tinkerpop.rexster.config.RexsterGraphGraphConfiguration</graph-type>
<graph-buffer-size>100</graph-buffer-size>
<graph-location>http://192.168.109.128:8182/graphs/titan</graph-location>
</graph>
You should be left with something like this for instance in Gephi:
![A Rexster Graph Import to Gephi, from a Titan database. The graph consists of a variety of segments, such as articles from a article-system and imported Maltego graphs](/static/img/data/rexster-import-gephi.png)
A Rexster Graph Import to Gephi, from a Titan database. The graph
consists of a variety of segments, such as articles, imported
Maltego graphs and such.
A Rexster Graph Import to Gephi, from a Titan database. The graph
consists of a variety of segments, such as articles from a
article-system and imported Maltego graphs
Here's the cluster on the right there by the way. There's some
interesting patterns inside there it seems, so I suspect it's from
a Maltego graph:
![](/static/img/data/gephi-cluster-maltego.png)
## Option 2: The Gephi Streaming API
For the other option I found the Gephi graph streaming API
[6]. This one I currently found a little limited in that it can
only provide collaboration between two Gephi instances using a
Jetty web-server. It's pretty cool, but doesn't offer the
integration I am looking for. I'll get back to this later.
[1] https://github.com/datablend/gephi-blueprints-plugin
[2] https://github.com/datablend/gephi-blueprints-plugin/wiki
[3] https://github.com/downloads/datablend/gephi-blueprints-plugin/org-gephi-lib-blueprints.nbm
[4]
https://github.com/downloads/datablend/gephi-blueprints-plugin/org-gephi-blueprints-plugin.nbm
[5] https://github.com/datablend/gephi-blueprints-plugin/issues/1
[6] https://marketplace.gephi.org/plugin/graph-streaming/