-
Notifications
You must be signed in to change notification settings - Fork 82
Description
Issue description
The Blazegraph Backup API (/blazegraph/backup) produces a corrupted blazegraph.jnl file when used for online backups while OTNode (OriginTrail V8 Node) is running. The backup file, generated with block=true and with or without compress=true, fails to restore properly, resulting in a java.lang.IllegalStateException: Invalid data checksum error when starting Blazegraph with the restored journal. This causes OTNode to fail with a "Cannot connect to Triple store" error and Blazegraph to return HTTP 503 Service Unavailable for SPARQL queries.
Expected behavior
The Backup API should produce a consistent, uncorrupted blazegraph.jnl file that can be restored to a functional Blazegraph instance, allowing OTNode to connect to the triple store and SPARQL queries to execute without errors.
Actual behavior
The backup file (blazegraph-backup.jnl or blazegraph-backup.jnl.gz) is corrupted. When used to replace the active blazegraph.jnl, Blazegraph fails to start, logging a java.lang.IllegalStateException: Invalid data checksum from address: 72130541568, size: 1104. OTNode reports "Cannot connect to Triple store (OtBlazegraph), repository: privateCurrent, located at: http://localhost:9999/ retry number: 2/10". SPARQL queries to http://localhost:9999/blazegraph/namespace/dkg/sparql return HTTP 503 Service Unavailable.
Steps to reproduce the problem
- Restart Blazegraph and OTNode
systemctl restart blazegraph otnode
- Run the Backup API command:
BLAZE_URL="http://localhost:9999/blazegraph/backup?block=true&compress=true"
BLAZE_OUTPUT_FILE="/root/blazegraph-backup.jnl.gz"
curl -X POST --data-urlencode "file=${BLAZE_OUTPUT_FILE}" "${BLAZE_URL}"
Alternatively, do not compress:
BLAZE_URL="http://localhost:9999/blazegraph/backup?block=true"
BLAZE_OUTPUT_FILE="/root/blazegraph-backup.jnl"
curl -X POST --data-urlencode "file=${BLAZE_OUTPUT_FILE}" "${BLAZE_URL}"
If compressed, decompress the backup:
gunzip /root/blazegraph-backup.jnl.gz
- Stop Blazegraph and OTNode, replace the active blazegraph.jnl with blazegraph-backup.jnl
systemctl stop blazegraph otnode
mv /root/ot-node/blazegraph.jnl /root/ot-node/blazegraph.jnl.bak
mv /root/blazegraph-backup.jnl /root/ot-node/blazegraph.jnl
- Restart both services
systemctl restart blazegraph
sleep 5s
systemctl restart otnode
- Observe OTNode error: "Cannot connect to Triple store (OtBlazegraph), repository: privateCurrent, located at: http://localhost:9999/ retry number: 2/10".
- Run a SPARQL query:
curl -X POST http://localhost:9999/blazegraph/namespace/dkg/sparql -H "Content-Type: application/sparql-query" --data 'SELECT (COUNT(*) AS ?totalTriples) WHERE { ?s ?p ?o }'
Observe response: HTTP 503 Service Unavailable.
Specifications
Node version: OriginTrail node v8.0.11
Platform: Ubuntu 24.04 LTS
Node wallet: 0xe5Cc7fd75E87fD26EB6557236FE29566365Ba267
Node libp2p identity: 37
Error logs
Blazegraph logs (after restoring backup and restarting):
May 14 17:33:15 othub3 java[11878]: ERROR: Banner.java:134: Could not resolve name for host: java.net.UnknownHostException: othub3: othub3: Name or service not known
May 14 17:33:15 othub3 java[11878]: WARN : Banner.java:136: Falling back to null
May 14 17:33:15 othub3 java[11878]: WARN : NanoSparqlServer.java:517: Starting NSS
May 14 17:33:15 othub3 java[11878]: WARN : WebAppContext.java:554: Failed startup of context o.e.j.w.WebAppContext@5b94b04d{Bigdata,/blazegraph,jar:file:/root/ot-node/blazegraph.jar!/war,UNAVAILABLE}{jar:file:/root/ot-node/blazegraph.jar!/war}
May 14 17:33:15 othub3 java[11878]: java.lang.RuntimeException: java.lang.RuntimeException: addr=-19608250 : cause=java.lang.IllegalStateException: Invalid data checksum from address: 72130541568, size: 1104
May 14 17:33:15 othub3 java[11878]: at com.bigdata.rdf.sail.webapp.BigdataRDFServletContextListener.openIndexManager(BigdataRDFServletContextListener.java:816)
...
Caused by: java.lang.IllegalStateException: Invalid data checksum from address: 72130541568, size: 1104
May 14 17:33:15 othub3 java[11878]: at com.bigdata.rwstore.RWStore.getData(RWStore.java:2378)
...
SPARQL query response:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 503 Service Unavailable</title>
</head>
<body><h2>HTTP ERROR 503</h2>
<p>Problem accessing /blazegraph/namespace/dkg/sparql. Reason:
<pre> Service Unavailable</pre></p><hr><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.4.z-SNAPSHOT</a><hr/>
</body>
</html>
OTNode error:
Cannot connect to Triple store (OtBlazegraph), repository: privateCurrent, located at: http://localhost:9999 retry number: 2/10
Disclaimer
Please be aware that the issue reported on a public repository allows everyone to see your node logs, node details, and contact details. If you have any sensitive information, feel free to share it by sending an email to [email protected] (mailto:[email protected]).