Sometimes Storage node Operator can see the error in the log "database disk image is malformed". This is happened during unplanned shutdown or reboot. Any of the
sqlite3 databases may become corrupted.
- Firstly, we should try to verify the database with embedded command of SQLite3. So, we need to have
sqlite3installed (v3.25.2 or later). The installation steps depend on the OS.
- We will use Docker instead of direct installation (this option is available only for x86_64 CPUs, for arm-based you will need to install
sqlite3via the package manager of your OS).
- Make a backup of all the
sqlite3databases. They are located in the
storagefolder for your data storage. For example
x:\storagenodeis the data folder, which you have specified in the
--mount type=bind,source=x:\storagenode,destination=/app/configoption of the
docker runcommand for your storagenode, or
x:\storagenode\storagein case of Windows GUI in the
storage.path:option of the
Check for errors
Docker (repeat this step for each one of the
docker run --rm -it --mount type=bind,source=x:\storagenode\storage\bandwidth.db,destination=/bandwidth.db sstc/sqlite3 sqlite3 /bandwidth.db "PRAGMA integrity_check;"
sudo apt update && sudo apt install sqlite3 -y
- Make sure that the version is v3.25.2 or later, otherwise the check will not work correctly.
- The check (perform for each database)
sqlite3 /path/to/storage/bandwidth.db "PRAGMA integrity_check;"
- If you see errors in the output, then the check did not passed. We will unload all not corrupted data and load it back. But this could sometimes fail too. If no errors occur here, you can skip all the following steps and start the storagenode again.
If you were not lucky, then please try to fix the corrupted database(s) as shown below.
Tip. You can use
tmpfsto restore your databases. It uses memory instead of disk and should take a lot less time than on HDD (you can read more about usage of tmpfs with Docker in the Use tmpfs mounts guide).
Open a shell inside the container:
docker run --rm -it --mount type=bind,source=x:\storagenode\storage,destination=/storage sstc/sqlite3 sh
- Or use your shell directly, if you have sqlite3 installed. Use the path to your storage instead of "/storage/".
Now run these commands in the shell. You need to repeat steps 8 to 14 for each one of the
sqlite3databases, any one of them may be corrupted.
cp /storage/bandwidth.db /storage/bandwidth.db.bak
You will see a prompt from the
sqlite3. Run this SQL script
We will edit the SQL file
cat /storage/dump_all.sql | grep -v TRANSACTION | grep -v ROLLBACK | grep -v COMMIT >/storage/dump_all_notrans.sql
Remove the corrupted database (make sure that you have a backup!)
Now we will load the unloaded data into the new database
sqlite3 /storage/bandwidth.db ".read /storage/dump_all_notrans.sql"
Check that the new
bandwidth.dbhas the size larger than 0
ls -l /storage/bandwidth.db
Exit from the container (skip this step, if you use a direct installed sqlite3)
If you are lucky and
bandwidth.dband all other corrupted
sqlite3databases are fixed, then you can start the storagenode.
Warning. If you were not successful with the fix of the database, then your stat is lost.
On Windows: disable the write cache.
On Docker: use the updated
docker run command from the documentation: https://documentation.storj.io/setup/cli/storage-node#running-the-storage-node
Make sure that you are not using NFS or SMB to connect to the storage, they are not compatible with SQLite. The only working network protocol is iSCSI.
Make sure that your external USB drive has enough power and it does not turn off during operations. It's better to avoid using them and use only internal drives.