胡思乱想

胡思乱想

专门记录我的灵感,可能是奇怪的小说

IPFS

IPFS#

Distributed file system, similar in principle to BT, stores and verifies files by chunking them, with each chunk corresponding to a CID and various levels of Hash, and uses DHT (Distributed Hash Table) for lookup and routing.

IPFS Documentation#

https://docs.ipfs.io/ Mainly look at the Concepts and How-tos sections.
IPFS generates different CIDs for each Content, and for the need for fixed links, it can be achieved through IPNS, but IPNS is not suitable for rapidly changing content, with update intervals measured in minutes.

Some services using IPFS#

Gateways#

There is an official list of available gateways: https://ipfs.github.io/public-gateway-checker/ Below are some tested available gateways.

go-ipfs#

The Go language project of IPFS is its main open-source implementation: https://github.com/ipfs/go-ipfs/

Configuration instructions for go-ipfs#

Detailed configuration instructions can be found at https://github.com/ipfs/go-ipfs/blob/master/docs/config.md which is more detailed than the manual.

Compiling go-ipfs#

Refer to https://github.com/ipfs/go-ipfs#download-and-compile-ipfs

The source code of the Github project is over 30MB, so cloning may take a while. If it's too slow and errors occur, a proxy may be needed. Use the make build command to compile; most of the time is spent downloading dependencies.
Set the GOPROXY before executing to avoid download timeouts.

# Check go version, must be above 1.13
go version
# Set proxy
git config --global https.proxy 'socks5://127.0.0.1:10080'
# Check
git config -l
#
git clone https://github.com/ipfs/go-ipfs.git
cd go-ipfs/
# Set GOPROXY
export GOPROXY=https://goproxy.cn
# Check
echo $GOPROXY
# Compile
make build
# Check compilation result
./cmd/ipfs/ipfs version

Running IPFS on Armbian for Amlogic S905L#

Download the Arm64 version from the GO IPFS Github project, Armbian version is 5.99, kernel 5.30.

# Download
wget https://github.com/ipfs/go-ipfs/releases/download/v0.5.1/go-ipfs_v0.5.1_linux-arm64.tar.gz
# Extract
tar xvf go-ipfs_v0.5.1_linux-amd64.tar.gz
# Directly run install.sh to install, this script will copy the ipfs directory to /usr/local/bin
cd go-ipfs
./install.sh

After initialization, start with a normal user.

# Check version
ipfs --version
# Initialize node
ipfs init
# View instructions
ipfs cat /ipfs/QmQPeNsJPyVWPFDVHb77w8G42Fvo15z4bG2X8D2GhfbSXc/readme
 
# This command will not run in the background, it is recommended to create a screen session and execute it inside the screen session
ipfs daemon

If IPFS is not running on the current computer, the configuration needs to be modified in two places to access the web UI.

One is Addresses.API and Addresses.Gateway.

"Addresses": {
  ...
  "API": "/ip4/127.0.0.1/tcp/5001",
  "Gateway": "/ip4/127.0.0.1/tcp/8080"
}

Change the API address to /ip4/0.0.0.0/tcp/5001 to listen on all network interfaces. If your server has a public IP, this may pose a security issue; you can set this IP to the internal network interface address.
Change the Gateway address to the internal or public address; if IPFS is in the internal network, external access can be done through NAT, which can be configured as an internal address.

The other is API.HTTPHeaders, to avoid cross-site errors when accessing the IPFS web UI from the current computer.

"API": {
    "HTTPHeaders": {}
},

Add two items in HTTPHeaders, which can refer to Gateway.HTTPHeaders.

"Access-Control-Allow-Methods": [
  "GET","PUT","POST"
],
"Access-Control-Allow-Origin": [
  "*"
]

After modifying the configuration, restart IPFS to access http://IP:5001/webui/ to graphically view the current node status, the number and size of stored blocks, details of connected nodes, and view the current configuration. This interface even provides some demonstration files for browsing.

Through the Gateway, files on IPFS and files with retrievable CIDs can be accessed. Any accessed file blocks will be cached, and subsequent accesses will directly use the cache without downloading from remote again.

After selecting a file to upload, the transfer will occur in the background (it is not confirmed whether the tab can be closed), and the total storage growth can be seen on the status page. The visibility delay for publishing varies with file size; for files under 10MB, they can be quickly discovered through the remote gateway, while files close to 500MB may take about half an hour to an hour. This is related to the mechanism of the node publishing its content CID to other nodes, as a single file block is 256KB (262144 bytes), resulting in over 2,000 CIDs for a 500MB file, making the publishing time much longer.

When adding files in the web UI, the pin operation can only be performed after the file upload is complete. If uploading files via CID, submitting the CID will not trigger file synchronization; pinning this CID will trigger IPFS to start synchronizing from other nodes, and once the file synchronization from other nodes is complete, it will appear in the pin list.

Installing as a service#

Create the file /etc/systemd/system/ipfs.service and write:

[Unit]
Description=IPFS Daemon
After=syslog.target network.target remote-fs.target nss-lookup.target
[Service]
Type=simple
ExecStart=/usr/local/bin/ipfs daemon --enable-namesys-pubsub
User=milton
[Install]
WantedBy=multi-user.target

Then add it through systemctl.

Configuration instructions#

IPFS exposes three main ports: API, Gateway, and Swarm, among which:

  • API: defaults to port 5001, providing the web UI for managing and controlling IPFS. Care should be taken not to expose it to the public network when setting the listening interface.
  • Gateway: defaults to port 8080, providing content lookup and download services for ipfs/CID.
  • Swarm: defaults to port 4001, used for listening to requests from other IPFS nodes.
  • Addresses.NoAnnounce adds internal IPs that need to be excluded; these IPs will not be announced, but be careful not to exclude 127.0.0.1 and ::1, as these seem to be used by other nodes to check whether the current node supports IPv4 or IPv6. If excluded, it will not be able to maintain connections with other nodes (can ping and connect, but cannot find in swarm peers).
  • Swarm.AddrFilters adds internal IPs to be ignored; for peers announcing internal IPs on the ID, these internal IP addresses will be filtered out.
  • Discovery.MDNS.Enabled set this option to false to avoid initiating node searches in the internal network.
  • Peering.Peers add nodes that need to be kept connected.

Fixed nodes#

For self-built networks, it is necessary to maintain connections between your nodes. However, under the default mechanism of IPFS, even if self-built nodes are set as bootstrap, after running for a while and with an increasing number of connected nodes, connections between other self-built nodes will still be closed. To maintain connections, you need to use the Peering section in the configuration file, formatted as follows, with the second being the access address of ipfs.runfission.com.

{
  "Peering": {
    "Peers": [
      {
        "ID": "QmPeerID1",
        "Addrs": ["/ip4/18.1.1.1/tcp/4001"]
      },
      {
        "ID": "QmVLEz2SxoNiFnuyLpbXsH6SvjPTrHNMU88vCQZyhgBzgw",
        "Addrs": ["/ip4/3.215.160.238/tcp/4001", "/ip4/3.215.160.238/udp/4001/quic"]
      }
    ]
  }
  ...
}

Nodes using Peering configuration:

  1. In connection management, it will protect the connection between this node and the specified node; IPFS will never actively (automatically) close this connection, and it will not close this connection even when the connection count reaches the limit.
  2. The connection will be established at startup.
  3. If the connection is lost due to network reasons or the other node going offline, IPFS will continuously attempt to reconnect, with the interval between attempts randomly ranging from 5 seconds to 10 minutes.

Running IPFS under public NAT#

Operating environment#

The server with a public IP is Centos7, with a public IP of 118.119.120.121 and an internal IP of 192.168.13.10.
The internal server is Ubuntu18.04, with an internal IP of 192.168.13.25.

Setting up port forwarding on the public server#

Configure the global forwarding switch.

firewall-cmd --permanent --zone=public --add-masquerade

Forward the public IP port 4002 to the internal server's port 4001.

# TCP port forwarding
firewall-cmd --permanent --zone=public --add-forward-port=port=4002:proto=tcp:toaddr=192.168.13.25:toport=4001
# UDP port forwarding
firewall-cmd --permanent --zone=public --add-forward-port=port=4002:proto=udp:toaddr=192.168.13.25:toport=4001
# Activate settings
firewall-cmd --reload
# Check
firewall-cmd --zone=public --list-all

If the gateway is an OpenWRT router, simply add forwarding rules in the firewall -> port forwarding section. Note that after adding forwarding rules, you also need to allow WAN access to this device on this port in the communication rules.

Limiting the number of connected nodes#

You need to set the Swarm/ConnMgr/HighWater parameter. In version 0.6.0, this setting did not work well; after running the server for a long time, the number of nodes would far exceed the set limit. In version 0.7.0, it is effective.

"Swarm": {
    ...
    "ConnMgr": {
        "GracePeriod": "30s",
        "HighWater": 500,
        "LowWater": 100,
        "Type": "basic"
    },
    ...
}

Configure the IPFS service on the internal server.

# Installation process omitted
 
# Initialize as server mode
ipfs init --profile=server
 
# Modify configuration, see specific instructions below
vi .ipfs/config
 
# Start
ipfs daemon

The server mode has several changes compared to the normal mode:

  1. Addresses.NoAnnounce in server mode will list all internal IPs, which will not be announced.
  2. Swarm.AddrFilters in server mode will list all internal IPs, and peers connecting with internal IPs will be filtered out.
  3. Discovery.MDNS.Enabled in server mode is set to false to avoid initiating node searches in the internal network.

In addition to the API, Gateway, and API HTTPHeaders of normal nodes, you also need to configure Addresses.Announce, adding the public IP and port of this node.

"Announce": [
  "/ip4/118.119.120.121/tcp/4002",
  "/ip4/118.119.120.121/udp/4002/quic"
],

Issues encountered#

Using OpenWRT configured gateways does not have this problem, but using Centos as a gateway seems unable to correctly pass the external IP, resulting in many third-party nodes showing the gateway IP. Connections to this node can succeed, but ping fails. Therefore, the list in Swarm.AddrFilters should delete parts of the same subnet.

The specific reason can be seen from the swarm peers of this node. Nodes actively connecting from this node record the public IP, but nodes passively connecting (i.e., connecting through the gateway 118.119.120.121) record the gateway's internal IP. According to the rules of AddrFilters, these nodes will be discarded. This leads to the issue where ipfs swarm connect succeeds, but ipfs ping fails.

 $ ipfs swarm peers
/ip4/104.131.131.82/udp/4001/quic/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ
/ip4/111.231.85.77/tcp/4001/p2p/QmWv1eLMNHPpwYKzREQBEpDfYjW6YXCrVpVyBZVjAuSd2i
...
/ip4/192.168.13.10/tcp/10041/p2p/QmXUZth5Pr2u1cW65F7gUeFwjkZFfduE1dwqiysNnrTwXd
/ip4/192.168.13.10/tcp/10053/p2p/QmPM3bepMUKpYTza67coD1Ar3efL7FPBFbGRMc42QLf4q9
/ip4/192.168.13.10/tcp/10202/p2p/QmVBzjW2MyNrSuR48GjcqB6SAJTnw8Y9zaFbgbegh4bRx4
/ip4/192.168.13.10/tcp/1024/p2p/QmbBEfFfw59Vya9CJuzgt4qj9f57sPZGiyK7fup8iKqkTr
/ip4/192.168.13.10/tcp/1025/p2p/QmcmhwCeLkBJvcq6KJzN58BRZg1B1N8m3uA3JyQKpVn64E
...
/ip4/192.168.13.10/udp/6681/quic/p2p/QmcSCBpek4YF5aAsY7bUMxiL7tacoYMeXUJUpU4wctqX4w
/ip4/192.168.13.10/udp/8050/quic/p2p/QmWeuXCNKAHfbineKMqo3U3dvVSz2em1w67pj5Up6tkUXo
/ip4/206.189.69.143/tcp/31071/p2p/12D3KooWHDr5W3Tse17mr4HSzuQm44dVQYp8Bb638mQknsyeHXSP
/ip4/206.189.69.250/tcp/30511/p2p/12D3KooWRd1BNPd8PMfxpCT7TNCFY4XSZsy8v8Cmm36H136yxzub
...

Further checking whether it is possible to ping test from this node to the passively connected nodes, take one Peer ID for ping testing.

ipfs ping QmXUZth5Pr2u1cW65F7gUeFwjkZFfduE1dwqiysNnrTwXd
PING QmXUZth5Pr2u1cW65F7gUeFwjkZFfduE1dwqiysNnrTwXd.
Pong received: time=26.43 ms
Pong received: time=25.70 ms
Pong received: time=26.31 ms
...

This indicates that nodes recorded as internal IPs through the gateway are available and should be retained.

These nodes connected through the gateway can be divided into two categories: nodes without public IPs and nodes with public IPs:

  • For nodes with public IPs, it is unclear whether they will ping back using the announce address after the initial connection. If they do and update the node's address based on this result, these nodes will briefly remain in the list displayed as internal IPs before updating to public IP addresses. This way, when other nodes connect later, they can connect through their public IP addresses.
  • For nodes without public IPs, the server node cannot connect back using the announce address, and can only connect using the internal IP of the gateway, so they will remain in the list displayed as internal IPs. These nodes cannot be shared with other nodes.

Optimizing download speed#

If you want to read files via CID, first choose a faster gateway. Using the ipfs.io gateway to obtain files is the most reliable, but due to connection issues, the speed may be very slow.

If you want to send files via CID, ensure that the gateway used by the other party is in the connection list of the current file instance, so maintaining a list of faster gateways is very important. Adding these gateways to your Peering.Peers will greatly enhance your file publishing speed.

Upgrading IPFS#

For major version upgrades, use the fs-repo-migrations tool, refer to the documentation for instructions. It involves two steps: backing up the .ipfs directory and running the fs-repo-migrations command. Before this, the IPFS service needs to be stopped.

IPFS Desktop#

Install IPFS Desktop on Windows 10, which is installed by default in the user directory C:\Users\Milton\AppData\Local\Programs\IPFS Desktop. After installation, the program directory is 265MB,
and the data directory is in C:\Users\Milton.ipfs, with the format and content identical to go-ipfs. There are two IPFS Desktop processes running in the background, along with two ipfs processes, consuming about 500MB of memory.

The status window is essentially a webview displaying the content of the web UI.

The configuration only changes the water levels to 50 and 300 from the default, while everything else remains the same.

Resources can be accessed if they exist (cached) on directly connected nodes; otherwise, they cannot be accessed.

IPFS Private Network and Cluster#

Refer to https://labs.eleks.com/2019/03/ipfs-network-data-replication.html

Run IPFS as a service, set to start on boot.

# Create service file
sudo vi /etc/systemd/system/ipfs.service
 
# File content
[Unit]
Description=IPFS Daemon
After=syslog.target network.target remote-fs.target nss-lookup.target
[Service]
Type=simple
ExecStart=/usr/local/bin/ipfs daemon --enable-namesys-pubsub
User=root
[Install]
WantedBy=multi-user.target
# End of file content
 
# Add service
sudo systemctl daemon-reload
sudo systemctl enable ipfs
sudo systemctl start ipfs
sudo systemctl status ipfs

Add IPFS Cluster as a service.

# Create service
sudo nano /etc/systemd/system/ipfs-cluster.service
 
# File content start, note that After includes the ipfs service to ensure startup order
[Unit]
Description=IPFS-Cluster Daemon
Requires=ipfs
After=syslog.target network.target remote-fs.target nss-lookup.target ipfs
[Service]
Type=simple
ExecStart=/home/ubuntu/gopath/bin/ipfs-cluster-service daemon
User=root
[Install]
WantedBy=multi-user.target
# End of file content
 
# Add to system services
sudo systemctl daemon-reload
sudo systemctl enable ipfs-cluster
sudo systemctl start ipfs-cluster
sudo systemctl status ipfs-cluster

Applications of IPFS#

Through practical testing, for IPFS nodes in the public network, once a connection is established, the speed of CID publishing and reading is very fast. Excluding the inherent network delay, the time from request to transmission start is basically within 2 seconds, and the transmission speed depends on the bandwidth between the two points.

File Sharing#

The indexing form of IPFS content is very suitable for file sharing among teams, as each modification generates an index change, allowing for version control. The distributed file access and download points facilitate cluster scalability, and the caching feature can reduce the impact of hot data on bandwidth resources.

Audio and Video Distribution#

Used to replace existing PT download networks. Since individual files propagated by PT are large, unified pin management is required for all nodes in the PT group, ensuring access speed for hot content at all terminal nodes while ensuring sufficient backup for long-tail content to prevent loss.

Streaming Media and Download Acceleration#

The characteristics of IPFS naturally allow it to replace CDN services. Static files such as images, CSS, JS, and compressed packages, as well as some live broadcast services with less stringent time requirements. With the widespread adoption of IPv6 by ISPs, home broadband resources with public IPv6 addresses can be fully utilized for regional content acceleration.

libp2p#

libp2p is a well-packaged p2p module that has been separated from IPFS. The module already includes mechanisms such as PeerId, MultiAddress, and Protocol Handler, making it easy to extend your own applications.

The Go language implementation can be found at https://github.com/libp2p/go-libp2p, with sample code at https://github.com/libp2p/go-libp2p-examples.

The usage in the code generally follows these steps:

  1. Create Host
  2. For the specified protocol, set StreamHandler on the Host
  3. If there are local ports providing services, create the corresponding service and listen on the port
  4. Specify the target node and protocol, create Stream
  5. Write data to the Stream
  6. Read data from the Stream, close or not close the Stream based on business needs

If the Host needs to keep running after startup, you can use the following methods:

# Method 1, use select
select {} // hang forever
 
# Method 2, use the built-in service's ListenAndServe
http.ListenAndServe(serveArgs, p)
 
# Method 3, use channel
<-make(chan struct{}) // hang forever
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.