Ways To Cover A Distance - A Devs Perspective

The concept of “distance” isn’t just for maps and travel. In the realm of software development and systems engineering, “distance” can manifest in many forms: the latency between two servers, the number of lines separating two versions of code, or the geographical span a distributed system covers.

Understanding and effectively “covering” or measuring these distances is crucial for building robust, performant, and maintainable systems. This post dives into various interpretations of “distance” from a developer’s viewpoint, providing practical, runnable examples for each.

1. Measuring Network Distance: Latency & Connectivity

One of the most common “distances” we encounter is network latency. This is the time it takes for data to travel from one point to another. Minimizing this distance is vital for responsive applications.

Pinging for Basic Reachability and Latency

The ping command is your first stop for checking if a host is reachable and to get a rough idea of the round-trip time (RTT).

ping -c 4 google.com
PING google.com (142.250.186.110) 56(84) bytes of data.
64 bytes from dfw28s19-in-f14.1e100.net (142.250.186.110): icmp_seq=1 ttl=113 time=15.6 ms
64 bytes from dfw28s19-in-f14.1e100.net (142.250.186.110): icmp_seq=2 ttl=113 time=16.1 ms
64 bytes from dfw28s19-in-f14.1e100.net (142.250.186.110): icmp_seq=3 ttl=113 time=15.9 ms
64 bytes from dfw28s19-in-f14.1e100.net (142.250.186.110): icmp_seq=4 ttl=113 time=15.7 ms

--- google.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 15.602/15.823/16.106/0.198 ms

Tracing the Route: Hops and Bottlenecks

traceroute (or tracert on Windows) shows you the path packets take to reach a destination, including each hop (router) and the latency to that hop. This helps identify where network “distance” is being covered, or where slowdowns might occur.

traceroute google.com
traceroute to google.com (142.250.186.110), 30 hops max, 60 byte packets
 1  _gateway (192.168.1.1)  0.370 ms  0.380 ms  0.372 ms
 2  <ISP_GATEWAY_IP> (<ISP_GATEWAY_IP>)  9.324 ms  9.326 ms  9.317 ms
 3  <ISP_HOP_1_IP> (<ISP_HOP_1_IP>)  10.021 ms  10.026 ms  10.016 ms
 4  <ISP_HOP_2_IP> (<ISP_HOP_2_IP>)  12.183 ms  12.190 ms  12.190 ms
 5  108.170.246.129 (108.170.246.129)  13.435 ms  13.435 ms  13.424 ms
 6  142.251.52.221 (142.251.52.221)  14.075 ms  14.067 ms  14.066 ms
 7  dfw28s19-in-f14.1e100.net (142.250.186.110)  15.864 ms  15.845 ms  15.834 ms

Measuring Bandwidth Distance: Throughput

While latency is time, bandwidth is the capacity (how much data can cross the distance per second). iperf3 is a fantastic tool for measuring TCP and UDP throughput. You’ll need an iperf3 server running on one end.

First, start the server on one machine (e.g., server_A_ip):

# On server_A_ip
iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------

Then, run the client from another machine (e.g., client_B_ip):

# On client_B_ip
iperf3 -c server_A_ip
Connecting to host server_A_ip, port 5201
[  5] local client_B_ip port 41184 connected to server_A_ip port 5201
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec  1.12 GBytes  9.63 Gbits/sec
[  5]   1.00-2.00   sec  1.12 GBytes  9.64 Gbits/sec
[  5]   2.00-3.00   sec  1.12 GBytes  9.61 Gbits/sec
[  5]   3.00-4.00   sec  1.12 GBytes  9.62 Gbits/sec
[  5]   4.00-5.00   sec  1.12 GBytes  9.63 Gbits/sec
[  5]   5.00-6.00   sec  1.12 GBytes  9.63 Gbits/sec
[  5]   6.00-7.00   sec  1.12 GBytes  9.64 Gbits/sec
[  5]   7.00-8.00   sec  1.12 GBytes  9.63 Gbits/sec
[  5]   8.00-9.00   sec  1.12 GBytes  9.62 Gbits/sec
[  5]   9.00-10.00  sec  1.12 GBytes  9.64 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.00  sec  11.2 GBytes  9.63 Gbits/sec                  sender
[  5]   0.00-10.00  sec  11.2 GBytes  9.63 Gbits/sec                  receiver

iperf Done.

Note: The above iperf3 output shows high bandwidth, typical for a local network test. Over the internet, expect much lower numbers.

HTTP/API Response Time Distance

When dealing with web services and APIs, the “distance” isn’t just network latency, but also the time the server takes to process the request. curl with its write-out (-w) option is excellent for this.

curl -s -o /dev/null -w "Connect: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" https://www.google.com
Connect: 0.038590s
TTFB: 0.052825s
Total: 0.052932s
  • time_connect: Time it took to establish a TCP connection.
  • time_starttransfer: Time from start until the first byte is received from the server. This includes connection time, send time, and server processing time.
  • time_total: Total time for the entire operation.

2. Quantifying Code & File Differences: Version Control & Diffs

In software development, “distance” often refers to the changes between two versions of a file or codebase. Measuring and understanding this distance is fundamental for collaboration, debugging, and maintaining software.

The diff Utility: Comparing Files

The classic diff command shows you line-by-line differences between two files. This helps you quickly see the “distance” in content.

First, create two sample files:

echo -e "Line 1\nLine 2\nLine 3" > file1.txt
echo -e "Line A\nLine 2\nLine B\nLine 4" > file2.txt

Now, run diff:

diff file1.txt file2.txt
1c1
< Line 1
---
> Line A
3c3,4
< Line 3
---
> Line B
> Line 4

This output shows:

  • 1c1: Line 1 in file1 changed (c for changed) to line 1 in file2.
  • < Line 1: The content from file1.
  • > Line A: The content from file2.
  • 3c3,4: Line 3 in file1 changed to lines 3 and 4 in file2.

git diff: Tracking Repository Distance

When working with version control systems like Git, git diff is indispensable. It shows the “distance” (changes) between commits, branches, or your working directory and the last commit.

Let’s set up a quick Git repo:

mkdir my_project && cd my_project
git init -b main
echo "Initial content" > README.md
git add README.md
git commit -m "Initial commit"
echo -e "\nAdded a new line" >> README.md

Now, see the uncommitted changes:

git diff
diff --git a/README.md b/README.md
index 102c7f4..971e444 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,2 @@
 Initial content
+Added a new line

The + indicates an added line, showing the “distance” introduced by this new line.

You can also compare between commits or branches:

git log --oneline
# (Note: Copy the commit hash of your initial commit, e.g., 'a1b2c3d')
git diff HEAD~1 HEAD
diff --git a/README.md b/README.md
index 102c7f4..971e444 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,2 @@
 Initial content
+Added a new line

This shows the difference between the current commit (HEAD) and the previous one (HEAD~1).

3. Calculating Geospatial Distance

For applications dealing with location (maps, delivery services, ride-sharing), “distance” means physical geographical separation. We often calculate this using coordinates (latitude and longitude).

Python and the Haversine Formula

The Haversine formula is commonly used to calculate the great-circle distance between two points on a sphere (like Earth) given their longitudes and latitudes.

import math

def haversine_distance(lat1, lon1, lat2, lon2):
    """
    Calculate the distance between two points on Earth using the Haversine formula.
    Latitudes and longitudes are in decimal degrees.
    Returns distance in kilometers.
    """
    R = 6371  # Earth radius in kilometers

    lat1_rad = math.radians(lat1)
    lon1_rad = math.radians(lon1)
    lat2_rad = math.radians(lat2)
    lon2_rad = math.radians(lon2)

    dlon = lon2_rad - lon1_rad
    dlat = lat2_rad - lat1_rad

    a = math.sin(dlat / 2)**2 + math.cos(lat1_rad) * math.cos(lat2_rad) * math.sin(dlon / 2)**2
    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))

    distance = R * c
    return distance

# Example: Distance between New York City and Los Angeles
# NYC: 40.7128° N, 74.0060° W
# LA: 34.0522° N, 118.2437° W

lat_nyc, lon_nyc = 40.7128, -74.0060
lat_la, lon_la = 34.0522, -118.2437

dist = haversine_distance(lat_nyc, lon_nyc, lat_la, lon_la)
print(f"Distance between NYC and LA: {dist:.2f} km")

# Example: Distance between London and Paris
# London: 51.5074° N, 0.1278° W
# Paris: 48.8566° N, 2.3522° E

lat_london, lon_london = 51.5074, -0.1278
lat_paris, lon_paris = 48.8566, 2.3522

dist_lp = haversine_distance(lat_london, lon_london, lat_paris, lon_paris)
print(f"Distance between London and Paris: {dist_lp:.2f} km")
Distance between NYC and LA: 3935.71 km
Distance between London and Paris: 343.49 km

Note: For highly accurate routing distances (e.g., driving distance, public transit), you’ll typically rely on specialized mapping APIs (like Google Maps API, OpenStreetMaps APIs) which consider road networks, traffic, and other factors beyond a simple straight-line calculation.

4. Bridging “Distance” in Distributed Systems & Data Transfer

In complex systems, “distance” can also refer to the logical or physical separation between components, data centers, or even distinct data sets. “Covering” this distance often means moving data, synchronizing states, or orchestrating tasks across different locations.

Efficient Data Transfer: rsync

rsync is a powerful utility for synchronizing files and directories, efficiently copying only the “distance” (differences) between source and destination. This is crucial for backups, deployments, and distributing large datasets.

First, create a source and destination directory, and some files:

mkdir source_data destination_data
echo "Hello, world!" > source_data/file1.txt
echo "Another line." >> source_data/file1.txt
echo "Unique file." > source_data/file2.txt

Now, transfer from source_data to destination_data:

rsync -av source_data/ destination_data/
sending incremental file list
./
file1.txt
file2.txt

sent 143 bytes  received 50 bytes  386.00 bytes/sec
total size is 35  speedup is 0.18

Let’s modify a file in source_data and add a new one:

echo "Updated content." >> source_data/file1.txt
echo "New file!" > source_data/file3.txt

Run rsync again to see it only transfer the “distance” (changes):

rsync -av source_data/ destination_data/
sending incremental file list
./
file1.txt
file3.txt

sent 143 bytes  received 50 bytes  386.00 bytes/sec
total size is 60  speedup is 0.31

Notice file2.txt wasn’t re-transferred, only file1.txt (due to update) and file3.txt (new).

Copying Files in Kubernetes: kubectl cp

In cloud-native environments, your applications often run inside containers on remote nodes. kubectl cp bridges the “distance” between your local machine and a running pod, allowing you to copy files to or from it.

First, let’s deploy a simple Nginx pod.

kubectl run nginx --image=nginx --port=80
kubectl wait --for=condition=ready pod/nginx --timeout=90s

Now, create a local file and copy it into the Nginx pod:

echo "This is a test file for the Nginx pod." > local_test.txt
kubectl cp local_test.txt nginx:/tmp/container_test.txt
# No explicit output for successful copy.

Verify the file is in the pod by executing a command inside it:

kubectl exec nginx -- cat /tmp/container_test.txt
This is a test file for the Nginx pod.

To copy a file from the pod to your local machine:

kubectl cp nginx:/etc/nginx/nginx.conf ./nginx.conf.local
# No explicit output for successful copy.

Now check your local directory:

ls nginx.conf.local
nginx.conf.local

This effectively “covers the distance” of file transfer between your local development environment and a remote containerized application.

Asynchronous Communication: Message Queues

While not a direct “measurement” of distance, message queues (like Apache Kafka, RabbitMQ, Redis Streams) are designed to bridge the temporal and spatial “distance” between different services or components in a distributed system. They allow producers to send data without knowing or waiting for consumers, enabling loose coupling and resilience.

Though a full working example would be too extensive for a blog post section, conceptually, a producer “sends” data across a conceptual distance to a queue, and a consumer “receives” it from that queue, potentially at a much later time or from a different geographical location. This system covers the distance by making the interaction asynchronous and robust.

Conclusion

The concept of “distance” in development is multifaceted. Whether you’re optimizing network paths, comparing code versions, calculating geographical coordinates, or moving data across distributed systems, understanding and managing these distances is key to building efficient and scalable software. By leveraging the right tools and techniques, you can effectively measure, reduce, or bridge these gaps, ensuring your systems are performant, reliable, and maintainable.

Last updated on