HNNewShowAskJobs
Built with Tanstack Start
4x faster network file sync with rclone (vs rsync) (2025)(jeffgeerling.com)
237 points by indigodaddy 4 days ago | 116 comments
  • digiown8 hours ago

    Note there is no intrinsic reason running multiple streams should be faster than one [EDIT: "at this scale"]. It almost always indicates some bottleneck in the application or TCP tuning. (Though, very fast links can overwhelm slow hardware, and ISPs might do some traffic shaping too, but this doesn't apply to local links).

    SSH was never really meant to be a high performance data transfer tool, and it shows. For example, it has a hardcoded maximum receive buffer of 2MiB (separate from the TCP one), which drastically limits transfer speed over high BDP links (even a fast local link, like the 10gbps one the author has). The encryption can also be a bottleneck. hpn-ssh [1] aims to solve this issue but I'm not so sure about running an ssh fork on important systems.

    1. https://github.com/rapier1/hpn-ssh

    • bscphil38 minutes ago |parent

      > TCP tuning

      I think a lot of file transfer issues that occur outside of the corporate intranet world involve hardware that you don't fully control on (at least) one hand. In science, for example, transferring huge amounts of data over long distances is pretty common, and I've had to do this on boxes that had poor TCP buffer configurations. Being able to multiplex your streams in situations like this is invaluable and I'd love to see more open source software that does this effectively, especially if it can punch through a firewall.

    • Aurornis7 hours ago |parent

      > Note there is no intrinsic reason running multiple streams should be faster than one.

      The issue is the serialization of operations. There is overhead for each operation which translates into dead time between transfers.

      However there are issues that can cause singular streams to underperform multiple streams in the real world once you reach a certain scale or face problems like packet loss.

      • nh26 hours ago |parent

        Is it certain that this is the reason?

        rsync's man page says "pipelining of file transfers to minimize latency costs" and https://rsync.samba.org/how-rsync-works.html says "Rsync is heavily pipelined".

        If pipelining is really in rsync, there should be no "dead time between transfers".

        • dekhn4 hours ago |parent

          The simple model for scp and rsync (it's likely more complex in rsync): for loop over all files. for each file, determine its metadata with fstat, then fopen and copy bytes in chunks until done. Proceed to next iteration.

          I don't know what rsync does on top of that (pipelining could mean many different things), but my empirical experience is that copying 1 1 TB file is far faster than copying 1 billion 1k files (both sum to ~1 TB), and that load balancing/partitioning/parallelizing the tool when copying large numbers of small files leads to significant speedups, likely because the per-file overhead is hidden by the parallelism (in addition to dealing with individual copies stalling due to TCP or whatever else).

          I guess the question is whether rsync is using multiple threads or otherwise accessing the filesystem in parallel, which I do not think it does, while tools like rclone, kopia, and aws sync all take advantage of parallelism (multiple ongoing file lookups and copies).

          • nh24 hours ago |parent

            > I guess the question is whether rsync is using multiple threads or otherwise accessing the filesystem in parallel

            No, that is not the question. Even Wikipedia explains that rsync is single-threaded. And even if it was multithreaded "or otherwise" used concurent file IO:

            The question is whether rsync _transmission_ is pipelined or not, meaning: Does it wait for 1 file to be transferred and acknowledged before sending the data of the next?

            Somebody has to go check that.

            If yes: Then parallel filesystem access won't matter, because a network roundtrip has brutally higher latency than reading data sequentially of an SSD.

            • dekhn4 hours ago |parent

              Note that rsync on many small files is slow even within the same machine (across two physical devices), suggesting that the network roundtrip latency is not the major contributor.

        • spockz5 hours ago |parent

          I’m not sure why, but just like with scp, I’ve achieved significant speeds ups by tarring the directory first (optionally compressing it), transferring and then decompressing. Maybe because it makes the tar and submit, and the receive, untar/uncompress, happen on different threads?

          • poke64625 minutes ago |parent

            One of my "goto" tools is copying files over a "tar pipe". This avoids the temporary tar file. Something like:

              tar cf - *.txt | ssh user@host tar xf - -C /some/dir/
          • lelandbatey4 hours ago |parent

            It's typically a disk-latency thing, as just stat-ing the many files in a directory can have significant latency implications (especially on spinning HDDs) vs opening a single file (the tar) and read-()ing that one file in memory before writing to the network.

            If copying a folder with many files is slower than tarring that folder and the moving the tar (but not counting the untar) then disk latency is your bottleneck.

            • ahartmetz3 hours ago |parent

              Not useful very often, but fast and kind of cool: You can also just netcat the whole block device if you wanted a full filesystem copy anyway. Optionally zero all empty space before using a tool like zerofree and use on-the-fly compression / decompression with lz4 or lzo. Of course, none of the block devices should be mounted, though you could probably get away with a source that's mounted read-only.

              dd is not a magic tool that can deal with block devices while others can't. You can just cp myLinuxInstallDisk.iso to /dev/myUsbDrive, too.

            • spockz2 hours ago |parent

              Okay. In this case the whole operation is faster end to end. That includes the time it takes to tar and untar. Maybe those programs do something more efficient in disk access than scp and rsync?

      • wmf6 hours ago |parent

        The ideal solution to that is pipelining but it can be complex to implement.

    • mprovost8 hours ago |parent

      In general TCP just isn't great for high performance. In the film industry we used to use a commercial product Aspera (now owned by IBM) which emulated ftp or scp but used UDP with forward error correction (instead of TCP retransmission). You could configure it to use a specific amount of bandwidth and it would just push everything else off the network to achieve it.

      • nh25 hours ago |parent

        What does "high performance" mean here?

        I get 40 Gbit/s over a single localhost TCP stream on my 10 years old laptop with iperf3.

        So the TCP does not seem to be a bottleneck if 40 Gbit/s is "high" enough, which it probably is currently for most people.

        I have also seen plenty situations in which TCP is faster than UDP in datacenters.

        For example, on Hetzner Cloud VMs, iperf3 gets me 7 Gbit/s over TCP but only 1.5 Gbit/s over UDP. On Hetzner dedicated servers with 10 Gbit links, I get 10 Gbit/s over TCP but only 4.5 Gbit/s over UDP. But this could also be due to my use of iperf3 or its implementation.

        I also suspect that TCP being a protocol whose state is inspectable by the network equipment between endpoints allows implementing higher performance, but I have not validated if that is done.

        • KaiserPro2 hours ago |parent

          Aspera was/is designed for high latency links. Ie sending multi terabytes from london to new Zealand, or LA

          For that use case, Aspera was the best tool for the job. It's designed to be fast over links that single TCP streams couldn't

          You could, if you were so bold, stack up multiple TCP links and send data down those. You got the same speed, but possible not the same efficiency. It was a fucktonne cheaper to do though.

        • wtallisan hour ago |parent

          > I get 40 Gbit/s over a single localhost TCP stream on my 10 years old laptop with iperf3.

          Do you mean literally just streaming data from one process to another on the same machine, without that data ever actually transiting a real network link? There's so many caveats to that test that it's basically worthless for evaluating what could happen on a real network.

        • mprovostan hour ago |parent

          High performance means transferring files from NZ to a director's yacht in the Mediterranean with a 40Mbps satellite link and getting 40Mbps, to the point that the link is unusable for anyone else.

      • digiown8 hours ago |parent

        There's an open source implementation that does something similar but for a more specific use case: https://github.com/apernet/tcp-brutal

        There's gotta be a less antisocial way though. I'd say using BBR and increasing the buffer sizes to 64 MiB does the trick in most cases.

        • tclancy6 hours ago |parent

          Have you tried searching for "tcp-kind"?

        • Onavoan hour ago |parent

          Looks unmaintained.

          Can we throw a bunch of AI agents at it? This sounds like a pretty tightly defined problem, much better than wasting tokens on re-inventing web browsers.

      • pezgrande7 hours ago |parent

        Was the torrent protocol considered at some point? Always surprised how little presence has in the industry considering how good the technology is.

        • gruez6 hours ago |parent

          If you strip out the swarm logic (ie. downloading from multiple peers), you're just left with a protocol that transfers big files via chunks, so there's no reason that'd be faster than any other sort of download manager that supports multi-thread downloads.

          https://en.wikipedia.org/wiki/Download_manager

        • KaiserPro2 hours ago |parent

          Aspera did the chunking and encryption for you, and it looked and acted like SFTP.

          The cost of leaking data was/is catastrophic (as in company ending) So paying a bit of money to guarantee that your data was being sent to the right place (point to point) and couldn't leak was a worthwhile tradeoff.

          For Point to point transfer torrenting is a lot higher overhead than you want. plus most clients have an anti-leaching setting, so you'd need not only a custom client, but a custom protocol as well.

          The idea is sound though, have an index file with and then a list of chunks to pull over multiple TCP connections.

        • ambicapter7 hours ago |parent

          torrent is great for many-to-one type downloads but I assume GP is talking about single machine to single machine transfers.

      • robaatoan hour ago |parent

        So what do you use now in film industry?

        • magarnicle44 minutes ago |parent

          I'm in a tiny part of the film industry. Bigger clients lend us licenses to Aspera and FileCatalyst when receiving files from them, but for our own trans-oceanic transfers I dug up an ancient program called Tsunami UDP and fixed it up just enough.

        • mprovostan hour ago |parent

          I suspect mostly Aspera because there are still no good alternatives.

      • adolph6 hours ago |parent

        Aspera's FASP [0] is very neat. One drawback to it is that the TCP stuff not being done the traditional way must be done on CPU. Say if one packet is missing or if packets are sent out of order, the Aspera client fixes those instead of all that being done as TCP.

        As I understand it, this is also the approach of WEKA.io [1]. Another approach is RDMA [2] used by storage systems like Vast which pushes those order and resend tasks to NICs that support RDMA so that applications can read and write directly to the network instead of to system buffers.

        0. https://en.wikipedia.org/wiki/Fast_and_Secure_Protocol

        1. https://docs.weka.io/weka-system-overview/weka-client-and-mo...

        2. https://en.wikipedia.org/wiki/Remote_direct_memory_access

        • mprovostan hour ago |parent

          FASP uses forward error correction instead of retransmission. So instead of waiting for something not to show up on the other end and sending it again, it calculates parity and transmits slightly more data up front, with enough redundancy that the receiving end is capable of reconstructing any missing bits. This is basically how all storage systems work, not just Weka. You calculate enough parity bits to be able to reconstruct the missing data when a drive fails. The more disks you have, the smaller the parity overhead is. Object storage like S3 does this on a massive scale. With a network transfer you typically only need a few percent, unless it's really lossy like Wifi, in which case standards like 802.11n are doing FEC for you to reduce retransmissions at the TCP layer.

    • nh24 hours ago |parent

      > has a hardcoded maximum receive buffer of 2MiB

      For completeness, I want to add:

      The 2MiB are per SSH "channel" -- the SSH protocol multiplexes multiple independent transmission channels over TCP [1], and each one has its own window size.

      rsync and `cat | ssh | cat` only use a single channel, so if their counterparty is an OpenSSH sshd server, their throughput is limited by the 2MiB window limit.

      rclone seems to be able to use multiple ssh channels over a single connection; I believe this is what the `--sftp-concurrency` setting controls.

      Some more discussion about the 2MiB limit and links to work for upstreaming a removal of these limits can be found in my post [3].

      Looking into it just now, I found that the SSH protocol itself already supports dynamically growing per-channel window sizes with `CHANNEL_WINDOW_ADJUST`, and OpenSSH seems to generally implement that. I don't fully grasp why it doesn't just use that to extend as needed.

      I also found that there's an official `no-flow-control` extension with the description

      > channel behaves as if all window sizes are infinite. > > This extension is intended for, but not limited to, use by file transfer applications that are only going to use one channel and for which the flow control provided by SSH is an impediment, rather than a feature.

      So this looks exactly as designed for rsync. But no software implements this extension!

      I wrote those things down in [4].

      It is frustrating to me that we're only a ~200 line patch away from "unlimited" instead of shitty SSH transfer speeds -- for >20 years!

      [1]: https://datatracker.ietf.org/doc/html/rfc4254#section-5

      [2]: https://rclone.org/sftp/#sftp-concurrency

      [3]: https://news.ycombinator.com/item?id=40856136

      [4]: https://github.com/djmdjm/openssh-portable-wip/pull/4#issuec...

    • softfalcon6 hours ago |parent

      > It almost always indicates some bottleneck in the application or TCP tuning.

      Yeah, this has been my experience with low-overhead streams as well.

      Interestingly, I see a ubiquity of this "open more streams to send more data" pattern all over the place for file transfer tooling.

      Recent ones that come to mind have been BackBlaze's CLI (B2) and taking a peek at Amazon's SDK for S3 uploads with Wireshark. (What do they know that we don't seem to think we know?)

      It seems like they're all doing this? Which is maybe odd, because when I analyse what Plex or Netflix is doing, it's not the same? They do what you're suggesting, tune the application + TCP/UDP stack. Though that could be due to their 1-to-1 streaming use case.

      There is overhead somewhere and they're trying to get past it via semi-brute-force methods (in my opinion).

      I wonder if there is a serialization or loss handling problem that we could be glossing over here?

      • KaiserPro2 hours ago |parent

        Memory and CPU are cheap (up to a point) so why not just copy/paste TCP streams. It neatly fits into multi-processing/threading as well.

        When we were doing 100TB backups of storage servers we had a wrapper that run multiple rsyncs over the file system, that got throughput up to about 20gigbits a second over lan

      • digiown5 hours ago |parent

        Tuning on Linux requires root and is systemwide. I don't think BBR is even available on other systems. And you need to tune the buffer sizes of both ends too. Using multiple streams is just less of a hassle for client users. It can also fool some traffic shaping tools. Internal use is a different story.

      • PunchyHamster4 hours ago |parent

        that is a different problem. For S3-esque transfers you might very well be limited by ability for target to receive X MB/s and not more and so starting parallel streams will make it faster.

        I used B2 as third leg for our backups and pretty much had to give rclone more connections at once because defaults were nowhere close to saturating bandwidth

      • akdev1l5 hours ago |parent

        not sure about B2 but AWS S3 SDK not assuming that people will do any tuning makes total sense

        cuz in my experience no one is doing that tbh

        • slightlygrilled2 hours ago |parent

          I’ve found aws s3 it’s always been painful to get any good speed out of it unless it’s massive files you’re moving.

          It’s base line tuning seems to just assume large files and does no auto scaling and it’s mostly single threaded.

          Then even when tuning it’s still painfully slow, again seemly limited by its cpu processing and mostly on a single thread, highly annoying.

          Especially when you’re running it on a high core, fast storage, large internet connection machine.

          Just feels like there is a large amount of untapped potential in the machines…

          • odo12422 hours ago |parent

            It’s almost certainly also tuned to prevent excessive or “spiky” traffic to their service.

    • oceanplexian7 hours ago |parent

      Uhh.. I work with this stuff daily and there are a LOT of intrinsic reasons a single stream would be slower than running multiple: MPLS ECMP hashing you over a single path, a single loss event with a high BDP causing congestion control to kick in for a single flow, CPU IRQ affinity, probably many more I’m not thinking like the inner workings of NIC offloading queues.

      Source: Been in big tech for roughly ten years now trying to get servers to move packets faster

      • digiown7 hours ago |parent

        Ha, it sounds like the best way to learn something is to make a confident and incorrect claim :)

        > MPLS ECMP hashing you over a single path

        This is kinda like the traffic shaping I was talking about though, but fair enough. It's not an inherent limitation of a single stream, just a consequence of how your network is designed.

        > a single loss event with a high BDP

        I thought BBR mitigates this. Even if it doesn't, I'd still count that as a TCP stack issue.

        At a large enough scale I'd say you are correct that multiple streams is inherently easier to optimize throughput for. But probably not a single 1-10gb link though.

        • PunchyHamster4 hours ago |parent

          > This is kinda like the traffic shaping I was talking about though, but fair enough. It's not an inherent limitation of a single stream, just a consequence of how your network is designed.

          It is. one stream gets you traffic of one path to the infrastructure. Multiple streams get you multiple and possibly also hit different servers to accelerate it even more. Just the limitation isn't hardware but "our networking device have 4 10Gbit ports instead of single 40Gbit port"

          Especially if link is saturated, you'd be essentially taking n-times your "fair share" of bandwidth on link.

    • yegle8 hours ago |parent

      The author tried running rsyncd demon so it's not _just_ the ssh protocol.

    • yason6 hours ago |parent

      Note there is no intrinsic reason running multiple streams should be faster than one

      If the server side scales (as cloud services do) it might end up using different end points for the parallel connections and saturate the bandwidth better. One server instance might be serving other clients as well and can't fill one particular client's pipe entirely.

    • Saris7 hours ago |parent

      Wouldn't lots of streams speed up transfers of thousands of small files?

      • digiown7 hours ago |parent

        If the application handles them serially, then yeah. But one can imagine the application opening files in threads, buffering them, and then finally sending it at full speed, so in that sense it is an application issue. If you truly have millions of small files, you're more likely to be bottlenecked by disk IO performance rather than application or network, though. My primary use case for ssh streams is zfs send, which is mostly bottlenecked by ssh itself.

        • catdog6 hours ago |parent

          It's an application issue but implementation wise it's probably way more straightforward to just open a separate network connection per thread.

    • dekhn8 hours ago |parent

      Single file overheads (opening millions of tiny files whose metadata is not in the OS cache and reading them) appears to be an intrinsic reason (intrinsic to the OS, at least).

      • pixl977 hours ago |parent

        IOPs and disk read depth are common limits.

        Depending on what you're doing it can be faster to leave your files in a solid archive that is less likely to be fragmented and get contiguous reads.

      • PunchyHamster4 hours ago |parent

        the majority of that will be big files. And to NVMe it is VERY fast even if you run single threaded 10Gbit should be easy

    • einpoklum2 hours ago |parent

      > Note there is no intrinsic reason running multiple streams should be faster than one

      Inherent reasons or no, it's been my experience across multiple protocols, applications, network connections and environments, and machines on both ends, that, _in fact_, splitting data up and operating using multiple streams is significantly faster.

      So, ok, it might not be because of an "inherent reason", but we still have to deal with it in real life.

    • patmorgan235 hours ago |parent

      I mean isn't a single TCP connections throughput limited by the latency? Which is why in high(er) latency WAN links you generally want to open multiple connections for large file transfers.

      https://wintelguy.com/wanperf.pl

  • ericpauley8 hours ago

    Rclone is a fantastic tool, but my favorite part of it is actually the underyling FS library. I've started baking Rclone FS into internal Go tooling and now everything transparently supports reading/writing to either local or remote storage. Really great for being able to test data analysis code locally and then running as batch jobs elsewhere.

    • absoflutely6 hours ago |parent

      What kind of data analysis do you run with Go and do you use an open source library for it? My experience with stats libraries in Go has been lukewarm so far.

    • rsync5 hours ago |parent

      "Rclone is a fantastic tool, but my favorite part of it is actually the underyling FS library."

      Related to this is the very useful:

        rclone serve restic ...
      
      .. workflow that allows you to create append-only (immutable) backups.

      This howto is not rsync.net-specific - you can follow this recipe at any standard SSH endpoint:

      https://www.rsync.net/resources/notes/2025-q4-rsync.net_tech...

  • coreylane8 hours ago

    RClone has been so useful over the years I built a fully managed service on top of it specifically for moving data between cloud storage providers: https://dataraven.io/

    My goal is to smooth out some of the operational rough edges I've seen companies deal with when using the tool:

      - Team workspaces with role-based access control
      - Event notifications & webhooks – Alerts on transfer failure or resource changes via Slack, Teams, Discord, etc.
      - Centralized log storage
      - Vault integrations – Connect 1Password, Doppler, or Infisical for zero-knowledge credential handling (no more plain text files with credentials)
      - 10 Gbps connected infrastructure (Pro tier) – High-throughput Linux systems for large transfers
    • noname1208 hours ago |parent

      I hope that you sponsor the rclone project given that it’s the core of your business! I couldn’t find any indication online that you do give back to the project. I hope I’m wrong.

      • coreylane7 hours ago |parent

        I'm certainly planning on sponsoring the project as soon as possible, but so far I have zero paying customers, hopefully that will change soon

        • znnajdla6 hours ago |parent

          first thing that popped into my mind is that your free plan is crazy generous. cut it out.

          • PunchyHamster4 hours ago |parent

            first thing that popped into mine is $30/mo for running a vm with a command is something people will now just tell LLM to do

            • tonymet30 minutes ago |parent

              first thing that popped into my mind is that OP did a lot of hard work and doesn't need cynical and useless comments about it.

      • stronglikedan7 hours ago |parent

        that's just creepy and hella presumptuous

        • asacrowflies6 hours ago |parent

          Yeah I've seen this pop up in foss a lot lately and I don't like it.

      • sneak7 hours ago |parent

        Gifts do not confer obligation. If you give me a screwdriver and I use it to run my electrical installation service business, I don’t owe you a payment.

        This idea that one must “give back” after receiving a gift freely given is simply silly.

        • burnte5 hours ago |parent

          Yes but thank-yous are always good. Making sure the project sticks around is just smart.

        • MattGrommes6 hours ago |parent

          If your neighbor kept baking and giving you cookies, to the point where you were wrapping and reselling them at the market, don't you think you should do something for them in return?

      • jfbaro8 hours ago |parent

        Me too!

    • tonymet28 minutes ago |parent

      i had been thinking about this service for a long time, especially something supporting transforms and indexing for backups. great job spinning it up.

    • plasticsoprano8 hours ago |parent

      How do you deal with how poorly rclone handles rate limits? It doesn't honor dropbox's retry-after header and just adds an exponential back off that, in my migrations, has resulted in a pause of days.

      I've adjusted threads and the various other controls rclone offers but I still feel like I'm not see it's true potential because the second it hits a rate limit I can all but guarantee that job will have to be restarted with new settings.

      • darthShadow6 hours ago |parent

        > doesn't honor dropbox's retry-after header

        That hasn't been true for more than 8 years now.

        Source: https://github.com/rclone/rclone/blob/9abf9d38c0b80094302281...

        And the PR adding it: https://github.com/rclone/rclone/pull/2622

      • coreylane7 hours ago |parent

        I honestly haven't used it with Dropbox before, have you tried adjusting --tpslimit 12 --tpslimit-burst 0 flags? Are you creating a dedicated api key for the transfer? Rate limits may vary between Plus/Advanced forum.rclone.org is quite active you may want to post more details there.

  • edvardsire3 hours ago

    Interesting that nobody has mentioned: Warp speed Data Transfer (WDT)[1].

    From the readme:

    - Warp speed Data Transfer (WDT) is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.

    - Goal: Lowest possible total transfer time - to be only hardware limited (disc or network bandwidth not latency) and as efficient as possible (low CPU/memory/resources utilization)

    1. https://github.com/facebook/wdt

  • SloopJon2 hours ago

    The article links to a YouTube mini-review of USB enclosures from UGreen and Acasis, neither of which he loves.[1] I've been happy with the OWC 1M2 as a boot drive on a Mac Studio with Thunderbolt 5 ports.[2] I just noticed that there is an OWC 1M2 80G, based on USB4 v2.[3] I didn't know that was a thing, but I guess it's the USB cousin to Thunderbolt 5.

    [1] https://www.youtube.com/watch?v=gaV-O6NPWrI

    [2] https://eshop.macsales.com/shop/owc-express-1m2

    [3] https://eshop.macsales.com/item/OWC/US4V2EXP1M2/

  • newsoftheday7 hours ago

    I prefer rsync because of its delta transfer which doesn't resend files already on the destination, saving bandwidth. This combined with rsync's ability to work over ssh lets me sync anywhere rsync runs, including the cloud. It may not be faster than rclone but it is more conserving on bandwidth.

    • kbr20004 hours ago |parent

      The delta-transfer algorithm [0] is about detecting which chunks of a file differ on source and target [1], and limiting the transfer to those chunks. The savings depend on how and where they differ, and ofcourse there's tradeoffs...

      You seem to be referring to the selection of candidates of files to transfer (along several possible criteria like modification time, file size or file contents using checksumming) [2]

      Rsync is great. However for huge filesystems (many files and directories) with relatively less change, you'll need to think about "assisting" it somewhat (by feeding it its candidates obtained in a more efficient way, using --files-from=). For example: in a renderfarm system you would have additions of files, not really updates. Keep a list of frames that have finished rendering (in a cinematic film production this could be eg. 10h/frame), and use it to feed rsync. Otherwise you'll be spending hours for rsync to build its index (both sides) over huge filesystems, instead of transferring relatively few big and new files.

      In workloads where you have many sync candidates (files) that have a majority of differing chunks, it might be worth rather disabling the delta-transfer algorithm (--whole-file) and saving on the tradeoffs.

      [0] https://www.andrew.cmu.edu/course/15-749/READINGS/required/c...

      [1] https://en.wikipedia.org/wiki/Rsync#Determining_which_parts_...

      [2] https://en.wikipedia.org/wiki/Rsync#Determining_which_files_...

    • HPsquared7 hours ago |parent

      Rclone can "sync" with a range of different ways to check if the existing files are the same. If no hashes are available (e.g. WebDAV) I think you can set it to check by timestamp (with a tolerance) and size.

      Edit: oh I see, delta transfer only sends the changed parts of files?

      • newsoftheday6 hours ago |parent

        It only sends the changed parts of files (the diffs) is my understanding which saves bandwidth.

        • mnw21caman hour ago |parent

          However this is only really good for slow connections. If your connection is faster than about 50MB/s then the delta calculation mechanism becomes the bottleneck. On fast connections you should use the -W option for rsync which switches the delta algorithm off.

    • plagiarist6 hours ago |parent

      Does rclone not do that? I thought they were specifically naming themselves similarly because they also did that.

      • newsoftheday6 hours ago |parent

        My understanding is that rclone does not do true delta sync sending only the differing parts of files like rsync.

  • ftchd5 hours ago

    Rclone is such an elegant piece of software, reminds me of the time where most software worked well most of the time. There's few people that wouldn't benefit from it, either as a developer or end-user.

    I'm currently working on the GUI if you're interested: https://github.com/rclone-ui/rclone-ui

  • cachius8 hours ago

    rclone --multi-thread-streams allows transfers in parallel, like robocopy /MT

    You can also run multiple instances of rsync, the problem seems how to efficiently divide the set of files.

    • cachius8 hours ago |parent

      > efficiently divide the set of files.

      It turns out, fpart does just that! Fpart is a Filesystem partitioner. It helps you sort file trees and pack them into bags (called "partitions"). It is developed in C and available under the BSD license.

      It comes with an rsync wrapper, fpsync. Now I'd like to see a benchmark of that vs rclone! via https://unix.stackexchange.com/q/189878/#688469 via https://stackoverflow.com/q/24058544/#comment93435424_255320...

      https://www.fpart.org/

    • pama8 hours ago |parent

      Sometimes find (with desired maxdepth) piped to gnu-parallel rsync is fine.

    • SoftTalker8 hours ago |parent

      robocopy! Wow, blast from the past. Used to use it all the time when I worked in a Windows shop.

      • bob10298 hours ago |parent

        I am using robocopy right now on a project. The /MIR option is extremely useful for incrementally maintaining copies of large local directories.

    • adolph6 hours ago |parent

      My go-to for fast and easy parallelization is xargs -P.

        find a-bunch-of-files | xargs -P 10 do-something-with-a-file
      
             -P max-procs
             --max-procs=max-procs
                    Run up to max-procs processes at a time; the default is 1.
                    If max-procs is 0, xargs will run as many processes as
                    possible at a time.
      • akdev1l5 hours ago |parent

        note that one should use -print0 and -0 for safety

        • adolph4 hours ago |parent

          Thanks! I've been using the -F{} do-something-tofile "{}" approach which is also handy for times in which the input is one pram among others. -0 is much faster.

          Edit: Looks like when doing file-by-file -F{} is still needed:

            # find tmp -type f | xargs -0 ls
            ls: cannot access 'tmp/b file.md'$'\n''tmp/a file.md'$'\n''tmp/c file.md'$'\n': No such file or directory
          • elteto3 hours ago |parent

            You have to do `find ... -print0` so find also uses \0 as the separator.

          • akdev1l3 hours ago |parent

            find -print0 will print the files with null bytes as separators

            xargs -0 will use a null byte as separator for each argument

            printf 'a\0b\0c\0' | xargs -tI{} echo “file -> {}"

  • indigodaddy8 hours ago

    One thing that sets rsync apart perhaps is the handling of hard links when you don't want to send both/duplicated files to the destination? Not sure if rclone can do that.

  • kwanbix5 hours ago

    It is crazy to see how difficult google makes it for anyone to download their own pictures from google photos. Rclone used to allow you to download them, but not anymore. Only the ones uploaded by Rclone are available to download. I wish someone forced all cloud providers to allow you to download your own data. And no, google takout doesn't count. It is horrible to use.

    • buu7094 hours ago |parent

      Not just bad to use, but doesn't fully work. I've been trying to get my photos off Google Photos to backup elsewhere, but takeout misses something like 20%-30% of them.

      • MisterTea3 hours ago |parent

        How did you verify that takeout shorted you on 20-30% which is a huge number? This worries me as I've done some takeouts but never fully poked through them.

      • kwanbix4 hours ago |parent

        yes, that is what I meant. Once I tried to download 600GB of photos, and it crashed.

  • xoa7 hours ago

    Thanks for sharing, hadn't seen it but at almost the same time he made that post I too was struggling to get decent NAS<>NAS transfer speeds with rsync. I should have thought to play more with rclone! I ended up using iSCSI but that is a lot more trouble.

    >In fact, some compression modes would actually slow things down as my energy-efficient NAS is running on some slower Arm cores

    Depending on the number/type of devices in the setup and usage patterns, it can be effective sometimes to have a single more powerful router and then use it directly as a hop for security or compression (or both) to a set of lower power devices. Like, I know it's not E2EE the same way to send unencrypted data to one OPNsense router, Wireguard (or Nebula or whatever tunnel you prefer) to another over the internet, and then from there to a NAS. But if the NAS is in the same physically secure rack directly attached by hardline to the router (or via isolated switch), I don't think in practice it's significantly enough less secure at the private service level to matter. If the router is a pretty important lynchpin anyone, it can be favorable to lean more heavily on that so one can go cheaper and lower power elsewhere. Not that more efficiency, hardware acceleration etc are at all bad, and conversely sometimes might make sense to have a powerful NAS/other servers and a low power router, but there are good degrees of freedom there. Handier then ever in the current crazy times where sometimes hardware that was formerly easily and cheaply available is now a king's ransom or gone and one has to improvise.

  • Dunedan7 hours ago

    I wonder if the at least partially the reason for the speed up isn't the multi-threading, but instead that rclone maybe doesn't compress transferred data by default. That's what rsync does when using SSH, so for already compressed data (like videos for example) disabling SSH compression when invoking rsync speeds it up significantly:

      rsync -e "ssh -o Compression=no" ...
    • dspillettan hour ago |parent

      IIRC rsync uses your default SSH options, so turning off compression is only needed if your default config explicitly turns it on (generally or just for that host). If sending compressible content using rsync's compression instead of SSH's is more effective when updating files because even if not sending everything it can use it to form the compression dictionary window for what does get sent (though for sending whoe files, SSH's compression may be preferable as rsync is single threaded and using SSH's compression moves that chunk of work to the SSH process).

    • nh26 hours ago |parent

      Compression is off by default in OpenSSH, at least `man 5 ssh_config` says:

      > Specifies whether to use compression. The argument must be yes or no (the default).

      So I'm surprised you see speedups with your invocation.

      • Dunedan4 hours ago |parent

        Good point. Seems like I enabled it in ~/.ssh/config ages ago and did forget about it. Nonetheless, it's good to check whether it's enabled when using rsync to transfer large, already well compressed files.

  • aidenn07 hours ago

    rclone is not as good as rsync for doing ad-hoc transfers; for anything not using the filesystem, you need to set up a configuration, which adds friction. It realy is purpose built for recurring transfers rather than "I need to move X to Y just once"

    • ruuda6 hours ago |parent

      We wrote https://github.com/chorusone/fastsync for fast ad-hoc transfers over multiple TCP streams.

  • KolmogorovComp7 hours ago

    Why are rclone/rsync never used by default for app updates? Especially games with large assets.

    • rjmunro7 hours ago |parent

      zsync is better for that. zsync precalculates all the hashes and puts them in a file alongside the main one. The client downloads the hashes, compares them to what it has then downloads the parts it is missing.

      With rsync, you upload hashes of what you have, then the source has to do all the hashing work to figure out what to send you. It's slightly more efficient, but If you are supporting even 10s of downloads it's a lot of work for the source.

      The other option is to send just a diff, which I believe e.g. Google Chrome does. Google invented Courgette and Zucchini which partially decompile binaries then recompile them on the other end to reduce the size of diffs. These only work for exact known previous versions, though.

      I wonder if the ideas of Courgette and Zucchini can be incorporated into zsync's hashes so that you get the minimal diff, but the flexibility of not having a perfect previous version to work from.

      • plagiarist6 hours ago |parent

        Do a CRDT but for binary executables

  • packetlost8 hours ago

    I use tab-complete to navigate remote folder structures with rsync all the time, does rclone have that?

    • nh26 hours ago |parent

      This is not a feature of rsync, but of your shell.

      So the question "does rclone have that" doesn't make much sense, because it usually wouldn't be rclone implementing it.

      For example, zsh does it here for rsync, which actually invokes `ssh` itself:

      https://github.com/zsh-users/zsh/blob/3e72a52e27d8ce8d8be0ee...

      https://github.com/zsh-users/zsh/blob/3e72a52e27d8ce8d8be0ee...

      That said, some CLI tools come with tools for shells to help them implement such things. E.g. `mytool completion-helper ...`

      But I don't get rclone SSH completions in zsh, as it doesn't call `_remote_files` for rclone:

      https://github.com/zsh-users/zsh/blob/3e72a52e27d8ce8d8be0ee...

  • rurban6 hours ago

    Thanks for the lms tips in the comments. Amazing!

  • tonymet3 hours ago

    golang concurrent IO is so accessible that even trivial IO transform scripts (e.g. compression, base64, md5sum/cksum) are very easy to multicore.

    You'd be astonished at how much faster even seemingly fast local IO can go when you unblock the IO

  • sneak7 hours ago

    What’s sad to me is that rsync hasn’t been touched to fix these issues in what feels like decades.

    • restalis3 hours ago |parent

      rsync does what was designed to do and the lack of scope creep is not a bad thing. There is "fpsync" - another tool on top of rsync (which was mentioned in one of the comments at article's page) that covers the parallel processing use-case: https://manpages.debian.org/bullseye/fpart/fpsync.1.en.html

  • baal80spam9 hours ago

    I'll keep saying that rclone is a fantastic and underrated piece of software.

    • digiown8 hours ago |parent

      rclone is super cool, but unfortunately many of the providers it supports has such low ratelimits, that it's fairly difficult to use it to transfer much data at all.

      • plasticsoprano7 hours ago |parent

        This has been my problem. Not necessarily that the rate limits are low, many can be gotten around by using multiple users to do the work since the limits are per user, but how rclone handles those rate limits when they hit them. The exponential back off will end up making hours and days long delays that will screw a migration.

        • PunchyHamster4 hours ago |parent

          I had to tweak some options because it was hitting ratelimit on B2 and the reaction of rclone to that was just... disabling deletes

  • gjvc6 hours ago

    May 6, 2025 May 6, 2025 May 6, 2025 May 6, 2025 May 6, 2025 May 6, 2025 May 6, 2025